xref: /third_party/python/Tools/stringbench/README (revision 7db96d56)
17db96d56Sopenharmony_cistringbench is a set of performance tests comparing byte string
27db96d56Sopenharmony_cioperations with unicode operations.  The two string implementations
37db96d56Sopenharmony_ciare loosely based on each other and sometimes the algorithm for one is
47db96d56Sopenharmony_cifaster than the other.
57db96d56Sopenharmony_ci
67db96d56Sopenharmony_ciThese test set was started at the Need For Speed sprint in Reykjavik
77db96d56Sopenharmony_cito identify which string methods could be sped up quickly and to
87db96d56Sopenharmony_ciidentify obvious places for improvement.
97db96d56Sopenharmony_ci
107db96d56Sopenharmony_ciHere is an example of a benchmark
117db96d56Sopenharmony_ci
127db96d56Sopenharmony_ci
137db96d56Sopenharmony_ci@bench('"Andrew".startswith("A")', 'startswith single character', 1000)
147db96d56Sopenharmony_cidef startswith_single(STR):
157db96d56Sopenharmony_ci    s1 = STR("Andrew")
167db96d56Sopenharmony_ci    s2 = STR("A")
177db96d56Sopenharmony_ci    s1_startswith = s1.startswith
187db96d56Sopenharmony_ci    for x in _RANGE_1000:
197db96d56Sopenharmony_ci        s1_startswith(s2)
207db96d56Sopenharmony_ci
217db96d56Sopenharmony_ciThe bench decorator takes three parameters.  The first is a short
227db96d56Sopenharmony_cidescription of how the code works.  In most cases this is Python code
237db96d56Sopenharmony_cisnippet.  It is not the code which is actually run because the real
247db96d56Sopenharmony_cicode is hand-optimized to focus on the method being tested.
257db96d56Sopenharmony_ci
267db96d56Sopenharmony_ciThe second parameter is a group title.  All benchmarks with the same
277db96d56Sopenharmony_cigroup title are listed together.  This lets you compare different
287db96d56Sopenharmony_ciimplementations of the same algorithm, such as "t in s"
297db96d56Sopenharmony_civs. "s.find(t)".
307db96d56Sopenharmony_ci
317db96d56Sopenharmony_ciThe last is a count.  Each benchmark loops over the algorithm either
327db96d56Sopenharmony_ci100 or 1000 times, depending on the algorithm performance.  The output
337db96d56Sopenharmony_citime is the time per benchmark call so the reader needs a way to know
347db96d56Sopenharmony_cihow to scale the performance.
357db96d56Sopenharmony_ci
367db96d56Sopenharmony_ciThese parameters become function attributes.
377db96d56Sopenharmony_ci
387db96d56Sopenharmony_ci
397db96d56Sopenharmony_ciHere is an example of the output
407db96d56Sopenharmony_ci
417db96d56Sopenharmony_ci
427db96d56Sopenharmony_ci========== count newlines
437db96d56Sopenharmony_ci38.54   41.60   92.7    ...text.with.2000.newlines.count("\n") (*100)
447db96d56Sopenharmony_ci========== early match, single character
457db96d56Sopenharmony_ci1.14    1.18    96.8    ("A"*1000).find("A") (*1000)
467db96d56Sopenharmony_ci0.44    0.41    105.6   "A" in "A"*1000 (*1000)
477db96d56Sopenharmony_ci1.15    1.17    98.1    ("A"*1000).index("A") (*1000)
487db96d56Sopenharmony_ci
497db96d56Sopenharmony_ciThe first column is the run time in milliseconds for byte strings.
507db96d56Sopenharmony_ciThe second is the run time for unicode strings.  The third is a
517db96d56Sopenharmony_cipercentage; byte time / unicode time.  It's the percentage by which
527db96d56Sopenharmony_ciunicode is faster than byte strings.
537db96d56Sopenharmony_ci
547db96d56Sopenharmony_ciThe last column contains the code snippet and the repeat count for the
557db96d56Sopenharmony_ciinternal benchmark loop.
567db96d56Sopenharmony_ci
577db96d56Sopenharmony_ciThe times are computed with 'timeit.py' which repeats the test more
587db96d56Sopenharmony_ciand more times until the total time takes over 0.2 seconds, returning
597db96d56Sopenharmony_cithe best time for a single iteration.
607db96d56Sopenharmony_ci
617db96d56Sopenharmony_ciThe final line of the output is the cumulative time for byte and
627db96d56Sopenharmony_ciunicode strings, and the overall performance of unicode relative to
637db96d56Sopenharmony_cibytes.  For example
647db96d56Sopenharmony_ci
657db96d56Sopenharmony_ci4079.83 5432.25 75.1    TOTAL
667db96d56Sopenharmony_ci
677db96d56Sopenharmony_ciHowever, this has no meaning as it evenly weights every test.
687db96d56Sopenharmony_ci
69