17db96d56Sopenharmony_ci:mod:`bisect` --- Array bisection algorithm 27db96d56Sopenharmony_ci=========================================== 37db96d56Sopenharmony_ci 47db96d56Sopenharmony_ci.. module:: bisect 57db96d56Sopenharmony_ci :synopsis: Array bisection algorithms for binary searching. 67db96d56Sopenharmony_ci.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> 77db96d56Sopenharmony_ci.. sectionauthor:: Raymond Hettinger <python at rcn.com> 87db96d56Sopenharmony_ci.. example based on the PyModules FAQ entry by Aaron Watters <arw@pythonpros.com> 97db96d56Sopenharmony_ci 107db96d56Sopenharmony_ci**Source code:** :source:`Lib/bisect.py` 117db96d56Sopenharmony_ci 127db96d56Sopenharmony_ci-------------- 137db96d56Sopenharmony_ci 147db96d56Sopenharmony_ciThis module provides support for maintaining a list in sorted order without 157db96d56Sopenharmony_cihaving to sort the list after each insertion. For long lists of items with 167db96d56Sopenharmony_ciexpensive comparison operations, this can be an improvement over the more common 177db96d56Sopenharmony_ciapproach. The module is called :mod:`bisect` because it uses a basic bisection 187db96d56Sopenharmony_cialgorithm to do its work. The source code may be most useful as a working 197db96d56Sopenharmony_ciexample of the algorithm (the boundary conditions are already right!). 207db96d56Sopenharmony_ci 217db96d56Sopenharmony_ci.. _bisect functions: 227db96d56Sopenharmony_ci 237db96d56Sopenharmony_ciThe following functions are provided: 247db96d56Sopenharmony_ci 257db96d56Sopenharmony_ci 267db96d56Sopenharmony_ci.. function:: bisect_left(a, x, lo=0, hi=len(a), *, key=None) 277db96d56Sopenharmony_ci 287db96d56Sopenharmony_ci Locate the insertion point for *x* in *a* to maintain sorted order. 297db96d56Sopenharmony_ci The parameters *lo* and *hi* may be used to specify a subset of the list 307db96d56Sopenharmony_ci which should be considered; by default the entire list is used. If *x* is 317db96d56Sopenharmony_ci already present in *a*, the insertion point will be before (to the left of) 327db96d56Sopenharmony_ci any existing entries. The return value is suitable for use as the first 337db96d56Sopenharmony_ci parameter to ``list.insert()`` assuming that *a* is already sorted. 347db96d56Sopenharmony_ci 357db96d56Sopenharmony_ci The returned insertion point *i* partitions the array *a* into two halves so 367db96d56Sopenharmony_ci that ``all(val < x for val in a[lo : i])`` for the left side and 377db96d56Sopenharmony_ci ``all(val >= x for val in a[i : hi])`` for the right side. 387db96d56Sopenharmony_ci 397db96d56Sopenharmony_ci *key* specifies a :term:`key function` of one argument that is used to 407db96d56Sopenharmony_ci extract a comparison key from each element in the array. To support 417db96d56Sopenharmony_ci searching complex records, the key function is not applied to the *x* value. 427db96d56Sopenharmony_ci 437db96d56Sopenharmony_ci If *key* is ``None``, the elements are compared directly with no 447db96d56Sopenharmony_ci intervening function call. 457db96d56Sopenharmony_ci 467db96d56Sopenharmony_ci .. versionchanged:: 3.10 477db96d56Sopenharmony_ci Added the *key* parameter. 487db96d56Sopenharmony_ci 497db96d56Sopenharmony_ci 507db96d56Sopenharmony_ci.. function:: bisect_right(a, x, lo=0, hi=len(a), *, key=None) 517db96d56Sopenharmony_ci bisect(a, x, lo=0, hi=len(a), *, key=None) 527db96d56Sopenharmony_ci 537db96d56Sopenharmony_ci Similar to :py:func:`~bisect.bisect_left`, but returns an insertion point which comes 547db96d56Sopenharmony_ci after (to the right of) any existing entries of *x* in *a*. 557db96d56Sopenharmony_ci 567db96d56Sopenharmony_ci The returned insertion point *i* partitions the array *a* into two halves so 577db96d56Sopenharmony_ci that ``all(val <= x for val in a[lo : i])`` for the left side and 587db96d56Sopenharmony_ci ``all(val > x for val in a[i : hi])`` for the right side. 597db96d56Sopenharmony_ci 607db96d56Sopenharmony_ci *key* specifies a :term:`key function` of one argument that is used to 617db96d56Sopenharmony_ci extract a comparison key from each element in the array. To support 627db96d56Sopenharmony_ci searching complex records, the key function is not applied to the *x* value. 637db96d56Sopenharmony_ci 647db96d56Sopenharmony_ci If *key* is ``None``, the elements are compared directly with no 657db96d56Sopenharmony_ci intervening function call. 667db96d56Sopenharmony_ci 677db96d56Sopenharmony_ci .. versionchanged:: 3.10 687db96d56Sopenharmony_ci Added the *key* parameter. 697db96d56Sopenharmony_ci 707db96d56Sopenharmony_ci 717db96d56Sopenharmony_ci.. function:: insort_left(a, x, lo=0, hi=len(a), *, key=None) 727db96d56Sopenharmony_ci 737db96d56Sopenharmony_ci Insert *x* in *a* in sorted order. 747db96d56Sopenharmony_ci 757db96d56Sopenharmony_ci This function first runs :py:func:`~bisect.bisect_left` to locate an insertion point. 767db96d56Sopenharmony_ci Next, it runs the :meth:`insert` method on *a* to insert *x* at the 777db96d56Sopenharmony_ci appropriate position to maintain sort order. 787db96d56Sopenharmony_ci 797db96d56Sopenharmony_ci To support inserting records in a table, the *key* function (if any) is 807db96d56Sopenharmony_ci applied to *x* for the search step but not for the insertion step. 817db96d56Sopenharmony_ci 827db96d56Sopenharmony_ci Keep in mind that the ``O(log n)`` search is dominated by the slow O(n) 837db96d56Sopenharmony_ci insertion step. 847db96d56Sopenharmony_ci 857db96d56Sopenharmony_ci .. versionchanged:: 3.10 867db96d56Sopenharmony_ci Added the *key* parameter. 877db96d56Sopenharmony_ci 887db96d56Sopenharmony_ci 897db96d56Sopenharmony_ci.. function:: insort_right(a, x, lo=0, hi=len(a), *, key=None) 907db96d56Sopenharmony_ci insort(a, x, lo=0, hi=len(a), *, key=None) 917db96d56Sopenharmony_ci 927db96d56Sopenharmony_ci Similar to :py:func:`~bisect.insort_left`, but inserting *x* in *a* after any existing 937db96d56Sopenharmony_ci entries of *x*. 947db96d56Sopenharmony_ci 957db96d56Sopenharmony_ci This function first runs :py:func:`~bisect.bisect_right` to locate an insertion point. 967db96d56Sopenharmony_ci Next, it runs the :meth:`insert` method on *a* to insert *x* at the 977db96d56Sopenharmony_ci appropriate position to maintain sort order. 987db96d56Sopenharmony_ci 997db96d56Sopenharmony_ci To support inserting records in a table, the *key* function (if any) is 1007db96d56Sopenharmony_ci applied to *x* for the search step but not for the insertion step. 1017db96d56Sopenharmony_ci 1027db96d56Sopenharmony_ci Keep in mind that the ``O(log n)`` search is dominated by the slow O(n) 1037db96d56Sopenharmony_ci insertion step. 1047db96d56Sopenharmony_ci 1057db96d56Sopenharmony_ci .. versionchanged:: 3.10 1067db96d56Sopenharmony_ci Added the *key* parameter. 1077db96d56Sopenharmony_ci 1087db96d56Sopenharmony_ci 1097db96d56Sopenharmony_ciPerformance Notes 1107db96d56Sopenharmony_ci----------------- 1117db96d56Sopenharmony_ci 1127db96d56Sopenharmony_ciWhen writing time sensitive code using *bisect()* and *insort()*, keep these 1137db96d56Sopenharmony_cithoughts in mind: 1147db96d56Sopenharmony_ci 1157db96d56Sopenharmony_ci* Bisection is effective for searching ranges of values. 1167db96d56Sopenharmony_ci For locating specific values, dictionaries are more performant. 1177db96d56Sopenharmony_ci 1187db96d56Sopenharmony_ci* The *insort()* functions are ``O(n)`` because the logarithmic search step 1197db96d56Sopenharmony_ci is dominated by the linear time insertion step. 1207db96d56Sopenharmony_ci 1217db96d56Sopenharmony_ci* The search functions are stateless and discard key function results after 1227db96d56Sopenharmony_ci they are used. Consequently, if the search functions are used in a loop, 1237db96d56Sopenharmony_ci the key function may be called again and again on the same array elements. 1247db96d56Sopenharmony_ci If the key function isn't fast, consider wrapping it with 1257db96d56Sopenharmony_ci :py:func:`functools.cache` to avoid duplicate computations. Alternatively, 1267db96d56Sopenharmony_ci consider searching an array of precomputed keys to locate the insertion 1277db96d56Sopenharmony_ci point (as shown in the examples section below). 1287db96d56Sopenharmony_ci 1297db96d56Sopenharmony_ci.. seealso:: 1307db96d56Sopenharmony_ci 1317db96d56Sopenharmony_ci * `Sorted Collections 1327db96d56Sopenharmony_ci <https://grantjenks.com/docs/sortedcollections/>`_ is a high performance 1337db96d56Sopenharmony_ci module that uses *bisect* to managed sorted collections of data. 1347db96d56Sopenharmony_ci 1357db96d56Sopenharmony_ci * The `SortedCollection recipe 1367db96d56Sopenharmony_ci <https://code.activestate.com/recipes/577197-sortedcollection/>`_ uses 1377db96d56Sopenharmony_ci bisect to build a full-featured collection class with straight-forward search 1387db96d56Sopenharmony_ci methods and support for a key-function. The keys are precomputed to save 1397db96d56Sopenharmony_ci unnecessary calls to the key function during searches. 1407db96d56Sopenharmony_ci 1417db96d56Sopenharmony_ci 1427db96d56Sopenharmony_ciSearching Sorted Lists 1437db96d56Sopenharmony_ci---------------------- 1447db96d56Sopenharmony_ci 1457db96d56Sopenharmony_ciThe above `bisect functions`_ are useful for finding insertion points but 1467db96d56Sopenharmony_cican be tricky or awkward to use for common searching tasks. The following five 1477db96d56Sopenharmony_cifunctions show how to transform them into the standard lookups for sorted 1487db96d56Sopenharmony_cilists:: 1497db96d56Sopenharmony_ci 1507db96d56Sopenharmony_ci def index(a, x): 1517db96d56Sopenharmony_ci 'Locate the leftmost value exactly equal to x' 1527db96d56Sopenharmony_ci i = bisect_left(a, x) 1537db96d56Sopenharmony_ci if i != len(a) and a[i] == x: 1547db96d56Sopenharmony_ci return i 1557db96d56Sopenharmony_ci raise ValueError 1567db96d56Sopenharmony_ci 1577db96d56Sopenharmony_ci def find_lt(a, x): 1587db96d56Sopenharmony_ci 'Find rightmost value less than x' 1597db96d56Sopenharmony_ci i = bisect_left(a, x) 1607db96d56Sopenharmony_ci if i: 1617db96d56Sopenharmony_ci return a[i-1] 1627db96d56Sopenharmony_ci raise ValueError 1637db96d56Sopenharmony_ci 1647db96d56Sopenharmony_ci def find_le(a, x): 1657db96d56Sopenharmony_ci 'Find rightmost value less than or equal to x' 1667db96d56Sopenharmony_ci i = bisect_right(a, x) 1677db96d56Sopenharmony_ci if i: 1687db96d56Sopenharmony_ci return a[i-1] 1697db96d56Sopenharmony_ci raise ValueError 1707db96d56Sopenharmony_ci 1717db96d56Sopenharmony_ci def find_gt(a, x): 1727db96d56Sopenharmony_ci 'Find leftmost value greater than x' 1737db96d56Sopenharmony_ci i = bisect_right(a, x) 1747db96d56Sopenharmony_ci if i != len(a): 1757db96d56Sopenharmony_ci return a[i] 1767db96d56Sopenharmony_ci raise ValueError 1777db96d56Sopenharmony_ci 1787db96d56Sopenharmony_ci def find_ge(a, x): 1797db96d56Sopenharmony_ci 'Find leftmost item greater than or equal to x' 1807db96d56Sopenharmony_ci i = bisect_left(a, x) 1817db96d56Sopenharmony_ci if i != len(a): 1827db96d56Sopenharmony_ci return a[i] 1837db96d56Sopenharmony_ci raise ValueError 1847db96d56Sopenharmony_ci 1857db96d56Sopenharmony_ci 1867db96d56Sopenharmony_ciExamples 1877db96d56Sopenharmony_ci-------- 1887db96d56Sopenharmony_ci 1897db96d56Sopenharmony_ci.. _bisect-example: 1907db96d56Sopenharmony_ci 1917db96d56Sopenharmony_ciThe :py:func:`~bisect.bisect` function can be useful for numeric table lookups. This 1927db96d56Sopenharmony_ciexample uses :py:func:`~bisect.bisect` to look up a letter grade for an exam score (say) 1937db96d56Sopenharmony_cibased on a set of ordered numeric breakpoints: 90 and up is an 'A', 80 to 89 is 1947db96d56Sopenharmony_cia 'B', and so on:: 1957db96d56Sopenharmony_ci 1967db96d56Sopenharmony_ci >>> def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'): 1977db96d56Sopenharmony_ci ... i = bisect(breakpoints, score) 1987db96d56Sopenharmony_ci ... return grades[i] 1997db96d56Sopenharmony_ci ... 2007db96d56Sopenharmony_ci >>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]] 2017db96d56Sopenharmony_ci ['F', 'A', 'C', 'C', 'B', 'A', 'A'] 2027db96d56Sopenharmony_ci 2037db96d56Sopenharmony_ciThe :py:func:`~bisect.bisect` and :py:func:`~bisect.insort` functions also work with 2047db96d56Sopenharmony_cilists of tuples. The *key* argument can serve to extract the field used for ordering 2057db96d56Sopenharmony_cirecords in a table:: 2067db96d56Sopenharmony_ci 2077db96d56Sopenharmony_ci >>> from collections import namedtuple 2087db96d56Sopenharmony_ci >>> from operator import attrgetter 2097db96d56Sopenharmony_ci >>> from bisect import bisect, insort 2107db96d56Sopenharmony_ci >>> from pprint import pprint 2117db96d56Sopenharmony_ci 2127db96d56Sopenharmony_ci >>> Movie = namedtuple('Movie', ('name', 'released', 'director')) 2137db96d56Sopenharmony_ci 2147db96d56Sopenharmony_ci >>> movies = [ 2157db96d56Sopenharmony_ci ... Movie('Jaws', 1975, 'Spielberg'), 2167db96d56Sopenharmony_ci ... Movie('Titanic', 1997, 'Cameron'), 2177db96d56Sopenharmony_ci ... Movie('The Birds', 1963, 'Hitchcock'), 2187db96d56Sopenharmony_ci ... Movie('Aliens', 1986, 'Cameron') 2197db96d56Sopenharmony_ci ... ] 2207db96d56Sopenharmony_ci 2217db96d56Sopenharmony_ci >>> # Find the first movie released after 1960 2227db96d56Sopenharmony_ci >>> by_year = attrgetter('released') 2237db96d56Sopenharmony_ci >>> movies.sort(key=by_year) 2247db96d56Sopenharmony_ci >>> movies[bisect(movies, 1960, key=by_year)] 2257db96d56Sopenharmony_ci Movie(name='The Birds', released=1963, director='Hitchcock') 2267db96d56Sopenharmony_ci 2277db96d56Sopenharmony_ci >>> # Insert a movie while maintaining sort order 2287db96d56Sopenharmony_ci >>> romance = Movie('Love Story', 1970, 'Hiller') 2297db96d56Sopenharmony_ci >>> insort(movies, romance, key=by_year) 2307db96d56Sopenharmony_ci >>> pprint(movies) 2317db96d56Sopenharmony_ci [Movie(name='The Birds', released=1963, director='Hitchcock'), 2327db96d56Sopenharmony_ci Movie(name='Love Story', released=1970, director='Hiller'), 2337db96d56Sopenharmony_ci Movie(name='Jaws', released=1975, director='Spielberg'), 2347db96d56Sopenharmony_ci Movie(name='Aliens', released=1986, director='Cameron'), 2357db96d56Sopenharmony_ci Movie(name='Titanic', released=1997, director='Cameron')] 2367db96d56Sopenharmony_ci 2377db96d56Sopenharmony_ciIf the key function is expensive, it is possible to avoid repeated function 2387db96d56Sopenharmony_cicalls by searching a list of precomputed keys to find the index of a record:: 2397db96d56Sopenharmony_ci 2407db96d56Sopenharmony_ci >>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)] 2417db96d56Sopenharmony_ci >>> data.sort(key=lambda r: r[1]) # Or use operator.itemgetter(1). 2427db96d56Sopenharmony_ci >>> keys = [r[1] for r in data] # Precompute a list of keys. 2437db96d56Sopenharmony_ci >>> data[bisect_left(keys, 0)] 2447db96d56Sopenharmony_ci ('black', 0) 2457db96d56Sopenharmony_ci >>> data[bisect_left(keys, 1)] 2467db96d56Sopenharmony_ci ('blue', 1) 2477db96d56Sopenharmony_ci >>> data[bisect_left(keys, 5)] 2487db96d56Sopenharmony_ci ('red', 5) 2497db96d56Sopenharmony_ci >>> data[bisect_left(keys, 8)] 2507db96d56Sopenharmony_ci ('yellow', 8) 251