162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0+
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci
462306a36Sopenharmony_ci==========
562306a36Sopenharmony_ciMaple Tree
662306a36Sopenharmony_ci==========
762306a36Sopenharmony_ci
862306a36Sopenharmony_ci:Author: Liam R. Howlett
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciOverview
1162306a36Sopenharmony_ci========
1262306a36Sopenharmony_ci
1362306a36Sopenharmony_ciThe Maple Tree is a B-Tree data type which is optimized for storing
1462306a36Sopenharmony_cinon-overlapping ranges, including ranges of size 1.  The tree was designed to
1562306a36Sopenharmony_cibe simple to use and does not require a user written search method.  It
1662306a36Sopenharmony_cisupports iterating over a range of entries and going to the previous or next
1762306a36Sopenharmony_cientry in a cache-efficient manner.  The tree can also be put into an RCU-safe
1862306a36Sopenharmony_cimode of operation which allows reading and writing concurrently.  Writers must
1962306a36Sopenharmony_cisynchronize on a lock, which can be the default spinlock, or the user can set
2062306a36Sopenharmony_cithe lock to an external lock of a different type.
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ciThe Maple Tree maintains a small memory footprint and was designed to use
2362306a36Sopenharmony_cimodern processor cache efficiently.  The majority of the users will be able to
2462306a36Sopenharmony_ciuse the normal API.  An :ref:`maple-tree-advanced-api` exists for more complex
2562306a36Sopenharmony_ciscenarios.  The most important usage of the Maple Tree is the tracking of the
2662306a36Sopenharmony_civirtual memory areas.
2762306a36Sopenharmony_ci
2862306a36Sopenharmony_ciThe Maple Tree can store values between ``0`` and ``ULONG_MAX``.  The Maple
2962306a36Sopenharmony_ciTree reserves values with the bottom two bits set to '10' which are below 4096
3062306a36Sopenharmony_ci(ie 2, 6, 10 .. 4094) for internal use.  If the entries may use reserved
3162306a36Sopenharmony_cientries then the users can convert the entries using xa_mk_value() and convert
3262306a36Sopenharmony_cithem back by calling xa_to_value().  If the user needs to use a reserved
3362306a36Sopenharmony_civalue, then the user can convert the value when using the
3462306a36Sopenharmony_ci:ref:`maple-tree-advanced-api`, but are blocked by the normal API.
3562306a36Sopenharmony_ci
3662306a36Sopenharmony_ciThe Maple Tree can also be configured to support searching for a gap of a given
3762306a36Sopenharmony_cisize (or larger).
3862306a36Sopenharmony_ci
3962306a36Sopenharmony_ciPre-allocating of nodes is also supported using the
4062306a36Sopenharmony_ci:ref:`maple-tree-advanced-api`.  This is useful for users who must guarantee a
4162306a36Sopenharmony_cisuccessful store operation within a given
4262306a36Sopenharmony_cicode segment when allocating cannot be done.  Allocations of nodes are
4362306a36Sopenharmony_cirelatively small at around 256 bytes.
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ci.. _maple-tree-normal-api:
4662306a36Sopenharmony_ci
4762306a36Sopenharmony_ciNormal API
4862306a36Sopenharmony_ci==========
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ciStart by initialising a maple tree, either with DEFINE_MTREE() for statically
5162306a36Sopenharmony_ciallocated maple trees or mt_init() for dynamically allocated ones.  A
5262306a36Sopenharmony_cifreshly-initialised maple tree contains a ``NULL`` pointer for the range ``0``
5362306a36Sopenharmony_ci- ``ULONG_MAX``.  There are currently two types of maple trees supported: the
5462306a36Sopenharmony_ciallocation tree and the regular tree.  The regular tree has a higher branching
5562306a36Sopenharmony_cifactor for internal nodes.  The allocation tree has a lower branching factor
5662306a36Sopenharmony_cibut allows the user to search for a gap of a given size or larger from either
5762306a36Sopenharmony_ci``0`` upwards or ``ULONG_MAX`` down.  An allocation tree can be used by
5862306a36Sopenharmony_cipassing in the ``MT_FLAGS_ALLOC_RANGE`` flag when initialising the tree.
5962306a36Sopenharmony_ci
6062306a36Sopenharmony_ciYou can then set entries using mtree_store() or mtree_store_range().
6162306a36Sopenharmony_cimtree_store() will overwrite any entry with the new entry and return 0 on
6262306a36Sopenharmony_cisuccess or an error code otherwise.  mtree_store_range() works in the same way
6362306a36Sopenharmony_cibut takes a range.  mtree_load() is used to retrieve the entry stored at a
6462306a36Sopenharmony_cigiven index.  You can use mtree_erase() to erase an entire range by only
6562306a36Sopenharmony_ciknowing one value within that range, or mtree_store() call with an entry of
6662306a36Sopenharmony_ciNULL may be used to partially erase a range or many ranges at once.
6762306a36Sopenharmony_ci
6862306a36Sopenharmony_ciIf you want to only store a new entry to a range (or index) if that range is
6962306a36Sopenharmony_cicurrently ``NULL``, you can use mtree_insert_range() or mtree_insert() which
7062306a36Sopenharmony_cireturn -EEXIST if the range is not empty.
7162306a36Sopenharmony_ci
7262306a36Sopenharmony_ciYou can search for an entry from an index upwards by using mt_find().
7362306a36Sopenharmony_ci
7462306a36Sopenharmony_ciYou can walk each entry within a range by calling mt_for_each().  You must
7562306a36Sopenharmony_ciprovide a temporary variable to store a cursor.  If you want to walk each
7662306a36Sopenharmony_cielement of the tree then ``0`` and ``ULONG_MAX`` may be used as the range.  If
7762306a36Sopenharmony_cithe caller is going to hold the lock for the duration of the walk then it is
7862306a36Sopenharmony_ciworth looking at the mas_for_each() API in the :ref:`maple-tree-advanced-api`
7962306a36Sopenharmony_cisection.
8062306a36Sopenharmony_ci
8162306a36Sopenharmony_ciSometimes it is necessary to ensure the next call to store to a maple tree does
8262306a36Sopenharmony_cinot allocate memory, please see :ref:`maple-tree-advanced-api` for this use case.
8362306a36Sopenharmony_ci
8462306a36Sopenharmony_ciFinally, you can remove all entries from a maple tree by calling
8562306a36Sopenharmony_cimtree_destroy().  If the maple tree entries are pointers, you may wish to free
8662306a36Sopenharmony_cithe entries first.
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ciAllocating Nodes
8962306a36Sopenharmony_ci----------------
9062306a36Sopenharmony_ci
9162306a36Sopenharmony_ciThe allocations are handled by the internal tree code.  See
9262306a36Sopenharmony_ci:ref:`maple-tree-advanced-alloc` for other options.
9362306a36Sopenharmony_ci
9462306a36Sopenharmony_ciLocking
9562306a36Sopenharmony_ci-------
9662306a36Sopenharmony_ci
9762306a36Sopenharmony_ciYou do not have to worry about locking.  See :ref:`maple-tree-advanced-locks`
9862306a36Sopenharmony_cifor other options.
9962306a36Sopenharmony_ci
10062306a36Sopenharmony_ciThe Maple Tree uses RCU and an internal spinlock to synchronise access:
10162306a36Sopenharmony_ci
10262306a36Sopenharmony_ciTakes RCU read lock:
10362306a36Sopenharmony_ci * mtree_load()
10462306a36Sopenharmony_ci * mt_find()
10562306a36Sopenharmony_ci * mt_for_each()
10662306a36Sopenharmony_ci * mt_next()
10762306a36Sopenharmony_ci * mt_prev()
10862306a36Sopenharmony_ci
10962306a36Sopenharmony_ciTakes ma_lock internally:
11062306a36Sopenharmony_ci * mtree_store()
11162306a36Sopenharmony_ci * mtree_store_range()
11262306a36Sopenharmony_ci * mtree_insert()
11362306a36Sopenharmony_ci * mtree_insert_range()
11462306a36Sopenharmony_ci * mtree_erase()
11562306a36Sopenharmony_ci * mtree_destroy()
11662306a36Sopenharmony_ci * mt_set_in_rcu()
11762306a36Sopenharmony_ci * mt_clear_in_rcu()
11862306a36Sopenharmony_ci
11962306a36Sopenharmony_ciIf you want to take advantage of the internal lock to protect the data
12062306a36Sopenharmony_cistructures that you are storing in the Maple Tree, you can call mtree_lock()
12162306a36Sopenharmony_cibefore calling mtree_load(), then take a reference count on the object you
12262306a36Sopenharmony_cihave found before calling mtree_unlock().  This will prevent stores from
12362306a36Sopenharmony_ciremoving the object from the tree between looking up the object and
12462306a36Sopenharmony_ciincrementing the refcount.  You can also use RCU to avoid dereferencing
12562306a36Sopenharmony_cifreed memory, but an explanation of that is beyond the scope of this
12662306a36Sopenharmony_cidocument.
12762306a36Sopenharmony_ci
12862306a36Sopenharmony_ci.. _maple-tree-advanced-api:
12962306a36Sopenharmony_ci
13062306a36Sopenharmony_ciAdvanced API
13162306a36Sopenharmony_ci============
13262306a36Sopenharmony_ci
13362306a36Sopenharmony_ciThe advanced API offers more flexibility and better performance at the
13462306a36Sopenharmony_cicost of an interface which can be harder to use and has fewer safeguards.
13562306a36Sopenharmony_ciYou must take care of your own locking while using the advanced API.
13662306a36Sopenharmony_ciYou can use the ma_lock, RCU or an external lock for protection.
13762306a36Sopenharmony_ciYou can mix advanced and normal operations on the same array, as long
13862306a36Sopenharmony_cias the locking is compatible.  The :ref:`maple-tree-normal-api` is implemented
13962306a36Sopenharmony_ciin terms of the advanced API.
14062306a36Sopenharmony_ci
14162306a36Sopenharmony_ciThe advanced API is based around the ma_state, this is where the 'mas'
14262306a36Sopenharmony_ciprefix originates.  The ma_state struct keeps track of tree operations to make
14362306a36Sopenharmony_cilife easier for both internal and external tree users.
14462306a36Sopenharmony_ci
14562306a36Sopenharmony_ciInitialising the maple tree is the same as in the :ref:`maple-tree-normal-api`.
14662306a36Sopenharmony_ciPlease see above.
14762306a36Sopenharmony_ci
14862306a36Sopenharmony_ciThe maple state keeps track of the range start and end in mas->index and
14962306a36Sopenharmony_cimas->last, respectively.
15062306a36Sopenharmony_ci
15162306a36Sopenharmony_cimas_walk() will walk the tree to the location of mas->index and set the
15262306a36Sopenharmony_cimas->index and mas->last according to the range for the entry.
15362306a36Sopenharmony_ci
15462306a36Sopenharmony_ciYou can set entries using mas_store().  mas_store() will overwrite any entry
15562306a36Sopenharmony_ciwith the new entry and return the first existing entry that is overwritten.
15662306a36Sopenharmony_ciThe range is passed in as members of the maple state: index and last.
15762306a36Sopenharmony_ci
15862306a36Sopenharmony_ciYou can use mas_erase() to erase an entire range by setting index and
15962306a36Sopenharmony_cilast of the maple state to the desired range to erase.  This will erase
16062306a36Sopenharmony_cithe first range that is found in that range, set the maple state index
16162306a36Sopenharmony_ciand last as the range that was erased and return the entry that existed
16262306a36Sopenharmony_ciat that location.
16362306a36Sopenharmony_ci
16462306a36Sopenharmony_ciYou can walk each entry within a range by using mas_for_each().  If you want
16562306a36Sopenharmony_cito walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as
16662306a36Sopenharmony_cithe range.  If the lock needs to be periodically dropped, see the locking
16762306a36Sopenharmony_cisection mas_pause().
16862306a36Sopenharmony_ci
16962306a36Sopenharmony_ciUsing a maple state allows mas_next() and mas_prev() to function as if the
17062306a36Sopenharmony_citree was a linked list.  With such a high branching factor the amortized
17162306a36Sopenharmony_ciperformance penalty is outweighed by cache optimization.  mas_next() will
17262306a36Sopenharmony_cireturn the next entry which occurs after the entry at index.  mas_prev()
17362306a36Sopenharmony_ciwill return the previous entry which occurs before the entry at index.
17462306a36Sopenharmony_ci
17562306a36Sopenharmony_cimas_find() will find the first entry which exists at or above index on
17662306a36Sopenharmony_cithe first call, and the next entry from every subsequent calls.
17762306a36Sopenharmony_ci
17862306a36Sopenharmony_cimas_find_rev() will find the fist entry which exists at or below the last on
17962306a36Sopenharmony_cithe first call, and the previous entry from every subsequent calls.
18062306a36Sopenharmony_ci
18162306a36Sopenharmony_ciIf the user needs to yield the lock during an operation, then the maple state
18262306a36Sopenharmony_cimust be paused using mas_pause().
18362306a36Sopenharmony_ci
18462306a36Sopenharmony_ciThere are a few extra interfaces provided when using an allocation tree.
18562306a36Sopenharmony_ciIf you wish to search for a gap within a range, then mas_empty_area()
18662306a36Sopenharmony_cior mas_empty_area_rev() can be used.  mas_empty_area() searches for a gap
18762306a36Sopenharmony_cistarting at the lowest index given up to the maximum of the range.
18862306a36Sopenharmony_cimas_empty_area_rev() searches for a gap starting at the highest index given
18962306a36Sopenharmony_ciand continues downward to the lower bound of the range.
19062306a36Sopenharmony_ci
19162306a36Sopenharmony_ci.. _maple-tree-advanced-alloc:
19262306a36Sopenharmony_ci
19362306a36Sopenharmony_ciAdvanced Allocating Nodes
19462306a36Sopenharmony_ci-------------------------
19562306a36Sopenharmony_ci
19662306a36Sopenharmony_ciAllocations are usually handled internally to the tree, however if allocations
19762306a36Sopenharmony_cineed to occur before a write occurs then calling mas_expected_entries() will
19862306a36Sopenharmony_ciallocate the worst-case number of needed nodes to insert the provided number of
19962306a36Sopenharmony_ciranges.  This also causes the tree to enter mass insertion mode.  Once
20062306a36Sopenharmony_ciinsertions are complete calling mas_destroy() on the maple state will free the
20162306a36Sopenharmony_ciunused allocations.
20262306a36Sopenharmony_ci
20362306a36Sopenharmony_ci.. _maple-tree-advanced-locks:
20462306a36Sopenharmony_ci
20562306a36Sopenharmony_ciAdvanced Locking
20662306a36Sopenharmony_ci----------------
20762306a36Sopenharmony_ci
20862306a36Sopenharmony_ciThe maple tree uses a spinlock by default, but external locks can be used for
20962306a36Sopenharmony_citree updates as well.  To use an external lock, the tree must be initialized
21062306a36Sopenharmony_ciwith the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the
21162306a36Sopenharmony_ciMTREE_INIT_EXT() #define, which takes an external lock as an argument.
21262306a36Sopenharmony_ci
21362306a36Sopenharmony_ciFunctions and structures
21462306a36Sopenharmony_ci========================
21562306a36Sopenharmony_ci
21662306a36Sopenharmony_ci.. kernel-doc:: include/linux/maple_tree.h
21762306a36Sopenharmony_ci.. kernel-doc:: lib/maple_tree.c
218