18c2ecf20Sopenharmony_ci========================================
28c2ecf20Sopenharmony_ciGeneric Associative Array Implementation
38c2ecf20Sopenharmony_ci========================================
48c2ecf20Sopenharmony_ci
58c2ecf20Sopenharmony_ciOverview
68c2ecf20Sopenharmony_ci========
78c2ecf20Sopenharmony_ci
88c2ecf20Sopenharmony_ciThis associative array implementation is an object container with the following
98c2ecf20Sopenharmony_ciproperties:
108c2ecf20Sopenharmony_ci
118c2ecf20Sopenharmony_ci1. Objects are opaque pointers.  The implementation does not care where they
128c2ecf20Sopenharmony_ci   point (if anywhere) or what they point to (if anything).
138c2ecf20Sopenharmony_ci
148c2ecf20Sopenharmony_ci   .. note::
158c2ecf20Sopenharmony_ci
168c2ecf20Sopenharmony_ci      Pointers to objects _must_ be zero in the least significant bit.
178c2ecf20Sopenharmony_ci
188c2ecf20Sopenharmony_ci2. Objects do not need to contain linkage blocks for use by the array.  This
198c2ecf20Sopenharmony_ci   permits an object to be located in multiple arrays simultaneously.
208c2ecf20Sopenharmony_ci   Rather, the array is made up of metadata blocks that point to objects.
218c2ecf20Sopenharmony_ci
228c2ecf20Sopenharmony_ci3. Objects require index keys to locate them within the array.
238c2ecf20Sopenharmony_ci
248c2ecf20Sopenharmony_ci4. Index keys must be unique.  Inserting an object with the same key as one
258c2ecf20Sopenharmony_ci   already in the array will replace the old object.
268c2ecf20Sopenharmony_ci
278c2ecf20Sopenharmony_ci5. Index keys can be of any length and can be of different lengths.
288c2ecf20Sopenharmony_ci
298c2ecf20Sopenharmony_ci6. Index keys should encode the length early on, before any variation due to
308c2ecf20Sopenharmony_ci   length is seen.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_ci7. Index keys can include a hash to scatter objects throughout the array.
338c2ecf20Sopenharmony_ci
348c2ecf20Sopenharmony_ci8. The array can iterated over.  The objects will not necessarily come out in
358c2ecf20Sopenharmony_ci   key order.
368c2ecf20Sopenharmony_ci
378c2ecf20Sopenharmony_ci9. The array can be iterated over while it is being modified, provided the
388c2ecf20Sopenharmony_ci   RCU readlock is being held by the iterator.  Note, however, under these
398c2ecf20Sopenharmony_ci   circumstances, some objects may be seen more than once.  If this is a
408c2ecf20Sopenharmony_ci   problem, the iterator should lock against modification.  Objects will not
418c2ecf20Sopenharmony_ci   be missed, however, unless deleted.
428c2ecf20Sopenharmony_ci
438c2ecf20Sopenharmony_ci10. Objects in the array can be looked up by means of their index key.
448c2ecf20Sopenharmony_ci
458c2ecf20Sopenharmony_ci11. Objects can be looked up while the array is being modified, provided the
468c2ecf20Sopenharmony_ci    RCU readlock is being held by the thread doing the look up.
478c2ecf20Sopenharmony_ci
488c2ecf20Sopenharmony_ciThe implementation uses a tree of 16-pointer nodes internally that are indexed
498c2ecf20Sopenharmony_cion each level by nibbles from the index key in the same manner as in a radix
508c2ecf20Sopenharmony_citree.  To improve memory efficiency, shortcuts can be emplaced to skip over
518c2ecf20Sopenharmony_ciwhat would otherwise be a series of single-occupancy nodes.  Further, nodes
528c2ecf20Sopenharmony_cipack leaf object pointers into spare space in the node rather than making an
538c2ecf20Sopenharmony_ciextra branch until as such time an object needs to be added to a full node.
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ci
568c2ecf20Sopenharmony_ciThe Public API
578c2ecf20Sopenharmony_ci==============
588c2ecf20Sopenharmony_ci
598c2ecf20Sopenharmony_ciThe public API can be found in ``<linux/assoc_array.h>``.  The associative
608c2ecf20Sopenharmony_ciarray is rooted on the following structure::
618c2ecf20Sopenharmony_ci
628c2ecf20Sopenharmony_ci    struct assoc_array {
638c2ecf20Sopenharmony_ci            ...
648c2ecf20Sopenharmony_ci    };
658c2ecf20Sopenharmony_ci
668c2ecf20Sopenharmony_ciThe code is selected by enabling ``CONFIG_ASSOCIATIVE_ARRAY`` with::
678c2ecf20Sopenharmony_ci
688c2ecf20Sopenharmony_ci    ./script/config -e ASSOCIATIVE_ARRAY
698c2ecf20Sopenharmony_ci
708c2ecf20Sopenharmony_ci
718c2ecf20Sopenharmony_ciEdit Script
728c2ecf20Sopenharmony_ci-----------
738c2ecf20Sopenharmony_ci
748c2ecf20Sopenharmony_ciThe insertion and deletion functions produce an 'edit script' that can later be
758c2ecf20Sopenharmony_ciapplied to effect the changes without risking ``ENOMEM``. This retains the
768c2ecf20Sopenharmony_cipreallocated metadata blocks that will be installed in the internal tree and
778c2ecf20Sopenharmony_cikeeps track of the metadata blocks that will be removed from the tree when the
788c2ecf20Sopenharmony_ciscript is applied.
798c2ecf20Sopenharmony_ci
808c2ecf20Sopenharmony_ciThis is also used to keep track of dead blocks and dead objects after the
818c2ecf20Sopenharmony_ciscript has been applied so that they can be freed later.  The freeing is done
828c2ecf20Sopenharmony_ciafter an RCU grace period has passed - thus allowing access functions to
838c2ecf20Sopenharmony_ciproceed under the RCU read lock.
848c2ecf20Sopenharmony_ci
858c2ecf20Sopenharmony_ciThe script appears as outside of the API as a pointer of the type::
868c2ecf20Sopenharmony_ci
878c2ecf20Sopenharmony_ci    struct assoc_array_edit;
888c2ecf20Sopenharmony_ci
898c2ecf20Sopenharmony_ciThere are two functions for dealing with the script:
908c2ecf20Sopenharmony_ci
918c2ecf20Sopenharmony_ci1. Apply an edit script::
928c2ecf20Sopenharmony_ci
938c2ecf20Sopenharmony_ci    void assoc_array_apply_edit(struct assoc_array_edit *edit);
948c2ecf20Sopenharmony_ci
958c2ecf20Sopenharmony_ciThis will perform the edit functions, interpolating various write barriers
968c2ecf20Sopenharmony_cito permit accesses under the RCU read lock to continue.  The edit script
978c2ecf20Sopenharmony_ciwill then be passed to ``call_rcu()`` to free it and any dead stuff it points
988c2ecf20Sopenharmony_cito.
998c2ecf20Sopenharmony_ci
1008c2ecf20Sopenharmony_ci2. Cancel an edit script::
1018c2ecf20Sopenharmony_ci
1028c2ecf20Sopenharmony_ci    void assoc_array_cancel_edit(struct assoc_array_edit *edit);
1038c2ecf20Sopenharmony_ci
1048c2ecf20Sopenharmony_ciThis frees the edit script and all preallocated memory immediately. If
1058c2ecf20Sopenharmony_cithis was for insertion, the new object is _not_ released by this function,
1068c2ecf20Sopenharmony_cibut must rather be released by the caller.
1078c2ecf20Sopenharmony_ci
1088c2ecf20Sopenharmony_ciThese functions are guaranteed not to fail.
1098c2ecf20Sopenharmony_ci
1108c2ecf20Sopenharmony_ci
1118c2ecf20Sopenharmony_ciOperations Table
1128c2ecf20Sopenharmony_ci----------------
1138c2ecf20Sopenharmony_ci
1148c2ecf20Sopenharmony_ciVarious functions take a table of operations::
1158c2ecf20Sopenharmony_ci
1168c2ecf20Sopenharmony_ci    struct assoc_array_ops {
1178c2ecf20Sopenharmony_ci            ...
1188c2ecf20Sopenharmony_ci    };
1198c2ecf20Sopenharmony_ci
1208c2ecf20Sopenharmony_ciThis points to a number of methods, all of which need to be provided:
1218c2ecf20Sopenharmony_ci
1228c2ecf20Sopenharmony_ci1. Get a chunk of index key from caller data::
1238c2ecf20Sopenharmony_ci
1248c2ecf20Sopenharmony_ci    unsigned long (*get_key_chunk)(const void *index_key, int level);
1258c2ecf20Sopenharmony_ci
1268c2ecf20Sopenharmony_ciThis should return a chunk of caller-supplied index key starting at the
1278c2ecf20Sopenharmony_ci*bit* position given by the level argument.  The level argument will be a
1288c2ecf20Sopenharmony_cimultiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return
1298c2ecf20Sopenharmony_ci``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``.  No error is possible.
1308c2ecf20Sopenharmony_ci
1318c2ecf20Sopenharmony_ci
1328c2ecf20Sopenharmony_ci2. Get a chunk of an object's index key::
1338c2ecf20Sopenharmony_ci
1348c2ecf20Sopenharmony_ci    unsigned long (*get_object_key_chunk)(const void *object, int level);
1358c2ecf20Sopenharmony_ci
1368c2ecf20Sopenharmony_ciAs the previous function, but gets its data from an object in the array
1378c2ecf20Sopenharmony_cirather than from a caller-supplied index key.
1388c2ecf20Sopenharmony_ci
1398c2ecf20Sopenharmony_ci
1408c2ecf20Sopenharmony_ci3. See if this is the object we're looking for::
1418c2ecf20Sopenharmony_ci
1428c2ecf20Sopenharmony_ci    bool (*compare_object)(const void *object, const void *index_key);
1438c2ecf20Sopenharmony_ci
1448c2ecf20Sopenharmony_ciCompare the object against an index key and return ``true`` if it matches and
1458c2ecf20Sopenharmony_ci``false`` if it doesn't.
1468c2ecf20Sopenharmony_ci
1478c2ecf20Sopenharmony_ci
1488c2ecf20Sopenharmony_ci4. Diff the index keys of two objects::
1498c2ecf20Sopenharmony_ci
1508c2ecf20Sopenharmony_ci    int (*diff_objects)(const void *object, const void *index_key);
1518c2ecf20Sopenharmony_ci
1528c2ecf20Sopenharmony_ciReturn the bit position at which the index key of the specified object
1538c2ecf20Sopenharmony_cidiffers from the given index key or -1 if they are the same.
1548c2ecf20Sopenharmony_ci
1558c2ecf20Sopenharmony_ci
1568c2ecf20Sopenharmony_ci5. Free an object::
1578c2ecf20Sopenharmony_ci
1588c2ecf20Sopenharmony_ci    void (*free_object)(void *object);
1598c2ecf20Sopenharmony_ci
1608c2ecf20Sopenharmony_ciFree the specified object.  Note that this may be called an RCU grace period
1618c2ecf20Sopenharmony_ciafter ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may be
1628c2ecf20Sopenharmony_cinecessary on module unloading.
1638c2ecf20Sopenharmony_ci
1648c2ecf20Sopenharmony_ci
1658c2ecf20Sopenharmony_ciManipulation Functions
1668c2ecf20Sopenharmony_ci----------------------
1678c2ecf20Sopenharmony_ci
1688c2ecf20Sopenharmony_ciThere are a number of functions for manipulating an associative array:
1698c2ecf20Sopenharmony_ci
1708c2ecf20Sopenharmony_ci1. Initialise an associative array::
1718c2ecf20Sopenharmony_ci
1728c2ecf20Sopenharmony_ci    void assoc_array_init(struct assoc_array *array);
1738c2ecf20Sopenharmony_ci
1748c2ecf20Sopenharmony_ciThis initialises the base structure for an associative array.  It can't fail.
1758c2ecf20Sopenharmony_ci
1768c2ecf20Sopenharmony_ci
1778c2ecf20Sopenharmony_ci2. Insert/replace an object in an associative array::
1788c2ecf20Sopenharmony_ci
1798c2ecf20Sopenharmony_ci    struct assoc_array_edit *
1808c2ecf20Sopenharmony_ci    assoc_array_insert(struct assoc_array *array,
1818c2ecf20Sopenharmony_ci                       const struct assoc_array_ops *ops,
1828c2ecf20Sopenharmony_ci                       const void *index_key,
1838c2ecf20Sopenharmony_ci                       void *object);
1848c2ecf20Sopenharmony_ci
1858c2ecf20Sopenharmony_ciThis inserts the given object into the array.  Note that the least
1868c2ecf20Sopenharmony_cisignificant bit of the pointer must be zero as it's used to type-mark
1878c2ecf20Sopenharmony_cipointers internally.
1888c2ecf20Sopenharmony_ci
1898c2ecf20Sopenharmony_ciIf an object already exists for that key then it will be replaced with the
1908c2ecf20Sopenharmony_cinew object and the old one will be freed automatically.
1918c2ecf20Sopenharmony_ci
1928c2ecf20Sopenharmony_ciThe ``index_key`` argument should hold index key information and is
1938c2ecf20Sopenharmony_cipassed to the methods in the ops table when they are called.
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns
1968c2ecf20Sopenharmony_cian edit script that must be applied.  ``-ENOMEM`` is returned in the case of
1978c2ecf20Sopenharmony_cian out-of-memory error.
1988c2ecf20Sopenharmony_ci
1998c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array.
2008c2ecf20Sopenharmony_ci
2018c2ecf20Sopenharmony_ci
2028c2ecf20Sopenharmony_ci3. Delete an object from an associative array::
2038c2ecf20Sopenharmony_ci
2048c2ecf20Sopenharmony_ci    struct assoc_array_edit *
2058c2ecf20Sopenharmony_ci    assoc_array_delete(struct assoc_array *array,
2068c2ecf20Sopenharmony_ci                       const struct assoc_array_ops *ops,
2078c2ecf20Sopenharmony_ci                       const void *index_key);
2088c2ecf20Sopenharmony_ci
2098c2ecf20Sopenharmony_ciThis deletes an object that matches the specified data from the array.
2108c2ecf20Sopenharmony_ci
2118c2ecf20Sopenharmony_ciThe ``index_key`` argument should hold index key information and is
2128c2ecf20Sopenharmony_cipassed to the methods in the ops table when they are called.
2138c2ecf20Sopenharmony_ci
2148c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns
2158c2ecf20Sopenharmony_cian edit script that must be applied.  ``-ENOMEM`` is returned in the case of
2168c2ecf20Sopenharmony_cian out-of-memory error.  ``NULL`` will be returned if the specified object is
2178c2ecf20Sopenharmony_cinot found within the array.
2188c2ecf20Sopenharmony_ci
2198c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array.
2208c2ecf20Sopenharmony_ci
2218c2ecf20Sopenharmony_ci
2228c2ecf20Sopenharmony_ci4. Delete all objects from an associative array::
2238c2ecf20Sopenharmony_ci
2248c2ecf20Sopenharmony_ci    struct assoc_array_edit *
2258c2ecf20Sopenharmony_ci    assoc_array_clear(struct assoc_array *array,
2268c2ecf20Sopenharmony_ci                      const struct assoc_array_ops *ops);
2278c2ecf20Sopenharmony_ci
2288c2ecf20Sopenharmony_ciThis deletes all the objects from an associative array and leaves it
2298c2ecf20Sopenharmony_cicompletely empty.
2308c2ecf20Sopenharmony_ci
2318c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns
2328c2ecf20Sopenharmony_cian edit script that must be applied.  ``-ENOMEM`` is returned in the case of
2338c2ecf20Sopenharmony_cian out-of-memory error.
2348c2ecf20Sopenharmony_ci
2358c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array.
2368c2ecf20Sopenharmony_ci
2378c2ecf20Sopenharmony_ci
2388c2ecf20Sopenharmony_ci5. Destroy an associative array, deleting all objects::
2398c2ecf20Sopenharmony_ci
2408c2ecf20Sopenharmony_ci    void assoc_array_destroy(struct assoc_array *array,
2418c2ecf20Sopenharmony_ci                             const struct assoc_array_ops *ops);
2428c2ecf20Sopenharmony_ci
2438c2ecf20Sopenharmony_ciThis destroys the contents of the associative array and leaves it
2448c2ecf20Sopenharmony_cicompletely empty.  It is not permitted for another thread to be traversing
2458c2ecf20Sopenharmony_cithe array under the RCU read lock at the same time as this function is
2468c2ecf20Sopenharmony_cidestroying it as no RCU deferral is performed on memory release -
2478c2ecf20Sopenharmony_cisomething that would require memory to be allocated.
2488c2ecf20Sopenharmony_ci
2498c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers and accessors
2508c2ecf20Sopenharmony_ciof the array.
2518c2ecf20Sopenharmony_ci
2528c2ecf20Sopenharmony_ci
2538c2ecf20Sopenharmony_ci6. Garbage collect an associative array::
2548c2ecf20Sopenharmony_ci
2558c2ecf20Sopenharmony_ci    int assoc_array_gc(struct assoc_array *array,
2568c2ecf20Sopenharmony_ci                       const struct assoc_array_ops *ops,
2578c2ecf20Sopenharmony_ci                       bool (*iterator)(void *object, void *iterator_data),
2588c2ecf20Sopenharmony_ci                       void *iterator_data);
2598c2ecf20Sopenharmony_ci
2608c2ecf20Sopenharmony_ciThis iterates over the objects in an associative array and passes each one to
2618c2ecf20Sopenharmony_ci``iterator()``.  If ``iterator()`` returns ``true``, the object is kept.  If it
2628c2ecf20Sopenharmony_cireturns ``false``, the object will be freed.  If the ``iterator()`` function
2638c2ecf20Sopenharmony_cireturns ``true``, it must perform any appropriate refcount incrementing on the
2648c2ecf20Sopenharmony_ciobject before returning.
2658c2ecf20Sopenharmony_ci
2668c2ecf20Sopenharmony_ciThe internal tree will be packed down if possible as part of the iteration
2678c2ecf20Sopenharmony_cito reduce the number of nodes in it.
2688c2ecf20Sopenharmony_ci
2698c2ecf20Sopenharmony_ciThe ``iterator_data`` is passed directly to ``iterator()`` and is otherwise
2708c2ecf20Sopenharmony_ciignored by the function.
2718c2ecf20Sopenharmony_ci
2728c2ecf20Sopenharmony_ciThe function will return ``0`` if successful and ``-ENOMEM`` if there wasn't
2738c2ecf20Sopenharmony_cienough memory.
2748c2ecf20Sopenharmony_ci
2758c2ecf20Sopenharmony_ciIt is possible for other threads to iterate over or search the array under
2768c2ecf20Sopenharmony_cithe RCU read lock while this function is in progress.  The caller should
2778c2ecf20Sopenharmony_cilock exclusively against other modifiers of the array.
2788c2ecf20Sopenharmony_ci
2798c2ecf20Sopenharmony_ci
2808c2ecf20Sopenharmony_ciAccess Functions
2818c2ecf20Sopenharmony_ci----------------
2828c2ecf20Sopenharmony_ci
2838c2ecf20Sopenharmony_ciThere are two functions for accessing an associative array:
2848c2ecf20Sopenharmony_ci
2858c2ecf20Sopenharmony_ci1. Iterate over all the objects in an associative array::
2868c2ecf20Sopenharmony_ci
2878c2ecf20Sopenharmony_ci    int assoc_array_iterate(const struct assoc_array *array,
2888c2ecf20Sopenharmony_ci                            int (*iterator)(const void *object,
2898c2ecf20Sopenharmony_ci                                            void *iterator_data),
2908c2ecf20Sopenharmony_ci                            void *iterator_data);
2918c2ecf20Sopenharmony_ci
2928c2ecf20Sopenharmony_ciThis passes each object in the array to the iterator callback function.
2938c2ecf20Sopenharmony_ci``iterator_data`` is private data for that function.
2948c2ecf20Sopenharmony_ci
2958c2ecf20Sopenharmony_ciThis may be used on an array at the same time as the array is being
2968c2ecf20Sopenharmony_cimodified, provided the RCU read lock is held.  Under such circumstances,
2978c2ecf20Sopenharmony_ciit is possible for the iteration function to see some objects twice.  If
2988c2ecf20Sopenharmony_cithis is a problem, then modification should be locked against.  The
2998c2ecf20Sopenharmony_ciiteration algorithm should not, however, miss any objects.
3008c2ecf20Sopenharmony_ci
3018c2ecf20Sopenharmony_ciThe function will return ``0`` if no objects were in the array or else it will
3028c2ecf20Sopenharmony_cireturn the result of the last iterator function called.  Iteration stops
3038c2ecf20Sopenharmony_ciimmediately if any call to the iteration function results in a non-zero
3048c2ecf20Sopenharmony_cireturn.
3058c2ecf20Sopenharmony_ci
3068c2ecf20Sopenharmony_ci
3078c2ecf20Sopenharmony_ci2. Find an object in an associative array::
3088c2ecf20Sopenharmony_ci
3098c2ecf20Sopenharmony_ci    void *assoc_array_find(const struct assoc_array *array,
3108c2ecf20Sopenharmony_ci                           const struct assoc_array_ops *ops,
3118c2ecf20Sopenharmony_ci                           const void *index_key);
3128c2ecf20Sopenharmony_ci
3138c2ecf20Sopenharmony_ciThis walks through the array's internal tree directly to the object
3148c2ecf20Sopenharmony_cispecified by the index key..
3158c2ecf20Sopenharmony_ci
3168c2ecf20Sopenharmony_ciThis may be used on an array at the same time as the array is being
3178c2ecf20Sopenharmony_cimodified, provided the RCU read lock is held.
3188c2ecf20Sopenharmony_ci
3198c2ecf20Sopenharmony_ciThe function will return the object if found (and set ``*_type`` to the object
3208c2ecf20Sopenharmony_citype) or will return ``NULL`` if the object was not found.
3218c2ecf20Sopenharmony_ci
3228c2ecf20Sopenharmony_ci
3238c2ecf20Sopenharmony_ciIndex Key Form
3248c2ecf20Sopenharmony_ci--------------
3258c2ecf20Sopenharmony_ci
3268c2ecf20Sopenharmony_ciThe index key can be of any form, but since the algorithms aren't told how long
3278c2ecf20Sopenharmony_cithe key is, it is strongly recommended that the index key includes its length
3288c2ecf20Sopenharmony_civery early on before any variation due to the length would have an effect on
3298c2ecf20Sopenharmony_cicomparisons.
3308c2ecf20Sopenharmony_ci
3318c2ecf20Sopenharmony_ciThis will cause leaves with different length keys to scatter away from each
3328c2ecf20Sopenharmony_ciother - and those with the same length keys to cluster together.
3338c2ecf20Sopenharmony_ci
3348c2ecf20Sopenharmony_ciIt is also recommended that the index key begin with a hash of the rest of the
3358c2ecf20Sopenharmony_cikey to maximise scattering throughout keyspace.
3368c2ecf20Sopenharmony_ci
3378c2ecf20Sopenharmony_ciThe better the scattering, the wider and lower the internal tree will be.
3388c2ecf20Sopenharmony_ci
3398c2ecf20Sopenharmony_ciPoor scattering isn't too much of a problem as there are shortcuts and nodes
3408c2ecf20Sopenharmony_cican contain mixtures of leaves and metadata pointers.
3418c2ecf20Sopenharmony_ci
3428c2ecf20Sopenharmony_ciThe index key is read in chunks of machine word.  Each chunk is subdivided into
3438c2ecf20Sopenharmony_cione nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and
3448c2ecf20Sopenharmony_cion a 64-bit CPU, 16 levels.  Unless the scattering is really poor, it is
3458c2ecf20Sopenharmony_ciunlikely that more than one word of any particular index key will have to be
3468c2ecf20Sopenharmony_ciused.
3478c2ecf20Sopenharmony_ci
3488c2ecf20Sopenharmony_ci
3498c2ecf20Sopenharmony_ciInternal Workings
3508c2ecf20Sopenharmony_ci=================
3518c2ecf20Sopenharmony_ci
3528c2ecf20Sopenharmony_ciThe associative array data structure has an internal tree.  This tree is
3538c2ecf20Sopenharmony_ciconstructed of two types of metadata blocks: nodes and shortcuts.
3548c2ecf20Sopenharmony_ci
3558c2ecf20Sopenharmony_ciA node is an array of slots.  Each slot can contain one of four things:
3568c2ecf20Sopenharmony_ci
3578c2ecf20Sopenharmony_ci* A NULL pointer, indicating that the slot is empty.
3588c2ecf20Sopenharmony_ci* A pointer to an object (a leaf).
3598c2ecf20Sopenharmony_ci* A pointer to a node at the next level.
3608c2ecf20Sopenharmony_ci* A pointer to a shortcut.
3618c2ecf20Sopenharmony_ci
3628c2ecf20Sopenharmony_ci
3638c2ecf20Sopenharmony_ciBasic Internal Tree Layout
3648c2ecf20Sopenharmony_ci--------------------------
3658c2ecf20Sopenharmony_ci
3668c2ecf20Sopenharmony_ciIgnoring shortcuts for the moment, the nodes form a multilevel tree.  The index
3678c2ecf20Sopenharmony_cikey space is strictly subdivided by the nodes in the tree and nodes occur on
3688c2ecf20Sopenharmony_cifixed levels.  For example::
3698c2ecf20Sopenharmony_ci
3708c2ecf20Sopenharmony_ci Level: 0               1               2               3
3718c2ecf20Sopenharmony_ci        =============== =============== =============== ===============
3728c2ecf20Sopenharmony_ci                                                        NODE D
3738c2ecf20Sopenharmony_ci                        NODE B          NODE C  +------>+---+
3748c2ecf20Sopenharmony_ci                +------>+---+   +------>+---+   |       | 0 |
3758c2ecf20Sopenharmony_ci        NODE A  |       | 0 |   |       | 0 |   |       +---+
3768c2ecf20Sopenharmony_ci        +---+   |       +---+   |       +---+   |       :   :
3778c2ecf20Sopenharmony_ci        | 0 |   |       :   :   |       :   :   |       +---+
3788c2ecf20Sopenharmony_ci        +---+   |       +---+   |       +---+   |       | f |
3798c2ecf20Sopenharmony_ci        | 1 |---+       | 3 |---+       | 7 |---+       +---+
3808c2ecf20Sopenharmony_ci        +---+           +---+           +---+
3818c2ecf20Sopenharmony_ci        :   :           :   :           | 8 |---+
3828c2ecf20Sopenharmony_ci        +---+           +---+           +---+   |       NODE E
3838c2ecf20Sopenharmony_ci        | e |---+       | f |           :   :   +------>+---+
3848c2ecf20Sopenharmony_ci        +---+   |       +---+           +---+           | 0 |
3858c2ecf20Sopenharmony_ci        | f |   |                       | f |           +---+
3868c2ecf20Sopenharmony_ci        +---+   |                       +---+           :   :
3878c2ecf20Sopenharmony_ci                |       NODE F                          +---+
3888c2ecf20Sopenharmony_ci                +------>+---+                           | f |
3898c2ecf20Sopenharmony_ci                        | 0 |           NODE G          +---+
3908c2ecf20Sopenharmony_ci                        +---+   +------>+---+
3918c2ecf20Sopenharmony_ci                        :   :   |       | 0 |
3928c2ecf20Sopenharmony_ci                        +---+   |       +---+
3938c2ecf20Sopenharmony_ci                        | 6 |---+       :   :
3948c2ecf20Sopenharmony_ci                        +---+           +---+
3958c2ecf20Sopenharmony_ci                        :   :           | f |
3968c2ecf20Sopenharmony_ci                        +---+           +---+
3978c2ecf20Sopenharmony_ci                        | f |
3988c2ecf20Sopenharmony_ci                        +---+
3998c2ecf20Sopenharmony_ci
4008c2ecf20Sopenharmony_ciIn the above example, there are 7 nodes (A-G), each with 16 slots (0-f).
4018c2ecf20Sopenharmony_ciAssuming no other meta data nodes in the tree, the key space is divided
4028c2ecf20Sopenharmony_cithusly::
4038c2ecf20Sopenharmony_ci
4048c2ecf20Sopenharmony_ci    KEY PREFIX      NODE
4058c2ecf20Sopenharmony_ci    ==========      ====
4068c2ecf20Sopenharmony_ci    137*            D
4078c2ecf20Sopenharmony_ci    138*            E
4088c2ecf20Sopenharmony_ci    13[0-69-f]*     C
4098c2ecf20Sopenharmony_ci    1[0-24-f]*      B
4108c2ecf20Sopenharmony_ci    e6*             G
4118c2ecf20Sopenharmony_ci    e[0-57-f]*      F
4128c2ecf20Sopenharmony_ci    [02-df]*        A
4138c2ecf20Sopenharmony_ci
4148c2ecf20Sopenharmony_ciSo, for instance, keys with the following example index keys will be found in
4158c2ecf20Sopenharmony_cithe appropriate nodes::
4168c2ecf20Sopenharmony_ci
4178c2ecf20Sopenharmony_ci    INDEX KEY       PREFIX  NODE
4188c2ecf20Sopenharmony_ci    =============== ======= ====
4198c2ecf20Sopenharmony_ci    13694892892489  13      C
4208c2ecf20Sopenharmony_ci    13795289025897  137     D
4218c2ecf20Sopenharmony_ci    13889dde88793   138     E
4228c2ecf20Sopenharmony_ci    138bbb89003093  138     E
4238c2ecf20Sopenharmony_ci    1394879524789   12      C
4248c2ecf20Sopenharmony_ci    1458952489      1       B
4258c2ecf20Sopenharmony_ci    9431809de993ba  -       A
4268c2ecf20Sopenharmony_ci    b4542910809cd   -       A
4278c2ecf20Sopenharmony_ci    e5284310def98   e       F
4288c2ecf20Sopenharmony_ci    e68428974237    e6      G
4298c2ecf20Sopenharmony_ci    e7fffcbd443     e       F
4308c2ecf20Sopenharmony_ci    f3842239082     -       A
4318c2ecf20Sopenharmony_ci
4328c2ecf20Sopenharmony_ciTo save memory, if a node can hold all the leaves in its portion of keyspace,
4338c2ecf20Sopenharmony_cithen the node will have all those leaves in it and will not have any metadata
4348c2ecf20Sopenharmony_cipointers - even if some of those leaves would like to be in the same slot.
4358c2ecf20Sopenharmony_ci
4368c2ecf20Sopenharmony_ciA node can contain a heterogeneous mix of leaves and metadata pointers.
4378c2ecf20Sopenharmony_ciMetadata pointers must be in the slots that match their subdivisions of key
4388c2ecf20Sopenharmony_cispace.  The leaves can be in any slot not occupied by a metadata pointer.  It
4398c2ecf20Sopenharmony_ciis guaranteed that none of the leaves in a node will match a slot occupied by a
4408c2ecf20Sopenharmony_cimetadata pointer.  If the metadata pointer is there, any leaf whose key matches
4418c2ecf20Sopenharmony_cithe metadata key prefix must be in the subtree that the metadata pointer points
4428c2ecf20Sopenharmony_cito.
4438c2ecf20Sopenharmony_ci
4448c2ecf20Sopenharmony_ciIn the above example list of index keys, node A will contain::
4458c2ecf20Sopenharmony_ci
4468c2ecf20Sopenharmony_ci    SLOT    CONTENT         INDEX KEY (PREFIX)
4478c2ecf20Sopenharmony_ci    ====    =============== ==================
4488c2ecf20Sopenharmony_ci    1       PTR TO NODE B   1*
4498c2ecf20Sopenharmony_ci    any     LEAF            9431809de993ba
4508c2ecf20Sopenharmony_ci    any     LEAF            b4542910809cd
4518c2ecf20Sopenharmony_ci    e       PTR TO NODE F   e*
4528c2ecf20Sopenharmony_ci    any     LEAF            f3842239082
4538c2ecf20Sopenharmony_ci
4548c2ecf20Sopenharmony_ciand node B::
4558c2ecf20Sopenharmony_ci
4568c2ecf20Sopenharmony_ci    3	PTR TO NODE C	13*
4578c2ecf20Sopenharmony_ci    any	LEAF		1458952489
4588c2ecf20Sopenharmony_ci
4598c2ecf20Sopenharmony_ci
4608c2ecf20Sopenharmony_ciShortcuts
4618c2ecf20Sopenharmony_ci---------
4628c2ecf20Sopenharmony_ci
4638c2ecf20Sopenharmony_ciShortcuts are metadata records that jump over a piece of keyspace.  A shortcut
4648c2ecf20Sopenharmony_ciis a replacement for a series of single-occupancy nodes ascending through the
4658c2ecf20Sopenharmony_cilevels.  Shortcuts exist to save memory and to speed up traversal.
4668c2ecf20Sopenharmony_ci
4678c2ecf20Sopenharmony_ciIt is possible for the root of the tree to be a shortcut - say, for example,
4688c2ecf20Sopenharmony_cithe tree contains at least 17 nodes all with key prefix ``1111``.  The
4698c2ecf20Sopenharmony_ciinsertion algorithm will insert a shortcut to skip over the ``1111`` keyspace
4708c2ecf20Sopenharmony_ciin a single bound and get to the fourth level where these actually become
4718c2ecf20Sopenharmony_cidifferent.
4728c2ecf20Sopenharmony_ci
4738c2ecf20Sopenharmony_ci
4748c2ecf20Sopenharmony_ciSplitting And Collapsing Nodes
4758c2ecf20Sopenharmony_ci------------------------------
4768c2ecf20Sopenharmony_ci
4778c2ecf20Sopenharmony_ciEach node has a maximum capacity of 16 leaves and metadata pointers.  If the
4788c2ecf20Sopenharmony_ciinsertion algorithm finds that it is trying to insert a 17th object into a
4798c2ecf20Sopenharmony_cinode, that node will be split such that at least two leaves that have a common
4808c2ecf20Sopenharmony_cikey segment at that level end up in a separate node rooted on that slot for
4818c2ecf20Sopenharmony_cithat common key segment.
4828c2ecf20Sopenharmony_ci
4838c2ecf20Sopenharmony_ciIf the leaves in a full node and the leaf that is being inserted are
4848c2ecf20Sopenharmony_cisufficiently similar, then a shortcut will be inserted into the tree.
4858c2ecf20Sopenharmony_ci
4868c2ecf20Sopenharmony_ciWhen the number of objects in the subtree rooted at a node falls to 16 or
4878c2ecf20Sopenharmony_cifewer, then the subtree will be collapsed down to a single node - and this will
4888c2ecf20Sopenharmony_ciripple towards the root if possible.
4898c2ecf20Sopenharmony_ci
4908c2ecf20Sopenharmony_ci
4918c2ecf20Sopenharmony_ciNon-Recursive Iteration
4928c2ecf20Sopenharmony_ci-----------------------
4938c2ecf20Sopenharmony_ci
4948c2ecf20Sopenharmony_ciEach node and shortcut contains a back pointer to its parent and the number of
4958c2ecf20Sopenharmony_cislot in that parent that points to it.  None-recursive iteration uses these to
4968c2ecf20Sopenharmony_ciproceed rootwards through the tree, going to the parent node, slot N + 1 to
4978c2ecf20Sopenharmony_cimake sure progress is made without the need for a stack.
4988c2ecf20Sopenharmony_ci
4998c2ecf20Sopenharmony_ciThe backpointers, however, make simultaneous alteration and iteration tricky.
5008c2ecf20Sopenharmony_ci
5018c2ecf20Sopenharmony_ci
5028c2ecf20Sopenharmony_ciSimultaneous Alteration And Iteration
5038c2ecf20Sopenharmony_ci-------------------------------------
5048c2ecf20Sopenharmony_ci
5058c2ecf20Sopenharmony_ciThere are a number of cases to consider:
5068c2ecf20Sopenharmony_ci
5078c2ecf20Sopenharmony_ci1. Simple insert/replace.  This involves simply replacing a NULL or old
5088c2ecf20Sopenharmony_ci   matching leaf pointer with the pointer to the new leaf after a barrier.
5098c2ecf20Sopenharmony_ci   The metadata blocks don't change otherwise.  An old leaf won't be freed
5108c2ecf20Sopenharmony_ci   until after the RCU grace period.
5118c2ecf20Sopenharmony_ci
5128c2ecf20Sopenharmony_ci2. Simple delete.  This involves just clearing an old matching leaf.  The
5138c2ecf20Sopenharmony_ci   metadata blocks don't change otherwise.  The old leaf won't be freed until
5148c2ecf20Sopenharmony_ci   after the RCU grace period.
5158c2ecf20Sopenharmony_ci
5168c2ecf20Sopenharmony_ci3. Insertion replacing part of a subtree that we haven't yet entered.  This
5178c2ecf20Sopenharmony_ci   may involve replacement of part of that subtree - but that won't affect
5188c2ecf20Sopenharmony_ci   the iteration as we won't have reached the pointer to it yet and the
5198c2ecf20Sopenharmony_ci   ancestry blocks are not replaced (the layout of those does not change).
5208c2ecf20Sopenharmony_ci
5218c2ecf20Sopenharmony_ci4. Insertion replacing nodes that we're actively processing.  This isn't a
5228c2ecf20Sopenharmony_ci   problem as we've passed the anchoring pointer and won't switch onto the
5238c2ecf20Sopenharmony_ci   new layout until we follow the back pointers - at which point we've
5248c2ecf20Sopenharmony_ci   already examined the leaves in the replaced node (we iterate over all the
5258c2ecf20Sopenharmony_ci   leaves in a node before following any of its metadata pointers).
5268c2ecf20Sopenharmony_ci
5278c2ecf20Sopenharmony_ci   We might, however, re-see some leaves that have been split out into a new
5288c2ecf20Sopenharmony_ci   branch that's in a slot further along than we were at.
5298c2ecf20Sopenharmony_ci
5308c2ecf20Sopenharmony_ci5. Insertion replacing nodes that we're processing a dependent branch of.
5318c2ecf20Sopenharmony_ci   This won't affect us until we follow the back pointers.  Similar to (4).
5328c2ecf20Sopenharmony_ci
5338c2ecf20Sopenharmony_ci6. Deletion collapsing a branch under us.  This doesn't affect us because the
5348c2ecf20Sopenharmony_ci   back pointers will get us back to the parent of the new node before we
5358c2ecf20Sopenharmony_ci   could see the new node.  The entire collapsed subtree is thrown away
5368c2ecf20Sopenharmony_ci   unchanged - and will still be rooted on the same slot, so we shouldn't
5378c2ecf20Sopenharmony_ci   process it a second time as we'll go back to slot + 1.
5388c2ecf20Sopenharmony_ci
5398c2ecf20Sopenharmony_ci.. note::
5408c2ecf20Sopenharmony_ci
5418c2ecf20Sopenharmony_ci   Under some circumstances, we need to simultaneously change the parent
5428c2ecf20Sopenharmony_ci   pointer and the parent slot pointer on a node (say, for example, we
5438c2ecf20Sopenharmony_ci   inserted another node before it and moved it up a level).  We cannot do
5448c2ecf20Sopenharmony_ci   this without locking against a read - so we have to replace that node too.
5458c2ecf20Sopenharmony_ci
5468c2ecf20Sopenharmony_ci   However, when we're changing a shortcut into a node this isn't a problem
5478c2ecf20Sopenharmony_ci   as shortcuts only have one slot and so the parent slot number isn't used
5488c2ecf20Sopenharmony_ci   when traversing backwards over one.  This means that it's okay to change
5498c2ecf20Sopenharmony_ci   the slot number first - provided suitable barriers are used to make sure
5508c2ecf20Sopenharmony_ci   the parent slot number is read after the back pointer.
5518c2ecf20Sopenharmony_ci
5528c2ecf20Sopenharmony_ciObsolete blocks and leaves are freed up after an RCU grace period has passed,
5538c2ecf20Sopenharmony_ciso as long as anyone doing walking or iteration holds the RCU read lock, the
5548c2ecf20Sopenharmony_ciold superstructure should not go away on them.
555