18c2ecf20Sopenharmony_ci======================================== 28c2ecf20Sopenharmony_ciGeneric Associative Array Implementation 38c2ecf20Sopenharmony_ci======================================== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ciOverview 68c2ecf20Sopenharmony_ci======== 78c2ecf20Sopenharmony_ci 88c2ecf20Sopenharmony_ciThis associative array implementation is an object container with the following 98c2ecf20Sopenharmony_ciproperties: 108c2ecf20Sopenharmony_ci 118c2ecf20Sopenharmony_ci1. Objects are opaque pointers. The implementation does not care where they 128c2ecf20Sopenharmony_ci point (if anywhere) or what they point to (if anything). 138c2ecf20Sopenharmony_ci 148c2ecf20Sopenharmony_ci .. note:: 158c2ecf20Sopenharmony_ci 168c2ecf20Sopenharmony_ci Pointers to objects _must_ be zero in the least significant bit. 178c2ecf20Sopenharmony_ci 188c2ecf20Sopenharmony_ci2. Objects do not need to contain linkage blocks for use by the array. This 198c2ecf20Sopenharmony_ci permits an object to be located in multiple arrays simultaneously. 208c2ecf20Sopenharmony_ci Rather, the array is made up of metadata blocks that point to objects. 218c2ecf20Sopenharmony_ci 228c2ecf20Sopenharmony_ci3. Objects require index keys to locate them within the array. 238c2ecf20Sopenharmony_ci 248c2ecf20Sopenharmony_ci4. Index keys must be unique. Inserting an object with the same key as one 258c2ecf20Sopenharmony_ci already in the array will replace the old object. 268c2ecf20Sopenharmony_ci 278c2ecf20Sopenharmony_ci5. Index keys can be of any length and can be of different lengths. 288c2ecf20Sopenharmony_ci 298c2ecf20Sopenharmony_ci6. Index keys should encode the length early on, before any variation due to 308c2ecf20Sopenharmony_ci length is seen. 318c2ecf20Sopenharmony_ci 328c2ecf20Sopenharmony_ci7. Index keys can include a hash to scatter objects throughout the array. 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ci8. The array can iterated over. The objects will not necessarily come out in 358c2ecf20Sopenharmony_ci key order. 368c2ecf20Sopenharmony_ci 378c2ecf20Sopenharmony_ci9. The array can be iterated over while it is being modified, provided the 388c2ecf20Sopenharmony_ci RCU readlock is being held by the iterator. Note, however, under these 398c2ecf20Sopenharmony_ci circumstances, some objects may be seen more than once. If this is a 408c2ecf20Sopenharmony_ci problem, the iterator should lock against modification. Objects will not 418c2ecf20Sopenharmony_ci be missed, however, unless deleted. 428c2ecf20Sopenharmony_ci 438c2ecf20Sopenharmony_ci10. Objects in the array can be looked up by means of their index key. 448c2ecf20Sopenharmony_ci 458c2ecf20Sopenharmony_ci11. Objects can be looked up while the array is being modified, provided the 468c2ecf20Sopenharmony_ci RCU readlock is being held by the thread doing the look up. 478c2ecf20Sopenharmony_ci 488c2ecf20Sopenharmony_ciThe implementation uses a tree of 16-pointer nodes internally that are indexed 498c2ecf20Sopenharmony_cion each level by nibbles from the index key in the same manner as in a radix 508c2ecf20Sopenharmony_citree. To improve memory efficiency, shortcuts can be emplaced to skip over 518c2ecf20Sopenharmony_ciwhat would otherwise be a series of single-occupancy nodes. Further, nodes 528c2ecf20Sopenharmony_cipack leaf object pointers into spare space in the node rather than making an 538c2ecf20Sopenharmony_ciextra branch until as such time an object needs to be added to a full node. 548c2ecf20Sopenharmony_ci 558c2ecf20Sopenharmony_ci 568c2ecf20Sopenharmony_ciThe Public API 578c2ecf20Sopenharmony_ci============== 588c2ecf20Sopenharmony_ci 598c2ecf20Sopenharmony_ciThe public API can be found in ``<linux/assoc_array.h>``. The associative 608c2ecf20Sopenharmony_ciarray is rooted on the following structure:: 618c2ecf20Sopenharmony_ci 628c2ecf20Sopenharmony_ci struct assoc_array { 638c2ecf20Sopenharmony_ci ... 648c2ecf20Sopenharmony_ci }; 658c2ecf20Sopenharmony_ci 668c2ecf20Sopenharmony_ciThe code is selected by enabling ``CONFIG_ASSOCIATIVE_ARRAY`` with:: 678c2ecf20Sopenharmony_ci 688c2ecf20Sopenharmony_ci ./script/config -e ASSOCIATIVE_ARRAY 698c2ecf20Sopenharmony_ci 708c2ecf20Sopenharmony_ci 718c2ecf20Sopenharmony_ciEdit Script 728c2ecf20Sopenharmony_ci----------- 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ciThe insertion and deletion functions produce an 'edit script' that can later be 758c2ecf20Sopenharmony_ciapplied to effect the changes without risking ``ENOMEM``. This retains the 768c2ecf20Sopenharmony_cipreallocated metadata blocks that will be installed in the internal tree and 778c2ecf20Sopenharmony_cikeeps track of the metadata blocks that will be removed from the tree when the 788c2ecf20Sopenharmony_ciscript is applied. 798c2ecf20Sopenharmony_ci 808c2ecf20Sopenharmony_ciThis is also used to keep track of dead blocks and dead objects after the 818c2ecf20Sopenharmony_ciscript has been applied so that they can be freed later. The freeing is done 828c2ecf20Sopenharmony_ciafter an RCU grace period has passed - thus allowing access functions to 838c2ecf20Sopenharmony_ciproceed under the RCU read lock. 848c2ecf20Sopenharmony_ci 858c2ecf20Sopenharmony_ciThe script appears as outside of the API as a pointer of the type:: 868c2ecf20Sopenharmony_ci 878c2ecf20Sopenharmony_ci struct assoc_array_edit; 888c2ecf20Sopenharmony_ci 898c2ecf20Sopenharmony_ciThere are two functions for dealing with the script: 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ci1. Apply an edit script:: 928c2ecf20Sopenharmony_ci 938c2ecf20Sopenharmony_ci void assoc_array_apply_edit(struct assoc_array_edit *edit); 948c2ecf20Sopenharmony_ci 958c2ecf20Sopenharmony_ciThis will perform the edit functions, interpolating various write barriers 968c2ecf20Sopenharmony_cito permit accesses under the RCU read lock to continue. The edit script 978c2ecf20Sopenharmony_ciwill then be passed to ``call_rcu()`` to free it and any dead stuff it points 988c2ecf20Sopenharmony_cito. 998c2ecf20Sopenharmony_ci 1008c2ecf20Sopenharmony_ci2. Cancel an edit script:: 1018c2ecf20Sopenharmony_ci 1028c2ecf20Sopenharmony_ci void assoc_array_cancel_edit(struct assoc_array_edit *edit); 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ciThis frees the edit script and all preallocated memory immediately. If 1058c2ecf20Sopenharmony_cithis was for insertion, the new object is _not_ released by this function, 1068c2ecf20Sopenharmony_cibut must rather be released by the caller. 1078c2ecf20Sopenharmony_ci 1088c2ecf20Sopenharmony_ciThese functions are guaranteed not to fail. 1098c2ecf20Sopenharmony_ci 1108c2ecf20Sopenharmony_ci 1118c2ecf20Sopenharmony_ciOperations Table 1128c2ecf20Sopenharmony_ci---------------- 1138c2ecf20Sopenharmony_ci 1148c2ecf20Sopenharmony_ciVarious functions take a table of operations:: 1158c2ecf20Sopenharmony_ci 1168c2ecf20Sopenharmony_ci struct assoc_array_ops { 1178c2ecf20Sopenharmony_ci ... 1188c2ecf20Sopenharmony_ci }; 1198c2ecf20Sopenharmony_ci 1208c2ecf20Sopenharmony_ciThis points to a number of methods, all of which need to be provided: 1218c2ecf20Sopenharmony_ci 1228c2ecf20Sopenharmony_ci1. Get a chunk of index key from caller data:: 1238c2ecf20Sopenharmony_ci 1248c2ecf20Sopenharmony_ci unsigned long (*get_key_chunk)(const void *index_key, int level); 1258c2ecf20Sopenharmony_ci 1268c2ecf20Sopenharmony_ciThis should return a chunk of caller-supplied index key starting at the 1278c2ecf20Sopenharmony_ci*bit* position given by the level argument. The level argument will be a 1288c2ecf20Sopenharmony_cimultiple of ``ASSOC_ARRAY_KEY_CHUNK_SIZE`` and the function should return 1298c2ecf20Sopenharmony_ci``ASSOC_ARRAY_KEY_CHUNK_SIZE bits``. No error is possible. 1308c2ecf20Sopenharmony_ci 1318c2ecf20Sopenharmony_ci 1328c2ecf20Sopenharmony_ci2. Get a chunk of an object's index key:: 1338c2ecf20Sopenharmony_ci 1348c2ecf20Sopenharmony_ci unsigned long (*get_object_key_chunk)(const void *object, int level); 1358c2ecf20Sopenharmony_ci 1368c2ecf20Sopenharmony_ciAs the previous function, but gets its data from an object in the array 1378c2ecf20Sopenharmony_cirather than from a caller-supplied index key. 1388c2ecf20Sopenharmony_ci 1398c2ecf20Sopenharmony_ci 1408c2ecf20Sopenharmony_ci3. See if this is the object we're looking for:: 1418c2ecf20Sopenharmony_ci 1428c2ecf20Sopenharmony_ci bool (*compare_object)(const void *object, const void *index_key); 1438c2ecf20Sopenharmony_ci 1448c2ecf20Sopenharmony_ciCompare the object against an index key and return ``true`` if it matches and 1458c2ecf20Sopenharmony_ci``false`` if it doesn't. 1468c2ecf20Sopenharmony_ci 1478c2ecf20Sopenharmony_ci 1488c2ecf20Sopenharmony_ci4. Diff the index keys of two objects:: 1498c2ecf20Sopenharmony_ci 1508c2ecf20Sopenharmony_ci int (*diff_objects)(const void *object, const void *index_key); 1518c2ecf20Sopenharmony_ci 1528c2ecf20Sopenharmony_ciReturn the bit position at which the index key of the specified object 1538c2ecf20Sopenharmony_cidiffers from the given index key or -1 if they are the same. 1548c2ecf20Sopenharmony_ci 1558c2ecf20Sopenharmony_ci 1568c2ecf20Sopenharmony_ci5. Free an object:: 1578c2ecf20Sopenharmony_ci 1588c2ecf20Sopenharmony_ci void (*free_object)(void *object); 1598c2ecf20Sopenharmony_ci 1608c2ecf20Sopenharmony_ciFree the specified object. Note that this may be called an RCU grace period 1618c2ecf20Sopenharmony_ciafter ``assoc_array_apply_edit()`` was called, so ``synchronize_rcu()`` may be 1628c2ecf20Sopenharmony_cinecessary on module unloading. 1638c2ecf20Sopenharmony_ci 1648c2ecf20Sopenharmony_ci 1658c2ecf20Sopenharmony_ciManipulation Functions 1668c2ecf20Sopenharmony_ci---------------------- 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ciThere are a number of functions for manipulating an associative array: 1698c2ecf20Sopenharmony_ci 1708c2ecf20Sopenharmony_ci1. Initialise an associative array:: 1718c2ecf20Sopenharmony_ci 1728c2ecf20Sopenharmony_ci void assoc_array_init(struct assoc_array *array); 1738c2ecf20Sopenharmony_ci 1748c2ecf20Sopenharmony_ciThis initialises the base structure for an associative array. It can't fail. 1758c2ecf20Sopenharmony_ci 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ci2. Insert/replace an object in an associative array:: 1788c2ecf20Sopenharmony_ci 1798c2ecf20Sopenharmony_ci struct assoc_array_edit * 1808c2ecf20Sopenharmony_ci assoc_array_insert(struct assoc_array *array, 1818c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops, 1828c2ecf20Sopenharmony_ci const void *index_key, 1838c2ecf20Sopenharmony_ci void *object); 1848c2ecf20Sopenharmony_ci 1858c2ecf20Sopenharmony_ciThis inserts the given object into the array. Note that the least 1868c2ecf20Sopenharmony_cisignificant bit of the pointer must be zero as it's used to type-mark 1878c2ecf20Sopenharmony_cipointers internally. 1888c2ecf20Sopenharmony_ci 1898c2ecf20Sopenharmony_ciIf an object already exists for that key then it will be replaced with the 1908c2ecf20Sopenharmony_cinew object and the old one will be freed automatically. 1918c2ecf20Sopenharmony_ci 1928c2ecf20Sopenharmony_ciThe ``index_key`` argument should hold index key information and is 1938c2ecf20Sopenharmony_cipassed to the methods in the ops table when they are called. 1948c2ecf20Sopenharmony_ci 1958c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns 1968c2ecf20Sopenharmony_cian edit script that must be applied. ``-ENOMEM`` is returned in the case of 1978c2ecf20Sopenharmony_cian out-of-memory error. 1988c2ecf20Sopenharmony_ci 1998c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array. 2008c2ecf20Sopenharmony_ci 2018c2ecf20Sopenharmony_ci 2028c2ecf20Sopenharmony_ci3. Delete an object from an associative array:: 2038c2ecf20Sopenharmony_ci 2048c2ecf20Sopenharmony_ci struct assoc_array_edit * 2058c2ecf20Sopenharmony_ci assoc_array_delete(struct assoc_array *array, 2068c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops, 2078c2ecf20Sopenharmony_ci const void *index_key); 2088c2ecf20Sopenharmony_ci 2098c2ecf20Sopenharmony_ciThis deletes an object that matches the specified data from the array. 2108c2ecf20Sopenharmony_ci 2118c2ecf20Sopenharmony_ciThe ``index_key`` argument should hold index key information and is 2128c2ecf20Sopenharmony_cipassed to the methods in the ops table when they are called. 2138c2ecf20Sopenharmony_ci 2148c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns 2158c2ecf20Sopenharmony_cian edit script that must be applied. ``-ENOMEM`` is returned in the case of 2168c2ecf20Sopenharmony_cian out-of-memory error. ``NULL`` will be returned if the specified object is 2178c2ecf20Sopenharmony_cinot found within the array. 2188c2ecf20Sopenharmony_ci 2198c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array. 2208c2ecf20Sopenharmony_ci 2218c2ecf20Sopenharmony_ci 2228c2ecf20Sopenharmony_ci4. Delete all objects from an associative array:: 2238c2ecf20Sopenharmony_ci 2248c2ecf20Sopenharmony_ci struct assoc_array_edit * 2258c2ecf20Sopenharmony_ci assoc_array_clear(struct assoc_array *array, 2268c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops); 2278c2ecf20Sopenharmony_ci 2288c2ecf20Sopenharmony_ciThis deletes all the objects from an associative array and leaves it 2298c2ecf20Sopenharmony_cicompletely empty. 2308c2ecf20Sopenharmony_ci 2318c2ecf20Sopenharmony_ciThis function makes no alteration to the array itself, but rather returns 2328c2ecf20Sopenharmony_cian edit script that must be applied. ``-ENOMEM`` is returned in the case of 2338c2ecf20Sopenharmony_cian out-of-memory error. 2348c2ecf20Sopenharmony_ci 2358c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers of the array. 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ci 2388c2ecf20Sopenharmony_ci5. Destroy an associative array, deleting all objects:: 2398c2ecf20Sopenharmony_ci 2408c2ecf20Sopenharmony_ci void assoc_array_destroy(struct assoc_array *array, 2418c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops); 2428c2ecf20Sopenharmony_ci 2438c2ecf20Sopenharmony_ciThis destroys the contents of the associative array and leaves it 2448c2ecf20Sopenharmony_cicompletely empty. It is not permitted for another thread to be traversing 2458c2ecf20Sopenharmony_cithe array under the RCU read lock at the same time as this function is 2468c2ecf20Sopenharmony_cidestroying it as no RCU deferral is performed on memory release - 2478c2ecf20Sopenharmony_cisomething that would require memory to be allocated. 2488c2ecf20Sopenharmony_ci 2498c2ecf20Sopenharmony_ciThe caller should lock exclusively against other modifiers and accessors 2508c2ecf20Sopenharmony_ciof the array. 2518c2ecf20Sopenharmony_ci 2528c2ecf20Sopenharmony_ci 2538c2ecf20Sopenharmony_ci6. Garbage collect an associative array:: 2548c2ecf20Sopenharmony_ci 2558c2ecf20Sopenharmony_ci int assoc_array_gc(struct assoc_array *array, 2568c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops, 2578c2ecf20Sopenharmony_ci bool (*iterator)(void *object, void *iterator_data), 2588c2ecf20Sopenharmony_ci void *iterator_data); 2598c2ecf20Sopenharmony_ci 2608c2ecf20Sopenharmony_ciThis iterates over the objects in an associative array and passes each one to 2618c2ecf20Sopenharmony_ci``iterator()``. If ``iterator()`` returns ``true``, the object is kept. If it 2628c2ecf20Sopenharmony_cireturns ``false``, the object will be freed. If the ``iterator()`` function 2638c2ecf20Sopenharmony_cireturns ``true``, it must perform any appropriate refcount incrementing on the 2648c2ecf20Sopenharmony_ciobject before returning. 2658c2ecf20Sopenharmony_ci 2668c2ecf20Sopenharmony_ciThe internal tree will be packed down if possible as part of the iteration 2678c2ecf20Sopenharmony_cito reduce the number of nodes in it. 2688c2ecf20Sopenharmony_ci 2698c2ecf20Sopenharmony_ciThe ``iterator_data`` is passed directly to ``iterator()`` and is otherwise 2708c2ecf20Sopenharmony_ciignored by the function. 2718c2ecf20Sopenharmony_ci 2728c2ecf20Sopenharmony_ciThe function will return ``0`` if successful and ``-ENOMEM`` if there wasn't 2738c2ecf20Sopenharmony_cienough memory. 2748c2ecf20Sopenharmony_ci 2758c2ecf20Sopenharmony_ciIt is possible for other threads to iterate over or search the array under 2768c2ecf20Sopenharmony_cithe RCU read lock while this function is in progress. The caller should 2778c2ecf20Sopenharmony_cilock exclusively against other modifiers of the array. 2788c2ecf20Sopenharmony_ci 2798c2ecf20Sopenharmony_ci 2808c2ecf20Sopenharmony_ciAccess Functions 2818c2ecf20Sopenharmony_ci---------------- 2828c2ecf20Sopenharmony_ci 2838c2ecf20Sopenharmony_ciThere are two functions for accessing an associative array: 2848c2ecf20Sopenharmony_ci 2858c2ecf20Sopenharmony_ci1. Iterate over all the objects in an associative array:: 2868c2ecf20Sopenharmony_ci 2878c2ecf20Sopenharmony_ci int assoc_array_iterate(const struct assoc_array *array, 2888c2ecf20Sopenharmony_ci int (*iterator)(const void *object, 2898c2ecf20Sopenharmony_ci void *iterator_data), 2908c2ecf20Sopenharmony_ci void *iterator_data); 2918c2ecf20Sopenharmony_ci 2928c2ecf20Sopenharmony_ciThis passes each object in the array to the iterator callback function. 2938c2ecf20Sopenharmony_ci``iterator_data`` is private data for that function. 2948c2ecf20Sopenharmony_ci 2958c2ecf20Sopenharmony_ciThis may be used on an array at the same time as the array is being 2968c2ecf20Sopenharmony_cimodified, provided the RCU read lock is held. Under such circumstances, 2978c2ecf20Sopenharmony_ciit is possible for the iteration function to see some objects twice. If 2988c2ecf20Sopenharmony_cithis is a problem, then modification should be locked against. The 2998c2ecf20Sopenharmony_ciiteration algorithm should not, however, miss any objects. 3008c2ecf20Sopenharmony_ci 3018c2ecf20Sopenharmony_ciThe function will return ``0`` if no objects were in the array or else it will 3028c2ecf20Sopenharmony_cireturn the result of the last iterator function called. Iteration stops 3038c2ecf20Sopenharmony_ciimmediately if any call to the iteration function results in a non-zero 3048c2ecf20Sopenharmony_cireturn. 3058c2ecf20Sopenharmony_ci 3068c2ecf20Sopenharmony_ci 3078c2ecf20Sopenharmony_ci2. Find an object in an associative array:: 3088c2ecf20Sopenharmony_ci 3098c2ecf20Sopenharmony_ci void *assoc_array_find(const struct assoc_array *array, 3108c2ecf20Sopenharmony_ci const struct assoc_array_ops *ops, 3118c2ecf20Sopenharmony_ci const void *index_key); 3128c2ecf20Sopenharmony_ci 3138c2ecf20Sopenharmony_ciThis walks through the array's internal tree directly to the object 3148c2ecf20Sopenharmony_cispecified by the index key.. 3158c2ecf20Sopenharmony_ci 3168c2ecf20Sopenharmony_ciThis may be used on an array at the same time as the array is being 3178c2ecf20Sopenharmony_cimodified, provided the RCU read lock is held. 3188c2ecf20Sopenharmony_ci 3198c2ecf20Sopenharmony_ciThe function will return the object if found (and set ``*_type`` to the object 3208c2ecf20Sopenharmony_citype) or will return ``NULL`` if the object was not found. 3218c2ecf20Sopenharmony_ci 3228c2ecf20Sopenharmony_ci 3238c2ecf20Sopenharmony_ciIndex Key Form 3248c2ecf20Sopenharmony_ci-------------- 3258c2ecf20Sopenharmony_ci 3268c2ecf20Sopenharmony_ciThe index key can be of any form, but since the algorithms aren't told how long 3278c2ecf20Sopenharmony_cithe key is, it is strongly recommended that the index key includes its length 3288c2ecf20Sopenharmony_civery early on before any variation due to the length would have an effect on 3298c2ecf20Sopenharmony_cicomparisons. 3308c2ecf20Sopenharmony_ci 3318c2ecf20Sopenharmony_ciThis will cause leaves with different length keys to scatter away from each 3328c2ecf20Sopenharmony_ciother - and those with the same length keys to cluster together. 3338c2ecf20Sopenharmony_ci 3348c2ecf20Sopenharmony_ciIt is also recommended that the index key begin with a hash of the rest of the 3358c2ecf20Sopenharmony_cikey to maximise scattering throughout keyspace. 3368c2ecf20Sopenharmony_ci 3378c2ecf20Sopenharmony_ciThe better the scattering, the wider and lower the internal tree will be. 3388c2ecf20Sopenharmony_ci 3398c2ecf20Sopenharmony_ciPoor scattering isn't too much of a problem as there are shortcuts and nodes 3408c2ecf20Sopenharmony_cican contain mixtures of leaves and metadata pointers. 3418c2ecf20Sopenharmony_ci 3428c2ecf20Sopenharmony_ciThe index key is read in chunks of machine word. Each chunk is subdivided into 3438c2ecf20Sopenharmony_cione nibble (4 bits) per level, so on a 32-bit CPU this is good for 8 levels and 3448c2ecf20Sopenharmony_cion a 64-bit CPU, 16 levels. Unless the scattering is really poor, it is 3458c2ecf20Sopenharmony_ciunlikely that more than one word of any particular index key will have to be 3468c2ecf20Sopenharmony_ciused. 3478c2ecf20Sopenharmony_ci 3488c2ecf20Sopenharmony_ci 3498c2ecf20Sopenharmony_ciInternal Workings 3508c2ecf20Sopenharmony_ci================= 3518c2ecf20Sopenharmony_ci 3528c2ecf20Sopenharmony_ciThe associative array data structure has an internal tree. This tree is 3538c2ecf20Sopenharmony_ciconstructed of two types of metadata blocks: nodes and shortcuts. 3548c2ecf20Sopenharmony_ci 3558c2ecf20Sopenharmony_ciA node is an array of slots. Each slot can contain one of four things: 3568c2ecf20Sopenharmony_ci 3578c2ecf20Sopenharmony_ci* A NULL pointer, indicating that the slot is empty. 3588c2ecf20Sopenharmony_ci* A pointer to an object (a leaf). 3598c2ecf20Sopenharmony_ci* A pointer to a node at the next level. 3608c2ecf20Sopenharmony_ci* A pointer to a shortcut. 3618c2ecf20Sopenharmony_ci 3628c2ecf20Sopenharmony_ci 3638c2ecf20Sopenharmony_ciBasic Internal Tree Layout 3648c2ecf20Sopenharmony_ci-------------------------- 3658c2ecf20Sopenharmony_ci 3668c2ecf20Sopenharmony_ciIgnoring shortcuts for the moment, the nodes form a multilevel tree. The index 3678c2ecf20Sopenharmony_cikey space is strictly subdivided by the nodes in the tree and nodes occur on 3688c2ecf20Sopenharmony_cifixed levels. For example:: 3698c2ecf20Sopenharmony_ci 3708c2ecf20Sopenharmony_ci Level: 0 1 2 3 3718c2ecf20Sopenharmony_ci =============== =============== =============== =============== 3728c2ecf20Sopenharmony_ci NODE D 3738c2ecf20Sopenharmony_ci NODE B NODE C +------>+---+ 3748c2ecf20Sopenharmony_ci +------>+---+ +------>+---+ | | 0 | 3758c2ecf20Sopenharmony_ci NODE A | | 0 | | | 0 | | +---+ 3768c2ecf20Sopenharmony_ci +---+ | +---+ | +---+ | : : 3778c2ecf20Sopenharmony_ci | 0 | | : : | : : | +---+ 3788c2ecf20Sopenharmony_ci +---+ | +---+ | +---+ | | f | 3798c2ecf20Sopenharmony_ci | 1 |---+ | 3 |---+ | 7 |---+ +---+ 3808c2ecf20Sopenharmony_ci +---+ +---+ +---+ 3818c2ecf20Sopenharmony_ci : : : : | 8 |---+ 3828c2ecf20Sopenharmony_ci +---+ +---+ +---+ | NODE E 3838c2ecf20Sopenharmony_ci | e |---+ | f | : : +------>+---+ 3848c2ecf20Sopenharmony_ci +---+ | +---+ +---+ | 0 | 3858c2ecf20Sopenharmony_ci | f | | | f | +---+ 3868c2ecf20Sopenharmony_ci +---+ | +---+ : : 3878c2ecf20Sopenharmony_ci | NODE F +---+ 3888c2ecf20Sopenharmony_ci +------>+---+ | f | 3898c2ecf20Sopenharmony_ci | 0 | NODE G +---+ 3908c2ecf20Sopenharmony_ci +---+ +------>+---+ 3918c2ecf20Sopenharmony_ci : : | | 0 | 3928c2ecf20Sopenharmony_ci +---+ | +---+ 3938c2ecf20Sopenharmony_ci | 6 |---+ : : 3948c2ecf20Sopenharmony_ci +---+ +---+ 3958c2ecf20Sopenharmony_ci : : | f | 3968c2ecf20Sopenharmony_ci +---+ +---+ 3978c2ecf20Sopenharmony_ci | f | 3988c2ecf20Sopenharmony_ci +---+ 3998c2ecf20Sopenharmony_ci 4008c2ecf20Sopenharmony_ciIn the above example, there are 7 nodes (A-G), each with 16 slots (0-f). 4018c2ecf20Sopenharmony_ciAssuming no other meta data nodes in the tree, the key space is divided 4028c2ecf20Sopenharmony_cithusly:: 4038c2ecf20Sopenharmony_ci 4048c2ecf20Sopenharmony_ci KEY PREFIX NODE 4058c2ecf20Sopenharmony_ci ========== ==== 4068c2ecf20Sopenharmony_ci 137* D 4078c2ecf20Sopenharmony_ci 138* E 4088c2ecf20Sopenharmony_ci 13[0-69-f]* C 4098c2ecf20Sopenharmony_ci 1[0-24-f]* B 4108c2ecf20Sopenharmony_ci e6* G 4118c2ecf20Sopenharmony_ci e[0-57-f]* F 4128c2ecf20Sopenharmony_ci [02-df]* A 4138c2ecf20Sopenharmony_ci 4148c2ecf20Sopenharmony_ciSo, for instance, keys with the following example index keys will be found in 4158c2ecf20Sopenharmony_cithe appropriate nodes:: 4168c2ecf20Sopenharmony_ci 4178c2ecf20Sopenharmony_ci INDEX KEY PREFIX NODE 4188c2ecf20Sopenharmony_ci =============== ======= ==== 4198c2ecf20Sopenharmony_ci 13694892892489 13 C 4208c2ecf20Sopenharmony_ci 13795289025897 137 D 4218c2ecf20Sopenharmony_ci 13889dde88793 138 E 4228c2ecf20Sopenharmony_ci 138bbb89003093 138 E 4238c2ecf20Sopenharmony_ci 1394879524789 12 C 4248c2ecf20Sopenharmony_ci 1458952489 1 B 4258c2ecf20Sopenharmony_ci 9431809de993ba - A 4268c2ecf20Sopenharmony_ci b4542910809cd - A 4278c2ecf20Sopenharmony_ci e5284310def98 e F 4288c2ecf20Sopenharmony_ci e68428974237 e6 G 4298c2ecf20Sopenharmony_ci e7fffcbd443 e F 4308c2ecf20Sopenharmony_ci f3842239082 - A 4318c2ecf20Sopenharmony_ci 4328c2ecf20Sopenharmony_ciTo save memory, if a node can hold all the leaves in its portion of keyspace, 4338c2ecf20Sopenharmony_cithen the node will have all those leaves in it and will not have any metadata 4348c2ecf20Sopenharmony_cipointers - even if some of those leaves would like to be in the same slot. 4358c2ecf20Sopenharmony_ci 4368c2ecf20Sopenharmony_ciA node can contain a heterogeneous mix of leaves and metadata pointers. 4378c2ecf20Sopenharmony_ciMetadata pointers must be in the slots that match their subdivisions of key 4388c2ecf20Sopenharmony_cispace. The leaves can be in any slot not occupied by a metadata pointer. It 4398c2ecf20Sopenharmony_ciis guaranteed that none of the leaves in a node will match a slot occupied by a 4408c2ecf20Sopenharmony_cimetadata pointer. If the metadata pointer is there, any leaf whose key matches 4418c2ecf20Sopenharmony_cithe metadata key prefix must be in the subtree that the metadata pointer points 4428c2ecf20Sopenharmony_cito. 4438c2ecf20Sopenharmony_ci 4448c2ecf20Sopenharmony_ciIn the above example list of index keys, node A will contain:: 4458c2ecf20Sopenharmony_ci 4468c2ecf20Sopenharmony_ci SLOT CONTENT INDEX KEY (PREFIX) 4478c2ecf20Sopenharmony_ci ==== =============== ================== 4488c2ecf20Sopenharmony_ci 1 PTR TO NODE B 1* 4498c2ecf20Sopenharmony_ci any LEAF 9431809de993ba 4508c2ecf20Sopenharmony_ci any LEAF b4542910809cd 4518c2ecf20Sopenharmony_ci e PTR TO NODE F e* 4528c2ecf20Sopenharmony_ci any LEAF f3842239082 4538c2ecf20Sopenharmony_ci 4548c2ecf20Sopenharmony_ciand node B:: 4558c2ecf20Sopenharmony_ci 4568c2ecf20Sopenharmony_ci 3 PTR TO NODE C 13* 4578c2ecf20Sopenharmony_ci any LEAF 1458952489 4588c2ecf20Sopenharmony_ci 4598c2ecf20Sopenharmony_ci 4608c2ecf20Sopenharmony_ciShortcuts 4618c2ecf20Sopenharmony_ci--------- 4628c2ecf20Sopenharmony_ci 4638c2ecf20Sopenharmony_ciShortcuts are metadata records that jump over a piece of keyspace. A shortcut 4648c2ecf20Sopenharmony_ciis a replacement for a series of single-occupancy nodes ascending through the 4658c2ecf20Sopenharmony_cilevels. Shortcuts exist to save memory and to speed up traversal. 4668c2ecf20Sopenharmony_ci 4678c2ecf20Sopenharmony_ciIt is possible for the root of the tree to be a shortcut - say, for example, 4688c2ecf20Sopenharmony_cithe tree contains at least 17 nodes all with key prefix ``1111``. The 4698c2ecf20Sopenharmony_ciinsertion algorithm will insert a shortcut to skip over the ``1111`` keyspace 4708c2ecf20Sopenharmony_ciin a single bound and get to the fourth level where these actually become 4718c2ecf20Sopenharmony_cidifferent. 4728c2ecf20Sopenharmony_ci 4738c2ecf20Sopenharmony_ci 4748c2ecf20Sopenharmony_ciSplitting And Collapsing Nodes 4758c2ecf20Sopenharmony_ci------------------------------ 4768c2ecf20Sopenharmony_ci 4778c2ecf20Sopenharmony_ciEach node has a maximum capacity of 16 leaves and metadata pointers. If the 4788c2ecf20Sopenharmony_ciinsertion algorithm finds that it is trying to insert a 17th object into a 4798c2ecf20Sopenharmony_cinode, that node will be split such that at least two leaves that have a common 4808c2ecf20Sopenharmony_cikey segment at that level end up in a separate node rooted on that slot for 4818c2ecf20Sopenharmony_cithat common key segment. 4828c2ecf20Sopenharmony_ci 4838c2ecf20Sopenharmony_ciIf the leaves in a full node and the leaf that is being inserted are 4848c2ecf20Sopenharmony_cisufficiently similar, then a shortcut will be inserted into the tree. 4858c2ecf20Sopenharmony_ci 4868c2ecf20Sopenharmony_ciWhen the number of objects in the subtree rooted at a node falls to 16 or 4878c2ecf20Sopenharmony_cifewer, then the subtree will be collapsed down to a single node - and this will 4888c2ecf20Sopenharmony_ciripple towards the root if possible. 4898c2ecf20Sopenharmony_ci 4908c2ecf20Sopenharmony_ci 4918c2ecf20Sopenharmony_ciNon-Recursive Iteration 4928c2ecf20Sopenharmony_ci----------------------- 4938c2ecf20Sopenharmony_ci 4948c2ecf20Sopenharmony_ciEach node and shortcut contains a back pointer to its parent and the number of 4958c2ecf20Sopenharmony_cislot in that parent that points to it. None-recursive iteration uses these to 4968c2ecf20Sopenharmony_ciproceed rootwards through the tree, going to the parent node, slot N + 1 to 4978c2ecf20Sopenharmony_cimake sure progress is made without the need for a stack. 4988c2ecf20Sopenharmony_ci 4998c2ecf20Sopenharmony_ciThe backpointers, however, make simultaneous alteration and iteration tricky. 5008c2ecf20Sopenharmony_ci 5018c2ecf20Sopenharmony_ci 5028c2ecf20Sopenharmony_ciSimultaneous Alteration And Iteration 5038c2ecf20Sopenharmony_ci------------------------------------- 5048c2ecf20Sopenharmony_ci 5058c2ecf20Sopenharmony_ciThere are a number of cases to consider: 5068c2ecf20Sopenharmony_ci 5078c2ecf20Sopenharmony_ci1. Simple insert/replace. This involves simply replacing a NULL or old 5088c2ecf20Sopenharmony_ci matching leaf pointer with the pointer to the new leaf after a barrier. 5098c2ecf20Sopenharmony_ci The metadata blocks don't change otherwise. An old leaf won't be freed 5108c2ecf20Sopenharmony_ci until after the RCU grace period. 5118c2ecf20Sopenharmony_ci 5128c2ecf20Sopenharmony_ci2. Simple delete. This involves just clearing an old matching leaf. The 5138c2ecf20Sopenharmony_ci metadata blocks don't change otherwise. The old leaf won't be freed until 5148c2ecf20Sopenharmony_ci after the RCU grace period. 5158c2ecf20Sopenharmony_ci 5168c2ecf20Sopenharmony_ci3. Insertion replacing part of a subtree that we haven't yet entered. This 5178c2ecf20Sopenharmony_ci may involve replacement of part of that subtree - but that won't affect 5188c2ecf20Sopenharmony_ci the iteration as we won't have reached the pointer to it yet and the 5198c2ecf20Sopenharmony_ci ancestry blocks are not replaced (the layout of those does not change). 5208c2ecf20Sopenharmony_ci 5218c2ecf20Sopenharmony_ci4. Insertion replacing nodes that we're actively processing. This isn't a 5228c2ecf20Sopenharmony_ci problem as we've passed the anchoring pointer and won't switch onto the 5238c2ecf20Sopenharmony_ci new layout until we follow the back pointers - at which point we've 5248c2ecf20Sopenharmony_ci already examined the leaves in the replaced node (we iterate over all the 5258c2ecf20Sopenharmony_ci leaves in a node before following any of its metadata pointers). 5268c2ecf20Sopenharmony_ci 5278c2ecf20Sopenharmony_ci We might, however, re-see some leaves that have been split out into a new 5288c2ecf20Sopenharmony_ci branch that's in a slot further along than we were at. 5298c2ecf20Sopenharmony_ci 5308c2ecf20Sopenharmony_ci5. Insertion replacing nodes that we're processing a dependent branch of. 5318c2ecf20Sopenharmony_ci This won't affect us until we follow the back pointers. Similar to (4). 5328c2ecf20Sopenharmony_ci 5338c2ecf20Sopenharmony_ci6. Deletion collapsing a branch under us. This doesn't affect us because the 5348c2ecf20Sopenharmony_ci back pointers will get us back to the parent of the new node before we 5358c2ecf20Sopenharmony_ci could see the new node. The entire collapsed subtree is thrown away 5368c2ecf20Sopenharmony_ci unchanged - and will still be rooted on the same slot, so we shouldn't 5378c2ecf20Sopenharmony_ci process it a second time as we'll go back to slot + 1. 5388c2ecf20Sopenharmony_ci 5398c2ecf20Sopenharmony_ci.. note:: 5408c2ecf20Sopenharmony_ci 5418c2ecf20Sopenharmony_ci Under some circumstances, we need to simultaneously change the parent 5428c2ecf20Sopenharmony_ci pointer and the parent slot pointer on a node (say, for example, we 5438c2ecf20Sopenharmony_ci inserted another node before it and moved it up a level). We cannot do 5448c2ecf20Sopenharmony_ci this without locking against a read - so we have to replace that node too. 5458c2ecf20Sopenharmony_ci 5468c2ecf20Sopenharmony_ci However, when we're changing a shortcut into a node this isn't a problem 5478c2ecf20Sopenharmony_ci as shortcuts only have one slot and so the parent slot number isn't used 5488c2ecf20Sopenharmony_ci when traversing backwards over one. This means that it's okay to change 5498c2ecf20Sopenharmony_ci the slot number first - provided suitable barriers are used to make sure 5508c2ecf20Sopenharmony_ci the parent slot number is read after the back pointer. 5518c2ecf20Sopenharmony_ci 5528c2ecf20Sopenharmony_ciObsolete blocks and leaves are freed up after an RCU grace period has passed, 5538c2ecf20Sopenharmony_ciso as long as anyone doing walking or iteration holds the RCU read lock, the 5548c2ecf20Sopenharmony_ciold superstructure should not go away on them. 555