18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0+ 28c2ecf20Sopenharmony_ci 38c2ecf20Sopenharmony_ci====== 48c2ecf20Sopenharmony_ciXArray 58c2ecf20Sopenharmony_ci====== 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ci:Author: Matthew Wilcox 88c2ecf20Sopenharmony_ci 98c2ecf20Sopenharmony_ciOverview 108c2ecf20Sopenharmony_ci======== 118c2ecf20Sopenharmony_ci 128c2ecf20Sopenharmony_ciThe XArray is an abstract data type which behaves like a very large array 138c2ecf20Sopenharmony_ciof pointers. It meets many of the same needs as a hash or a conventional 148c2ecf20Sopenharmony_ciresizable array. Unlike a hash, it allows you to sensibly go to the 158c2ecf20Sopenharmony_cinext or previous entry in a cache-efficient manner. In contrast to a 168c2ecf20Sopenharmony_ciresizable array, there is no need to copy data or change MMU mappings in 178c2ecf20Sopenharmony_ciorder to grow the array. It is more memory-efficient, parallelisable 188c2ecf20Sopenharmony_ciand cache friendly than a doubly-linked list. It takes advantage of 198c2ecf20Sopenharmony_ciRCU to perform lookups without locking. 208c2ecf20Sopenharmony_ci 218c2ecf20Sopenharmony_ciThe XArray implementation is efficient when the indices used are densely 228c2ecf20Sopenharmony_ciclustered; hashing the object and using the hash as the index will not 238c2ecf20Sopenharmony_ciperform well. The XArray is optimised for small indices, but still has 248c2ecf20Sopenharmony_cigood performance with large indices. If your index can be larger than 258c2ecf20Sopenharmony_ci``ULONG_MAX`` then the XArray is not the data type for you. The most 268c2ecf20Sopenharmony_ciimportant user of the XArray is the page cache. 278c2ecf20Sopenharmony_ci 288c2ecf20Sopenharmony_ciNormal pointers may be stored in the XArray directly. They must be 4-byte 298c2ecf20Sopenharmony_cialigned, which is true for any pointer returned from kmalloc() and 308c2ecf20Sopenharmony_cialloc_page(). It isn't true for arbitrary user-space pointers, 318c2ecf20Sopenharmony_cinor for function pointers. You can store pointers to statically allocated 328c2ecf20Sopenharmony_ciobjects, as long as those objects have an alignment of at least 4. 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ciYou can also store integers between 0 and ``LONG_MAX`` in the XArray. 358c2ecf20Sopenharmony_ciYou must first convert it into an entry using xa_mk_value(). 368c2ecf20Sopenharmony_ciWhen you retrieve an entry from the XArray, you can check whether it is 378c2ecf20Sopenharmony_cia value entry by calling xa_is_value(), and convert it back to 388c2ecf20Sopenharmony_cian integer by calling xa_to_value(). 398c2ecf20Sopenharmony_ci 408c2ecf20Sopenharmony_ciSome users want to tag the pointers they store in the XArray. You can 418c2ecf20Sopenharmony_cicall xa_tag_pointer() to create an entry with a tag, xa_untag_pointer() 428c2ecf20Sopenharmony_cito turn a tagged entry back into an untagged pointer and xa_pointer_tag() 438c2ecf20Sopenharmony_cito retrieve the tag of an entry. Tagged pointers use the same bits that 448c2ecf20Sopenharmony_ciare used to distinguish value entries from normal pointers, so you must 458c2ecf20Sopenharmony_cidecide whether they want to store value entries or tagged pointers in 468c2ecf20Sopenharmony_ciany particular XArray. 478c2ecf20Sopenharmony_ci 488c2ecf20Sopenharmony_ciThe XArray does not support storing IS_ERR() pointers as some 498c2ecf20Sopenharmony_ciconflict with value entries or internal entries. 508c2ecf20Sopenharmony_ci 518c2ecf20Sopenharmony_ciAn unusual feature of the XArray is the ability to create entries which 528c2ecf20Sopenharmony_cioccupy a range of indices. Once stored to, looking up any index in 538c2ecf20Sopenharmony_cithe range will return the same entry as looking up any other index in 548c2ecf20Sopenharmony_cithe range. Storing to any index will store to all of them. Multi-index 558c2ecf20Sopenharmony_cientries can be explicitly split into smaller entries, or storing ``NULL`` 568c2ecf20Sopenharmony_ciinto any entry will cause the XArray to forget about the range. 578c2ecf20Sopenharmony_ci 588c2ecf20Sopenharmony_ciNormal API 598c2ecf20Sopenharmony_ci========== 608c2ecf20Sopenharmony_ci 618c2ecf20Sopenharmony_ciStart by initialising an XArray, either with DEFINE_XARRAY() 628c2ecf20Sopenharmony_cifor statically allocated XArrays or xa_init() for dynamically 638c2ecf20Sopenharmony_ciallocated ones. A freshly-initialised XArray contains a ``NULL`` 648c2ecf20Sopenharmony_cipointer at every index. 658c2ecf20Sopenharmony_ci 668c2ecf20Sopenharmony_ciYou can then set entries using xa_store() and get entries 678c2ecf20Sopenharmony_ciusing xa_load(). xa_store will overwrite any entry with the 688c2ecf20Sopenharmony_cinew entry and return the previous entry stored at that index. You can 698c2ecf20Sopenharmony_ciuse xa_erase() instead of calling xa_store() with a 708c2ecf20Sopenharmony_ci``NULL`` entry. There is no difference between an entry that has never 718c2ecf20Sopenharmony_cibeen stored to, one that has been erased and one that has most recently 728c2ecf20Sopenharmony_cihad ``NULL`` stored to it. 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ciYou can conditionally replace an entry at an index by using 758c2ecf20Sopenharmony_cixa_cmpxchg(). Like cmpxchg(), it will only succeed if 768c2ecf20Sopenharmony_cithe entry at that index has the 'old' value. It also returns the entry 778c2ecf20Sopenharmony_ciwhich was at that index; if it returns the same entry which was passed as 788c2ecf20Sopenharmony_ci'old', then xa_cmpxchg() succeeded. 798c2ecf20Sopenharmony_ci 808c2ecf20Sopenharmony_ciIf you want to only store a new entry to an index if the current entry 818c2ecf20Sopenharmony_ciat that index is ``NULL``, you can use xa_insert() which 828c2ecf20Sopenharmony_cireturns ``-EBUSY`` if the entry is not empty. 838c2ecf20Sopenharmony_ci 848c2ecf20Sopenharmony_ciYou can copy entries out of the XArray into a plain array by calling 858c2ecf20Sopenharmony_cixa_extract(). Or you can iterate over the present entries in the XArray 868c2ecf20Sopenharmony_ciby calling xa_for_each(), xa_for_each_start() or xa_for_each_range(). 878c2ecf20Sopenharmony_ciYou may prefer to use xa_find() or xa_find_after() to move to the next 888c2ecf20Sopenharmony_cipresent entry in the XArray. 898c2ecf20Sopenharmony_ci 908c2ecf20Sopenharmony_ciCalling xa_store_range() stores the same entry in a range 918c2ecf20Sopenharmony_ciof indices. If you do this, some of the other operations will behave 928c2ecf20Sopenharmony_ciin a slightly odd way. For example, marking the entry at one index 938c2ecf20Sopenharmony_cimay result in the entry being marked at some, but not all of the other 948c2ecf20Sopenharmony_ciindices. Storing into one index may result in the entry retrieved by 958c2ecf20Sopenharmony_cisome, but not all of the other indices changing. 968c2ecf20Sopenharmony_ci 978c2ecf20Sopenharmony_ciSometimes you need to ensure that a subsequent call to xa_store() 988c2ecf20Sopenharmony_ciwill not need to allocate memory. The xa_reserve() function 998c2ecf20Sopenharmony_ciwill store a reserved entry at the indicated index. Users of the 1008c2ecf20Sopenharmony_cinormal API will see this entry as containing ``NULL``. If you do 1018c2ecf20Sopenharmony_cinot need to use the reserved entry, you can call xa_release() 1028c2ecf20Sopenharmony_cito remove the unused entry. If another user has stored to the entry 1038c2ecf20Sopenharmony_ciin the meantime, xa_release() will do nothing; if instead you 1048c2ecf20Sopenharmony_ciwant the entry to become ``NULL``, you should use xa_erase(). 1058c2ecf20Sopenharmony_ciUsing xa_insert() on a reserved entry will fail. 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ciIf all entries in the array are ``NULL``, the xa_empty() function 1088c2ecf20Sopenharmony_ciwill return ``true``. 1098c2ecf20Sopenharmony_ci 1108c2ecf20Sopenharmony_ciFinally, you can remove all entries from an XArray by calling 1118c2ecf20Sopenharmony_cixa_destroy(). If the XArray entries are pointers, you may wish 1128c2ecf20Sopenharmony_cito free the entries first. You can do this by iterating over all present 1138c2ecf20Sopenharmony_cientries in the XArray using the xa_for_each() iterator. 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_ciSearch Marks 1168c2ecf20Sopenharmony_ci------------ 1178c2ecf20Sopenharmony_ci 1188c2ecf20Sopenharmony_ciEach entry in the array has three bits associated with it called marks. 1198c2ecf20Sopenharmony_ciEach mark may be set or cleared independently of the others. You can 1208c2ecf20Sopenharmony_ciiterate over marked entries by using the xa_for_each_marked() iterator. 1218c2ecf20Sopenharmony_ci 1228c2ecf20Sopenharmony_ciYou can enquire whether a mark is set on an entry by using 1238c2ecf20Sopenharmony_cixa_get_mark(). If the entry is not ``NULL``, you can set a mark on it 1248c2ecf20Sopenharmony_ciby using xa_set_mark() and remove the mark from an entry by calling 1258c2ecf20Sopenharmony_cixa_clear_mark(). You can ask whether any entry in the XArray has a 1268c2ecf20Sopenharmony_ciparticular mark set by calling xa_marked(). Erasing an entry from the 1278c2ecf20Sopenharmony_ciXArray causes all marks associated with that entry to be cleared. 1288c2ecf20Sopenharmony_ci 1298c2ecf20Sopenharmony_ciSetting or clearing a mark on any index of a multi-index entry will 1308c2ecf20Sopenharmony_ciaffect all indices covered by that entry. Querying the mark on any 1318c2ecf20Sopenharmony_ciindex will return the same result. 1328c2ecf20Sopenharmony_ci 1338c2ecf20Sopenharmony_ciThere is no way to iterate over entries which are not marked; the data 1348c2ecf20Sopenharmony_cistructure does not allow this to be implemented efficiently. There are 1358c2ecf20Sopenharmony_cinot currently iterators to search for logical combinations of bits (eg 1368c2ecf20Sopenharmony_ciiterate over all entries which have both ``XA_MARK_1`` and ``XA_MARK_2`` 1378c2ecf20Sopenharmony_ciset, or iterate over all entries which have ``XA_MARK_0`` or ``XA_MARK_2`` 1388c2ecf20Sopenharmony_ciset). It would be possible to add these if a user arises. 1398c2ecf20Sopenharmony_ci 1408c2ecf20Sopenharmony_ciAllocating XArrays 1418c2ecf20Sopenharmony_ci------------------ 1428c2ecf20Sopenharmony_ci 1438c2ecf20Sopenharmony_ciIf you use DEFINE_XARRAY_ALLOC() to define the XArray, or 1448c2ecf20Sopenharmony_ciinitialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(), 1458c2ecf20Sopenharmony_cithe XArray changes to track whether entries are in use or not. 1468c2ecf20Sopenharmony_ci 1478c2ecf20Sopenharmony_ciYou can call xa_alloc() to store the entry at an unused index 1488c2ecf20Sopenharmony_ciin the XArray. If you need to modify the array from interrupt context, 1498c2ecf20Sopenharmony_ciyou can use xa_alloc_bh() or xa_alloc_irq() to disable 1508c2ecf20Sopenharmony_ciinterrupts while allocating the ID. 1518c2ecf20Sopenharmony_ci 1528c2ecf20Sopenharmony_ciUsing xa_store(), xa_cmpxchg() or xa_insert() will 1538c2ecf20Sopenharmony_cialso mark the entry as being allocated. Unlike a normal XArray, storing 1548c2ecf20Sopenharmony_ci``NULL`` will mark the entry as being in use, like xa_reserve(). 1558c2ecf20Sopenharmony_ciTo free an entry, use xa_erase() (or xa_release() if 1568c2ecf20Sopenharmony_ciyou only want to free the entry if it's ``NULL``). 1578c2ecf20Sopenharmony_ci 1588c2ecf20Sopenharmony_ciBy default, the lowest free entry is allocated starting from 0. If you 1598c2ecf20Sopenharmony_ciwant to allocate entries starting at 1, it is more efficient to use 1608c2ecf20Sopenharmony_ciDEFINE_XARRAY_ALLOC1() or ``XA_FLAGS_ALLOC1``. If you want to 1618c2ecf20Sopenharmony_ciallocate IDs up to a maximum, then wrap back around to the lowest free 1628c2ecf20Sopenharmony_ciID, you can use xa_alloc_cyclic(). 1638c2ecf20Sopenharmony_ci 1648c2ecf20Sopenharmony_ciYou cannot use ``XA_MARK_0`` with an allocating XArray as this mark 1658c2ecf20Sopenharmony_ciis used to track whether an entry is free or not. The other marks are 1668c2ecf20Sopenharmony_ciavailable for your use. 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ciMemory allocation 1698c2ecf20Sopenharmony_ci----------------- 1708c2ecf20Sopenharmony_ci 1718c2ecf20Sopenharmony_ciThe xa_store(), xa_cmpxchg(), xa_alloc(), 1728c2ecf20Sopenharmony_cixa_reserve() and xa_insert() functions take a gfp_t 1738c2ecf20Sopenharmony_ciparameter in case the XArray needs to allocate memory to store this entry. 1748c2ecf20Sopenharmony_ciIf the entry is being deleted, no memory allocation needs to be performed, 1758c2ecf20Sopenharmony_ciand the GFP flags specified will be ignored. 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ciIt is possible for no memory to be allocatable, particularly if you pass 1788c2ecf20Sopenharmony_cia restrictive set of GFP flags. In that case, the functions return a 1798c2ecf20Sopenharmony_cispecial value which can be turned into an errno using xa_err(). 1808c2ecf20Sopenharmony_ciIf you don't need to know exactly which error occurred, using 1818c2ecf20Sopenharmony_cixa_is_err() is slightly more efficient. 1828c2ecf20Sopenharmony_ci 1838c2ecf20Sopenharmony_ciLocking 1848c2ecf20Sopenharmony_ci------- 1858c2ecf20Sopenharmony_ci 1868c2ecf20Sopenharmony_ciWhen using the Normal API, you do not have to worry about locking. 1878c2ecf20Sopenharmony_ciThe XArray uses RCU and an internal spinlock to synchronise access: 1888c2ecf20Sopenharmony_ci 1898c2ecf20Sopenharmony_ciNo lock needed: 1908c2ecf20Sopenharmony_ci * xa_empty() 1918c2ecf20Sopenharmony_ci * xa_marked() 1928c2ecf20Sopenharmony_ci 1938c2ecf20Sopenharmony_ciTakes RCU read lock: 1948c2ecf20Sopenharmony_ci * xa_load() 1958c2ecf20Sopenharmony_ci * xa_for_each() 1968c2ecf20Sopenharmony_ci * xa_for_each_start() 1978c2ecf20Sopenharmony_ci * xa_for_each_range() 1988c2ecf20Sopenharmony_ci * xa_find() 1998c2ecf20Sopenharmony_ci * xa_find_after() 2008c2ecf20Sopenharmony_ci * xa_extract() 2018c2ecf20Sopenharmony_ci * xa_get_mark() 2028c2ecf20Sopenharmony_ci 2038c2ecf20Sopenharmony_ciTakes xa_lock internally: 2048c2ecf20Sopenharmony_ci * xa_store() 2058c2ecf20Sopenharmony_ci * xa_store_bh() 2068c2ecf20Sopenharmony_ci * xa_store_irq() 2078c2ecf20Sopenharmony_ci * xa_insert() 2088c2ecf20Sopenharmony_ci * xa_insert_bh() 2098c2ecf20Sopenharmony_ci * xa_insert_irq() 2108c2ecf20Sopenharmony_ci * xa_erase() 2118c2ecf20Sopenharmony_ci * xa_erase_bh() 2128c2ecf20Sopenharmony_ci * xa_erase_irq() 2138c2ecf20Sopenharmony_ci * xa_cmpxchg() 2148c2ecf20Sopenharmony_ci * xa_cmpxchg_bh() 2158c2ecf20Sopenharmony_ci * xa_cmpxchg_irq() 2168c2ecf20Sopenharmony_ci * xa_store_range() 2178c2ecf20Sopenharmony_ci * xa_alloc() 2188c2ecf20Sopenharmony_ci * xa_alloc_bh() 2198c2ecf20Sopenharmony_ci * xa_alloc_irq() 2208c2ecf20Sopenharmony_ci * xa_reserve() 2218c2ecf20Sopenharmony_ci * xa_reserve_bh() 2228c2ecf20Sopenharmony_ci * xa_reserve_irq() 2238c2ecf20Sopenharmony_ci * xa_destroy() 2248c2ecf20Sopenharmony_ci * xa_set_mark() 2258c2ecf20Sopenharmony_ci * xa_clear_mark() 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ciAssumes xa_lock held on entry: 2288c2ecf20Sopenharmony_ci * __xa_store() 2298c2ecf20Sopenharmony_ci * __xa_insert() 2308c2ecf20Sopenharmony_ci * __xa_erase() 2318c2ecf20Sopenharmony_ci * __xa_cmpxchg() 2328c2ecf20Sopenharmony_ci * __xa_alloc() 2338c2ecf20Sopenharmony_ci * __xa_set_mark() 2348c2ecf20Sopenharmony_ci * __xa_clear_mark() 2358c2ecf20Sopenharmony_ci 2368c2ecf20Sopenharmony_ciIf you want to take advantage of the lock to protect the data structures 2378c2ecf20Sopenharmony_cithat you are storing in the XArray, you can call xa_lock() 2388c2ecf20Sopenharmony_cibefore calling xa_load(), then take a reference count on the 2398c2ecf20Sopenharmony_ciobject you have found before calling xa_unlock(). This will 2408c2ecf20Sopenharmony_ciprevent stores from removing the object from the array between looking 2418c2ecf20Sopenharmony_ciup the object and incrementing the refcount. You can also use RCU to 2428c2ecf20Sopenharmony_ciavoid dereferencing freed memory, but an explanation of that is beyond 2438c2ecf20Sopenharmony_cithe scope of this document. 2448c2ecf20Sopenharmony_ci 2458c2ecf20Sopenharmony_ciThe XArray does not disable interrupts or softirqs while modifying 2468c2ecf20Sopenharmony_cithe array. It is safe to read the XArray from interrupt or softirq 2478c2ecf20Sopenharmony_cicontext as the RCU lock provides enough protection. 2488c2ecf20Sopenharmony_ci 2498c2ecf20Sopenharmony_ciIf, for example, you want to store entries in the XArray in process 2508c2ecf20Sopenharmony_cicontext and then erase them in softirq context, you can do that this way:: 2518c2ecf20Sopenharmony_ci 2528c2ecf20Sopenharmony_ci void foo_init(struct foo *foo) 2538c2ecf20Sopenharmony_ci { 2548c2ecf20Sopenharmony_ci xa_init_flags(&foo->array, XA_FLAGS_LOCK_BH); 2558c2ecf20Sopenharmony_ci } 2568c2ecf20Sopenharmony_ci 2578c2ecf20Sopenharmony_ci int foo_store(struct foo *foo, unsigned long index, void *entry) 2588c2ecf20Sopenharmony_ci { 2598c2ecf20Sopenharmony_ci int err; 2608c2ecf20Sopenharmony_ci 2618c2ecf20Sopenharmony_ci xa_lock_bh(&foo->array); 2628c2ecf20Sopenharmony_ci err = xa_err(__xa_store(&foo->array, index, entry, GFP_KERNEL)); 2638c2ecf20Sopenharmony_ci if (!err) 2648c2ecf20Sopenharmony_ci foo->count++; 2658c2ecf20Sopenharmony_ci xa_unlock_bh(&foo->array); 2668c2ecf20Sopenharmony_ci return err; 2678c2ecf20Sopenharmony_ci } 2688c2ecf20Sopenharmony_ci 2698c2ecf20Sopenharmony_ci /* foo_erase() is only called from softirq context */ 2708c2ecf20Sopenharmony_ci void foo_erase(struct foo *foo, unsigned long index) 2718c2ecf20Sopenharmony_ci { 2728c2ecf20Sopenharmony_ci xa_lock(&foo->array); 2738c2ecf20Sopenharmony_ci __xa_erase(&foo->array, index); 2748c2ecf20Sopenharmony_ci foo->count--; 2758c2ecf20Sopenharmony_ci xa_unlock(&foo->array); 2768c2ecf20Sopenharmony_ci } 2778c2ecf20Sopenharmony_ci 2788c2ecf20Sopenharmony_ciIf you are going to modify the XArray from interrupt or softirq context, 2798c2ecf20Sopenharmony_ciyou need to initialise the array using xa_init_flags(), passing 2808c2ecf20Sopenharmony_ci``XA_FLAGS_LOCK_IRQ`` or ``XA_FLAGS_LOCK_BH``. 2818c2ecf20Sopenharmony_ci 2828c2ecf20Sopenharmony_ciThe above example also shows a common pattern of wanting to extend the 2838c2ecf20Sopenharmony_cicoverage of the xa_lock on the store side to protect some statistics 2848c2ecf20Sopenharmony_ciassociated with the array. 2858c2ecf20Sopenharmony_ci 2868c2ecf20Sopenharmony_ciSharing the XArray with interrupt context is also possible, either 2878c2ecf20Sopenharmony_ciusing xa_lock_irqsave() in both the interrupt handler and process 2888c2ecf20Sopenharmony_cicontext, or xa_lock_irq() in process context and xa_lock() 2898c2ecf20Sopenharmony_ciin the interrupt handler. Some of the more common patterns have helper 2908c2ecf20Sopenharmony_cifunctions such as xa_store_bh(), xa_store_irq(), 2918c2ecf20Sopenharmony_cixa_erase_bh(), xa_erase_irq(), xa_cmpxchg_bh() 2928c2ecf20Sopenharmony_ciand xa_cmpxchg_irq(). 2938c2ecf20Sopenharmony_ci 2948c2ecf20Sopenharmony_ciSometimes you need to protect access to the XArray with a mutex because 2958c2ecf20Sopenharmony_cithat lock sits above another mutex in the locking hierarchy. That does 2968c2ecf20Sopenharmony_cinot entitle you to use functions like __xa_erase() without taking 2978c2ecf20Sopenharmony_cithe xa_lock; the xa_lock is used for lockdep validation and will be used 2988c2ecf20Sopenharmony_cifor other purposes in the future. 2998c2ecf20Sopenharmony_ci 3008c2ecf20Sopenharmony_ciThe __xa_set_mark() and __xa_clear_mark() functions are also 3018c2ecf20Sopenharmony_ciavailable for situations where you look up an entry and want to atomically 3028c2ecf20Sopenharmony_ciset or clear a mark. It may be more efficient to use the advanced API 3038c2ecf20Sopenharmony_ciin this case, as it will save you from walking the tree twice. 3048c2ecf20Sopenharmony_ci 3058c2ecf20Sopenharmony_ciAdvanced API 3068c2ecf20Sopenharmony_ci============ 3078c2ecf20Sopenharmony_ci 3088c2ecf20Sopenharmony_ciThe advanced API offers more flexibility and better performance at the 3098c2ecf20Sopenharmony_cicost of an interface which can be harder to use and has fewer safeguards. 3108c2ecf20Sopenharmony_ciNo locking is done for you by the advanced API, and you are required 3118c2ecf20Sopenharmony_cito use the xa_lock while modifying the array. You can choose whether 3128c2ecf20Sopenharmony_cito use the xa_lock or the RCU lock while doing read-only operations on 3138c2ecf20Sopenharmony_cithe array. You can mix advanced and normal operations on the same array; 3148c2ecf20Sopenharmony_ciindeed the normal API is implemented in terms of the advanced API. The 3158c2ecf20Sopenharmony_ciadvanced API is only available to modules with a GPL-compatible license. 3168c2ecf20Sopenharmony_ci 3178c2ecf20Sopenharmony_ciThe advanced API is based around the xa_state. This is an opaque data 3188c2ecf20Sopenharmony_cistructure which you declare on the stack using the XA_STATE() 3198c2ecf20Sopenharmony_cimacro. This macro initialises the xa_state ready to start walking 3208c2ecf20Sopenharmony_ciaround the XArray. It is used as a cursor to maintain the position 3218c2ecf20Sopenharmony_ciin the XArray and let you compose various operations together without 3228c2ecf20Sopenharmony_cihaving to restart from the top every time. 3238c2ecf20Sopenharmony_ci 3248c2ecf20Sopenharmony_ciThe xa_state is also used to store errors. You can call 3258c2ecf20Sopenharmony_cixas_error() to retrieve the error. All operations check whether 3268c2ecf20Sopenharmony_cithe xa_state is in an error state before proceeding, so there's no need 3278c2ecf20Sopenharmony_cifor you to check for an error after each call; you can make multiple 3288c2ecf20Sopenharmony_cicalls in succession and only check at a convenient point. The only 3298c2ecf20Sopenharmony_cierrors currently generated by the XArray code itself are ``ENOMEM`` and 3308c2ecf20Sopenharmony_ci``EINVAL``, but it supports arbitrary errors in case you want to call 3318c2ecf20Sopenharmony_cixas_set_err() yourself. 3328c2ecf20Sopenharmony_ci 3338c2ecf20Sopenharmony_ciIf the xa_state is holding an ``ENOMEM`` error, calling xas_nomem() 3348c2ecf20Sopenharmony_ciwill attempt to allocate more memory using the specified gfp flags and 3358c2ecf20Sopenharmony_cicache it in the xa_state for the next attempt. The idea is that you take 3368c2ecf20Sopenharmony_cithe xa_lock, attempt the operation and drop the lock. The operation 3378c2ecf20Sopenharmony_ciattempts to allocate memory while holding the lock, but it is more 3388c2ecf20Sopenharmony_cilikely to fail. Once you have dropped the lock, xas_nomem() 3398c2ecf20Sopenharmony_cican try harder to allocate more memory. It will return ``true`` if it 3408c2ecf20Sopenharmony_ciis worth retrying the operation (i.e. that there was a memory error *and* 3418c2ecf20Sopenharmony_cimore memory was allocated). If it has previously allocated memory, and 3428c2ecf20Sopenharmony_cithat memory wasn't used, and there is no error (or some error that isn't 3438c2ecf20Sopenharmony_ci``ENOMEM``), then it will free the memory previously allocated. 3448c2ecf20Sopenharmony_ci 3458c2ecf20Sopenharmony_ciInternal Entries 3468c2ecf20Sopenharmony_ci---------------- 3478c2ecf20Sopenharmony_ci 3488c2ecf20Sopenharmony_ciThe XArray reserves some entries for its own purposes. These are never 3498c2ecf20Sopenharmony_ciexposed through the normal API, but when using the advanced API, it's 3508c2ecf20Sopenharmony_cipossible to see them. Usually the best way to handle them is to pass them 3518c2ecf20Sopenharmony_cito xas_retry(), and retry the operation if it returns ``true``. 3528c2ecf20Sopenharmony_ci 3538c2ecf20Sopenharmony_ci.. flat-table:: 3548c2ecf20Sopenharmony_ci :widths: 1 1 6 3558c2ecf20Sopenharmony_ci 3568c2ecf20Sopenharmony_ci * - Name 3578c2ecf20Sopenharmony_ci - Test 3588c2ecf20Sopenharmony_ci - Usage 3598c2ecf20Sopenharmony_ci 3608c2ecf20Sopenharmony_ci * - Node 3618c2ecf20Sopenharmony_ci - xa_is_node() 3628c2ecf20Sopenharmony_ci - An XArray node. May be visible when using a multi-index xa_state. 3638c2ecf20Sopenharmony_ci 3648c2ecf20Sopenharmony_ci * - Sibling 3658c2ecf20Sopenharmony_ci - xa_is_sibling() 3668c2ecf20Sopenharmony_ci - A non-canonical entry for a multi-index entry. The value indicates 3678c2ecf20Sopenharmony_ci which slot in this node has the canonical entry. 3688c2ecf20Sopenharmony_ci 3698c2ecf20Sopenharmony_ci * - Retry 3708c2ecf20Sopenharmony_ci - xa_is_retry() 3718c2ecf20Sopenharmony_ci - This entry is currently being modified by a thread which has the 3728c2ecf20Sopenharmony_ci xa_lock. The node containing this entry may be freed at the end 3738c2ecf20Sopenharmony_ci of this RCU period. You should restart the lookup from the head 3748c2ecf20Sopenharmony_ci of the array. 3758c2ecf20Sopenharmony_ci 3768c2ecf20Sopenharmony_ci * - Zero 3778c2ecf20Sopenharmony_ci - xa_is_zero() 3788c2ecf20Sopenharmony_ci - Zero entries appear as ``NULL`` through the Normal API, but occupy 3798c2ecf20Sopenharmony_ci an entry in the XArray which can be used to reserve the index for 3808c2ecf20Sopenharmony_ci future use. This is used by allocating XArrays for allocated entries 3818c2ecf20Sopenharmony_ci which are ``NULL``. 3828c2ecf20Sopenharmony_ci 3838c2ecf20Sopenharmony_ciOther internal entries may be added in the future. As far as possible, they 3848c2ecf20Sopenharmony_ciwill be handled by xas_retry(). 3858c2ecf20Sopenharmony_ci 3868c2ecf20Sopenharmony_ciAdditional functionality 3878c2ecf20Sopenharmony_ci------------------------ 3888c2ecf20Sopenharmony_ci 3898c2ecf20Sopenharmony_ciThe xas_create_range() function allocates all the necessary memory 3908c2ecf20Sopenharmony_cito store every entry in a range. It will set ENOMEM in the xa_state if 3918c2ecf20Sopenharmony_ciit cannot allocate memory. 3928c2ecf20Sopenharmony_ci 3938c2ecf20Sopenharmony_ciYou can use xas_init_marks() to reset the marks on an entry 3948c2ecf20Sopenharmony_cito their default state. This is usually all marks clear, unless the 3958c2ecf20Sopenharmony_ciXArray is marked with ``XA_FLAGS_TRACK_FREE``, in which case mark 0 is set 3968c2ecf20Sopenharmony_ciand all other marks are clear. Replacing one entry with another using 3978c2ecf20Sopenharmony_cixas_store() will not reset the marks on that entry; if you want 3988c2ecf20Sopenharmony_cithe marks reset, you should do that explicitly. 3998c2ecf20Sopenharmony_ci 4008c2ecf20Sopenharmony_ciThe xas_load() will walk the xa_state as close to the entry 4018c2ecf20Sopenharmony_cias it can. If you know the xa_state has already been walked to the 4028c2ecf20Sopenharmony_cientry and need to check that the entry hasn't changed, you can use 4038c2ecf20Sopenharmony_cixas_reload() to save a function call. 4048c2ecf20Sopenharmony_ci 4058c2ecf20Sopenharmony_ciIf you need to move to a different index in the XArray, call 4068c2ecf20Sopenharmony_cixas_set(). This resets the cursor to the top of the tree, which 4078c2ecf20Sopenharmony_ciwill generally make the next operation walk the cursor to the desired 4088c2ecf20Sopenharmony_cispot in the tree. If you want to move to the next or previous index, 4098c2ecf20Sopenharmony_cicall xas_next() or xas_prev(). Setting the index does 4108c2ecf20Sopenharmony_cinot walk the cursor around the array so does not require a lock to be 4118c2ecf20Sopenharmony_ciheld, while moving to the next or previous index does. 4128c2ecf20Sopenharmony_ci 4138c2ecf20Sopenharmony_ciYou can search for the next present entry using xas_find(). This 4148c2ecf20Sopenharmony_ciis the equivalent of both xa_find() and xa_find_after(); 4158c2ecf20Sopenharmony_ciif the cursor has been walked to an entry, then it will find the next 4168c2ecf20Sopenharmony_cientry after the one currently referenced. If not, it will return the 4178c2ecf20Sopenharmony_cientry at the index of the xa_state. Using xas_next_entry() to 4188c2ecf20Sopenharmony_cimove to the next present entry instead of xas_find() will save 4198c2ecf20Sopenharmony_cia function call in the majority of cases at the expense of emitting more 4208c2ecf20Sopenharmony_ciinline code. 4218c2ecf20Sopenharmony_ci 4228c2ecf20Sopenharmony_ciThe xas_find_marked() function is similar. If the xa_state has 4238c2ecf20Sopenharmony_cinot been walked, it will return the entry at the index of the xa_state, 4248c2ecf20Sopenharmony_ciif it is marked. Otherwise, it will return the first marked entry after 4258c2ecf20Sopenharmony_cithe entry referenced by the xa_state. The xas_next_marked() 4268c2ecf20Sopenharmony_cifunction is the equivalent of xas_next_entry(). 4278c2ecf20Sopenharmony_ci 4288c2ecf20Sopenharmony_ciWhen iterating over a range of the XArray using xas_for_each() 4298c2ecf20Sopenharmony_cior xas_for_each_marked(), it may be necessary to temporarily stop 4308c2ecf20Sopenharmony_cithe iteration. The xas_pause() function exists for this purpose. 4318c2ecf20Sopenharmony_ciAfter you have done the necessary work and wish to resume, the xa_state 4328c2ecf20Sopenharmony_ciis in an appropriate state to continue the iteration after the entry 4338c2ecf20Sopenharmony_ciyou last processed. If you have interrupts disabled while iterating, 4348c2ecf20Sopenharmony_cithen it is good manners to pause the iteration and reenable interrupts 4358c2ecf20Sopenharmony_cievery ``XA_CHECK_SCHED`` entries. 4368c2ecf20Sopenharmony_ci 4378c2ecf20Sopenharmony_ciThe xas_get_mark(), xas_set_mark() and xas_clear_mark() functions require 4388c2ecf20Sopenharmony_cithe xa_state cursor to have been moved to the appropriate location in the 4398c2ecf20Sopenharmony_ciXArray; they will do nothing if you have called xas_pause() or xas_set() 4408c2ecf20Sopenharmony_ciimmediately before. 4418c2ecf20Sopenharmony_ci 4428c2ecf20Sopenharmony_ciYou can call xas_set_update() to have a callback function 4438c2ecf20Sopenharmony_cicalled each time the XArray updates a node. This is used by the page 4448c2ecf20Sopenharmony_cicache workingset code to maintain its list of nodes which contain only 4458c2ecf20Sopenharmony_cishadow entries. 4468c2ecf20Sopenharmony_ci 4478c2ecf20Sopenharmony_ciMulti-Index Entries 4488c2ecf20Sopenharmony_ci------------------- 4498c2ecf20Sopenharmony_ci 4508c2ecf20Sopenharmony_ciThe XArray has the ability to tie multiple indices together so that 4518c2ecf20Sopenharmony_cioperations on one index affect all indices. For example, storing into 4528c2ecf20Sopenharmony_ciany index will change the value of the entry retrieved from any index. 4538c2ecf20Sopenharmony_ciSetting or clearing a mark on any index will set or clear the mark 4548c2ecf20Sopenharmony_cion every index that is tied together. The current implementation 4558c2ecf20Sopenharmony_cionly allows tying ranges which are aligned powers of two together; 4568c2ecf20Sopenharmony_cieg indices 64-127 may be tied together, but 2-6 may not be. This may 4578c2ecf20Sopenharmony_cisave substantial quantities of memory; for example tying 512 entries 4588c2ecf20Sopenharmony_citogether will save over 4kB. 4598c2ecf20Sopenharmony_ci 4608c2ecf20Sopenharmony_ciYou can create a multi-index entry by using XA_STATE_ORDER() 4618c2ecf20Sopenharmony_cior xas_set_order() followed by a call to xas_store(). 4628c2ecf20Sopenharmony_ciCalling xas_load() with a multi-index xa_state will walk the 4638c2ecf20Sopenharmony_cixa_state to the right location in the tree, but the return value is not 4648c2ecf20Sopenharmony_cimeaningful, potentially being an internal entry or ``NULL`` even when there 4658c2ecf20Sopenharmony_ciis an entry stored within the range. Calling xas_find_conflict() 4668c2ecf20Sopenharmony_ciwill return the first entry within the range or ``NULL`` if there are no 4678c2ecf20Sopenharmony_cientries in the range. The xas_for_each_conflict() iterator will 4688c2ecf20Sopenharmony_ciiterate over every entry which overlaps the specified range. 4698c2ecf20Sopenharmony_ci 4708c2ecf20Sopenharmony_ciIf xas_load() encounters a multi-index entry, the xa_index 4718c2ecf20Sopenharmony_ciin the xa_state will not be changed. When iterating over an XArray 4728c2ecf20Sopenharmony_cior calling xas_find(), if the initial index is in the middle 4738c2ecf20Sopenharmony_ciof a multi-index entry, it will not be altered. Subsequent calls 4748c2ecf20Sopenharmony_cior iterations will move the index to the first index in the range. 4758c2ecf20Sopenharmony_ciEach entry will only be returned once, no matter how many indices it 4768c2ecf20Sopenharmony_cioccupies. 4778c2ecf20Sopenharmony_ci 4788c2ecf20Sopenharmony_ciUsing xas_next() or xas_prev() with a multi-index xa_state is not 4798c2ecf20Sopenharmony_cisupported. Using either of these functions on a multi-index entry will 4808c2ecf20Sopenharmony_cireveal sibling entries; these should be skipped over by the caller. 4818c2ecf20Sopenharmony_ci 4828c2ecf20Sopenharmony_ciStoring ``NULL`` into any index of a multi-index entry will set the 4838c2ecf20Sopenharmony_cientry at every index to ``NULL`` and dissolve the tie. A multi-index 4848c2ecf20Sopenharmony_cientry can be split into entries occupying smaller ranges by calling 4858c2ecf20Sopenharmony_cixas_split_alloc() without the xa_lock held, followed by taking the lock 4868c2ecf20Sopenharmony_ciand calling xas_split(). 4878c2ecf20Sopenharmony_ci 4888c2ecf20Sopenharmony_ciFunctions and structures 4898c2ecf20Sopenharmony_ci======================== 4908c2ecf20Sopenharmony_ci 4918c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/xarray.h 4928c2ecf20Sopenharmony_ci.. kernel-doc:: lib/xarray.c 493