18c2ecf20Sopenharmony_ci.. _array_rcu_doc: 28c2ecf20Sopenharmony_ci 38c2ecf20Sopenharmony_ciUsing RCU to Protect Read-Mostly Arrays 48c2ecf20Sopenharmony_ci======================================= 58c2ecf20Sopenharmony_ci 68c2ecf20Sopenharmony_ciAlthough RCU is more commonly used to protect linked lists, it can 78c2ecf20Sopenharmony_cialso be used to protect arrays. Three situations are as follows: 88c2ecf20Sopenharmony_ci 98c2ecf20Sopenharmony_ci1. :ref:`Hash Tables <hash_tables>` 108c2ecf20Sopenharmony_ci 118c2ecf20Sopenharmony_ci2. :ref:`Static Arrays <static_arrays>` 128c2ecf20Sopenharmony_ci 138c2ecf20Sopenharmony_ci3. :ref:`Resizable Arrays <resizable_arrays>` 148c2ecf20Sopenharmony_ci 158c2ecf20Sopenharmony_ciEach of these three situations involves an RCU-protected pointer to an 168c2ecf20Sopenharmony_ciarray that is separately indexed. It might be tempting to consider use 178c2ecf20Sopenharmony_ciof RCU to instead protect the index into an array, however, this use 188c2ecf20Sopenharmony_cicase is **not** supported. The problem with RCU-protected indexes into 198c2ecf20Sopenharmony_ciarrays is that compilers can play way too many optimization games with 208c2ecf20Sopenharmony_ciintegers, which means that the rules governing handling of these indexes 218c2ecf20Sopenharmony_ciare far more trouble than they are worth. If RCU-protected indexes into 228c2ecf20Sopenharmony_ciarrays prove to be particularly valuable (which they have not thus far), 238c2ecf20Sopenharmony_ciexplicit cooperation from the compiler will be required to permit them 248c2ecf20Sopenharmony_cito be safely used. 258c2ecf20Sopenharmony_ci 268c2ecf20Sopenharmony_ciThat aside, each of the three RCU-protected pointer situations are 278c2ecf20Sopenharmony_cidescribed in the following sections. 288c2ecf20Sopenharmony_ci 298c2ecf20Sopenharmony_ci.. _hash_tables: 308c2ecf20Sopenharmony_ci 318c2ecf20Sopenharmony_ciSituation 1: Hash Tables 328c2ecf20Sopenharmony_ci------------------------ 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ciHash tables are often implemented as an array, where each array entry 358c2ecf20Sopenharmony_cihas a linked-list hash chain. Each hash chain can be protected by RCU 368c2ecf20Sopenharmony_cias described in the listRCU.txt document. This approach also applies 378c2ecf20Sopenharmony_cito other array-of-list situations, such as radix trees. 388c2ecf20Sopenharmony_ci 398c2ecf20Sopenharmony_ci.. _static_arrays: 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ciSituation 2: Static Arrays 428c2ecf20Sopenharmony_ci-------------------------- 438c2ecf20Sopenharmony_ci 448c2ecf20Sopenharmony_ciStatic arrays, where the data (rather than a pointer to the data) is 458c2ecf20Sopenharmony_cilocated in each array element, and where the array is never resized, 468c2ecf20Sopenharmony_cihave not been used with RCU. Rik van Riel recommends using seqlock in 478c2ecf20Sopenharmony_cithis situation, which would also have minimal read-side overhead as long 488c2ecf20Sopenharmony_cias updates are rare. 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ciQuick Quiz: 518c2ecf20Sopenharmony_ci Why is it so important that updates be rare when using seqlock? 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ci:ref:`Answer to Quick Quiz <answer_quick_quiz_seqlock>` 548c2ecf20Sopenharmony_ci 558c2ecf20Sopenharmony_ci.. _resizable_arrays: 568c2ecf20Sopenharmony_ci 578c2ecf20Sopenharmony_ciSituation 3: Resizable Arrays 588c2ecf20Sopenharmony_ci------------------------------ 598c2ecf20Sopenharmony_ci 608c2ecf20Sopenharmony_ciUse of RCU for resizable arrays is demonstrated by the grow_ary() 618c2ecf20Sopenharmony_cifunction formerly used by the System V IPC code. The array is used 628c2ecf20Sopenharmony_cito map from semaphore, message-queue, and shared-memory IDs to the data 638c2ecf20Sopenharmony_cistructure that represents the corresponding IPC construct. The grow_ary() 648c2ecf20Sopenharmony_cifunction does not acquire any locks; instead its caller must hold the 658c2ecf20Sopenharmony_ciids->sem semaphore. 668c2ecf20Sopenharmony_ci 678c2ecf20Sopenharmony_ciThe grow_ary() function, shown below, does some limit checks, allocates a 688c2ecf20Sopenharmony_cinew ipc_id_ary, copies the old to the new portion of the new, initializes 698c2ecf20Sopenharmony_cithe remainder of the new, updates the ids->entries pointer to point to 708c2ecf20Sopenharmony_cithe new array, and invokes ipc_rcu_putref() to free up the old array. 718c2ecf20Sopenharmony_ciNote that rcu_assign_pointer() is used to update the ids->entries pointer, 728c2ecf20Sopenharmony_ciwhich includes any memory barriers required on whatever architecture 738c2ecf20Sopenharmony_ciyou are running on:: 748c2ecf20Sopenharmony_ci 758c2ecf20Sopenharmony_ci static int grow_ary(struct ipc_ids* ids, int newsize) 768c2ecf20Sopenharmony_ci { 778c2ecf20Sopenharmony_ci struct ipc_id_ary* new; 788c2ecf20Sopenharmony_ci struct ipc_id_ary* old; 798c2ecf20Sopenharmony_ci int i; 808c2ecf20Sopenharmony_ci int size = ids->entries->size; 818c2ecf20Sopenharmony_ci 828c2ecf20Sopenharmony_ci if(newsize > IPCMNI) 838c2ecf20Sopenharmony_ci newsize = IPCMNI; 848c2ecf20Sopenharmony_ci if(newsize <= size) 858c2ecf20Sopenharmony_ci return newsize; 868c2ecf20Sopenharmony_ci 878c2ecf20Sopenharmony_ci new = ipc_rcu_alloc(sizeof(struct kern_ipc_perm *)*newsize + 888c2ecf20Sopenharmony_ci sizeof(struct ipc_id_ary)); 898c2ecf20Sopenharmony_ci if(new == NULL) 908c2ecf20Sopenharmony_ci return size; 918c2ecf20Sopenharmony_ci new->size = newsize; 928c2ecf20Sopenharmony_ci memcpy(new->p, ids->entries->p, 938c2ecf20Sopenharmony_ci sizeof(struct kern_ipc_perm *)*size + 948c2ecf20Sopenharmony_ci sizeof(struct ipc_id_ary)); 958c2ecf20Sopenharmony_ci for(i=size;i<newsize;i++) { 968c2ecf20Sopenharmony_ci new->p[i] = NULL; 978c2ecf20Sopenharmony_ci } 988c2ecf20Sopenharmony_ci old = ids->entries; 998c2ecf20Sopenharmony_ci 1008c2ecf20Sopenharmony_ci /* 1018c2ecf20Sopenharmony_ci * Use rcu_assign_pointer() to make sure the memcpyed 1028c2ecf20Sopenharmony_ci * contents of the new array are visible before the new 1038c2ecf20Sopenharmony_ci * array becomes visible. 1048c2ecf20Sopenharmony_ci */ 1058c2ecf20Sopenharmony_ci rcu_assign_pointer(ids->entries, new); 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ci ipc_rcu_putref(old); 1088c2ecf20Sopenharmony_ci return newsize; 1098c2ecf20Sopenharmony_ci } 1108c2ecf20Sopenharmony_ci 1118c2ecf20Sopenharmony_ciThe ipc_rcu_putref() function decrements the array's reference count 1128c2ecf20Sopenharmony_ciand then, if the reference count has dropped to zero, uses call_rcu() 1138c2ecf20Sopenharmony_cito free the array after a grace period has elapsed. 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_ciThe array is traversed by the ipc_lock() function. This function 1168c2ecf20Sopenharmony_ciindexes into the array under the protection of rcu_read_lock(), 1178c2ecf20Sopenharmony_ciusing rcu_dereference() to pick up the pointer to the array so 1188c2ecf20Sopenharmony_cithat it may later safely be dereferenced -- memory barriers are 1198c2ecf20Sopenharmony_cirequired on the Alpha CPU. Since the size of the array is stored 1208c2ecf20Sopenharmony_ciwith the array itself, there can be no array-size mismatches, so 1218c2ecf20Sopenharmony_cia simple check suffices. The pointer to the structure corresponding 1228c2ecf20Sopenharmony_cito the desired IPC object is placed in "out", with NULL indicating 1238c2ecf20Sopenharmony_cia non-existent entry. After acquiring "out->lock", the "out->deleted" 1248c2ecf20Sopenharmony_ciflag indicates whether the IPC object is in the process of being 1258c2ecf20Sopenharmony_cideleted, and, if not, the pointer is returned:: 1268c2ecf20Sopenharmony_ci 1278c2ecf20Sopenharmony_ci struct kern_ipc_perm* ipc_lock(struct ipc_ids* ids, int id) 1288c2ecf20Sopenharmony_ci { 1298c2ecf20Sopenharmony_ci struct kern_ipc_perm* out; 1308c2ecf20Sopenharmony_ci int lid = id % SEQ_MULTIPLIER; 1318c2ecf20Sopenharmony_ci struct ipc_id_ary* entries; 1328c2ecf20Sopenharmony_ci 1338c2ecf20Sopenharmony_ci rcu_read_lock(); 1348c2ecf20Sopenharmony_ci entries = rcu_dereference(ids->entries); 1358c2ecf20Sopenharmony_ci if(lid >= entries->size) { 1368c2ecf20Sopenharmony_ci rcu_read_unlock(); 1378c2ecf20Sopenharmony_ci return NULL; 1388c2ecf20Sopenharmony_ci } 1398c2ecf20Sopenharmony_ci out = entries->p[lid]; 1408c2ecf20Sopenharmony_ci if(out == NULL) { 1418c2ecf20Sopenharmony_ci rcu_read_unlock(); 1428c2ecf20Sopenharmony_ci return NULL; 1438c2ecf20Sopenharmony_ci } 1448c2ecf20Sopenharmony_ci spin_lock(&out->lock); 1458c2ecf20Sopenharmony_ci 1468c2ecf20Sopenharmony_ci /* ipc_rmid() may have already freed the ID while ipc_lock 1478c2ecf20Sopenharmony_ci * was spinning: here verify that the structure is still valid 1488c2ecf20Sopenharmony_ci */ 1498c2ecf20Sopenharmony_ci if (out->deleted) { 1508c2ecf20Sopenharmony_ci spin_unlock(&out->lock); 1518c2ecf20Sopenharmony_ci rcu_read_unlock(); 1528c2ecf20Sopenharmony_ci return NULL; 1538c2ecf20Sopenharmony_ci } 1548c2ecf20Sopenharmony_ci return out; 1558c2ecf20Sopenharmony_ci } 1568c2ecf20Sopenharmony_ci 1578c2ecf20Sopenharmony_ci.. _answer_quick_quiz_seqlock: 1588c2ecf20Sopenharmony_ci 1598c2ecf20Sopenharmony_ciAnswer to Quick Quiz: 1608c2ecf20Sopenharmony_ci Why is it so important that updates be rare when using seqlock? 1618c2ecf20Sopenharmony_ci 1628c2ecf20Sopenharmony_ci The reason that it is important that updates be rare when 1638c2ecf20Sopenharmony_ci using seqlock is that frequent updates can livelock readers. 1648c2ecf20Sopenharmony_ci One way to avoid this problem is to assign a seqlock for 1658c2ecf20Sopenharmony_ci each array entry rather than to the entire array. 166