162306a36Sopenharmony_ci====================================================== 262306a36Sopenharmony_ciA Tour Through TREE_RCU's Grace-Period Memory Ordering 362306a36Sopenharmony_ci====================================================== 462306a36Sopenharmony_ci 562306a36Sopenharmony_ciAugust 8, 2017 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciThis article was contributed by Paul E. McKenney 862306a36Sopenharmony_ci 962306a36Sopenharmony_ciIntroduction 1062306a36Sopenharmony_ci============ 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ciThis document gives a rough visual overview of how Tree RCU's 1362306a36Sopenharmony_cigrace-period memory ordering guarantee is provided. 1462306a36Sopenharmony_ci 1562306a36Sopenharmony_ciWhat Is Tree RCU's Grace Period Memory Ordering Guarantee? 1662306a36Sopenharmony_ci========================================================== 1762306a36Sopenharmony_ci 1862306a36Sopenharmony_ciRCU grace periods provide extremely strong memory-ordering guarantees 1962306a36Sopenharmony_cifor non-idle non-offline code. 2062306a36Sopenharmony_ciAny code that happens after the end of a given RCU grace period is guaranteed 2162306a36Sopenharmony_cito see the effects of all accesses prior to the beginning of that grace 2262306a36Sopenharmony_ciperiod that are within RCU read-side critical sections. 2362306a36Sopenharmony_ciSimilarly, any code that happens before the beginning of a given RCU grace 2462306a36Sopenharmony_ciperiod is guaranteed to not see the effects of all accesses following the end 2562306a36Sopenharmony_ciof that grace period that are within RCU read-side critical sections. 2662306a36Sopenharmony_ci 2762306a36Sopenharmony_ciNote well that RCU-sched read-side critical sections include any region 2862306a36Sopenharmony_ciof code for which preemption is disabled. 2962306a36Sopenharmony_ciGiven that each individual machine instruction can be thought of as 3062306a36Sopenharmony_cian extremely small region of preemption-disabled code, one can think of 3162306a36Sopenharmony_ci``synchronize_rcu()`` as ``smp_mb()`` on steroids. 3262306a36Sopenharmony_ci 3362306a36Sopenharmony_ciRCU updaters use this guarantee by splitting their updates into 3462306a36Sopenharmony_citwo phases, one of which is executed before the grace period and 3562306a36Sopenharmony_cithe other of which is executed after the grace period. 3662306a36Sopenharmony_ciIn the most common use case, phase one removes an element from 3762306a36Sopenharmony_cia linked RCU-protected data structure, and phase two frees that element. 3862306a36Sopenharmony_ciFor this to work, any readers that have witnessed state prior to the 3962306a36Sopenharmony_ciphase-one update (in the common case, removal) must not witness state 4062306a36Sopenharmony_cifollowing the phase-two update (in the common case, freeing). 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ciThe RCU implementation provides this guarantee using a network 4362306a36Sopenharmony_ciof lock-based critical sections, memory barriers, and per-CPU 4462306a36Sopenharmony_ciprocessing, as is described in the following sections. 4562306a36Sopenharmony_ci 4662306a36Sopenharmony_ciTree RCU Grace Period Memory Ordering Building Blocks 4762306a36Sopenharmony_ci===================================================== 4862306a36Sopenharmony_ci 4962306a36Sopenharmony_ciThe workhorse for RCU's grace-period memory ordering is the 5062306a36Sopenharmony_cicritical section for the ``rcu_node`` structure's 5162306a36Sopenharmony_ci``->lock``. These critical sections use helper functions for lock 5262306a36Sopenharmony_ciacquisition, including ``raw_spin_lock_rcu_node()``, 5362306a36Sopenharmony_ci``raw_spin_lock_irq_rcu_node()``, and ``raw_spin_lock_irqsave_rcu_node()``. 5462306a36Sopenharmony_ciTheir lock-release counterparts are ``raw_spin_unlock_rcu_node()``, 5562306a36Sopenharmony_ci``raw_spin_unlock_irq_rcu_node()``, and 5662306a36Sopenharmony_ci``raw_spin_unlock_irqrestore_rcu_node()``, respectively. 5762306a36Sopenharmony_ciFor completeness, a ``raw_spin_trylock_rcu_node()`` is also provided. 5862306a36Sopenharmony_ciThe key point is that the lock-acquisition functions, including 5962306a36Sopenharmony_ci``raw_spin_trylock_rcu_node()``, all invoke ``smp_mb__after_unlock_lock()`` 6062306a36Sopenharmony_ciimmediately after successful acquisition of the lock. 6162306a36Sopenharmony_ci 6262306a36Sopenharmony_ciTherefore, for any given ``rcu_node`` structure, any access 6362306a36Sopenharmony_cihappening before one of the above lock-release functions will be seen 6462306a36Sopenharmony_ciby all CPUs as happening before any access happening after a later 6562306a36Sopenharmony_cione of the above lock-acquisition functions. 6662306a36Sopenharmony_ciFurthermore, any access happening before one of the 6762306a36Sopenharmony_ciabove lock-release function on any given CPU will be seen by all 6862306a36Sopenharmony_ciCPUs as happening before any access happening after a later one 6962306a36Sopenharmony_ciof the above lock-acquisition functions executing on that same CPU, 7062306a36Sopenharmony_cieven if the lock-release and lock-acquisition functions are operating 7162306a36Sopenharmony_cion different ``rcu_node`` structures. 7262306a36Sopenharmony_ciTree RCU uses these two ordering guarantees to form an ordering 7362306a36Sopenharmony_cinetwork among all CPUs that were in any way involved in the grace 7462306a36Sopenharmony_ciperiod, including any CPUs that came online or went offline during 7562306a36Sopenharmony_cithe grace period in question. 7662306a36Sopenharmony_ci 7762306a36Sopenharmony_ciThe following litmus test exhibits the ordering effects of these 7862306a36Sopenharmony_cilock-acquisition and lock-release functions:: 7962306a36Sopenharmony_ci 8062306a36Sopenharmony_ci 1 int x, y, z; 8162306a36Sopenharmony_ci 2 8262306a36Sopenharmony_ci 3 void task0(void) 8362306a36Sopenharmony_ci 4 { 8462306a36Sopenharmony_ci 5 raw_spin_lock_rcu_node(rnp); 8562306a36Sopenharmony_ci 6 WRITE_ONCE(x, 1); 8662306a36Sopenharmony_ci 7 r1 = READ_ONCE(y); 8762306a36Sopenharmony_ci 8 raw_spin_unlock_rcu_node(rnp); 8862306a36Sopenharmony_ci 9 } 8962306a36Sopenharmony_ci 10 9062306a36Sopenharmony_ci 11 void task1(void) 9162306a36Sopenharmony_ci 12 { 9262306a36Sopenharmony_ci 13 raw_spin_lock_rcu_node(rnp); 9362306a36Sopenharmony_ci 14 WRITE_ONCE(y, 1); 9462306a36Sopenharmony_ci 15 r2 = READ_ONCE(z); 9562306a36Sopenharmony_ci 16 raw_spin_unlock_rcu_node(rnp); 9662306a36Sopenharmony_ci 17 } 9762306a36Sopenharmony_ci 18 9862306a36Sopenharmony_ci 19 void task2(void) 9962306a36Sopenharmony_ci 20 { 10062306a36Sopenharmony_ci 21 WRITE_ONCE(z, 1); 10162306a36Sopenharmony_ci 22 smp_mb(); 10262306a36Sopenharmony_ci 23 r3 = READ_ONCE(x); 10362306a36Sopenharmony_ci 24 } 10462306a36Sopenharmony_ci 25 10562306a36Sopenharmony_ci 26 WARN_ON(r1 == 0 && r2 == 0 && r3 == 0); 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ciThe ``WARN_ON()`` is evaluated at "the end of time", 10862306a36Sopenharmony_ciafter all changes have propagated throughout the system. 10962306a36Sopenharmony_ciWithout the ``smp_mb__after_unlock_lock()`` provided by the 11062306a36Sopenharmony_ciacquisition functions, this ``WARN_ON()`` could trigger, for example 11162306a36Sopenharmony_cion PowerPC. 11262306a36Sopenharmony_ciThe ``smp_mb__after_unlock_lock()`` invocations prevent this 11362306a36Sopenharmony_ci``WARN_ON()`` from triggering. 11462306a36Sopenharmony_ci 11562306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 11662306a36Sopenharmony_ci| **Quick Quiz**: | 11762306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 11862306a36Sopenharmony_ci| But the chain of rcu_node-structure lock acquisitions guarantees | 11962306a36Sopenharmony_ci| that new readers will see all of the updater's pre-grace-period | 12062306a36Sopenharmony_ci| accesses and also guarantees that the updater's post-grace-period | 12162306a36Sopenharmony_ci| accesses will see all of the old reader's accesses. So why do we | 12262306a36Sopenharmony_ci| need all of those calls to smp_mb__after_unlock_lock()? | 12362306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 12462306a36Sopenharmony_ci| **Answer**: | 12562306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 12662306a36Sopenharmony_ci| Because we must provide ordering for RCU's polling grace-period | 12762306a36Sopenharmony_ci| primitives, for example, get_state_synchronize_rcu() and | 12862306a36Sopenharmony_ci| poll_state_synchronize_rcu(). Consider this code:: | 12962306a36Sopenharmony_ci| | 13062306a36Sopenharmony_ci| CPU 0 CPU 1 | 13162306a36Sopenharmony_ci| ---- ---- | 13262306a36Sopenharmony_ci| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) | 13362306a36Sopenharmony_ci| g = get_state_synchronize_rcu() smp_mb() | 13462306a36Sopenharmony_ci| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) | 13562306a36Sopenharmony_ci| continue; | 13662306a36Sopenharmony_ci| r0 = READ_ONCE(Y) | 13762306a36Sopenharmony_ci| | 13862306a36Sopenharmony_ci| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not | 13962306a36Sopenharmony_ci| happen, even if CPU 1 is in an RCU extended quiescent state | 14062306a36Sopenharmony_ci| (idle or offline) and thus won't interact directly with the RCU | 14162306a36Sopenharmony_ci| core processing at all. | 14262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_ciThis approach must be extended to include idle CPUs, which need 14562306a36Sopenharmony_ciRCU's grace-period memory ordering guarantee to extend to any 14662306a36Sopenharmony_ciRCU read-side critical sections preceding and following the current 14762306a36Sopenharmony_ciidle sojourn. 14862306a36Sopenharmony_ciThis case is handled by calls to the strongly ordered 14962306a36Sopenharmony_ci``atomic_add_return()`` read-modify-write atomic operation that 15062306a36Sopenharmony_ciis invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry 15162306a36Sopenharmony_citime and within ``rcu_dynticks_eqs_exit()`` at idle-exit time. 15262306a36Sopenharmony_ciThe grace-period kthread invokes ``rcu_dynticks_snap()`` and 15362306a36Sopenharmony_ci``rcu_dynticks_in_eqs_since()`` (both of which invoke 15462306a36Sopenharmony_cian ``atomic_add_return()`` of zero) to detect idle CPUs. 15562306a36Sopenharmony_ci 15662306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 15762306a36Sopenharmony_ci| **Quick Quiz**: | 15862306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 15962306a36Sopenharmony_ci| But what about CPUs that remain offline for the entire grace period? | 16062306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 16162306a36Sopenharmony_ci| **Answer**: | 16262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 16362306a36Sopenharmony_ci| Such CPUs will be offline at the beginning of the grace period, so | 16462306a36Sopenharmony_ci| the grace period won't expect quiescent states from them. Races | 16562306a36Sopenharmony_ci| between grace-period start and CPU-hotplug operations are mediated | 16662306a36Sopenharmony_ci| by the CPU's leaf ``rcu_node`` structure's ``->lock`` as described | 16762306a36Sopenharmony_ci| above. | 16862306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 16962306a36Sopenharmony_ci 17062306a36Sopenharmony_ciThe approach must be extended to handle one final case, that of waking a 17162306a36Sopenharmony_citask blocked in ``synchronize_rcu()``. This task might be affined to 17262306a36Sopenharmony_cia CPU that is not yet aware that the grace period has ended, and thus 17362306a36Sopenharmony_cimight not yet be subject to the grace period's memory ordering. 17462306a36Sopenharmony_ciTherefore, there is an ``smp_mb()`` after the return from 17562306a36Sopenharmony_ci``wait_for_completion()`` in the ``synchronize_rcu()`` code path. 17662306a36Sopenharmony_ci 17762306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 17862306a36Sopenharmony_ci| **Quick Quiz**: | 17962306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 18062306a36Sopenharmony_ci| What? Where??? I don't see any ``smp_mb()`` after the return from | 18162306a36Sopenharmony_ci| ``wait_for_completion()``!!! | 18262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 18362306a36Sopenharmony_ci| **Answer**: | 18462306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 18562306a36Sopenharmony_ci| That would be because I spotted the need for that ``smp_mb()`` during | 18662306a36Sopenharmony_ci| the creation of this documentation, and it is therefore unlikely to | 18762306a36Sopenharmony_ci| hit mainline before v4.14. Kudos to Lance Roy, Will Deacon, Peter | 18862306a36Sopenharmony_ci| Zijlstra, and Jonathan Cameron for asking questions that sensitized | 18962306a36Sopenharmony_ci| me to the rather elaborate sequence of events that demonstrate the | 19062306a36Sopenharmony_ci| need for this memory barrier. | 19162306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 19262306a36Sopenharmony_ci 19362306a36Sopenharmony_ciTree RCU's grace--period memory-ordering guarantees rely most heavily on 19462306a36Sopenharmony_cithe ``rcu_node`` structure's ``->lock`` field, so much so that it is 19562306a36Sopenharmony_cinecessary to abbreviate this pattern in the diagrams in the next 19662306a36Sopenharmony_cisection. For example, consider the ``rcu_prepare_for_idle()`` function 19762306a36Sopenharmony_cishown below, which is one of several functions that enforce ordering of 19862306a36Sopenharmony_cinewly arrived RCU callbacks against future grace periods: 19962306a36Sopenharmony_ci 20062306a36Sopenharmony_ci:: 20162306a36Sopenharmony_ci 20262306a36Sopenharmony_ci 1 static void rcu_prepare_for_idle(void) 20362306a36Sopenharmony_ci 2 { 20462306a36Sopenharmony_ci 3 bool needwake; 20562306a36Sopenharmony_ci 4 struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 20662306a36Sopenharmony_ci 5 struct rcu_node *rnp; 20762306a36Sopenharmony_ci 6 int tne; 20862306a36Sopenharmony_ci 7 20962306a36Sopenharmony_ci 8 lockdep_assert_irqs_disabled(); 21062306a36Sopenharmony_ci 9 if (rcu_rdp_is_offloaded(rdp)) 21162306a36Sopenharmony_ci 10 return; 21262306a36Sopenharmony_ci 11 21362306a36Sopenharmony_ci 12 /* Handle nohz enablement switches conservatively. */ 21462306a36Sopenharmony_ci 13 tne = READ_ONCE(tick_nohz_active); 21562306a36Sopenharmony_ci 14 if (tne != rdp->tick_nohz_enabled_snap) { 21662306a36Sopenharmony_ci 15 if (!rcu_segcblist_empty(&rdp->cblist)) 21762306a36Sopenharmony_ci 16 invoke_rcu_core(); /* force nohz to see update. */ 21862306a36Sopenharmony_ci 17 rdp->tick_nohz_enabled_snap = tne; 21962306a36Sopenharmony_ci 18 return; 22062306a36Sopenharmony_ci 19 } 22162306a36Sopenharmony_ci 20 if (!tne) 22262306a36Sopenharmony_ci 21 return; 22362306a36Sopenharmony_ci 22 22462306a36Sopenharmony_ci 23 /* 22562306a36Sopenharmony_ci 24 * If we have not yet accelerated this jiffy, accelerate all 22662306a36Sopenharmony_ci 25 * callbacks on this CPU. 22762306a36Sopenharmony_ci 26 */ 22862306a36Sopenharmony_ci 27 if (rdp->last_accelerate == jiffies) 22962306a36Sopenharmony_ci 28 return; 23062306a36Sopenharmony_ci 29 rdp->last_accelerate = jiffies; 23162306a36Sopenharmony_ci 30 if (rcu_segcblist_pend_cbs(&rdp->cblist)) { 23262306a36Sopenharmony_ci 31 rnp = rdp->mynode; 23362306a36Sopenharmony_ci 32 raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */ 23462306a36Sopenharmony_ci 33 needwake = rcu_accelerate_cbs(rnp, rdp); 23562306a36Sopenharmony_ci 34 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 23662306a36Sopenharmony_ci 35 if (needwake) 23762306a36Sopenharmony_ci 36 rcu_gp_kthread_wake(); 23862306a36Sopenharmony_ci 37 } 23962306a36Sopenharmony_ci 38 } 24062306a36Sopenharmony_ci 24162306a36Sopenharmony_ciBut the only part of ``rcu_prepare_for_idle()`` that really matters for 24262306a36Sopenharmony_cithis discussion are lines 32–34. We will therefore abbreviate this 24362306a36Sopenharmony_cifunction as follows: 24462306a36Sopenharmony_ci 24562306a36Sopenharmony_ci.. kernel-figure:: rcu_node-lock.svg 24662306a36Sopenharmony_ci 24762306a36Sopenharmony_ciThe box represents the ``rcu_node`` structure's ``->lock`` critical 24862306a36Sopenharmony_cisection, with the double line on top representing the additional 24962306a36Sopenharmony_ci``smp_mb__after_unlock_lock()``. 25062306a36Sopenharmony_ci 25162306a36Sopenharmony_ciTree RCU Grace Period Memory Ordering Components 25262306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 25362306a36Sopenharmony_ci 25462306a36Sopenharmony_ciTree RCU's grace-period memory-ordering guarantee is provided by a 25562306a36Sopenharmony_cinumber of RCU components: 25662306a36Sopenharmony_ci 25762306a36Sopenharmony_ci#. `Callback Registry`_ 25862306a36Sopenharmony_ci#. `Grace-Period Initialization`_ 25962306a36Sopenharmony_ci#. `Self-Reported Quiescent States`_ 26062306a36Sopenharmony_ci#. `Dynamic Tick Interface`_ 26162306a36Sopenharmony_ci#. `CPU-Hotplug Interface`_ 26262306a36Sopenharmony_ci#. `Forcing Quiescent States`_ 26362306a36Sopenharmony_ci#. `Grace-Period Cleanup`_ 26462306a36Sopenharmony_ci#. `Callback Invocation`_ 26562306a36Sopenharmony_ci 26662306a36Sopenharmony_ciEach of the following section looks at the corresponding component in 26762306a36Sopenharmony_cidetail. 26862306a36Sopenharmony_ci 26962306a36Sopenharmony_ciCallback Registry 27062306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^ 27162306a36Sopenharmony_ci 27262306a36Sopenharmony_ciIf RCU's grace-period guarantee is to mean anything at all, any access 27362306a36Sopenharmony_cithat happens before a given invocation of ``call_rcu()`` must also 27462306a36Sopenharmony_cihappen before the corresponding grace period. The implementation of this 27562306a36Sopenharmony_ciportion of RCU's grace period guarantee is shown in the following 27662306a36Sopenharmony_cifigure: 27762306a36Sopenharmony_ci 27862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-callback-registry.svg 27962306a36Sopenharmony_ci 28062306a36Sopenharmony_ciBecause ``call_rcu()`` normally acts only on CPU-local state, it 28162306a36Sopenharmony_ciprovides no ordering guarantees, either for itself or for phase one of 28262306a36Sopenharmony_cithe update (which again will usually be removal of an element from an 28362306a36Sopenharmony_ciRCU-protected data structure). It simply enqueues the ``rcu_head`` 28462306a36Sopenharmony_cistructure on a per-CPU list, which cannot become associated with a grace 28562306a36Sopenharmony_ciperiod until a later call to ``rcu_accelerate_cbs()``, as shown in the 28662306a36Sopenharmony_cidiagram above. 28762306a36Sopenharmony_ci 28862306a36Sopenharmony_ciOne set of code paths shown on the left invokes ``rcu_accelerate_cbs()`` 28962306a36Sopenharmony_civia ``note_gp_changes()``, either directly from ``call_rcu()`` (if the 29062306a36Sopenharmony_cicurrent CPU is inundated with queued ``rcu_head`` structures) or more 29162306a36Sopenharmony_cilikely from an ``RCU_SOFTIRQ`` handler. Another code path in the middle 29262306a36Sopenharmony_ciis taken only in kernels built with ``CONFIG_RCU_FAST_NO_HZ=y``, which 29362306a36Sopenharmony_ciinvokes ``rcu_accelerate_cbs()`` via ``rcu_prepare_for_idle()``. The 29462306a36Sopenharmony_cifinal code path on the right is taken only in kernels built with 29562306a36Sopenharmony_ci``CONFIG_HOTPLUG_CPU=y``, which invokes ``rcu_accelerate_cbs()`` via 29662306a36Sopenharmony_ci``rcu_advance_cbs()``, ``rcu_migrate_callbacks``, 29762306a36Sopenharmony_ci``rcutree_migrate_callbacks()``, and ``takedown_cpu()``, which in turn 29862306a36Sopenharmony_ciis invoked on a surviving CPU after the outgoing CPU has been completely 29962306a36Sopenharmony_ciofflined. 30062306a36Sopenharmony_ci 30162306a36Sopenharmony_ciThere are a few other code paths within grace-period processing that 30262306a36Sopenharmony_ciopportunistically invoke ``rcu_accelerate_cbs()``. However, either way, 30362306a36Sopenharmony_ciall of the CPU's recently queued ``rcu_head`` structures are associated 30462306a36Sopenharmony_ciwith a future grace-period number under the protection of the CPU's lead 30562306a36Sopenharmony_ci``rcu_node`` structure's ``->lock``. In all cases, there is full 30662306a36Sopenharmony_ciordering against any prior critical section for that same ``rcu_node`` 30762306a36Sopenharmony_cistructure's ``->lock``, and also full ordering against any of the 30862306a36Sopenharmony_cicurrent task's or CPU's prior critical sections for any ``rcu_node`` 30962306a36Sopenharmony_cistructure's ``->lock``. 31062306a36Sopenharmony_ci 31162306a36Sopenharmony_ciThe next section will show how this ordering ensures that any accesses 31262306a36Sopenharmony_ciprior to the ``call_rcu()`` (particularly including phase one of the 31362306a36Sopenharmony_ciupdate) happen before the start of the corresponding grace period. 31462306a36Sopenharmony_ci 31562306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 31662306a36Sopenharmony_ci| **Quick Quiz**: | 31762306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 31862306a36Sopenharmony_ci| But what about ``synchronize_rcu()``? | 31962306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 32062306a36Sopenharmony_ci| **Answer**: | 32162306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 32262306a36Sopenharmony_ci| The ``synchronize_rcu()`` passes ``call_rcu()`` to ``wait_rcu_gp()``, | 32362306a36Sopenharmony_ci| which invokes it. So either way, it eventually comes down to | 32462306a36Sopenharmony_ci| ``call_rcu()``. | 32562306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 32662306a36Sopenharmony_ci 32762306a36Sopenharmony_ciGrace-Period Initialization 32862306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^ 32962306a36Sopenharmony_ci 33062306a36Sopenharmony_ciGrace-period initialization is carried out by the grace-period kernel 33162306a36Sopenharmony_cithread, which makes several passes over the ``rcu_node`` tree within the 33262306a36Sopenharmony_ci``rcu_gp_init()`` function. This means that showing the full flow of 33362306a36Sopenharmony_ciordering through the grace-period computation will require duplicating 33462306a36Sopenharmony_cithis tree. If you find this confusing, please note that the state of the 33562306a36Sopenharmony_ci``rcu_node`` changes over time, just like Heraclitus's river. However, 33662306a36Sopenharmony_cito keep the ``rcu_node`` river tractable, the grace-period kernel 33762306a36Sopenharmony_cithread's traversals are presented in multiple parts, starting in this 33862306a36Sopenharmony_cisection with the various phases of grace-period initialization. 33962306a36Sopenharmony_ci 34062306a36Sopenharmony_ciThe first ordering-related grace-period initialization action is to 34162306a36Sopenharmony_ciadvance the ``rcu_state`` structure's ``->gp_seq`` grace-period-number 34262306a36Sopenharmony_cicounter, as shown below: 34362306a36Sopenharmony_ci 34462306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-1.svg 34562306a36Sopenharmony_ci 34662306a36Sopenharmony_ciThe actual increment is carried out using ``smp_store_release()``, which 34762306a36Sopenharmony_cihelps reject false-positive RCU CPU stall detection. Note that only the 34862306a36Sopenharmony_ciroot ``rcu_node`` structure is touched. 34962306a36Sopenharmony_ci 35062306a36Sopenharmony_ciThe first pass through the ``rcu_node`` tree updates bitmasks based on 35162306a36Sopenharmony_ciCPUs having come online or gone offline since the start of the previous 35262306a36Sopenharmony_cigrace period. In the common case where the number of online CPUs for 35362306a36Sopenharmony_cithis ``rcu_node`` structure has not transitioned to or from zero, this 35462306a36Sopenharmony_cipass will scan only the leaf ``rcu_node`` structures. However, if the 35562306a36Sopenharmony_cinumber of online CPUs for a given leaf ``rcu_node`` structure has 35662306a36Sopenharmony_citransitioned from zero, ``rcu_init_new_rnp()`` will be invoked for the 35762306a36Sopenharmony_cifirst incoming CPU. Similarly, if the number of online CPUs for a given 35862306a36Sopenharmony_cileaf ``rcu_node`` structure has transitioned to zero, 35962306a36Sopenharmony_ci``rcu_cleanup_dead_rnp()`` will be invoked for the last outgoing CPU. 36062306a36Sopenharmony_ciThe diagram below shows the path of ordering if the leftmost 36162306a36Sopenharmony_ci``rcu_node`` structure onlines its first CPU and if the next 36262306a36Sopenharmony_ci``rcu_node`` structure has no online CPUs (or, alternatively if the 36362306a36Sopenharmony_cileftmost ``rcu_node`` structure offlines its last CPU and if the next 36462306a36Sopenharmony_ci``rcu_node`` structure has no online CPUs). 36562306a36Sopenharmony_ci 36662306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-2.svg 36762306a36Sopenharmony_ci 36862306a36Sopenharmony_ciThe final ``rcu_gp_init()`` pass through the ``rcu_node`` tree traverses 36962306a36Sopenharmony_cibreadth-first, setting each ``rcu_node`` structure's ``->gp_seq`` field 37062306a36Sopenharmony_cito the newly advanced value from the ``rcu_state`` structure, as shown 37162306a36Sopenharmony_ciin the following diagram. 37262306a36Sopenharmony_ci 37362306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-3.svg 37462306a36Sopenharmony_ci 37562306a36Sopenharmony_ciThis change will also cause each CPU's next call to 37662306a36Sopenharmony_ci``__note_gp_changes()`` to notice that a new grace period has started, 37762306a36Sopenharmony_cias described in the next section. But because the grace-period kthread 37862306a36Sopenharmony_cistarted the grace period at the root (with the advancing of the 37962306a36Sopenharmony_ci``rcu_state`` structure's ``->gp_seq`` field) before setting each leaf 38062306a36Sopenharmony_ci``rcu_node`` structure's ``->gp_seq`` field, each CPU's observation of 38162306a36Sopenharmony_cithe start of the grace period will happen after the actual start of the 38262306a36Sopenharmony_cigrace period. 38362306a36Sopenharmony_ci 38462306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 38562306a36Sopenharmony_ci| **Quick Quiz**: | 38662306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 38762306a36Sopenharmony_ci| But what about the CPU that started the grace period? Why wouldn't it | 38862306a36Sopenharmony_ci| see the start of the grace period right when it started that grace | 38962306a36Sopenharmony_ci| period? | 39062306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 39162306a36Sopenharmony_ci| **Answer**: | 39262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 39362306a36Sopenharmony_ci| In some deep philosophical and overly anthromorphized sense, yes, the | 39462306a36Sopenharmony_ci| CPU starting the grace period is immediately aware of having done so. | 39562306a36Sopenharmony_ci| However, if we instead assume that RCU is not self-aware, then even | 39662306a36Sopenharmony_ci| the CPU starting the grace period does not really become aware of the | 39762306a36Sopenharmony_ci| start of this grace period until its first call to | 39862306a36Sopenharmony_ci| ``__note_gp_changes()``. On the other hand, this CPU potentially gets | 39962306a36Sopenharmony_ci| early notification because it invokes ``__note_gp_changes()`` during | 40062306a36Sopenharmony_ci| its last ``rcu_gp_init()`` pass through its leaf ``rcu_node`` | 40162306a36Sopenharmony_ci| structure. | 40262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 40362306a36Sopenharmony_ci 40462306a36Sopenharmony_ciSelf-Reported Quiescent States 40562306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 40662306a36Sopenharmony_ci 40762306a36Sopenharmony_ciWhen all entities that might block the grace period have reported 40862306a36Sopenharmony_ciquiescent states (or as described in a later section, had quiescent 40962306a36Sopenharmony_cistates reported on their behalf), the grace period can end. Online 41062306a36Sopenharmony_cinon-idle CPUs report their own quiescent states, as shown in the 41162306a36Sopenharmony_cifollowing diagram: 41262306a36Sopenharmony_ci 41362306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-qs.svg 41462306a36Sopenharmony_ci 41562306a36Sopenharmony_ciThis is for the last CPU to report a quiescent state, which signals the 41662306a36Sopenharmony_ciend of the grace period. Earlier quiescent states would push up the 41762306a36Sopenharmony_ci``rcu_node`` tree only until they encountered an ``rcu_node`` structure 41862306a36Sopenharmony_cithat is waiting for additional quiescent states. However, ordering is 41962306a36Sopenharmony_cinevertheless preserved because some later quiescent state will acquire 42062306a36Sopenharmony_cithat ``rcu_node`` structure's ``->lock``. 42162306a36Sopenharmony_ci 42262306a36Sopenharmony_ciAny number of events can lead up to a CPU invoking ``note_gp_changes`` 42362306a36Sopenharmony_ci(or alternatively, directly invoking ``__note_gp_changes()``), at which 42462306a36Sopenharmony_cipoint that CPU will notice the start of a new grace period while holding 42562306a36Sopenharmony_ciits leaf ``rcu_node`` lock. Therefore, all execution shown in this 42662306a36Sopenharmony_cidiagram happens after the start of the grace period. In addition, this 42762306a36Sopenharmony_ciCPU will consider any RCU read-side critical section that started before 42862306a36Sopenharmony_cithe invocation of ``__note_gp_changes()`` to have started before the 42962306a36Sopenharmony_cigrace period, and thus a critical section that the grace period must 43062306a36Sopenharmony_ciwait on. 43162306a36Sopenharmony_ci 43262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 43362306a36Sopenharmony_ci| **Quick Quiz**: | 43462306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 43562306a36Sopenharmony_ci| But a RCU read-side critical section might have started after the | 43662306a36Sopenharmony_ci| beginning of the grace period (the advancing of ``->gp_seq`` from | 43762306a36Sopenharmony_ci| earlier), so why should the grace period wait on such a critical | 43862306a36Sopenharmony_ci| section? | 43962306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 44062306a36Sopenharmony_ci| **Answer**: | 44162306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 44262306a36Sopenharmony_ci| It is indeed not necessary for the grace period to wait on such a | 44362306a36Sopenharmony_ci| critical section. However, it is permissible to wait on it. And it is | 44462306a36Sopenharmony_ci| furthermore important to wait on it, as this lazy approach is far | 44562306a36Sopenharmony_ci| more scalable than a “big bang” all-at-once grace-period start could | 44662306a36Sopenharmony_ci| possibly be. | 44762306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 44862306a36Sopenharmony_ci 44962306a36Sopenharmony_ciIf the CPU does a context switch, a quiescent state will be noted by 45062306a36Sopenharmony_ci``rcu_note_context_switch()`` on the left. On the other hand, if the CPU 45162306a36Sopenharmony_citakes a scheduler-clock interrupt while executing in usermode, a 45262306a36Sopenharmony_ciquiescent state will be noted by ``rcu_sched_clock_irq()`` on the right. 45362306a36Sopenharmony_ciEither way, the passage through a quiescent state will be noted in a 45462306a36Sopenharmony_ciper-CPU variable. 45562306a36Sopenharmony_ci 45662306a36Sopenharmony_ciThe next time an ``RCU_SOFTIRQ`` handler executes on this CPU (for 45762306a36Sopenharmony_ciexample, after the next scheduler-clock interrupt), ``rcu_core()`` will 45862306a36Sopenharmony_ciinvoke ``rcu_check_quiescent_state()``, which will notice the recorded 45962306a36Sopenharmony_ciquiescent state, and invoke ``rcu_report_qs_rdp()``. If 46062306a36Sopenharmony_ci``rcu_report_qs_rdp()`` verifies that the quiescent state really does 46162306a36Sopenharmony_ciapply to the current grace period, it invokes ``rcu_report_rnp()`` which 46262306a36Sopenharmony_citraverses up the ``rcu_node`` tree as shown at the bottom of the 46362306a36Sopenharmony_cidiagram, clearing bits from each ``rcu_node`` structure's ``->qsmask`` 46462306a36Sopenharmony_cifield, and propagating up the tree when the result is zero. 46562306a36Sopenharmony_ci 46662306a36Sopenharmony_ciNote that traversal passes upwards out of a given ``rcu_node`` structure 46762306a36Sopenharmony_cionly if the current CPU is reporting the last quiescent state for the 46862306a36Sopenharmony_cisubtree headed by that ``rcu_node`` structure. A key point is that if a 46962306a36Sopenharmony_ciCPU's traversal stops at a given ``rcu_node`` structure, then there will 47062306a36Sopenharmony_cibe a later traversal by another CPU (or perhaps the same one) that 47162306a36Sopenharmony_ciproceeds upwards from that point, and the ``rcu_node`` ``->lock`` 47262306a36Sopenharmony_ciguarantees that the first CPU's quiescent state happens before the 47362306a36Sopenharmony_ciremainder of the second CPU's traversal. Applying this line of thought 47462306a36Sopenharmony_cirepeatedly shows that all CPUs' quiescent states happen before the last 47562306a36Sopenharmony_ciCPU traverses through the root ``rcu_node`` structure, the “last CPU” 47662306a36Sopenharmony_cibeing the one that clears the last bit in the root ``rcu_node`` 47762306a36Sopenharmony_cistructure's ``->qsmask`` field. 47862306a36Sopenharmony_ci 47962306a36Sopenharmony_ciDynamic Tick Interface 48062306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^ 48162306a36Sopenharmony_ci 48262306a36Sopenharmony_ciDue to energy-efficiency considerations, RCU is forbidden from 48362306a36Sopenharmony_cidisturbing idle CPUs. CPUs are therefore required to notify RCU when 48462306a36Sopenharmony_cientering or leaving idle state, which they do via fully ordered 48562306a36Sopenharmony_civalue-returning atomic operations on a per-CPU variable. The ordering 48662306a36Sopenharmony_cieffects are as shown below: 48762306a36Sopenharmony_ci 48862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-dyntick.svg 48962306a36Sopenharmony_ci 49062306a36Sopenharmony_ciThe RCU grace-period kernel thread samples the per-CPU idleness variable 49162306a36Sopenharmony_ciwhile holding the corresponding CPU's leaf ``rcu_node`` structure's 49262306a36Sopenharmony_ci``->lock``. This means that any RCU read-side critical sections that 49362306a36Sopenharmony_ciprecede the idle period (the oval near the top of the diagram above) 49462306a36Sopenharmony_ciwill happen before the end of the current grace period. Similarly, the 49562306a36Sopenharmony_cibeginning of the current grace period will happen before any RCU 49662306a36Sopenharmony_ciread-side critical sections that follow the idle period (the oval near 49762306a36Sopenharmony_cithe bottom of the diagram above). 49862306a36Sopenharmony_ci 49962306a36Sopenharmony_ciPlumbing this into the full grace-period execution is described 50062306a36Sopenharmony_ci`below <Forcing Quiescent States_>`__. 50162306a36Sopenharmony_ci 50262306a36Sopenharmony_ciCPU-Hotplug Interface 50362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^ 50462306a36Sopenharmony_ci 50562306a36Sopenharmony_ciRCU is also forbidden from disturbing offline CPUs, which might well be 50662306a36Sopenharmony_cipowered off and removed from the system completely. CPUs are therefore 50762306a36Sopenharmony_cirequired to notify RCU of their comings and goings as part of the 50862306a36Sopenharmony_cicorresponding CPU hotplug operations. The ordering effects are shown 50962306a36Sopenharmony_cibelow: 51062306a36Sopenharmony_ci 51162306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-hotplug.svg 51262306a36Sopenharmony_ci 51362306a36Sopenharmony_ciBecause CPU hotplug operations are much less frequent than idle 51462306a36Sopenharmony_citransitions, they are heavier weight, and thus acquire the CPU's leaf 51562306a36Sopenharmony_ci``rcu_node`` structure's ``->lock`` and update this structure's 51662306a36Sopenharmony_ci``->qsmaskinitnext``. The RCU grace-period kernel thread samples this 51762306a36Sopenharmony_cimask to detect CPUs having gone offline since the beginning of this 51862306a36Sopenharmony_cigrace period. 51962306a36Sopenharmony_ci 52062306a36Sopenharmony_ciPlumbing this into the full grace-period execution is described 52162306a36Sopenharmony_ci`below <Forcing Quiescent States_>`__. 52262306a36Sopenharmony_ci 52362306a36Sopenharmony_ciForcing Quiescent States 52462306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^ 52562306a36Sopenharmony_ci 52662306a36Sopenharmony_ciAs noted above, idle and offline CPUs cannot report their own quiescent 52762306a36Sopenharmony_cistates, and therefore the grace-period kernel thread must do the 52862306a36Sopenharmony_cireporting on their behalf. This process is called “forcing quiescent 52962306a36Sopenharmony_cistates”, it is repeated every few jiffies, and its ordering effects are 53062306a36Sopenharmony_cishown below: 53162306a36Sopenharmony_ci 53262306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-fqs.svg 53362306a36Sopenharmony_ci 53462306a36Sopenharmony_ciEach pass of quiescent state forcing is guaranteed to traverse the leaf 53562306a36Sopenharmony_ci``rcu_node`` structures, and if there are no new quiescent states due to 53662306a36Sopenharmony_cirecently idled and/or offlined CPUs, then only the leaves are traversed. 53762306a36Sopenharmony_ciHowever, if there is a newly offlined CPU as illustrated on the left or 53862306a36Sopenharmony_cia newly idled CPU as illustrated on the right, the corresponding 53962306a36Sopenharmony_ciquiescent state will be driven up towards the root. As with 54062306a36Sopenharmony_ciself-reported quiescent states, the upwards driving stops once it 54162306a36Sopenharmony_cireaches an ``rcu_node`` structure that has quiescent states outstanding 54262306a36Sopenharmony_cifrom other CPUs. 54362306a36Sopenharmony_ci 54462306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 54562306a36Sopenharmony_ci| **Quick Quiz**: | 54662306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 54762306a36Sopenharmony_ci| The leftmost drive to root stopped before it reached the root | 54862306a36Sopenharmony_ci| ``rcu_node`` structure, which means that there are still CPUs | 54962306a36Sopenharmony_ci| subordinate to that structure on which the current grace period is | 55062306a36Sopenharmony_ci| waiting. Given that, how is it possible that the rightmost drive to | 55162306a36Sopenharmony_ci| root ended the grace period? | 55262306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 55362306a36Sopenharmony_ci| **Answer**: | 55462306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 55562306a36Sopenharmony_ci| Good analysis! It is in fact impossible in the absence of bugs in | 55662306a36Sopenharmony_ci| RCU. But this diagram is complex enough as it is, so simplicity | 55762306a36Sopenharmony_ci| overrode accuracy. You can think of it as poetic license, or you can | 55862306a36Sopenharmony_ci| think of it as misdirection that is resolved in the | 55962306a36Sopenharmony_ci| `stitched-together diagram <Putting It All Together_>`__. | 56062306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 56162306a36Sopenharmony_ci 56262306a36Sopenharmony_ciGrace-Period Cleanup 56362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^ 56462306a36Sopenharmony_ci 56562306a36Sopenharmony_ciGrace-period cleanup first scans the ``rcu_node`` tree breadth-first 56662306a36Sopenharmony_ciadvancing all the ``->gp_seq`` fields, then it advances the 56762306a36Sopenharmony_ci``rcu_state`` structure's ``->gp_seq`` field. The ordering effects are 56862306a36Sopenharmony_cishown below: 56962306a36Sopenharmony_ci 57062306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-cleanup.svg 57162306a36Sopenharmony_ci 57262306a36Sopenharmony_ciAs indicated by the oval at the bottom of the diagram, once grace-period 57362306a36Sopenharmony_cicleanup is complete, the next grace period can begin. 57462306a36Sopenharmony_ci 57562306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 57662306a36Sopenharmony_ci| **Quick Quiz**: | 57762306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 57862306a36Sopenharmony_ci| But when precisely does the grace period end? | 57962306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 58062306a36Sopenharmony_ci| **Answer**: | 58162306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 58262306a36Sopenharmony_ci| There is no useful single point at which the grace period can be said | 58362306a36Sopenharmony_ci| to end. The earliest reasonable candidate is as soon as the last CPU | 58462306a36Sopenharmony_ci| has reported its quiescent state, but it may be some milliseconds | 58562306a36Sopenharmony_ci| before RCU becomes aware of this. The latest reasonable candidate is | 58662306a36Sopenharmony_ci| once the ``rcu_state`` structure's ``->gp_seq`` field has been | 58762306a36Sopenharmony_ci| updated, but it is quite possible that some CPUs have already | 58862306a36Sopenharmony_ci| completed phase two of their updates by that time. In short, if you | 58962306a36Sopenharmony_ci| are going to work with RCU, you need to learn to embrace uncertainty. | 59062306a36Sopenharmony_ci+-----------------------------------------------------------------------+ 59162306a36Sopenharmony_ci 59262306a36Sopenharmony_ciCallback Invocation 59362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^ 59462306a36Sopenharmony_ci 59562306a36Sopenharmony_ciOnce a given CPU's leaf ``rcu_node`` structure's ``->gp_seq`` field has 59662306a36Sopenharmony_cibeen updated, that CPU can begin invoking its RCU callbacks that were 59762306a36Sopenharmony_ciwaiting for this grace period to end. These callbacks are identified by 59862306a36Sopenharmony_ci``rcu_advance_cbs()``, which is usually invoked by 59962306a36Sopenharmony_ci``__note_gp_changes()``. As shown in the diagram below, this invocation 60062306a36Sopenharmony_cican be triggered by the scheduling-clock interrupt 60162306a36Sopenharmony_ci(``rcu_sched_clock_irq()`` on the left) or by idle entry 60262306a36Sopenharmony_ci(``rcu_cleanup_after_idle()`` on the right, but only for kernels build 60362306a36Sopenharmony_ciwith ``CONFIG_RCU_FAST_NO_HZ=y``). Either way, ``RCU_SOFTIRQ`` is 60462306a36Sopenharmony_ciraised, which results in ``rcu_do_batch()`` invoking the callbacks, 60562306a36Sopenharmony_ciwhich in turn allows those callbacks to carry out (either directly or 60662306a36Sopenharmony_ciindirectly via wakeup) the needed phase-two processing for each update. 60762306a36Sopenharmony_ci 60862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-callback-invocation.svg 60962306a36Sopenharmony_ci 61062306a36Sopenharmony_ciPlease note that callback invocation can also be prompted by any number 61162306a36Sopenharmony_ciof corner-case code paths, for example, when a CPU notes that it has 61262306a36Sopenharmony_ciexcessive numbers of callbacks queued. In all cases, the CPU acquires 61362306a36Sopenharmony_ciits leaf ``rcu_node`` structure's ``->lock`` before invoking callbacks, 61462306a36Sopenharmony_ciwhich preserves the required ordering against the newly completed grace 61562306a36Sopenharmony_ciperiod. 61662306a36Sopenharmony_ci 61762306a36Sopenharmony_ciHowever, if the callback function communicates to other CPUs, for 61862306a36Sopenharmony_ciexample, doing a wakeup, then it is that function's responsibility to 61962306a36Sopenharmony_cimaintain ordering. For example, if the callback function wakes up a task 62062306a36Sopenharmony_cithat runs on some other CPU, proper ordering must in place in both the 62162306a36Sopenharmony_cicallback function and the task being awakened. To see why this is 62262306a36Sopenharmony_ciimportant, consider the top half of the `grace-period 62362306a36Sopenharmony_cicleanup`_ diagram. The callback might be 62462306a36Sopenharmony_cirunning on a CPU corresponding to the leftmost leaf ``rcu_node`` 62562306a36Sopenharmony_cistructure, and awaken a task that is to run on a CPU corresponding to 62662306a36Sopenharmony_cithe rightmost leaf ``rcu_node`` structure, and the grace-period kernel 62762306a36Sopenharmony_cithread might not yet have reached the rightmost leaf. In this case, the 62862306a36Sopenharmony_cigrace period's memory ordering might not yet have reached that CPU, so 62962306a36Sopenharmony_ciagain the callback function and the awakened task must supply proper 63062306a36Sopenharmony_ciordering. 63162306a36Sopenharmony_ci 63262306a36Sopenharmony_ciPutting It All Together 63362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~ 63462306a36Sopenharmony_ci 63562306a36Sopenharmony_ciA stitched-together diagram is here: 63662306a36Sopenharmony_ci 63762306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp.svg 63862306a36Sopenharmony_ci 63962306a36Sopenharmony_ciLegal Statement 64062306a36Sopenharmony_ci~~~~~~~~~~~~~~~ 64162306a36Sopenharmony_ci 64262306a36Sopenharmony_ciThis work represents the view of the author and does not necessarily 64362306a36Sopenharmony_cirepresent the view of IBM. 64462306a36Sopenharmony_ci 64562306a36Sopenharmony_ciLinux is a registered trademark of Linus Torvalds. 64662306a36Sopenharmony_ci 64762306a36Sopenharmony_ciOther company, product, and service names may be trademarks or service 64862306a36Sopenharmony_cimarks of others. 649