162306a36Sopenharmony_ci======================================================
262306a36Sopenharmony_ciA Tour Through TREE_RCU's Grace-Period Memory Ordering
362306a36Sopenharmony_ci======================================================
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciAugust 8, 2017
662306a36Sopenharmony_ci
762306a36Sopenharmony_ciThis article was contributed by Paul E. McKenney
862306a36Sopenharmony_ci
962306a36Sopenharmony_ciIntroduction
1062306a36Sopenharmony_ci============
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ciThis document gives a rough visual overview of how Tree RCU's
1362306a36Sopenharmony_cigrace-period memory ordering guarantee is provided.
1462306a36Sopenharmony_ci
1562306a36Sopenharmony_ciWhat Is Tree RCU's Grace Period Memory Ordering Guarantee?
1662306a36Sopenharmony_ci==========================================================
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ciRCU grace periods provide extremely strong memory-ordering guarantees
1962306a36Sopenharmony_cifor non-idle non-offline code.
2062306a36Sopenharmony_ciAny code that happens after the end of a given RCU grace period is guaranteed
2162306a36Sopenharmony_cito see the effects of all accesses prior to the beginning of that grace
2262306a36Sopenharmony_ciperiod that are within RCU read-side critical sections.
2362306a36Sopenharmony_ciSimilarly, any code that happens before the beginning of a given RCU grace
2462306a36Sopenharmony_ciperiod is guaranteed to not see the effects of all accesses following the end
2562306a36Sopenharmony_ciof that grace period that are within RCU read-side critical sections.
2662306a36Sopenharmony_ci
2762306a36Sopenharmony_ciNote well that RCU-sched read-side critical sections include any region
2862306a36Sopenharmony_ciof code for which preemption is disabled.
2962306a36Sopenharmony_ciGiven that each individual machine instruction can be thought of as
3062306a36Sopenharmony_cian extremely small region of preemption-disabled code, one can think of
3162306a36Sopenharmony_ci``synchronize_rcu()`` as ``smp_mb()`` on steroids.
3262306a36Sopenharmony_ci
3362306a36Sopenharmony_ciRCU updaters use this guarantee by splitting their updates into
3462306a36Sopenharmony_citwo phases, one of which is executed before the grace period and
3562306a36Sopenharmony_cithe other of which is executed after the grace period.
3662306a36Sopenharmony_ciIn the most common use case, phase one removes an element from
3762306a36Sopenharmony_cia linked RCU-protected data structure, and phase two frees that element.
3862306a36Sopenharmony_ciFor this to work, any readers that have witnessed state prior to the
3962306a36Sopenharmony_ciphase-one update (in the common case, removal) must not witness state
4062306a36Sopenharmony_cifollowing the phase-two update (in the common case, freeing).
4162306a36Sopenharmony_ci
4262306a36Sopenharmony_ciThe RCU implementation provides this guarantee using a network
4362306a36Sopenharmony_ciof lock-based critical sections, memory barriers, and per-CPU
4462306a36Sopenharmony_ciprocessing, as is described in the following sections.
4562306a36Sopenharmony_ci
4662306a36Sopenharmony_ciTree RCU Grace Period Memory Ordering Building Blocks
4762306a36Sopenharmony_ci=====================================================
4862306a36Sopenharmony_ci
4962306a36Sopenharmony_ciThe workhorse for RCU's grace-period memory ordering is the
5062306a36Sopenharmony_cicritical section for the ``rcu_node`` structure's
5162306a36Sopenharmony_ci``->lock``. These critical sections use helper functions for lock
5262306a36Sopenharmony_ciacquisition, including ``raw_spin_lock_rcu_node()``,
5362306a36Sopenharmony_ci``raw_spin_lock_irq_rcu_node()``, and ``raw_spin_lock_irqsave_rcu_node()``.
5462306a36Sopenharmony_ciTheir lock-release counterparts are ``raw_spin_unlock_rcu_node()``,
5562306a36Sopenharmony_ci``raw_spin_unlock_irq_rcu_node()``, and
5662306a36Sopenharmony_ci``raw_spin_unlock_irqrestore_rcu_node()``, respectively.
5762306a36Sopenharmony_ciFor completeness, a ``raw_spin_trylock_rcu_node()`` is also provided.
5862306a36Sopenharmony_ciThe key point is that the lock-acquisition functions, including
5962306a36Sopenharmony_ci``raw_spin_trylock_rcu_node()``, all invoke ``smp_mb__after_unlock_lock()``
6062306a36Sopenharmony_ciimmediately after successful acquisition of the lock.
6162306a36Sopenharmony_ci
6262306a36Sopenharmony_ciTherefore, for any given ``rcu_node`` structure, any access
6362306a36Sopenharmony_cihappening before one of the above lock-release functions will be seen
6462306a36Sopenharmony_ciby all CPUs as happening before any access happening after a later
6562306a36Sopenharmony_cione of the above lock-acquisition functions.
6662306a36Sopenharmony_ciFurthermore, any access happening before one of the
6762306a36Sopenharmony_ciabove lock-release function on any given CPU will be seen by all
6862306a36Sopenharmony_ciCPUs as happening before any access happening after a later one
6962306a36Sopenharmony_ciof the above lock-acquisition functions executing on that same CPU,
7062306a36Sopenharmony_cieven if the lock-release and lock-acquisition functions are operating
7162306a36Sopenharmony_cion different ``rcu_node`` structures.
7262306a36Sopenharmony_ciTree RCU uses these two ordering guarantees to form an ordering
7362306a36Sopenharmony_cinetwork among all CPUs that were in any way involved in the grace
7462306a36Sopenharmony_ciperiod, including any CPUs that came online or went offline during
7562306a36Sopenharmony_cithe grace period in question.
7662306a36Sopenharmony_ci
7762306a36Sopenharmony_ciThe following litmus test exhibits the ordering effects of these
7862306a36Sopenharmony_cilock-acquisition and lock-release functions::
7962306a36Sopenharmony_ci
8062306a36Sopenharmony_ci    1 int x, y, z;
8162306a36Sopenharmony_ci    2
8262306a36Sopenharmony_ci    3 void task0(void)
8362306a36Sopenharmony_ci    4 {
8462306a36Sopenharmony_ci    5   raw_spin_lock_rcu_node(rnp);
8562306a36Sopenharmony_ci    6   WRITE_ONCE(x, 1);
8662306a36Sopenharmony_ci    7   r1 = READ_ONCE(y);
8762306a36Sopenharmony_ci    8   raw_spin_unlock_rcu_node(rnp);
8862306a36Sopenharmony_ci    9 }
8962306a36Sopenharmony_ci   10
9062306a36Sopenharmony_ci   11 void task1(void)
9162306a36Sopenharmony_ci   12 {
9262306a36Sopenharmony_ci   13   raw_spin_lock_rcu_node(rnp);
9362306a36Sopenharmony_ci   14   WRITE_ONCE(y, 1);
9462306a36Sopenharmony_ci   15   r2 = READ_ONCE(z);
9562306a36Sopenharmony_ci   16   raw_spin_unlock_rcu_node(rnp);
9662306a36Sopenharmony_ci   17 }
9762306a36Sopenharmony_ci   18
9862306a36Sopenharmony_ci   19 void task2(void)
9962306a36Sopenharmony_ci   20 {
10062306a36Sopenharmony_ci   21   WRITE_ONCE(z, 1);
10162306a36Sopenharmony_ci   22   smp_mb();
10262306a36Sopenharmony_ci   23   r3 = READ_ONCE(x);
10362306a36Sopenharmony_ci   24 }
10462306a36Sopenharmony_ci   25
10562306a36Sopenharmony_ci   26 WARN_ON(r1 == 0 && r2 == 0 && r3 == 0);
10662306a36Sopenharmony_ci
10762306a36Sopenharmony_ciThe ``WARN_ON()`` is evaluated at "the end of time",
10862306a36Sopenharmony_ciafter all changes have propagated throughout the system.
10962306a36Sopenharmony_ciWithout the ``smp_mb__after_unlock_lock()`` provided by the
11062306a36Sopenharmony_ciacquisition functions, this ``WARN_ON()`` could trigger, for example
11162306a36Sopenharmony_cion PowerPC.
11262306a36Sopenharmony_ciThe ``smp_mb__after_unlock_lock()`` invocations prevent this
11362306a36Sopenharmony_ci``WARN_ON()`` from triggering.
11462306a36Sopenharmony_ci
11562306a36Sopenharmony_ci+-----------------------------------------------------------------------+
11662306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
11762306a36Sopenharmony_ci+-----------------------------------------------------------------------+
11862306a36Sopenharmony_ci| But the chain of rcu_node-structure lock acquisitions guarantees      |
11962306a36Sopenharmony_ci| that new readers will see all of the updater's pre-grace-period       |
12062306a36Sopenharmony_ci| accesses and also guarantees that the updater's post-grace-period     |
12162306a36Sopenharmony_ci| accesses will see all of the old reader's accesses.  So why do we     |
12262306a36Sopenharmony_ci| need all of those calls to smp_mb__after_unlock_lock()?               |
12362306a36Sopenharmony_ci+-----------------------------------------------------------------------+
12462306a36Sopenharmony_ci| **Answer**:                                                           |
12562306a36Sopenharmony_ci+-----------------------------------------------------------------------+
12662306a36Sopenharmony_ci| Because we must provide ordering for RCU's polling grace-period       |
12762306a36Sopenharmony_ci| primitives, for example, get_state_synchronize_rcu() and              |
12862306a36Sopenharmony_ci| poll_state_synchronize_rcu().  Consider this code::                   |
12962306a36Sopenharmony_ci|                                                                       |
13062306a36Sopenharmony_ci|  CPU 0                                     CPU 1                      |
13162306a36Sopenharmony_ci|  ----                                      ----                       |
13262306a36Sopenharmony_ci|  WRITE_ONCE(X, 1)                          WRITE_ONCE(Y, 1)           |
13362306a36Sopenharmony_ci|  g = get_state_synchronize_rcu()           smp_mb()                   |
13462306a36Sopenharmony_ci|  while (!poll_state_synchronize_rcu(g))    r1 = READ_ONCE(X)          |
13562306a36Sopenharmony_ci|          continue;                                                    |
13662306a36Sopenharmony_ci|  r0 = READ_ONCE(Y)                                                    |
13762306a36Sopenharmony_ci|                                                                       |
13862306a36Sopenharmony_ci| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not           |
13962306a36Sopenharmony_ci| happen, even if CPU 1 is in an RCU extended quiescent state           |
14062306a36Sopenharmony_ci| (idle or offline) and thus won't interact directly with the RCU       |
14162306a36Sopenharmony_ci| core processing at all.                                               |
14262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_ciThis approach must be extended to include idle CPUs, which need
14562306a36Sopenharmony_ciRCU's grace-period memory ordering guarantee to extend to any
14662306a36Sopenharmony_ciRCU read-side critical sections preceding and following the current
14762306a36Sopenharmony_ciidle sojourn.
14862306a36Sopenharmony_ciThis case is handled by calls to the strongly ordered
14962306a36Sopenharmony_ci``atomic_add_return()`` read-modify-write atomic operation that
15062306a36Sopenharmony_ciis invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry
15162306a36Sopenharmony_citime and within ``rcu_dynticks_eqs_exit()`` at idle-exit time.
15262306a36Sopenharmony_ciThe grace-period kthread invokes ``rcu_dynticks_snap()`` and
15362306a36Sopenharmony_ci``rcu_dynticks_in_eqs_since()`` (both of which invoke
15462306a36Sopenharmony_cian ``atomic_add_return()`` of zero) to detect idle CPUs.
15562306a36Sopenharmony_ci
15662306a36Sopenharmony_ci+-----------------------------------------------------------------------+
15762306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
15862306a36Sopenharmony_ci+-----------------------------------------------------------------------+
15962306a36Sopenharmony_ci| But what about CPUs that remain offline for the entire grace period?  |
16062306a36Sopenharmony_ci+-----------------------------------------------------------------------+
16162306a36Sopenharmony_ci| **Answer**:                                                           |
16262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
16362306a36Sopenharmony_ci| Such CPUs will be offline at the beginning of the grace period, so    |
16462306a36Sopenharmony_ci| the grace period won't expect quiescent states from them. Races       |
16562306a36Sopenharmony_ci| between grace-period start and CPU-hotplug operations are mediated    |
16662306a36Sopenharmony_ci| by the CPU's leaf ``rcu_node`` structure's ``->lock`` as described    |
16762306a36Sopenharmony_ci| above.                                                                |
16862306a36Sopenharmony_ci+-----------------------------------------------------------------------+
16962306a36Sopenharmony_ci
17062306a36Sopenharmony_ciThe approach must be extended to handle one final case, that of waking a
17162306a36Sopenharmony_citask blocked in ``synchronize_rcu()``. This task might be affined to
17262306a36Sopenharmony_cia CPU that is not yet aware that the grace period has ended, and thus
17362306a36Sopenharmony_cimight not yet be subject to the grace period's memory ordering.
17462306a36Sopenharmony_ciTherefore, there is an ``smp_mb()`` after the return from
17562306a36Sopenharmony_ci``wait_for_completion()`` in the ``synchronize_rcu()`` code path.
17662306a36Sopenharmony_ci
17762306a36Sopenharmony_ci+-----------------------------------------------------------------------+
17862306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
17962306a36Sopenharmony_ci+-----------------------------------------------------------------------+
18062306a36Sopenharmony_ci| What? Where??? I don't see any ``smp_mb()`` after the return from     |
18162306a36Sopenharmony_ci| ``wait_for_completion()``!!!                                          |
18262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
18362306a36Sopenharmony_ci| **Answer**:                                                           |
18462306a36Sopenharmony_ci+-----------------------------------------------------------------------+
18562306a36Sopenharmony_ci| That would be because I spotted the need for that ``smp_mb()`` during |
18662306a36Sopenharmony_ci| the creation of this documentation, and it is therefore unlikely to   |
18762306a36Sopenharmony_ci| hit mainline before v4.14. Kudos to Lance Roy, Will Deacon, Peter     |
18862306a36Sopenharmony_ci| Zijlstra, and Jonathan Cameron for asking questions that sensitized   |
18962306a36Sopenharmony_ci| me to the rather elaborate sequence of events that demonstrate the    |
19062306a36Sopenharmony_ci| need for this memory barrier.                                         |
19162306a36Sopenharmony_ci+-----------------------------------------------------------------------+
19262306a36Sopenharmony_ci
19362306a36Sopenharmony_ciTree RCU's grace--period memory-ordering guarantees rely most heavily on
19462306a36Sopenharmony_cithe ``rcu_node`` structure's ``->lock`` field, so much so that it is
19562306a36Sopenharmony_cinecessary to abbreviate this pattern in the diagrams in the next
19662306a36Sopenharmony_cisection. For example, consider the ``rcu_prepare_for_idle()`` function
19762306a36Sopenharmony_cishown below, which is one of several functions that enforce ordering of
19862306a36Sopenharmony_cinewly arrived RCU callbacks against future grace periods:
19962306a36Sopenharmony_ci
20062306a36Sopenharmony_ci::
20162306a36Sopenharmony_ci
20262306a36Sopenharmony_ci    1 static void rcu_prepare_for_idle(void)
20362306a36Sopenharmony_ci    2 {
20462306a36Sopenharmony_ci    3   bool needwake;
20562306a36Sopenharmony_ci    4   struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
20662306a36Sopenharmony_ci    5   struct rcu_node *rnp;
20762306a36Sopenharmony_ci    6   int tne;
20862306a36Sopenharmony_ci    7
20962306a36Sopenharmony_ci    8   lockdep_assert_irqs_disabled();
21062306a36Sopenharmony_ci    9   if (rcu_rdp_is_offloaded(rdp))
21162306a36Sopenharmony_ci   10     return;
21262306a36Sopenharmony_ci   11
21362306a36Sopenharmony_ci   12   /* Handle nohz enablement switches conservatively. */
21462306a36Sopenharmony_ci   13   tne = READ_ONCE(tick_nohz_active);
21562306a36Sopenharmony_ci   14   if (tne != rdp->tick_nohz_enabled_snap) {
21662306a36Sopenharmony_ci   15     if (!rcu_segcblist_empty(&rdp->cblist))
21762306a36Sopenharmony_ci   16       invoke_rcu_core(); /* force nohz to see update. */
21862306a36Sopenharmony_ci   17     rdp->tick_nohz_enabled_snap = tne;
21962306a36Sopenharmony_ci   18     return;
22062306a36Sopenharmony_ci   19	}
22162306a36Sopenharmony_ci   20   if (!tne)
22262306a36Sopenharmony_ci   21     return;
22362306a36Sopenharmony_ci   22
22462306a36Sopenharmony_ci   23   /*
22562306a36Sopenharmony_ci   24    * If we have not yet accelerated this jiffy, accelerate all
22662306a36Sopenharmony_ci   25    * callbacks on this CPU.
22762306a36Sopenharmony_ci   26   */
22862306a36Sopenharmony_ci   27   if (rdp->last_accelerate == jiffies)
22962306a36Sopenharmony_ci   28     return;
23062306a36Sopenharmony_ci   29   rdp->last_accelerate = jiffies;
23162306a36Sopenharmony_ci   30   if (rcu_segcblist_pend_cbs(&rdp->cblist)) {
23262306a36Sopenharmony_ci   31     rnp = rdp->mynode;
23362306a36Sopenharmony_ci   32     raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
23462306a36Sopenharmony_ci   33     needwake = rcu_accelerate_cbs(rnp, rdp);
23562306a36Sopenharmony_ci   34     raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
23662306a36Sopenharmony_ci   35     if (needwake)
23762306a36Sopenharmony_ci   36       rcu_gp_kthread_wake();
23862306a36Sopenharmony_ci   37   }
23962306a36Sopenharmony_ci   38 }
24062306a36Sopenharmony_ci
24162306a36Sopenharmony_ciBut the only part of ``rcu_prepare_for_idle()`` that really matters for
24262306a36Sopenharmony_cithis discussion are lines 32–34. We will therefore abbreviate this
24362306a36Sopenharmony_cifunction as follows:
24462306a36Sopenharmony_ci
24562306a36Sopenharmony_ci.. kernel-figure:: rcu_node-lock.svg
24662306a36Sopenharmony_ci
24762306a36Sopenharmony_ciThe box represents the ``rcu_node`` structure's ``->lock`` critical
24862306a36Sopenharmony_cisection, with the double line on top representing the additional
24962306a36Sopenharmony_ci``smp_mb__after_unlock_lock()``.
25062306a36Sopenharmony_ci
25162306a36Sopenharmony_ciTree RCU Grace Period Memory Ordering Components
25262306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
25362306a36Sopenharmony_ci
25462306a36Sopenharmony_ciTree RCU's grace-period memory-ordering guarantee is provided by a
25562306a36Sopenharmony_cinumber of RCU components:
25662306a36Sopenharmony_ci
25762306a36Sopenharmony_ci#. `Callback Registry`_
25862306a36Sopenharmony_ci#. `Grace-Period Initialization`_
25962306a36Sopenharmony_ci#. `Self-Reported Quiescent States`_
26062306a36Sopenharmony_ci#. `Dynamic Tick Interface`_
26162306a36Sopenharmony_ci#. `CPU-Hotplug Interface`_
26262306a36Sopenharmony_ci#. `Forcing Quiescent States`_
26362306a36Sopenharmony_ci#. `Grace-Period Cleanup`_
26462306a36Sopenharmony_ci#. `Callback Invocation`_
26562306a36Sopenharmony_ci
26662306a36Sopenharmony_ciEach of the following section looks at the corresponding component in
26762306a36Sopenharmony_cidetail.
26862306a36Sopenharmony_ci
26962306a36Sopenharmony_ciCallback Registry
27062306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^
27162306a36Sopenharmony_ci
27262306a36Sopenharmony_ciIf RCU's grace-period guarantee is to mean anything at all, any access
27362306a36Sopenharmony_cithat happens before a given invocation of ``call_rcu()`` must also
27462306a36Sopenharmony_cihappen before the corresponding grace period. The implementation of this
27562306a36Sopenharmony_ciportion of RCU's grace period guarantee is shown in the following
27662306a36Sopenharmony_cifigure:
27762306a36Sopenharmony_ci
27862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-callback-registry.svg
27962306a36Sopenharmony_ci
28062306a36Sopenharmony_ciBecause ``call_rcu()`` normally acts only on CPU-local state, it
28162306a36Sopenharmony_ciprovides no ordering guarantees, either for itself or for phase one of
28262306a36Sopenharmony_cithe update (which again will usually be removal of an element from an
28362306a36Sopenharmony_ciRCU-protected data structure). It simply enqueues the ``rcu_head``
28462306a36Sopenharmony_cistructure on a per-CPU list, which cannot become associated with a grace
28562306a36Sopenharmony_ciperiod until a later call to ``rcu_accelerate_cbs()``, as shown in the
28662306a36Sopenharmony_cidiagram above.
28762306a36Sopenharmony_ci
28862306a36Sopenharmony_ciOne set of code paths shown on the left invokes ``rcu_accelerate_cbs()``
28962306a36Sopenharmony_civia ``note_gp_changes()``, either directly from ``call_rcu()`` (if the
29062306a36Sopenharmony_cicurrent CPU is inundated with queued ``rcu_head`` structures) or more
29162306a36Sopenharmony_cilikely from an ``RCU_SOFTIRQ`` handler. Another code path in the middle
29262306a36Sopenharmony_ciis taken only in kernels built with ``CONFIG_RCU_FAST_NO_HZ=y``, which
29362306a36Sopenharmony_ciinvokes ``rcu_accelerate_cbs()`` via ``rcu_prepare_for_idle()``. The
29462306a36Sopenharmony_cifinal code path on the right is taken only in kernels built with
29562306a36Sopenharmony_ci``CONFIG_HOTPLUG_CPU=y``, which invokes ``rcu_accelerate_cbs()`` via
29662306a36Sopenharmony_ci``rcu_advance_cbs()``, ``rcu_migrate_callbacks``,
29762306a36Sopenharmony_ci``rcutree_migrate_callbacks()``, and ``takedown_cpu()``, which in turn
29862306a36Sopenharmony_ciis invoked on a surviving CPU after the outgoing CPU has been completely
29962306a36Sopenharmony_ciofflined.
30062306a36Sopenharmony_ci
30162306a36Sopenharmony_ciThere are a few other code paths within grace-period processing that
30262306a36Sopenharmony_ciopportunistically invoke ``rcu_accelerate_cbs()``. However, either way,
30362306a36Sopenharmony_ciall of the CPU's recently queued ``rcu_head`` structures are associated
30462306a36Sopenharmony_ciwith a future grace-period number under the protection of the CPU's lead
30562306a36Sopenharmony_ci``rcu_node`` structure's ``->lock``. In all cases, there is full
30662306a36Sopenharmony_ciordering against any prior critical section for that same ``rcu_node``
30762306a36Sopenharmony_cistructure's ``->lock``, and also full ordering against any of the
30862306a36Sopenharmony_cicurrent task's or CPU's prior critical sections for any ``rcu_node``
30962306a36Sopenharmony_cistructure's ``->lock``.
31062306a36Sopenharmony_ci
31162306a36Sopenharmony_ciThe next section will show how this ordering ensures that any accesses
31262306a36Sopenharmony_ciprior to the ``call_rcu()`` (particularly including phase one of the
31362306a36Sopenharmony_ciupdate) happen before the start of the corresponding grace period.
31462306a36Sopenharmony_ci
31562306a36Sopenharmony_ci+-----------------------------------------------------------------------+
31662306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
31762306a36Sopenharmony_ci+-----------------------------------------------------------------------+
31862306a36Sopenharmony_ci| But what about ``synchronize_rcu()``?                                 |
31962306a36Sopenharmony_ci+-----------------------------------------------------------------------+
32062306a36Sopenharmony_ci| **Answer**:                                                           |
32162306a36Sopenharmony_ci+-----------------------------------------------------------------------+
32262306a36Sopenharmony_ci| The ``synchronize_rcu()`` passes ``call_rcu()`` to ``wait_rcu_gp()``, |
32362306a36Sopenharmony_ci| which invokes it. So either way, it eventually comes down to          |
32462306a36Sopenharmony_ci| ``call_rcu()``.                                                       |
32562306a36Sopenharmony_ci+-----------------------------------------------------------------------+
32662306a36Sopenharmony_ci
32762306a36Sopenharmony_ciGrace-Period Initialization
32862306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^
32962306a36Sopenharmony_ci
33062306a36Sopenharmony_ciGrace-period initialization is carried out by the grace-period kernel
33162306a36Sopenharmony_cithread, which makes several passes over the ``rcu_node`` tree within the
33262306a36Sopenharmony_ci``rcu_gp_init()`` function. This means that showing the full flow of
33362306a36Sopenharmony_ciordering through the grace-period computation will require duplicating
33462306a36Sopenharmony_cithis tree. If you find this confusing, please note that the state of the
33562306a36Sopenharmony_ci``rcu_node`` changes over time, just like Heraclitus's river. However,
33662306a36Sopenharmony_cito keep the ``rcu_node`` river tractable, the grace-period kernel
33762306a36Sopenharmony_cithread's traversals are presented in multiple parts, starting in this
33862306a36Sopenharmony_cisection with the various phases of grace-period initialization.
33962306a36Sopenharmony_ci
34062306a36Sopenharmony_ciThe first ordering-related grace-period initialization action is to
34162306a36Sopenharmony_ciadvance the ``rcu_state`` structure's ``->gp_seq`` grace-period-number
34262306a36Sopenharmony_cicounter, as shown below:
34362306a36Sopenharmony_ci
34462306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-1.svg
34562306a36Sopenharmony_ci
34662306a36Sopenharmony_ciThe actual increment is carried out using ``smp_store_release()``, which
34762306a36Sopenharmony_cihelps reject false-positive RCU CPU stall detection. Note that only the
34862306a36Sopenharmony_ciroot ``rcu_node`` structure is touched.
34962306a36Sopenharmony_ci
35062306a36Sopenharmony_ciThe first pass through the ``rcu_node`` tree updates bitmasks based on
35162306a36Sopenharmony_ciCPUs having come online or gone offline since the start of the previous
35262306a36Sopenharmony_cigrace period. In the common case where the number of online CPUs for
35362306a36Sopenharmony_cithis ``rcu_node`` structure has not transitioned to or from zero, this
35462306a36Sopenharmony_cipass will scan only the leaf ``rcu_node`` structures. However, if the
35562306a36Sopenharmony_cinumber of online CPUs for a given leaf ``rcu_node`` structure has
35662306a36Sopenharmony_citransitioned from zero, ``rcu_init_new_rnp()`` will be invoked for the
35762306a36Sopenharmony_cifirst incoming CPU. Similarly, if the number of online CPUs for a given
35862306a36Sopenharmony_cileaf ``rcu_node`` structure has transitioned to zero,
35962306a36Sopenharmony_ci``rcu_cleanup_dead_rnp()`` will be invoked for the last outgoing CPU.
36062306a36Sopenharmony_ciThe diagram below shows the path of ordering if the leftmost
36162306a36Sopenharmony_ci``rcu_node`` structure onlines its first CPU and if the next
36262306a36Sopenharmony_ci``rcu_node`` structure has no online CPUs (or, alternatively if the
36362306a36Sopenharmony_cileftmost ``rcu_node`` structure offlines its last CPU and if the next
36462306a36Sopenharmony_ci``rcu_node`` structure has no online CPUs).
36562306a36Sopenharmony_ci
36662306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-2.svg
36762306a36Sopenharmony_ci
36862306a36Sopenharmony_ciThe final ``rcu_gp_init()`` pass through the ``rcu_node`` tree traverses
36962306a36Sopenharmony_cibreadth-first, setting each ``rcu_node`` structure's ``->gp_seq`` field
37062306a36Sopenharmony_cito the newly advanced value from the ``rcu_state`` structure, as shown
37162306a36Sopenharmony_ciin the following diagram.
37262306a36Sopenharmony_ci
37362306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-init-3.svg
37462306a36Sopenharmony_ci
37562306a36Sopenharmony_ciThis change will also cause each CPU's next call to
37662306a36Sopenharmony_ci``__note_gp_changes()`` to notice that a new grace period has started,
37762306a36Sopenharmony_cias described in the next section. But because the grace-period kthread
37862306a36Sopenharmony_cistarted the grace period at the root (with the advancing of the
37962306a36Sopenharmony_ci``rcu_state`` structure's ``->gp_seq`` field) before setting each leaf
38062306a36Sopenharmony_ci``rcu_node`` structure's ``->gp_seq`` field, each CPU's observation of
38162306a36Sopenharmony_cithe start of the grace period will happen after the actual start of the
38262306a36Sopenharmony_cigrace period.
38362306a36Sopenharmony_ci
38462306a36Sopenharmony_ci+-----------------------------------------------------------------------+
38562306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
38662306a36Sopenharmony_ci+-----------------------------------------------------------------------+
38762306a36Sopenharmony_ci| But what about the CPU that started the grace period? Why wouldn't it |
38862306a36Sopenharmony_ci| see the start of the grace period right when it started that grace    |
38962306a36Sopenharmony_ci| period?                                                               |
39062306a36Sopenharmony_ci+-----------------------------------------------------------------------+
39162306a36Sopenharmony_ci| **Answer**:                                                           |
39262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
39362306a36Sopenharmony_ci| In some deep philosophical and overly anthromorphized sense, yes, the |
39462306a36Sopenharmony_ci| CPU starting the grace period is immediately aware of having done so. |
39562306a36Sopenharmony_ci| However, if we instead assume that RCU is not self-aware, then even   |
39662306a36Sopenharmony_ci| the CPU starting the grace period does not really become aware of the |
39762306a36Sopenharmony_ci| start of this grace period until its first call to                    |
39862306a36Sopenharmony_ci| ``__note_gp_changes()``. On the other hand, this CPU potentially gets |
39962306a36Sopenharmony_ci| early notification because it invokes ``__note_gp_changes()`` during  |
40062306a36Sopenharmony_ci| its last ``rcu_gp_init()`` pass through its leaf ``rcu_node``         |
40162306a36Sopenharmony_ci| structure.                                                            |
40262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
40362306a36Sopenharmony_ci
40462306a36Sopenharmony_ciSelf-Reported Quiescent States
40562306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
40662306a36Sopenharmony_ci
40762306a36Sopenharmony_ciWhen all entities that might block the grace period have reported
40862306a36Sopenharmony_ciquiescent states (or as described in a later section, had quiescent
40962306a36Sopenharmony_cistates reported on their behalf), the grace period can end. Online
41062306a36Sopenharmony_cinon-idle CPUs report their own quiescent states, as shown in the
41162306a36Sopenharmony_cifollowing diagram:
41262306a36Sopenharmony_ci
41362306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-qs.svg
41462306a36Sopenharmony_ci
41562306a36Sopenharmony_ciThis is for the last CPU to report a quiescent state, which signals the
41662306a36Sopenharmony_ciend of the grace period. Earlier quiescent states would push up the
41762306a36Sopenharmony_ci``rcu_node`` tree only until they encountered an ``rcu_node`` structure
41862306a36Sopenharmony_cithat is waiting for additional quiescent states. However, ordering is
41962306a36Sopenharmony_cinevertheless preserved because some later quiescent state will acquire
42062306a36Sopenharmony_cithat ``rcu_node`` structure's ``->lock``.
42162306a36Sopenharmony_ci
42262306a36Sopenharmony_ciAny number of events can lead up to a CPU invoking ``note_gp_changes``
42362306a36Sopenharmony_ci(or alternatively, directly invoking ``__note_gp_changes()``), at which
42462306a36Sopenharmony_cipoint that CPU will notice the start of a new grace period while holding
42562306a36Sopenharmony_ciits leaf ``rcu_node`` lock. Therefore, all execution shown in this
42662306a36Sopenharmony_cidiagram happens after the start of the grace period. In addition, this
42762306a36Sopenharmony_ciCPU will consider any RCU read-side critical section that started before
42862306a36Sopenharmony_cithe invocation of ``__note_gp_changes()`` to have started before the
42962306a36Sopenharmony_cigrace period, and thus a critical section that the grace period must
43062306a36Sopenharmony_ciwait on.
43162306a36Sopenharmony_ci
43262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
43362306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
43462306a36Sopenharmony_ci+-----------------------------------------------------------------------+
43562306a36Sopenharmony_ci| But a RCU read-side critical section might have started after the     |
43662306a36Sopenharmony_ci| beginning of the grace period (the advancing of ``->gp_seq`` from     |
43762306a36Sopenharmony_ci| earlier), so why should the grace period wait on such a critical      |
43862306a36Sopenharmony_ci| section?                                                              |
43962306a36Sopenharmony_ci+-----------------------------------------------------------------------+
44062306a36Sopenharmony_ci| **Answer**:                                                           |
44162306a36Sopenharmony_ci+-----------------------------------------------------------------------+
44262306a36Sopenharmony_ci| It is indeed not necessary for the grace period to wait on such a     |
44362306a36Sopenharmony_ci| critical section. However, it is permissible to wait on it. And it is |
44462306a36Sopenharmony_ci| furthermore important to wait on it, as this lazy approach is far     |
44562306a36Sopenharmony_ci| more scalable than a “big bang” all-at-once grace-period start could  |
44662306a36Sopenharmony_ci| possibly be.                                                          |
44762306a36Sopenharmony_ci+-----------------------------------------------------------------------+
44862306a36Sopenharmony_ci
44962306a36Sopenharmony_ciIf the CPU does a context switch, a quiescent state will be noted by
45062306a36Sopenharmony_ci``rcu_note_context_switch()`` on the left. On the other hand, if the CPU
45162306a36Sopenharmony_citakes a scheduler-clock interrupt while executing in usermode, a
45262306a36Sopenharmony_ciquiescent state will be noted by ``rcu_sched_clock_irq()`` on the right.
45362306a36Sopenharmony_ciEither way, the passage through a quiescent state will be noted in a
45462306a36Sopenharmony_ciper-CPU variable.
45562306a36Sopenharmony_ci
45662306a36Sopenharmony_ciThe next time an ``RCU_SOFTIRQ`` handler executes on this CPU (for
45762306a36Sopenharmony_ciexample, after the next scheduler-clock interrupt), ``rcu_core()`` will
45862306a36Sopenharmony_ciinvoke ``rcu_check_quiescent_state()``, which will notice the recorded
45962306a36Sopenharmony_ciquiescent state, and invoke ``rcu_report_qs_rdp()``. If
46062306a36Sopenharmony_ci``rcu_report_qs_rdp()`` verifies that the quiescent state really does
46162306a36Sopenharmony_ciapply to the current grace period, it invokes ``rcu_report_rnp()`` which
46262306a36Sopenharmony_citraverses up the ``rcu_node`` tree as shown at the bottom of the
46362306a36Sopenharmony_cidiagram, clearing bits from each ``rcu_node`` structure's ``->qsmask``
46462306a36Sopenharmony_cifield, and propagating up the tree when the result is zero.
46562306a36Sopenharmony_ci
46662306a36Sopenharmony_ciNote that traversal passes upwards out of a given ``rcu_node`` structure
46762306a36Sopenharmony_cionly if the current CPU is reporting the last quiescent state for the
46862306a36Sopenharmony_cisubtree headed by that ``rcu_node`` structure. A key point is that if a
46962306a36Sopenharmony_ciCPU's traversal stops at a given ``rcu_node`` structure, then there will
47062306a36Sopenharmony_cibe a later traversal by another CPU (or perhaps the same one) that
47162306a36Sopenharmony_ciproceeds upwards from that point, and the ``rcu_node`` ``->lock``
47262306a36Sopenharmony_ciguarantees that the first CPU's quiescent state happens before the
47362306a36Sopenharmony_ciremainder of the second CPU's traversal. Applying this line of thought
47462306a36Sopenharmony_cirepeatedly shows that all CPUs' quiescent states happen before the last
47562306a36Sopenharmony_ciCPU traverses through the root ``rcu_node`` structure, the “last CPU”
47662306a36Sopenharmony_cibeing the one that clears the last bit in the root ``rcu_node``
47762306a36Sopenharmony_cistructure's ``->qsmask`` field.
47862306a36Sopenharmony_ci
47962306a36Sopenharmony_ciDynamic Tick Interface
48062306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^
48162306a36Sopenharmony_ci
48262306a36Sopenharmony_ciDue to energy-efficiency considerations, RCU is forbidden from
48362306a36Sopenharmony_cidisturbing idle CPUs. CPUs are therefore required to notify RCU when
48462306a36Sopenharmony_cientering or leaving idle state, which they do via fully ordered
48562306a36Sopenharmony_civalue-returning atomic operations on a per-CPU variable. The ordering
48662306a36Sopenharmony_cieffects are as shown below:
48762306a36Sopenharmony_ci
48862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-dyntick.svg
48962306a36Sopenharmony_ci
49062306a36Sopenharmony_ciThe RCU grace-period kernel thread samples the per-CPU idleness variable
49162306a36Sopenharmony_ciwhile holding the corresponding CPU's leaf ``rcu_node`` structure's
49262306a36Sopenharmony_ci``->lock``. This means that any RCU read-side critical sections that
49362306a36Sopenharmony_ciprecede the idle period (the oval near the top of the diagram above)
49462306a36Sopenharmony_ciwill happen before the end of the current grace period. Similarly, the
49562306a36Sopenharmony_cibeginning of the current grace period will happen before any RCU
49662306a36Sopenharmony_ciread-side critical sections that follow the idle period (the oval near
49762306a36Sopenharmony_cithe bottom of the diagram above).
49862306a36Sopenharmony_ci
49962306a36Sopenharmony_ciPlumbing this into the full grace-period execution is described
50062306a36Sopenharmony_ci`below <Forcing Quiescent States_>`__.
50162306a36Sopenharmony_ci
50262306a36Sopenharmony_ciCPU-Hotplug Interface
50362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^
50462306a36Sopenharmony_ci
50562306a36Sopenharmony_ciRCU is also forbidden from disturbing offline CPUs, which might well be
50662306a36Sopenharmony_cipowered off and removed from the system completely. CPUs are therefore
50762306a36Sopenharmony_cirequired to notify RCU of their comings and goings as part of the
50862306a36Sopenharmony_cicorresponding CPU hotplug operations. The ordering effects are shown
50962306a36Sopenharmony_cibelow:
51062306a36Sopenharmony_ci
51162306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-hotplug.svg
51262306a36Sopenharmony_ci
51362306a36Sopenharmony_ciBecause CPU hotplug operations are much less frequent than idle
51462306a36Sopenharmony_citransitions, they are heavier weight, and thus acquire the CPU's leaf
51562306a36Sopenharmony_ci``rcu_node`` structure's ``->lock`` and update this structure's
51662306a36Sopenharmony_ci``->qsmaskinitnext``. The RCU grace-period kernel thread samples this
51762306a36Sopenharmony_cimask to detect CPUs having gone offline since the beginning of this
51862306a36Sopenharmony_cigrace period.
51962306a36Sopenharmony_ci
52062306a36Sopenharmony_ciPlumbing this into the full grace-period execution is described
52162306a36Sopenharmony_ci`below <Forcing Quiescent States_>`__.
52262306a36Sopenharmony_ci
52362306a36Sopenharmony_ciForcing Quiescent States
52462306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^
52562306a36Sopenharmony_ci
52662306a36Sopenharmony_ciAs noted above, idle and offline CPUs cannot report their own quiescent
52762306a36Sopenharmony_cistates, and therefore the grace-period kernel thread must do the
52862306a36Sopenharmony_cireporting on their behalf. This process is called “forcing quiescent
52962306a36Sopenharmony_cistates”, it is repeated every few jiffies, and its ordering effects are
53062306a36Sopenharmony_cishown below:
53162306a36Sopenharmony_ci
53262306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-fqs.svg
53362306a36Sopenharmony_ci
53462306a36Sopenharmony_ciEach pass of quiescent state forcing is guaranteed to traverse the leaf
53562306a36Sopenharmony_ci``rcu_node`` structures, and if there are no new quiescent states due to
53662306a36Sopenharmony_cirecently idled and/or offlined CPUs, then only the leaves are traversed.
53762306a36Sopenharmony_ciHowever, if there is a newly offlined CPU as illustrated on the left or
53862306a36Sopenharmony_cia newly idled CPU as illustrated on the right, the corresponding
53962306a36Sopenharmony_ciquiescent state will be driven up towards the root. As with
54062306a36Sopenharmony_ciself-reported quiescent states, the upwards driving stops once it
54162306a36Sopenharmony_cireaches an ``rcu_node`` structure that has quiescent states outstanding
54262306a36Sopenharmony_cifrom other CPUs.
54362306a36Sopenharmony_ci
54462306a36Sopenharmony_ci+-----------------------------------------------------------------------+
54562306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
54662306a36Sopenharmony_ci+-----------------------------------------------------------------------+
54762306a36Sopenharmony_ci| The leftmost drive to root stopped before it reached the root         |
54862306a36Sopenharmony_ci| ``rcu_node`` structure, which means that there are still CPUs         |
54962306a36Sopenharmony_ci| subordinate to that structure on which the current grace period is    |
55062306a36Sopenharmony_ci| waiting. Given that, how is it possible that the rightmost drive to   |
55162306a36Sopenharmony_ci| root ended the grace period?                                          |
55262306a36Sopenharmony_ci+-----------------------------------------------------------------------+
55362306a36Sopenharmony_ci| **Answer**:                                                           |
55462306a36Sopenharmony_ci+-----------------------------------------------------------------------+
55562306a36Sopenharmony_ci| Good analysis! It is in fact impossible in the absence of bugs in     |
55662306a36Sopenharmony_ci| RCU. But this diagram is complex enough as it is, so simplicity       |
55762306a36Sopenharmony_ci| overrode accuracy. You can think of it as poetic license, or you can  |
55862306a36Sopenharmony_ci| think of it as misdirection that is resolved in the                   |
55962306a36Sopenharmony_ci| `stitched-together diagram <Putting It All Together_>`__.             |
56062306a36Sopenharmony_ci+-----------------------------------------------------------------------+
56162306a36Sopenharmony_ci
56262306a36Sopenharmony_ciGrace-Period Cleanup
56362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^
56462306a36Sopenharmony_ci
56562306a36Sopenharmony_ciGrace-period cleanup first scans the ``rcu_node`` tree breadth-first
56662306a36Sopenharmony_ciadvancing all the ``->gp_seq`` fields, then it advances the
56762306a36Sopenharmony_ci``rcu_state`` structure's ``->gp_seq`` field. The ordering effects are
56862306a36Sopenharmony_cishown below:
56962306a36Sopenharmony_ci
57062306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp-cleanup.svg
57162306a36Sopenharmony_ci
57262306a36Sopenharmony_ciAs indicated by the oval at the bottom of the diagram, once grace-period
57362306a36Sopenharmony_cicleanup is complete, the next grace period can begin.
57462306a36Sopenharmony_ci
57562306a36Sopenharmony_ci+-----------------------------------------------------------------------+
57662306a36Sopenharmony_ci| **Quick Quiz**:                                                       |
57762306a36Sopenharmony_ci+-----------------------------------------------------------------------+
57862306a36Sopenharmony_ci| But when precisely does the grace period end?                         |
57962306a36Sopenharmony_ci+-----------------------------------------------------------------------+
58062306a36Sopenharmony_ci| **Answer**:                                                           |
58162306a36Sopenharmony_ci+-----------------------------------------------------------------------+
58262306a36Sopenharmony_ci| There is no useful single point at which the grace period can be said |
58362306a36Sopenharmony_ci| to end. The earliest reasonable candidate is as soon as the last CPU  |
58462306a36Sopenharmony_ci| has reported its quiescent state, but it may be some milliseconds     |
58562306a36Sopenharmony_ci| before RCU becomes aware of this. The latest reasonable candidate is  |
58662306a36Sopenharmony_ci| once the ``rcu_state`` structure's ``->gp_seq`` field has been        |
58762306a36Sopenharmony_ci| updated, but it is quite possible that some CPUs have already         |
58862306a36Sopenharmony_ci| completed phase two of their updates by that time. In short, if you   |
58962306a36Sopenharmony_ci| are going to work with RCU, you need to learn to embrace uncertainty. |
59062306a36Sopenharmony_ci+-----------------------------------------------------------------------+
59162306a36Sopenharmony_ci
59262306a36Sopenharmony_ciCallback Invocation
59362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^
59462306a36Sopenharmony_ci
59562306a36Sopenharmony_ciOnce a given CPU's leaf ``rcu_node`` structure's ``->gp_seq`` field has
59662306a36Sopenharmony_cibeen updated, that CPU can begin invoking its RCU callbacks that were
59762306a36Sopenharmony_ciwaiting for this grace period to end. These callbacks are identified by
59862306a36Sopenharmony_ci``rcu_advance_cbs()``, which is usually invoked by
59962306a36Sopenharmony_ci``__note_gp_changes()``. As shown in the diagram below, this invocation
60062306a36Sopenharmony_cican be triggered by the scheduling-clock interrupt
60162306a36Sopenharmony_ci(``rcu_sched_clock_irq()`` on the left) or by idle entry
60262306a36Sopenharmony_ci(``rcu_cleanup_after_idle()`` on the right, but only for kernels build
60362306a36Sopenharmony_ciwith ``CONFIG_RCU_FAST_NO_HZ=y``). Either way, ``RCU_SOFTIRQ`` is
60462306a36Sopenharmony_ciraised, which results in ``rcu_do_batch()`` invoking the callbacks,
60562306a36Sopenharmony_ciwhich in turn allows those callbacks to carry out (either directly or
60662306a36Sopenharmony_ciindirectly via wakeup) the needed phase-two processing for each update.
60762306a36Sopenharmony_ci
60862306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-callback-invocation.svg
60962306a36Sopenharmony_ci
61062306a36Sopenharmony_ciPlease note that callback invocation can also be prompted by any number
61162306a36Sopenharmony_ciof corner-case code paths, for example, when a CPU notes that it has
61262306a36Sopenharmony_ciexcessive numbers of callbacks queued. In all cases, the CPU acquires
61362306a36Sopenharmony_ciits leaf ``rcu_node`` structure's ``->lock`` before invoking callbacks,
61462306a36Sopenharmony_ciwhich preserves the required ordering against the newly completed grace
61562306a36Sopenharmony_ciperiod.
61662306a36Sopenharmony_ci
61762306a36Sopenharmony_ciHowever, if the callback function communicates to other CPUs, for
61862306a36Sopenharmony_ciexample, doing a wakeup, then it is that function's responsibility to
61962306a36Sopenharmony_cimaintain ordering. For example, if the callback function wakes up a task
62062306a36Sopenharmony_cithat runs on some other CPU, proper ordering must in place in both the
62162306a36Sopenharmony_cicallback function and the task being awakened. To see why this is
62262306a36Sopenharmony_ciimportant, consider the top half of the `grace-period
62362306a36Sopenharmony_cicleanup`_ diagram. The callback might be
62462306a36Sopenharmony_cirunning on a CPU corresponding to the leftmost leaf ``rcu_node``
62562306a36Sopenharmony_cistructure, and awaken a task that is to run on a CPU corresponding to
62662306a36Sopenharmony_cithe rightmost leaf ``rcu_node`` structure, and the grace-period kernel
62762306a36Sopenharmony_cithread might not yet have reached the rightmost leaf. In this case, the
62862306a36Sopenharmony_cigrace period's memory ordering might not yet have reached that CPU, so
62962306a36Sopenharmony_ciagain the callback function and the awakened task must supply proper
63062306a36Sopenharmony_ciordering.
63162306a36Sopenharmony_ci
63262306a36Sopenharmony_ciPutting It All Together
63362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~
63462306a36Sopenharmony_ci
63562306a36Sopenharmony_ciA stitched-together diagram is here:
63662306a36Sopenharmony_ci
63762306a36Sopenharmony_ci.. kernel-figure:: TreeRCU-gp.svg
63862306a36Sopenharmony_ci
63962306a36Sopenharmony_ciLegal Statement
64062306a36Sopenharmony_ci~~~~~~~~~~~~~~~
64162306a36Sopenharmony_ci
64262306a36Sopenharmony_ciThis work represents the view of the author and does not necessarily
64362306a36Sopenharmony_cirepresent the view of IBM.
64462306a36Sopenharmony_ci
64562306a36Sopenharmony_ciLinux is a registered trademark of Linus Torvalds.
64662306a36Sopenharmony_ci
64762306a36Sopenharmony_ciOther company, product, and service names may be trademarks or service
64862306a36Sopenharmony_cimarks of others.
649