162306a36Sopenharmony_ci.. _kernel_hacking_lock:
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci===========================
462306a36Sopenharmony_ciUnreliable Guide To Locking
562306a36Sopenharmony_ci===========================
662306a36Sopenharmony_ci
762306a36Sopenharmony_ci:Author: Rusty Russell
862306a36Sopenharmony_ci
962306a36Sopenharmony_ciIntroduction
1062306a36Sopenharmony_ci============
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ciWelcome, to Rusty's Remarkably Unreliable Guide to Kernel Locking
1362306a36Sopenharmony_ciissues. This document describes the locking systems in the Linux Kernel
1462306a36Sopenharmony_ciin 2.6.
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_ciWith the wide availability of HyperThreading, and preemption in the
1762306a36Sopenharmony_ciLinux Kernel, everyone hacking on the kernel needs to know the
1862306a36Sopenharmony_cifundamentals of concurrency and locking for SMP.
1962306a36Sopenharmony_ci
2062306a36Sopenharmony_ciThe Problem With Concurrency
2162306a36Sopenharmony_ci============================
2262306a36Sopenharmony_ci
2362306a36Sopenharmony_ci(Skip this if you know what a Race Condition is).
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ciIn a normal program, you can increment a counter like so:
2662306a36Sopenharmony_ci
2762306a36Sopenharmony_ci::
2862306a36Sopenharmony_ci
2962306a36Sopenharmony_ci          very_important_count++;
3062306a36Sopenharmony_ci
3162306a36Sopenharmony_ci
3262306a36Sopenharmony_ciThis is what they would expect to happen:
3362306a36Sopenharmony_ci
3462306a36Sopenharmony_ci
3562306a36Sopenharmony_ci.. table:: Expected Results
3662306a36Sopenharmony_ci
3762306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
3862306a36Sopenharmony_ci  | Instance 1                         | Instance 2                         |
3962306a36Sopenharmony_ci  +====================================+====================================+
4062306a36Sopenharmony_ci  | read very_important_count (5)      |                                    |
4162306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
4262306a36Sopenharmony_ci  | add 1 (6)                          |                                    |
4362306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
4462306a36Sopenharmony_ci  | write very_important_count (6)     |                                    |
4562306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
4662306a36Sopenharmony_ci  |                                    | read very_important_count (6)      |
4762306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
4862306a36Sopenharmony_ci  |                                    | add 1 (7)                          |
4962306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
5062306a36Sopenharmony_ci  |                                    | write very_important_count (7)     |
5162306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
5262306a36Sopenharmony_ci
5362306a36Sopenharmony_ciThis is what might happen:
5462306a36Sopenharmony_ci
5562306a36Sopenharmony_ci.. table:: Possible Results
5662306a36Sopenharmony_ci
5762306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
5862306a36Sopenharmony_ci  | Instance 1                         | Instance 2                         |
5962306a36Sopenharmony_ci  +====================================+====================================+
6062306a36Sopenharmony_ci  | read very_important_count (5)      |                                    |
6162306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
6262306a36Sopenharmony_ci  |                                    | read very_important_count (5)      |
6362306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
6462306a36Sopenharmony_ci  | add 1 (6)                          |                                    |
6562306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
6662306a36Sopenharmony_ci  |                                    | add 1 (6)                          |
6762306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
6862306a36Sopenharmony_ci  | write very_important_count (6)     |                                    |
6962306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
7062306a36Sopenharmony_ci  |                                    | write very_important_count (6)     |
7162306a36Sopenharmony_ci  +------------------------------------+------------------------------------+
7262306a36Sopenharmony_ci
7362306a36Sopenharmony_ci
7462306a36Sopenharmony_ciRace Conditions and Critical Regions
7562306a36Sopenharmony_ci------------------------------------
7662306a36Sopenharmony_ci
7762306a36Sopenharmony_ciThis overlap, where the result depends on the relative timing of
7862306a36Sopenharmony_cimultiple tasks, is called a race condition. The piece of code containing
7962306a36Sopenharmony_cithe concurrency issue is called a critical region. And especially since
8062306a36Sopenharmony_ciLinux starting running on SMP machines, they became one of the major
8162306a36Sopenharmony_ciissues in kernel design and implementation.
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ciPreemption can have the same effect, even if there is only one CPU: by
8462306a36Sopenharmony_cipreempting one task during the critical region, we have exactly the same
8562306a36Sopenharmony_cirace condition. In this case the thread which preempts might run the
8662306a36Sopenharmony_cicritical region itself.
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ciThe solution is to recognize when these simultaneous accesses occur, and
8962306a36Sopenharmony_ciuse locks to make sure that only one instance can enter the critical
9062306a36Sopenharmony_ciregion at any time. There are many friendly primitives in the Linux
9162306a36Sopenharmony_cikernel to help you do this. And then there are the unfriendly
9262306a36Sopenharmony_ciprimitives, but I'll pretend they don't exist.
9362306a36Sopenharmony_ci
9462306a36Sopenharmony_ciLocking in the Linux Kernel
9562306a36Sopenharmony_ci===========================
9662306a36Sopenharmony_ci
9762306a36Sopenharmony_ciIf I could give you one piece of advice on locking: **keep it simple**.
9862306a36Sopenharmony_ci
9962306a36Sopenharmony_ciBe reluctant to introduce new locks.
10062306a36Sopenharmony_ci
10162306a36Sopenharmony_ciTwo Main Types of Kernel Locks: Spinlocks and Mutexes
10262306a36Sopenharmony_ci-----------------------------------------------------
10362306a36Sopenharmony_ci
10462306a36Sopenharmony_ciThere are two main types of kernel locks. The fundamental type is the
10562306a36Sopenharmony_cispinlock (``include/asm/spinlock.h``), which is a very simple
10662306a36Sopenharmony_cisingle-holder lock: if you can't get the spinlock, you keep trying
10762306a36Sopenharmony_ci(spinning) until you can. Spinlocks are very small and fast, and can be
10862306a36Sopenharmony_ciused anywhere.
10962306a36Sopenharmony_ci
11062306a36Sopenharmony_ciThe second type is a mutex (``include/linux/mutex.h``): it is like a
11162306a36Sopenharmony_cispinlock, but you may block holding a mutex. If you can't lock a mutex,
11262306a36Sopenharmony_ciyour task will suspend itself, and be woken up when the mutex is
11362306a36Sopenharmony_cireleased. This means the CPU can do something else while you are
11462306a36Sopenharmony_ciwaiting. There are many cases when you simply can't sleep (see
11562306a36Sopenharmony_ci`What Functions Are Safe To Call From Interrupts?`_),
11662306a36Sopenharmony_ciand so have to use a spinlock instead.
11762306a36Sopenharmony_ci
11862306a36Sopenharmony_ciNeither type of lock is recursive: see
11962306a36Sopenharmony_ci`Deadlock: Simple and Advanced`_.
12062306a36Sopenharmony_ci
12162306a36Sopenharmony_ciLocks and Uniprocessor Kernels
12262306a36Sopenharmony_ci------------------------------
12362306a36Sopenharmony_ci
12462306a36Sopenharmony_ciFor kernels compiled without ``CONFIG_SMP``, and without
12562306a36Sopenharmony_ci``CONFIG_PREEMPT`` spinlocks do not exist at all. This is an excellent
12662306a36Sopenharmony_cidesign decision: when no-one else can run at the same time, there is no
12762306a36Sopenharmony_cireason to have a lock.
12862306a36Sopenharmony_ci
12962306a36Sopenharmony_ciIf the kernel is compiled without ``CONFIG_SMP``, but ``CONFIG_PREEMPT``
13062306a36Sopenharmony_ciis set, then spinlocks simply disable preemption, which is sufficient to
13162306a36Sopenharmony_ciprevent any races. For most purposes, we can think of preemption as
13262306a36Sopenharmony_ciequivalent to SMP, and not worry about it separately.
13362306a36Sopenharmony_ci
13462306a36Sopenharmony_ciYou should always test your locking code with ``CONFIG_SMP`` and
13562306a36Sopenharmony_ci``CONFIG_PREEMPT`` enabled, even if you don't have an SMP test box,
13662306a36Sopenharmony_cibecause it will still catch some kinds of locking bugs.
13762306a36Sopenharmony_ci
13862306a36Sopenharmony_ciMutexes still exist, because they are required for synchronization
13962306a36Sopenharmony_cibetween user contexts, as we will see below.
14062306a36Sopenharmony_ci
14162306a36Sopenharmony_ciLocking Only In User Context
14262306a36Sopenharmony_ci----------------------------
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_ciIf you have a data structure which is only ever accessed from user
14562306a36Sopenharmony_cicontext, then you can use a simple mutex (``include/linux/mutex.h``) to
14662306a36Sopenharmony_ciprotect it. This is the most trivial case: you initialize the mutex.
14762306a36Sopenharmony_ciThen you can call mutex_lock_interruptible() to grab the
14862306a36Sopenharmony_cimutex, and mutex_unlock() to release it. There is also a
14962306a36Sopenharmony_cimutex_lock(), which should be avoided, because it will
15062306a36Sopenharmony_cinot return if a signal is received.
15162306a36Sopenharmony_ci
15262306a36Sopenharmony_ciExample: ``net/netfilter/nf_sockopt.c`` allows registration of new
15362306a36Sopenharmony_cisetsockopt() and getsockopt() calls, with
15462306a36Sopenharmony_cinf_register_sockopt(). Registration and de-registration
15562306a36Sopenharmony_ciare only done on module load and unload (and boot time, where there is
15662306a36Sopenharmony_cino concurrency), and the list of registrations is only consulted for an
15762306a36Sopenharmony_ciunknown setsockopt() or getsockopt() system
15862306a36Sopenharmony_cicall. The ``nf_sockopt_mutex`` is perfect to protect this, especially
15962306a36Sopenharmony_cisince the setsockopt and getsockopt calls may well sleep.
16062306a36Sopenharmony_ci
16162306a36Sopenharmony_ciLocking Between User Context and Softirqs
16262306a36Sopenharmony_ci-----------------------------------------
16362306a36Sopenharmony_ci
16462306a36Sopenharmony_ciIf a softirq shares data with user context, you have two problems.
16562306a36Sopenharmony_ciFirstly, the current user context can be interrupted by a softirq, and
16662306a36Sopenharmony_cisecondly, the critical region could be entered from another CPU. This is
16762306a36Sopenharmony_ciwhere spin_lock_bh() (``include/linux/spinlock.h``) is
16862306a36Sopenharmony_ciused. It disables softirqs on that CPU, then grabs the lock.
16962306a36Sopenharmony_cispin_unlock_bh() does the reverse. (The '_bh' suffix is
17062306a36Sopenharmony_cia historical reference to "Bottom Halves", the old name for software
17162306a36Sopenharmony_ciinterrupts. It should really be called spin_lock_softirq()' in a
17262306a36Sopenharmony_ciperfect world).
17362306a36Sopenharmony_ci
17462306a36Sopenharmony_ciNote that you can also use spin_lock_irq() or
17562306a36Sopenharmony_cispin_lock_irqsave() here, which stop hardware interrupts
17662306a36Sopenharmony_cias well: see `Hard IRQ Context`_.
17762306a36Sopenharmony_ci
17862306a36Sopenharmony_ciThis works perfectly for UP as well: the spin lock vanishes, and this
17962306a36Sopenharmony_cimacro simply becomes local_bh_disable()
18062306a36Sopenharmony_ci(``include/linux/interrupt.h``), which protects you from the softirq
18162306a36Sopenharmony_cibeing run.
18262306a36Sopenharmony_ci
18362306a36Sopenharmony_ciLocking Between User Context and Tasklets
18462306a36Sopenharmony_ci-----------------------------------------
18562306a36Sopenharmony_ci
18662306a36Sopenharmony_ciThis is exactly the same as above, because tasklets are actually run
18762306a36Sopenharmony_cifrom a softirq.
18862306a36Sopenharmony_ci
18962306a36Sopenharmony_ciLocking Between User Context and Timers
19062306a36Sopenharmony_ci---------------------------------------
19162306a36Sopenharmony_ci
19262306a36Sopenharmony_ciThis, too, is exactly the same as above, because timers are actually run
19362306a36Sopenharmony_cifrom a softirq. From a locking point of view, tasklets and timers are
19462306a36Sopenharmony_ciidentical.
19562306a36Sopenharmony_ci
19662306a36Sopenharmony_ciLocking Between Tasklets/Timers
19762306a36Sopenharmony_ci-------------------------------
19862306a36Sopenharmony_ci
19962306a36Sopenharmony_ciSometimes a tasklet or timer might want to share data with another
20062306a36Sopenharmony_citasklet or timer.
20162306a36Sopenharmony_ci
20262306a36Sopenharmony_ciThe Same Tasklet/Timer
20362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~
20462306a36Sopenharmony_ci
20562306a36Sopenharmony_ciSince a tasklet is never run on two CPUs at once, you don't need to
20662306a36Sopenharmony_ciworry about your tasklet being reentrant (running twice at once), even
20762306a36Sopenharmony_cion SMP.
20862306a36Sopenharmony_ci
20962306a36Sopenharmony_ciDifferent Tasklets/Timers
21062306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~
21162306a36Sopenharmony_ci
21262306a36Sopenharmony_ciIf another tasklet/timer wants to share data with your tasklet or timer
21362306a36Sopenharmony_ci, you will both need to use spin_lock() and
21462306a36Sopenharmony_cispin_unlock() calls. spin_lock_bh() is
21562306a36Sopenharmony_ciunnecessary here, as you are already in a tasklet, and none will be run
21662306a36Sopenharmony_cion the same CPU.
21762306a36Sopenharmony_ci
21862306a36Sopenharmony_ciLocking Between Softirqs
21962306a36Sopenharmony_ci------------------------
22062306a36Sopenharmony_ci
22162306a36Sopenharmony_ciOften a softirq might want to share data with itself or a tasklet/timer.
22262306a36Sopenharmony_ci
22362306a36Sopenharmony_ciThe Same Softirq
22462306a36Sopenharmony_ci~~~~~~~~~~~~~~~~
22562306a36Sopenharmony_ci
22662306a36Sopenharmony_ciThe same softirq can run on the other CPUs: you can use a per-CPU array
22762306a36Sopenharmony_ci(see `Per-CPU Data`_) for better performance. If you're
22862306a36Sopenharmony_cigoing so far as to use a softirq, you probably care about scalable
22962306a36Sopenharmony_ciperformance enough to justify the extra complexity.
23062306a36Sopenharmony_ci
23162306a36Sopenharmony_ciYou'll need to use spin_lock() and
23262306a36Sopenharmony_cispin_unlock() for shared data.
23362306a36Sopenharmony_ci
23462306a36Sopenharmony_ciDifferent Softirqs
23562306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~
23662306a36Sopenharmony_ci
23762306a36Sopenharmony_ciYou'll need to use spin_lock() and
23862306a36Sopenharmony_cispin_unlock() for shared data, whether it be a timer,
23962306a36Sopenharmony_citasklet, different softirq or the same or another softirq: any of them
24062306a36Sopenharmony_cicould be running on a different CPU.
24162306a36Sopenharmony_ci
24262306a36Sopenharmony_ciHard IRQ Context
24362306a36Sopenharmony_ci================
24462306a36Sopenharmony_ci
24562306a36Sopenharmony_ciHardware interrupts usually communicate with a tasklet or softirq.
24662306a36Sopenharmony_ciFrequently this involves putting work in a queue, which the softirq will
24762306a36Sopenharmony_citake out.
24862306a36Sopenharmony_ci
24962306a36Sopenharmony_ciLocking Between Hard IRQ and Softirqs/Tasklets
25062306a36Sopenharmony_ci----------------------------------------------
25162306a36Sopenharmony_ci
25262306a36Sopenharmony_ciIf a hardware irq handler shares data with a softirq, you have two
25362306a36Sopenharmony_ciconcerns. Firstly, the softirq processing can be interrupted by a
25462306a36Sopenharmony_cihardware interrupt, and secondly, the critical region could be entered
25562306a36Sopenharmony_ciby a hardware interrupt on another CPU. This is where
25662306a36Sopenharmony_cispin_lock_irq() is used. It is defined to disable
25762306a36Sopenharmony_ciinterrupts on that cpu, then grab the lock.
25862306a36Sopenharmony_cispin_unlock_irq() does the reverse.
25962306a36Sopenharmony_ci
26062306a36Sopenharmony_ciThe irq handler does not need to use spin_lock_irq(), because
26162306a36Sopenharmony_cithe softirq cannot run while the irq handler is running: it can use
26262306a36Sopenharmony_cispin_lock(), which is slightly faster. The only exception
26362306a36Sopenharmony_ciwould be if a different hardware irq handler uses the same lock:
26462306a36Sopenharmony_cispin_lock_irq() will stop that from interrupting us.
26562306a36Sopenharmony_ci
26662306a36Sopenharmony_ciThis works perfectly for UP as well: the spin lock vanishes, and this
26762306a36Sopenharmony_cimacro simply becomes local_irq_disable()
26862306a36Sopenharmony_ci(``include/asm/smp.h``), which protects you from the softirq/tasklet/BH
26962306a36Sopenharmony_cibeing run.
27062306a36Sopenharmony_ci
27162306a36Sopenharmony_cispin_lock_irqsave() (``include/linux/spinlock.h``) is a
27262306a36Sopenharmony_civariant which saves whether interrupts were on or off in a flags word,
27362306a36Sopenharmony_ciwhich is passed to spin_unlock_irqrestore(). This means
27462306a36Sopenharmony_cithat the same code can be used inside an hard irq handler (where
27562306a36Sopenharmony_ciinterrupts are already off) and in softirqs (where the irq disabling is
27662306a36Sopenharmony_cirequired).
27762306a36Sopenharmony_ci
27862306a36Sopenharmony_ciNote that softirqs (and hence tasklets and timers) are run on return
27962306a36Sopenharmony_cifrom hardware interrupts, so spin_lock_irq() also stops
28062306a36Sopenharmony_cithese. In that sense, spin_lock_irqsave() is the most
28162306a36Sopenharmony_cigeneral and powerful locking function.
28262306a36Sopenharmony_ci
28362306a36Sopenharmony_ciLocking Between Two Hard IRQ Handlers
28462306a36Sopenharmony_ci-------------------------------------
28562306a36Sopenharmony_ci
28662306a36Sopenharmony_ciIt is rare to have to share data between two IRQ handlers, but if you
28762306a36Sopenharmony_cido, spin_lock_irqsave() should be used: it is
28862306a36Sopenharmony_ciarchitecture-specific whether all interrupts are disabled inside irq
28962306a36Sopenharmony_cihandlers themselves.
29062306a36Sopenharmony_ci
29162306a36Sopenharmony_ciCheat Sheet For Locking
29262306a36Sopenharmony_ci=======================
29362306a36Sopenharmony_ci
29462306a36Sopenharmony_ciPete Zaitcev gives the following summary:
29562306a36Sopenharmony_ci
29662306a36Sopenharmony_ci-  If you are in a process context (any syscall) and want to lock other
29762306a36Sopenharmony_ci   process out, use a mutex. You can take a mutex and sleep
29862306a36Sopenharmony_ci   (``copy_from_user()`` or ``kmalloc(x,GFP_KERNEL)``).
29962306a36Sopenharmony_ci
30062306a36Sopenharmony_ci-  Otherwise (== data can be touched in an interrupt), use
30162306a36Sopenharmony_ci   spin_lock_irqsave() and
30262306a36Sopenharmony_ci   spin_unlock_irqrestore().
30362306a36Sopenharmony_ci
30462306a36Sopenharmony_ci-  Avoid holding spinlock for more than 5 lines of code and across any
30562306a36Sopenharmony_ci   function call (except accessors like readb()).
30662306a36Sopenharmony_ci
30762306a36Sopenharmony_ciTable of Minimum Requirements
30862306a36Sopenharmony_ci-----------------------------
30962306a36Sopenharmony_ci
31062306a36Sopenharmony_ciThe following table lists the **minimum** locking requirements between
31162306a36Sopenharmony_civarious contexts. In some cases, the same context can only be running on
31262306a36Sopenharmony_cione CPU at a time, so no locking is required for that context (eg. a
31362306a36Sopenharmony_ciparticular thread can only run on one CPU at a time, but if it needs
31462306a36Sopenharmony_cishares data with another thread, locking is required).
31562306a36Sopenharmony_ci
31662306a36Sopenharmony_ciRemember the advice above: you can always use
31762306a36Sopenharmony_cispin_lock_irqsave(), which is a superset of all other
31862306a36Sopenharmony_cispinlock primitives.
31962306a36Sopenharmony_ci
32062306a36Sopenharmony_ci============== ============= ============= ========= ========= ========= ========= ======= ======= ============== ==============
32162306a36Sopenharmony_ci.              IRQ Handler A IRQ Handler B Softirq A Softirq B Tasklet A Tasklet B Timer A Timer B User Context A User Context B
32262306a36Sopenharmony_ci============== ============= ============= ========= ========= ========= ========= ======= ======= ============== ==============
32362306a36Sopenharmony_ciIRQ Handler A  None
32462306a36Sopenharmony_ciIRQ Handler B  SLIS          None
32562306a36Sopenharmony_ciSoftirq A      SLI           SLI           SL
32662306a36Sopenharmony_ciSoftirq B      SLI           SLI           SL        SL
32762306a36Sopenharmony_ciTasklet A      SLI           SLI           SL        SL        None
32862306a36Sopenharmony_ciTasklet B      SLI           SLI           SL        SL        SL        None
32962306a36Sopenharmony_ciTimer A        SLI           SLI           SL        SL        SL        SL        None
33062306a36Sopenharmony_ciTimer B        SLI           SLI           SL        SL        SL        SL        SL      None
33162306a36Sopenharmony_ciUser Context A SLI           SLI           SLBH      SLBH      SLBH      SLBH      SLBH    SLBH    None
33262306a36Sopenharmony_ciUser Context B SLI           SLI           SLBH      SLBH      SLBH      SLBH      SLBH    SLBH    MLI            None
33362306a36Sopenharmony_ci============== ============= ============= ========= ========= ========= ========= ======= ======= ============== ==============
33462306a36Sopenharmony_ci
33562306a36Sopenharmony_ciTable: Table of Locking Requirements
33662306a36Sopenharmony_ci
33762306a36Sopenharmony_ci+--------+----------------------------+
33862306a36Sopenharmony_ci| SLIS   | spin_lock_irqsave          |
33962306a36Sopenharmony_ci+--------+----------------------------+
34062306a36Sopenharmony_ci| SLI    | spin_lock_irq              |
34162306a36Sopenharmony_ci+--------+----------------------------+
34262306a36Sopenharmony_ci| SL     | spin_lock                  |
34362306a36Sopenharmony_ci+--------+----------------------------+
34462306a36Sopenharmony_ci| SLBH   | spin_lock_bh               |
34562306a36Sopenharmony_ci+--------+----------------------------+
34662306a36Sopenharmony_ci| MLI    | mutex_lock_interruptible   |
34762306a36Sopenharmony_ci+--------+----------------------------+
34862306a36Sopenharmony_ci
34962306a36Sopenharmony_ciTable: Legend for Locking Requirements Table
35062306a36Sopenharmony_ci
35162306a36Sopenharmony_ciThe trylock Functions
35262306a36Sopenharmony_ci=====================
35362306a36Sopenharmony_ci
35462306a36Sopenharmony_ciThere are functions that try to acquire a lock only once and immediately
35562306a36Sopenharmony_cireturn a value telling about success or failure to acquire the lock.
35662306a36Sopenharmony_ciThey can be used if you need no access to the data protected with the
35762306a36Sopenharmony_cilock when some other thread is holding the lock. You should acquire the
35862306a36Sopenharmony_cilock later if you then need access to the data protected with the lock.
35962306a36Sopenharmony_ci
36062306a36Sopenharmony_cispin_trylock() does not spin but returns non-zero if it
36162306a36Sopenharmony_ciacquires the spinlock on the first try or 0 if not. This function can be
36262306a36Sopenharmony_ciused in all contexts like spin_lock(): you must have
36362306a36Sopenharmony_cidisabled the contexts that might interrupt you and acquire the spin
36462306a36Sopenharmony_cilock.
36562306a36Sopenharmony_ci
36662306a36Sopenharmony_cimutex_trylock() does not suspend your task but returns
36762306a36Sopenharmony_cinon-zero if it could lock the mutex on the first try or 0 if not. This
36862306a36Sopenharmony_cifunction cannot be safely used in hardware or software interrupt
36962306a36Sopenharmony_cicontexts despite not sleeping.
37062306a36Sopenharmony_ci
37162306a36Sopenharmony_ciCommon Examples
37262306a36Sopenharmony_ci===============
37362306a36Sopenharmony_ci
37462306a36Sopenharmony_ciLet's step through a simple example: a cache of number to name mappings.
37562306a36Sopenharmony_ciThe cache keeps a count of how often each of the objects is used, and
37662306a36Sopenharmony_ciwhen it gets full, throws out the least used one.
37762306a36Sopenharmony_ci
37862306a36Sopenharmony_ciAll In User Context
37962306a36Sopenharmony_ci-------------------
38062306a36Sopenharmony_ci
38162306a36Sopenharmony_ciFor our first example, we assume that all operations are in user context
38262306a36Sopenharmony_ci(ie. from system calls), so we can sleep. This means we can use a mutex
38362306a36Sopenharmony_cito protect the cache and all the objects within it. Here's the code::
38462306a36Sopenharmony_ci
38562306a36Sopenharmony_ci    #include <linux/list.h>
38662306a36Sopenharmony_ci    #include <linux/slab.h>
38762306a36Sopenharmony_ci    #include <linux/string.h>
38862306a36Sopenharmony_ci    #include <linux/mutex.h>
38962306a36Sopenharmony_ci    #include <asm/errno.h>
39062306a36Sopenharmony_ci
39162306a36Sopenharmony_ci    struct object
39262306a36Sopenharmony_ci    {
39362306a36Sopenharmony_ci            struct list_head list;
39462306a36Sopenharmony_ci            int id;
39562306a36Sopenharmony_ci            char name[32];
39662306a36Sopenharmony_ci            int popularity;
39762306a36Sopenharmony_ci    };
39862306a36Sopenharmony_ci
39962306a36Sopenharmony_ci    /* Protects the cache, cache_num, and the objects within it */
40062306a36Sopenharmony_ci    static DEFINE_MUTEX(cache_lock);
40162306a36Sopenharmony_ci    static LIST_HEAD(cache);
40262306a36Sopenharmony_ci    static unsigned int cache_num = 0;
40362306a36Sopenharmony_ci    #define MAX_CACHE_SIZE 10
40462306a36Sopenharmony_ci
40562306a36Sopenharmony_ci    /* Must be holding cache_lock */
40662306a36Sopenharmony_ci    static struct object *__cache_find(int id)
40762306a36Sopenharmony_ci    {
40862306a36Sopenharmony_ci            struct object *i;
40962306a36Sopenharmony_ci
41062306a36Sopenharmony_ci            list_for_each_entry(i, &cache, list)
41162306a36Sopenharmony_ci                    if (i->id == id) {
41262306a36Sopenharmony_ci                            i->popularity++;
41362306a36Sopenharmony_ci                            return i;
41462306a36Sopenharmony_ci                    }
41562306a36Sopenharmony_ci            return NULL;
41662306a36Sopenharmony_ci    }
41762306a36Sopenharmony_ci
41862306a36Sopenharmony_ci    /* Must be holding cache_lock */
41962306a36Sopenharmony_ci    static void __cache_delete(struct object *obj)
42062306a36Sopenharmony_ci    {
42162306a36Sopenharmony_ci            BUG_ON(!obj);
42262306a36Sopenharmony_ci            list_del(&obj->list);
42362306a36Sopenharmony_ci            kfree(obj);
42462306a36Sopenharmony_ci            cache_num--;
42562306a36Sopenharmony_ci    }
42662306a36Sopenharmony_ci
42762306a36Sopenharmony_ci    /* Must be holding cache_lock */
42862306a36Sopenharmony_ci    static void __cache_add(struct object *obj)
42962306a36Sopenharmony_ci    {
43062306a36Sopenharmony_ci            list_add(&obj->list, &cache);
43162306a36Sopenharmony_ci            if (++cache_num > MAX_CACHE_SIZE) {
43262306a36Sopenharmony_ci                    struct object *i, *outcast = NULL;
43362306a36Sopenharmony_ci                    list_for_each_entry(i, &cache, list) {
43462306a36Sopenharmony_ci                            if (!outcast || i->popularity < outcast->popularity)
43562306a36Sopenharmony_ci                                    outcast = i;
43662306a36Sopenharmony_ci                    }
43762306a36Sopenharmony_ci                    __cache_delete(outcast);
43862306a36Sopenharmony_ci            }
43962306a36Sopenharmony_ci    }
44062306a36Sopenharmony_ci
44162306a36Sopenharmony_ci    int cache_add(int id, const char *name)
44262306a36Sopenharmony_ci    {
44362306a36Sopenharmony_ci            struct object *obj;
44462306a36Sopenharmony_ci
44562306a36Sopenharmony_ci            if ((obj = kmalloc(sizeof(*obj), GFP_KERNEL)) == NULL)
44662306a36Sopenharmony_ci                    return -ENOMEM;
44762306a36Sopenharmony_ci
44862306a36Sopenharmony_ci            strscpy(obj->name, name, sizeof(obj->name));
44962306a36Sopenharmony_ci            obj->id = id;
45062306a36Sopenharmony_ci            obj->popularity = 0;
45162306a36Sopenharmony_ci
45262306a36Sopenharmony_ci            mutex_lock(&cache_lock);
45362306a36Sopenharmony_ci            __cache_add(obj);
45462306a36Sopenharmony_ci            mutex_unlock(&cache_lock);
45562306a36Sopenharmony_ci            return 0;
45662306a36Sopenharmony_ci    }
45762306a36Sopenharmony_ci
45862306a36Sopenharmony_ci    void cache_delete(int id)
45962306a36Sopenharmony_ci    {
46062306a36Sopenharmony_ci            mutex_lock(&cache_lock);
46162306a36Sopenharmony_ci            __cache_delete(__cache_find(id));
46262306a36Sopenharmony_ci            mutex_unlock(&cache_lock);
46362306a36Sopenharmony_ci    }
46462306a36Sopenharmony_ci
46562306a36Sopenharmony_ci    int cache_find(int id, char *name)
46662306a36Sopenharmony_ci    {
46762306a36Sopenharmony_ci            struct object *obj;
46862306a36Sopenharmony_ci            int ret = -ENOENT;
46962306a36Sopenharmony_ci
47062306a36Sopenharmony_ci            mutex_lock(&cache_lock);
47162306a36Sopenharmony_ci            obj = __cache_find(id);
47262306a36Sopenharmony_ci            if (obj) {
47362306a36Sopenharmony_ci                    ret = 0;
47462306a36Sopenharmony_ci                    strcpy(name, obj->name);
47562306a36Sopenharmony_ci            }
47662306a36Sopenharmony_ci            mutex_unlock(&cache_lock);
47762306a36Sopenharmony_ci            return ret;
47862306a36Sopenharmony_ci    }
47962306a36Sopenharmony_ci
48062306a36Sopenharmony_ciNote that we always make sure we have the cache_lock when we add,
48162306a36Sopenharmony_cidelete, or look up the cache: both the cache infrastructure itself and
48262306a36Sopenharmony_cithe contents of the objects are protected by the lock. In this case it's
48362306a36Sopenharmony_cieasy, since we copy the data for the user, and never let them access the
48462306a36Sopenharmony_ciobjects directly.
48562306a36Sopenharmony_ci
48662306a36Sopenharmony_ciThere is a slight (and common) optimization here: in
48762306a36Sopenharmony_cicache_add() we set up the fields of the object before
48862306a36Sopenharmony_cigrabbing the lock. This is safe, as no-one else can access it until we
48962306a36Sopenharmony_ciput it in cache.
49062306a36Sopenharmony_ci
49162306a36Sopenharmony_ciAccessing From Interrupt Context
49262306a36Sopenharmony_ci--------------------------------
49362306a36Sopenharmony_ci
49462306a36Sopenharmony_ciNow consider the case where cache_find() can be called
49562306a36Sopenharmony_cifrom interrupt context: either a hardware interrupt or a softirq. An
49662306a36Sopenharmony_ciexample would be a timer which deletes object from the cache.
49762306a36Sopenharmony_ci
49862306a36Sopenharmony_ciThe change is shown below, in standard patch format: the ``-`` are lines
49962306a36Sopenharmony_ciwhich are taken away, and the ``+`` are lines which are added.
50062306a36Sopenharmony_ci
50162306a36Sopenharmony_ci::
50262306a36Sopenharmony_ci
50362306a36Sopenharmony_ci    --- cache.c.usercontext 2003-12-09 13:58:54.000000000 +1100
50462306a36Sopenharmony_ci    +++ cache.c.interrupt   2003-12-09 14:07:49.000000000 +1100
50562306a36Sopenharmony_ci    @@ -12,7 +12,7 @@
50662306a36Sopenharmony_ci             int popularity;
50762306a36Sopenharmony_ci     };
50862306a36Sopenharmony_ci
50962306a36Sopenharmony_ci    -static DEFINE_MUTEX(cache_lock);
51062306a36Sopenharmony_ci    +static DEFINE_SPINLOCK(cache_lock);
51162306a36Sopenharmony_ci     static LIST_HEAD(cache);
51262306a36Sopenharmony_ci     static unsigned int cache_num = 0;
51362306a36Sopenharmony_ci     #define MAX_CACHE_SIZE 10
51462306a36Sopenharmony_ci    @@ -55,6 +55,7 @@
51562306a36Sopenharmony_ci     int cache_add(int id, const char *name)
51662306a36Sopenharmony_ci     {
51762306a36Sopenharmony_ci             struct object *obj;
51862306a36Sopenharmony_ci    +        unsigned long flags;
51962306a36Sopenharmony_ci
52062306a36Sopenharmony_ci             if ((obj = kmalloc(sizeof(*obj), GFP_KERNEL)) == NULL)
52162306a36Sopenharmony_ci                     return -ENOMEM;
52262306a36Sopenharmony_ci    @@ -63,30 +64,33 @@
52362306a36Sopenharmony_ci             obj->id = id;
52462306a36Sopenharmony_ci             obj->popularity = 0;
52562306a36Sopenharmony_ci
52662306a36Sopenharmony_ci    -        mutex_lock(&cache_lock);
52762306a36Sopenharmony_ci    +        spin_lock_irqsave(&cache_lock, flags);
52862306a36Sopenharmony_ci             __cache_add(obj);
52962306a36Sopenharmony_ci    -        mutex_unlock(&cache_lock);
53062306a36Sopenharmony_ci    +        spin_unlock_irqrestore(&cache_lock, flags);
53162306a36Sopenharmony_ci             return 0;
53262306a36Sopenharmony_ci     }
53362306a36Sopenharmony_ci
53462306a36Sopenharmony_ci     void cache_delete(int id)
53562306a36Sopenharmony_ci     {
53662306a36Sopenharmony_ci    -        mutex_lock(&cache_lock);
53762306a36Sopenharmony_ci    +        unsigned long flags;
53862306a36Sopenharmony_ci    +
53962306a36Sopenharmony_ci    +        spin_lock_irqsave(&cache_lock, flags);
54062306a36Sopenharmony_ci             __cache_delete(__cache_find(id));
54162306a36Sopenharmony_ci    -        mutex_unlock(&cache_lock);
54262306a36Sopenharmony_ci    +        spin_unlock_irqrestore(&cache_lock, flags);
54362306a36Sopenharmony_ci     }
54462306a36Sopenharmony_ci
54562306a36Sopenharmony_ci     int cache_find(int id, char *name)
54662306a36Sopenharmony_ci     {
54762306a36Sopenharmony_ci             struct object *obj;
54862306a36Sopenharmony_ci             int ret = -ENOENT;
54962306a36Sopenharmony_ci    +        unsigned long flags;
55062306a36Sopenharmony_ci
55162306a36Sopenharmony_ci    -        mutex_lock(&cache_lock);
55262306a36Sopenharmony_ci    +        spin_lock_irqsave(&cache_lock, flags);
55362306a36Sopenharmony_ci             obj = __cache_find(id);
55462306a36Sopenharmony_ci             if (obj) {
55562306a36Sopenharmony_ci                     ret = 0;
55662306a36Sopenharmony_ci                     strcpy(name, obj->name);
55762306a36Sopenharmony_ci             }
55862306a36Sopenharmony_ci    -        mutex_unlock(&cache_lock);
55962306a36Sopenharmony_ci    +        spin_unlock_irqrestore(&cache_lock, flags);
56062306a36Sopenharmony_ci             return ret;
56162306a36Sopenharmony_ci     }
56262306a36Sopenharmony_ci
56362306a36Sopenharmony_ciNote that the spin_lock_irqsave() will turn off
56462306a36Sopenharmony_ciinterrupts if they are on, otherwise does nothing (if we are already in
56562306a36Sopenharmony_cian interrupt handler), hence these functions are safe to call from any
56662306a36Sopenharmony_cicontext.
56762306a36Sopenharmony_ci
56862306a36Sopenharmony_ciUnfortunately, cache_add() calls kmalloc()
56962306a36Sopenharmony_ciwith the ``GFP_KERNEL`` flag, which is only legal in user context. I
57062306a36Sopenharmony_cihave assumed that cache_add() is still only called in
57162306a36Sopenharmony_ciuser context, otherwise this should become a parameter to
57262306a36Sopenharmony_cicache_add().
57362306a36Sopenharmony_ci
57462306a36Sopenharmony_ciExposing Objects Outside This File
57562306a36Sopenharmony_ci----------------------------------
57662306a36Sopenharmony_ci
57762306a36Sopenharmony_ciIf our objects contained more information, it might not be sufficient to
57862306a36Sopenharmony_cicopy the information in and out: other parts of the code might want to
57962306a36Sopenharmony_cikeep pointers to these objects, for example, rather than looking up the
58062306a36Sopenharmony_ciid every time. This produces two problems.
58162306a36Sopenharmony_ci
58262306a36Sopenharmony_ciThe first problem is that we use the ``cache_lock`` to protect objects:
58362306a36Sopenharmony_ciwe'd need to make this non-static so the rest of the code can use it.
58462306a36Sopenharmony_ciThis makes locking trickier, as it is no longer all in one place.
58562306a36Sopenharmony_ci
58662306a36Sopenharmony_ciThe second problem is the lifetime problem: if another structure keeps a
58762306a36Sopenharmony_cipointer to an object, it presumably expects that pointer to remain
58862306a36Sopenharmony_civalid. Unfortunately, this is only guaranteed while you hold the lock,
58962306a36Sopenharmony_ciotherwise someone might call cache_delete() and even
59062306a36Sopenharmony_ciworse, add another object, re-using the same address.
59162306a36Sopenharmony_ci
59262306a36Sopenharmony_ciAs there is only one lock, you can't hold it forever: no-one else would
59362306a36Sopenharmony_ciget any work done.
59462306a36Sopenharmony_ci
59562306a36Sopenharmony_ciThe solution to this problem is to use a reference count: everyone who
59662306a36Sopenharmony_cihas a pointer to the object increases it when they first get the object,
59762306a36Sopenharmony_ciand drops the reference count when they're finished with it. Whoever
59862306a36Sopenharmony_cidrops it to zero knows it is unused, and can actually delete it.
59962306a36Sopenharmony_ci
60062306a36Sopenharmony_ciHere is the code::
60162306a36Sopenharmony_ci
60262306a36Sopenharmony_ci    --- cache.c.interrupt   2003-12-09 14:25:43.000000000 +1100
60362306a36Sopenharmony_ci    +++ cache.c.refcnt  2003-12-09 14:33:05.000000000 +1100
60462306a36Sopenharmony_ci    @@ -7,6 +7,7 @@
60562306a36Sopenharmony_ci     struct object
60662306a36Sopenharmony_ci     {
60762306a36Sopenharmony_ci             struct list_head list;
60862306a36Sopenharmony_ci    +        unsigned int refcnt;
60962306a36Sopenharmony_ci             int id;
61062306a36Sopenharmony_ci             char name[32];
61162306a36Sopenharmony_ci             int popularity;
61262306a36Sopenharmony_ci    @@ -17,6 +18,35 @@
61362306a36Sopenharmony_ci     static unsigned int cache_num = 0;
61462306a36Sopenharmony_ci     #define MAX_CACHE_SIZE 10
61562306a36Sopenharmony_ci
61662306a36Sopenharmony_ci    +static void __object_put(struct object *obj)
61762306a36Sopenharmony_ci    +{
61862306a36Sopenharmony_ci    +        if (--obj->refcnt == 0)
61962306a36Sopenharmony_ci    +                kfree(obj);
62062306a36Sopenharmony_ci    +}
62162306a36Sopenharmony_ci    +
62262306a36Sopenharmony_ci    +static void __object_get(struct object *obj)
62362306a36Sopenharmony_ci    +{
62462306a36Sopenharmony_ci    +        obj->refcnt++;
62562306a36Sopenharmony_ci    +}
62662306a36Sopenharmony_ci    +
62762306a36Sopenharmony_ci    +void object_put(struct object *obj)
62862306a36Sopenharmony_ci    +{
62962306a36Sopenharmony_ci    +        unsigned long flags;
63062306a36Sopenharmony_ci    +
63162306a36Sopenharmony_ci    +        spin_lock_irqsave(&cache_lock, flags);
63262306a36Sopenharmony_ci    +        __object_put(obj);
63362306a36Sopenharmony_ci    +        spin_unlock_irqrestore(&cache_lock, flags);
63462306a36Sopenharmony_ci    +}
63562306a36Sopenharmony_ci    +
63662306a36Sopenharmony_ci    +void object_get(struct object *obj)
63762306a36Sopenharmony_ci    +{
63862306a36Sopenharmony_ci    +        unsigned long flags;
63962306a36Sopenharmony_ci    +
64062306a36Sopenharmony_ci    +        spin_lock_irqsave(&cache_lock, flags);
64162306a36Sopenharmony_ci    +        __object_get(obj);
64262306a36Sopenharmony_ci    +        spin_unlock_irqrestore(&cache_lock, flags);
64362306a36Sopenharmony_ci    +}
64462306a36Sopenharmony_ci    +
64562306a36Sopenharmony_ci     /* Must be holding cache_lock */
64662306a36Sopenharmony_ci     static struct object *__cache_find(int id)
64762306a36Sopenharmony_ci     {
64862306a36Sopenharmony_ci    @@ -35,6 +65,7 @@
64962306a36Sopenharmony_ci     {
65062306a36Sopenharmony_ci             BUG_ON(!obj);
65162306a36Sopenharmony_ci             list_del(&obj->list);
65262306a36Sopenharmony_ci    +        __object_put(obj);
65362306a36Sopenharmony_ci             cache_num--;
65462306a36Sopenharmony_ci     }
65562306a36Sopenharmony_ci
65662306a36Sopenharmony_ci    @@ -63,6 +94,7 @@
65762306a36Sopenharmony_ci             strscpy(obj->name, name, sizeof(obj->name));
65862306a36Sopenharmony_ci             obj->id = id;
65962306a36Sopenharmony_ci             obj->popularity = 0;
66062306a36Sopenharmony_ci    +        obj->refcnt = 1; /* The cache holds a reference */
66162306a36Sopenharmony_ci
66262306a36Sopenharmony_ci             spin_lock_irqsave(&cache_lock, flags);
66362306a36Sopenharmony_ci             __cache_add(obj);
66462306a36Sopenharmony_ci    @@ -79,18 +111,15 @@
66562306a36Sopenharmony_ci             spin_unlock_irqrestore(&cache_lock, flags);
66662306a36Sopenharmony_ci     }
66762306a36Sopenharmony_ci
66862306a36Sopenharmony_ci    -int cache_find(int id, char *name)
66962306a36Sopenharmony_ci    +struct object *cache_find(int id)
67062306a36Sopenharmony_ci     {
67162306a36Sopenharmony_ci             struct object *obj;
67262306a36Sopenharmony_ci    -        int ret = -ENOENT;
67362306a36Sopenharmony_ci             unsigned long flags;
67462306a36Sopenharmony_ci
67562306a36Sopenharmony_ci             spin_lock_irqsave(&cache_lock, flags);
67662306a36Sopenharmony_ci             obj = __cache_find(id);
67762306a36Sopenharmony_ci    -        if (obj) {
67862306a36Sopenharmony_ci    -                ret = 0;
67962306a36Sopenharmony_ci    -                strcpy(name, obj->name);
68062306a36Sopenharmony_ci    -        }
68162306a36Sopenharmony_ci    +        if (obj)
68262306a36Sopenharmony_ci    +                __object_get(obj);
68362306a36Sopenharmony_ci             spin_unlock_irqrestore(&cache_lock, flags);
68462306a36Sopenharmony_ci    -        return ret;
68562306a36Sopenharmony_ci    +        return obj;
68662306a36Sopenharmony_ci     }
68762306a36Sopenharmony_ci
68862306a36Sopenharmony_ciWe encapsulate the reference counting in the standard 'get' and 'put'
68962306a36Sopenharmony_cifunctions. Now we can return the object itself from
69062306a36Sopenharmony_cicache_find() which has the advantage that the user can
69162306a36Sopenharmony_cinow sleep holding the object (eg. to copy_to_user() to
69262306a36Sopenharmony_ciname to userspace).
69362306a36Sopenharmony_ci
69462306a36Sopenharmony_ciThe other point to note is that I said a reference should be held for
69562306a36Sopenharmony_cievery pointer to the object: thus the reference count is 1 when first
69662306a36Sopenharmony_ciinserted into the cache. In some versions the framework does not hold a
69762306a36Sopenharmony_cireference count, but they are more complicated.
69862306a36Sopenharmony_ci
69962306a36Sopenharmony_ciUsing Atomic Operations For The Reference Count
70062306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70162306a36Sopenharmony_ci
70262306a36Sopenharmony_ciIn practice, :c:type:`atomic_t` would usually be used for refcnt. There are a
70362306a36Sopenharmony_cinumber of atomic operations defined in ``include/asm/atomic.h``: these
70462306a36Sopenharmony_ciare guaranteed to be seen atomically from all CPUs in the system, so no
70562306a36Sopenharmony_cilock is required. In this case, it is simpler than using spinlocks,
70662306a36Sopenharmony_cialthough for anything non-trivial using spinlocks is clearer. The
70762306a36Sopenharmony_ciatomic_inc() and atomic_dec_and_test()
70862306a36Sopenharmony_ciare used instead of the standard increment and decrement operators, and
70962306a36Sopenharmony_cithe lock is no longer used to protect the reference count itself.
71062306a36Sopenharmony_ci
71162306a36Sopenharmony_ci::
71262306a36Sopenharmony_ci
71362306a36Sopenharmony_ci    --- cache.c.refcnt  2003-12-09 15:00:35.000000000 +1100
71462306a36Sopenharmony_ci    +++ cache.c.refcnt-atomic   2003-12-11 15:49:42.000000000 +1100
71562306a36Sopenharmony_ci    @@ -7,7 +7,7 @@
71662306a36Sopenharmony_ci     struct object
71762306a36Sopenharmony_ci     {
71862306a36Sopenharmony_ci             struct list_head list;
71962306a36Sopenharmony_ci    -        unsigned int refcnt;
72062306a36Sopenharmony_ci    +        atomic_t refcnt;
72162306a36Sopenharmony_ci             int id;
72262306a36Sopenharmony_ci             char name[32];
72362306a36Sopenharmony_ci             int popularity;
72462306a36Sopenharmony_ci    @@ -18,33 +18,15 @@
72562306a36Sopenharmony_ci     static unsigned int cache_num = 0;
72662306a36Sopenharmony_ci     #define MAX_CACHE_SIZE 10
72762306a36Sopenharmony_ci
72862306a36Sopenharmony_ci    -static void __object_put(struct object *obj)
72962306a36Sopenharmony_ci    -{
73062306a36Sopenharmony_ci    -        if (--obj->refcnt == 0)
73162306a36Sopenharmony_ci    -                kfree(obj);
73262306a36Sopenharmony_ci    -}
73362306a36Sopenharmony_ci    -
73462306a36Sopenharmony_ci    -static void __object_get(struct object *obj)
73562306a36Sopenharmony_ci    -{
73662306a36Sopenharmony_ci    -        obj->refcnt++;
73762306a36Sopenharmony_ci    -}
73862306a36Sopenharmony_ci    -
73962306a36Sopenharmony_ci     void object_put(struct object *obj)
74062306a36Sopenharmony_ci     {
74162306a36Sopenharmony_ci    -        unsigned long flags;
74262306a36Sopenharmony_ci    -
74362306a36Sopenharmony_ci    -        spin_lock_irqsave(&cache_lock, flags);
74462306a36Sopenharmony_ci    -        __object_put(obj);
74562306a36Sopenharmony_ci    -        spin_unlock_irqrestore(&cache_lock, flags);
74662306a36Sopenharmony_ci    +        if (atomic_dec_and_test(&obj->refcnt))
74762306a36Sopenharmony_ci    +                kfree(obj);
74862306a36Sopenharmony_ci     }
74962306a36Sopenharmony_ci
75062306a36Sopenharmony_ci     void object_get(struct object *obj)
75162306a36Sopenharmony_ci     {
75262306a36Sopenharmony_ci    -        unsigned long flags;
75362306a36Sopenharmony_ci    -
75462306a36Sopenharmony_ci    -        spin_lock_irqsave(&cache_lock, flags);
75562306a36Sopenharmony_ci    -        __object_get(obj);
75662306a36Sopenharmony_ci    -        spin_unlock_irqrestore(&cache_lock, flags);
75762306a36Sopenharmony_ci    +        atomic_inc(&obj->refcnt);
75862306a36Sopenharmony_ci     }
75962306a36Sopenharmony_ci
76062306a36Sopenharmony_ci     /* Must be holding cache_lock */
76162306a36Sopenharmony_ci    @@ -65,7 +47,7 @@
76262306a36Sopenharmony_ci     {
76362306a36Sopenharmony_ci             BUG_ON(!obj);
76462306a36Sopenharmony_ci             list_del(&obj->list);
76562306a36Sopenharmony_ci    -        __object_put(obj);
76662306a36Sopenharmony_ci    +        object_put(obj);
76762306a36Sopenharmony_ci             cache_num--;
76862306a36Sopenharmony_ci     }
76962306a36Sopenharmony_ci
77062306a36Sopenharmony_ci    @@ -94,7 +76,7 @@
77162306a36Sopenharmony_ci             strscpy(obj->name, name, sizeof(obj->name));
77262306a36Sopenharmony_ci             obj->id = id;
77362306a36Sopenharmony_ci             obj->popularity = 0;
77462306a36Sopenharmony_ci    -        obj->refcnt = 1; /* The cache holds a reference */
77562306a36Sopenharmony_ci    +        atomic_set(&obj->refcnt, 1); /* The cache holds a reference */
77662306a36Sopenharmony_ci
77762306a36Sopenharmony_ci             spin_lock_irqsave(&cache_lock, flags);
77862306a36Sopenharmony_ci             __cache_add(obj);
77962306a36Sopenharmony_ci    @@ -119,7 +101,7 @@
78062306a36Sopenharmony_ci             spin_lock_irqsave(&cache_lock, flags);
78162306a36Sopenharmony_ci             obj = __cache_find(id);
78262306a36Sopenharmony_ci             if (obj)
78362306a36Sopenharmony_ci    -                __object_get(obj);
78462306a36Sopenharmony_ci    +                object_get(obj);
78562306a36Sopenharmony_ci             spin_unlock_irqrestore(&cache_lock, flags);
78662306a36Sopenharmony_ci             return obj;
78762306a36Sopenharmony_ci     }
78862306a36Sopenharmony_ci
78962306a36Sopenharmony_ciProtecting The Objects Themselves
79062306a36Sopenharmony_ci---------------------------------
79162306a36Sopenharmony_ci
79262306a36Sopenharmony_ciIn these examples, we assumed that the objects (except the reference
79362306a36Sopenharmony_cicounts) never changed once they are created. If we wanted to allow the
79462306a36Sopenharmony_ciname to change, there are three possibilities:
79562306a36Sopenharmony_ci
79662306a36Sopenharmony_ci-  You can make ``cache_lock`` non-static, and tell people to grab that
79762306a36Sopenharmony_ci   lock before changing the name in any object.
79862306a36Sopenharmony_ci
79962306a36Sopenharmony_ci-  You can provide a cache_obj_rename() which grabs this
80062306a36Sopenharmony_ci   lock and changes the name for the caller, and tell everyone to use
80162306a36Sopenharmony_ci   that function.
80262306a36Sopenharmony_ci
80362306a36Sopenharmony_ci-  You can make the ``cache_lock`` protect only the cache itself, and
80462306a36Sopenharmony_ci   use another lock to protect the name.
80562306a36Sopenharmony_ci
80662306a36Sopenharmony_ciTheoretically, you can make the locks as fine-grained as one lock for
80762306a36Sopenharmony_cievery field, for every object. In practice, the most common variants
80862306a36Sopenharmony_ciare:
80962306a36Sopenharmony_ci
81062306a36Sopenharmony_ci-  One lock which protects the infrastructure (the ``cache`` list in
81162306a36Sopenharmony_ci   this example) and all the objects. This is what we have done so far.
81262306a36Sopenharmony_ci
81362306a36Sopenharmony_ci-  One lock which protects the infrastructure (including the list
81462306a36Sopenharmony_ci   pointers inside the objects), and one lock inside the object which
81562306a36Sopenharmony_ci   protects the rest of that object.
81662306a36Sopenharmony_ci
81762306a36Sopenharmony_ci-  Multiple locks to protect the infrastructure (eg. one lock per hash
81862306a36Sopenharmony_ci   chain), possibly with a separate per-object lock.
81962306a36Sopenharmony_ci
82062306a36Sopenharmony_ciHere is the "lock-per-object" implementation:
82162306a36Sopenharmony_ci
82262306a36Sopenharmony_ci::
82362306a36Sopenharmony_ci
82462306a36Sopenharmony_ci    --- cache.c.refcnt-atomic   2003-12-11 15:50:54.000000000 +1100
82562306a36Sopenharmony_ci    +++ cache.c.perobjectlock   2003-12-11 17:15:03.000000000 +1100
82662306a36Sopenharmony_ci    @@ -6,11 +6,17 @@
82762306a36Sopenharmony_ci
82862306a36Sopenharmony_ci     struct object
82962306a36Sopenharmony_ci     {
83062306a36Sopenharmony_ci    +        /* These two protected by cache_lock. */
83162306a36Sopenharmony_ci             struct list_head list;
83262306a36Sopenharmony_ci    +        int popularity;
83362306a36Sopenharmony_ci    +
83462306a36Sopenharmony_ci             atomic_t refcnt;
83562306a36Sopenharmony_ci    +
83662306a36Sopenharmony_ci    +        /* Doesn't change once created. */
83762306a36Sopenharmony_ci             int id;
83862306a36Sopenharmony_ci    +
83962306a36Sopenharmony_ci    +        spinlock_t lock; /* Protects the name */
84062306a36Sopenharmony_ci             char name[32];
84162306a36Sopenharmony_ci    -        int popularity;
84262306a36Sopenharmony_ci     };
84362306a36Sopenharmony_ci
84462306a36Sopenharmony_ci     static DEFINE_SPINLOCK(cache_lock);
84562306a36Sopenharmony_ci    @@ -77,6 +84,7 @@
84662306a36Sopenharmony_ci             obj->id = id;
84762306a36Sopenharmony_ci             obj->popularity = 0;
84862306a36Sopenharmony_ci             atomic_set(&obj->refcnt, 1); /* The cache holds a reference */
84962306a36Sopenharmony_ci    +        spin_lock_init(&obj->lock);
85062306a36Sopenharmony_ci
85162306a36Sopenharmony_ci             spin_lock_irqsave(&cache_lock, flags);
85262306a36Sopenharmony_ci             __cache_add(obj);
85362306a36Sopenharmony_ci
85462306a36Sopenharmony_ciNote that I decide that the popularity count should be protected by the
85562306a36Sopenharmony_ci``cache_lock`` rather than the per-object lock: this is because it (like
85662306a36Sopenharmony_cithe :c:type:`struct list_head <list_head>` inside the object)
85762306a36Sopenharmony_ciis logically part of the infrastructure. This way, I don't need to grab
85862306a36Sopenharmony_cithe lock of every object in __cache_add() when seeking
85962306a36Sopenharmony_cithe least popular.
86062306a36Sopenharmony_ci
86162306a36Sopenharmony_ciI also decided that the id member is unchangeable, so I don't need to
86262306a36Sopenharmony_cigrab each object lock in __cache_find() to examine the
86362306a36Sopenharmony_ciid: the object lock is only used by a caller who wants to read or write
86462306a36Sopenharmony_cithe name field.
86562306a36Sopenharmony_ci
86662306a36Sopenharmony_ciNote also that I added a comment describing what data was protected by
86762306a36Sopenharmony_ciwhich locks. This is extremely important, as it describes the runtime
86862306a36Sopenharmony_cibehavior of the code, and can be hard to gain from just reading. And as
86962306a36Sopenharmony_ciAlan Cox says, “Lock data, not code”.
87062306a36Sopenharmony_ci
87162306a36Sopenharmony_ciCommon Problems
87262306a36Sopenharmony_ci===============
87362306a36Sopenharmony_ci
87462306a36Sopenharmony_ciDeadlock: Simple and Advanced
87562306a36Sopenharmony_ci-----------------------------
87662306a36Sopenharmony_ci
87762306a36Sopenharmony_ciThere is a coding bug where a piece of code tries to grab a spinlock
87862306a36Sopenharmony_citwice: it will spin forever, waiting for the lock to be released
87962306a36Sopenharmony_ci(spinlocks, rwlocks and mutexes are not recursive in Linux). This is
88062306a36Sopenharmony_citrivial to diagnose: not a
88162306a36Sopenharmony_cistay-up-five-nights-talk-to-fluffy-code-bunnies kind of problem.
88262306a36Sopenharmony_ci
88362306a36Sopenharmony_ciFor a slightly more complex case, imagine you have a region shared by a
88462306a36Sopenharmony_cisoftirq and user context. If you use a spin_lock() call
88562306a36Sopenharmony_cito protect it, it is possible that the user context will be interrupted
88662306a36Sopenharmony_ciby the softirq while it holds the lock, and the softirq will then spin
88762306a36Sopenharmony_ciforever trying to get the same lock.
88862306a36Sopenharmony_ci
88962306a36Sopenharmony_ciBoth of these are called deadlock, and as shown above, it can occur even
89062306a36Sopenharmony_ciwith a single CPU (although not on UP compiles, since spinlocks vanish
89162306a36Sopenharmony_cion kernel compiles with ``CONFIG_SMP``\ =n. You'll still get data
89262306a36Sopenharmony_cicorruption in the second example).
89362306a36Sopenharmony_ci
89462306a36Sopenharmony_ciThis complete lockup is easy to diagnose: on SMP boxes the watchdog
89562306a36Sopenharmony_citimer or compiling with ``DEBUG_SPINLOCK`` set
89662306a36Sopenharmony_ci(``include/linux/spinlock.h``) will show this up immediately when it
89762306a36Sopenharmony_cihappens.
89862306a36Sopenharmony_ci
89962306a36Sopenharmony_ciA more complex problem is the so-called 'deadly embrace', involving two
90062306a36Sopenharmony_cior more locks. Say you have a hash table: each entry in the table is a
90162306a36Sopenharmony_cispinlock, and a chain of hashed objects. Inside a softirq handler, you
90262306a36Sopenharmony_cisometimes want to alter an object from one place in the hash to another:
90362306a36Sopenharmony_ciyou grab the spinlock of the old hash chain and the spinlock of the new
90462306a36Sopenharmony_cihash chain, and delete the object from the old one, and insert it in the
90562306a36Sopenharmony_cinew one.
90662306a36Sopenharmony_ci
90762306a36Sopenharmony_ciThere are two problems here. First, if your code ever tries to move the
90862306a36Sopenharmony_ciobject to the same chain, it will deadlock with itself as it tries to
90962306a36Sopenharmony_cilock it twice. Secondly, if the same softirq on another CPU is trying to
91062306a36Sopenharmony_cimove another object in the reverse direction, the following could
91162306a36Sopenharmony_cihappen:
91262306a36Sopenharmony_ci
91362306a36Sopenharmony_ci+-----------------------+-----------------------+
91462306a36Sopenharmony_ci| CPU 1                 | CPU 2                 |
91562306a36Sopenharmony_ci+=======================+=======================+
91662306a36Sopenharmony_ci| Grab lock A -> OK     | Grab lock B -> OK     |
91762306a36Sopenharmony_ci+-----------------------+-----------------------+
91862306a36Sopenharmony_ci| Grab lock B -> spin   | Grab lock A -> spin   |
91962306a36Sopenharmony_ci+-----------------------+-----------------------+
92062306a36Sopenharmony_ci
92162306a36Sopenharmony_ciTable: Consequences
92262306a36Sopenharmony_ci
92362306a36Sopenharmony_ciThe two CPUs will spin forever, waiting for the other to give up their
92462306a36Sopenharmony_cilock. It will look, smell, and feel like a crash.
92562306a36Sopenharmony_ci
92662306a36Sopenharmony_ciPreventing Deadlock
92762306a36Sopenharmony_ci-------------------
92862306a36Sopenharmony_ci
92962306a36Sopenharmony_ciTextbooks will tell you that if you always lock in the same order, you
93062306a36Sopenharmony_ciwill never get this kind of deadlock. Practice will tell you that this
93162306a36Sopenharmony_ciapproach doesn't scale: when I create a new lock, I don't understand
93262306a36Sopenharmony_cienough of the kernel to figure out where in the 5000 lock hierarchy it
93362306a36Sopenharmony_ciwill fit.
93462306a36Sopenharmony_ci
93562306a36Sopenharmony_ciThe best locks are encapsulated: they never get exposed in headers, and
93662306a36Sopenharmony_ciare never held around calls to non-trivial functions outside the same
93762306a36Sopenharmony_cifile. You can read through this code and see that it will never
93862306a36Sopenharmony_cideadlock, because it never tries to grab another lock while it has that
93962306a36Sopenharmony_cione. People using your code don't even need to know you are using a
94062306a36Sopenharmony_cilock.
94162306a36Sopenharmony_ci
94262306a36Sopenharmony_ciA classic problem here is when you provide callbacks or hooks: if you
94362306a36Sopenharmony_cicall these with the lock held, you risk simple deadlock, or a deadly
94462306a36Sopenharmony_ciembrace (who knows what the callback will do?).
94562306a36Sopenharmony_ci
94662306a36Sopenharmony_ciOverzealous Prevention Of Deadlocks
94762306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
94862306a36Sopenharmony_ci
94962306a36Sopenharmony_ciDeadlocks are problematic, but not as bad as data corruption. Code which
95062306a36Sopenharmony_cigrabs a read lock, searches a list, fails to find what it wants, drops
95162306a36Sopenharmony_cithe read lock, grabs a write lock and inserts the object has a race
95262306a36Sopenharmony_cicondition.
95362306a36Sopenharmony_ci
95462306a36Sopenharmony_ciRacing Timers: A Kernel Pastime
95562306a36Sopenharmony_ci-------------------------------
95662306a36Sopenharmony_ci
95762306a36Sopenharmony_ciTimers can produce their own special problems with races. Consider a
95862306a36Sopenharmony_cicollection of objects (list, hash, etc) where each object has a timer
95962306a36Sopenharmony_ciwhich is due to destroy it.
96062306a36Sopenharmony_ci
96162306a36Sopenharmony_ciIf you want to destroy the entire collection (say on module removal),
96262306a36Sopenharmony_ciyou might do the following::
96362306a36Sopenharmony_ci
96462306a36Sopenharmony_ci            /* THIS CODE BAD BAD BAD BAD: IF IT WAS ANY WORSE IT WOULD USE
96562306a36Sopenharmony_ci               HUNGARIAN NOTATION */
96662306a36Sopenharmony_ci            spin_lock_bh(&list_lock);
96762306a36Sopenharmony_ci
96862306a36Sopenharmony_ci            while (list) {
96962306a36Sopenharmony_ci                    struct foo *next = list->next;
97062306a36Sopenharmony_ci                    timer_delete(&list->timer);
97162306a36Sopenharmony_ci                    kfree(list);
97262306a36Sopenharmony_ci                    list = next;
97362306a36Sopenharmony_ci            }
97462306a36Sopenharmony_ci
97562306a36Sopenharmony_ci            spin_unlock_bh(&list_lock);
97662306a36Sopenharmony_ci
97762306a36Sopenharmony_ci
97862306a36Sopenharmony_ciSooner or later, this will crash on SMP, because a timer can have just
97962306a36Sopenharmony_cigone off before the spin_lock_bh(), and it will only get
98062306a36Sopenharmony_cithe lock after we spin_unlock_bh(), and then try to free
98162306a36Sopenharmony_cithe element (which has already been freed!).
98262306a36Sopenharmony_ci
98362306a36Sopenharmony_ciThis can be avoided by checking the result of
98462306a36Sopenharmony_citimer_delete(): if it returns 1, the timer has been deleted.
98562306a36Sopenharmony_ciIf 0, it means (in this case) that it is currently running, so we can
98662306a36Sopenharmony_cido::
98762306a36Sopenharmony_ci
98862306a36Sopenharmony_ci            retry:
98962306a36Sopenharmony_ci                    spin_lock_bh(&list_lock);
99062306a36Sopenharmony_ci
99162306a36Sopenharmony_ci                    while (list) {
99262306a36Sopenharmony_ci                            struct foo *next = list->next;
99362306a36Sopenharmony_ci                            if (!timer_delete(&list->timer)) {
99462306a36Sopenharmony_ci                                    /* Give timer a chance to delete this */
99562306a36Sopenharmony_ci                                    spin_unlock_bh(&list_lock);
99662306a36Sopenharmony_ci                                    goto retry;
99762306a36Sopenharmony_ci                            }
99862306a36Sopenharmony_ci                            kfree(list);
99962306a36Sopenharmony_ci                            list = next;
100062306a36Sopenharmony_ci                    }
100162306a36Sopenharmony_ci
100262306a36Sopenharmony_ci                    spin_unlock_bh(&list_lock);
100362306a36Sopenharmony_ci
100462306a36Sopenharmony_ci
100562306a36Sopenharmony_ciAnother common problem is deleting timers which restart themselves (by
100662306a36Sopenharmony_cicalling add_timer() at the end of their timer function).
100762306a36Sopenharmony_ciBecause this is a fairly common case which is prone to races, you should
100862306a36Sopenharmony_ciuse timer_delete_sync() (``include/linux/timer.h``) to handle this case.
100962306a36Sopenharmony_ci
101062306a36Sopenharmony_ciBefore freeing a timer, timer_shutdown() or timer_shutdown_sync() should be
101162306a36Sopenharmony_cicalled which will keep it from being rearmed. Any subsequent attempt to
101262306a36Sopenharmony_cirearm the timer will be silently ignored by the core code.
101362306a36Sopenharmony_ci
101462306a36Sopenharmony_ci
101562306a36Sopenharmony_ciLocking Speed
101662306a36Sopenharmony_ci=============
101762306a36Sopenharmony_ci
101862306a36Sopenharmony_ciThere are three main things to worry about when considering speed of
101962306a36Sopenharmony_cisome code which does locking. First is concurrency: how many things are
102062306a36Sopenharmony_cigoing to be waiting while someone else is holding a lock. Second is the
102162306a36Sopenharmony_citime taken to actually acquire and release an uncontended lock. Third is
102262306a36Sopenharmony_ciusing fewer, or smarter locks. I'm assuming that the lock is used fairly
102362306a36Sopenharmony_cioften: otherwise, you wouldn't be concerned about efficiency.
102462306a36Sopenharmony_ci
102562306a36Sopenharmony_ciConcurrency depends on how long the lock is usually held: you should
102662306a36Sopenharmony_cihold the lock for as long as needed, but no longer. In the cache
102762306a36Sopenharmony_ciexample, we always create the object without the lock held, and then
102862306a36Sopenharmony_cigrab the lock only when we are ready to insert it in the list.
102962306a36Sopenharmony_ci
103062306a36Sopenharmony_ciAcquisition times depend on how much damage the lock operations do to
103162306a36Sopenharmony_cithe pipeline (pipeline stalls) and how likely it is that this CPU was
103262306a36Sopenharmony_cithe last one to grab the lock (ie. is the lock cache-hot for this CPU):
103362306a36Sopenharmony_cion a machine with more CPUs, this likelihood drops fast. Consider a
103462306a36Sopenharmony_ci700MHz Intel Pentium III: an instruction takes about 0.7ns, an atomic
103562306a36Sopenharmony_ciincrement takes about 58ns, a lock which is cache-hot on this CPU takes
103662306a36Sopenharmony_ci160ns, and a cacheline transfer from another CPU takes an additional 170
103762306a36Sopenharmony_cito 360ns. (These figures from Paul McKenney's `Linux Journal RCU
103862306a36Sopenharmony_ciarticle <http://www.linuxjournal.com/article.php?sid=6993>`__).
103962306a36Sopenharmony_ci
104062306a36Sopenharmony_ciThese two aims conflict: holding a lock for a short time might be done
104162306a36Sopenharmony_ciby splitting locks into parts (such as in our final per-object-lock
104262306a36Sopenharmony_ciexample), but this increases the number of lock acquisitions, and the
104362306a36Sopenharmony_ciresults are often slower than having a single lock. This is another
104462306a36Sopenharmony_cireason to advocate locking simplicity.
104562306a36Sopenharmony_ci
104662306a36Sopenharmony_ciThe third concern is addressed below: there are some methods to reduce
104762306a36Sopenharmony_cithe amount of locking which needs to be done.
104862306a36Sopenharmony_ci
104962306a36Sopenharmony_ciRead/Write Lock Variants
105062306a36Sopenharmony_ci------------------------
105162306a36Sopenharmony_ci
105262306a36Sopenharmony_ciBoth spinlocks and mutexes have read/write variants: ``rwlock_t`` and
105362306a36Sopenharmony_ci:c:type:`struct rw_semaphore <rw_semaphore>`. These divide
105462306a36Sopenharmony_ciusers into two classes: the readers and the writers. If you are only
105562306a36Sopenharmony_cireading the data, you can get a read lock, but to write to the data you
105662306a36Sopenharmony_cineed the write lock. Many people can hold a read lock, but a writer must
105762306a36Sopenharmony_cibe sole holder.
105862306a36Sopenharmony_ci
105962306a36Sopenharmony_ciIf your code divides neatly along reader/writer lines (as our cache code
106062306a36Sopenharmony_cidoes), and the lock is held by readers for significant lengths of time,
106162306a36Sopenharmony_ciusing these locks can help. They are slightly slower than the normal
106262306a36Sopenharmony_cilocks though, so in practice ``rwlock_t`` is not usually worthwhile.
106362306a36Sopenharmony_ci
106462306a36Sopenharmony_ciAvoiding Locks: Read Copy Update
106562306a36Sopenharmony_ci--------------------------------
106662306a36Sopenharmony_ci
106762306a36Sopenharmony_ciThere is a special method of read/write locking called Read Copy Update.
106862306a36Sopenharmony_ciUsing RCU, the readers can avoid taking a lock altogether: as we expect
106962306a36Sopenharmony_ciour cache to be read more often than updated (otherwise the cache is a
107062306a36Sopenharmony_ciwaste of time), it is a candidate for this optimization.
107162306a36Sopenharmony_ci
107262306a36Sopenharmony_ciHow do we get rid of read locks? Getting rid of read locks means that
107362306a36Sopenharmony_ciwriters may be changing the list underneath the readers. That is
107462306a36Sopenharmony_ciactually quite simple: we can read a linked list while an element is
107562306a36Sopenharmony_cibeing added if the writer adds the element very carefully. For example,
107662306a36Sopenharmony_ciadding ``new`` to a single linked list called ``list``::
107762306a36Sopenharmony_ci
107862306a36Sopenharmony_ci            new->next = list->next;
107962306a36Sopenharmony_ci            wmb();
108062306a36Sopenharmony_ci            list->next = new;
108162306a36Sopenharmony_ci
108262306a36Sopenharmony_ci
108362306a36Sopenharmony_ciThe wmb() is a write memory barrier. It ensures that the
108462306a36Sopenharmony_cifirst operation (setting the new element's ``next`` pointer) is complete
108562306a36Sopenharmony_ciand will be seen by all CPUs, before the second operation is (putting
108662306a36Sopenharmony_cithe new element into the list). This is important, since modern
108762306a36Sopenharmony_cicompilers and modern CPUs can both reorder instructions unless told
108862306a36Sopenharmony_ciotherwise: we want a reader to either not see the new element at all, or
108962306a36Sopenharmony_cisee the new element with the ``next`` pointer correctly pointing at the
109062306a36Sopenharmony_cirest of the list.
109162306a36Sopenharmony_ci
109262306a36Sopenharmony_ciFortunately, there is a function to do this for standard
109362306a36Sopenharmony_ci:c:type:`struct list_head <list_head>` lists:
109462306a36Sopenharmony_cilist_add_rcu() (``include/linux/list.h``).
109562306a36Sopenharmony_ci
109662306a36Sopenharmony_ciRemoving an element from the list is even simpler: we replace the
109762306a36Sopenharmony_cipointer to the old element with a pointer to its successor, and readers
109862306a36Sopenharmony_ciwill either see it, or skip over it.
109962306a36Sopenharmony_ci
110062306a36Sopenharmony_ci::
110162306a36Sopenharmony_ci
110262306a36Sopenharmony_ci            list->next = old->next;
110362306a36Sopenharmony_ci
110462306a36Sopenharmony_ci
110562306a36Sopenharmony_ciThere is list_del_rcu() (``include/linux/list.h``) which
110662306a36Sopenharmony_cidoes this (the normal version poisons the old object, which we don't
110762306a36Sopenharmony_ciwant).
110862306a36Sopenharmony_ci
110962306a36Sopenharmony_ciThe reader must also be careful: some CPUs can look through the ``next``
111062306a36Sopenharmony_cipointer to start reading the contents of the next element early, but
111162306a36Sopenharmony_cidon't realize that the pre-fetched contents is wrong when the ``next``
111262306a36Sopenharmony_cipointer changes underneath them. Once again, there is a
111362306a36Sopenharmony_cilist_for_each_entry_rcu() (``include/linux/list.h``)
111462306a36Sopenharmony_cito help you. Of course, writers can just use
111562306a36Sopenharmony_cilist_for_each_entry(), since there cannot be two
111662306a36Sopenharmony_cisimultaneous writers.
111762306a36Sopenharmony_ci
111862306a36Sopenharmony_ciOur final dilemma is this: when can we actually destroy the removed
111962306a36Sopenharmony_cielement? Remember, a reader might be stepping through this element in
112062306a36Sopenharmony_cithe list right now: if we free this element and the ``next`` pointer
112162306a36Sopenharmony_cichanges, the reader will jump off into garbage and crash. We need to
112262306a36Sopenharmony_ciwait until we know that all the readers who were traversing the list
112362306a36Sopenharmony_ciwhen we deleted the element are finished. We use
112462306a36Sopenharmony_cicall_rcu() to register a callback which will actually
112562306a36Sopenharmony_cidestroy the object once all pre-existing readers are finished.
112662306a36Sopenharmony_ciAlternatively, synchronize_rcu() may be used to block
112762306a36Sopenharmony_ciuntil all pre-existing are finished.
112862306a36Sopenharmony_ci
112962306a36Sopenharmony_ciBut how does Read Copy Update know when the readers are finished? The
113062306a36Sopenharmony_cimethod is this: firstly, the readers always traverse the list inside
113162306a36Sopenharmony_circu_read_lock()/rcu_read_unlock() pairs:
113262306a36Sopenharmony_cithese simply disable preemption so the reader won't go to sleep while
113362306a36Sopenharmony_cireading the list.
113462306a36Sopenharmony_ci
113562306a36Sopenharmony_ciRCU then waits until every other CPU has slept at least once: since
113662306a36Sopenharmony_cireaders cannot sleep, we know that any readers which were traversing the
113762306a36Sopenharmony_cilist during the deletion are finished, and the callback is triggered.
113862306a36Sopenharmony_ciThe real Read Copy Update code is a little more optimized than this, but
113962306a36Sopenharmony_cithis is the fundamental idea.
114062306a36Sopenharmony_ci
114162306a36Sopenharmony_ci::
114262306a36Sopenharmony_ci
114362306a36Sopenharmony_ci    --- cache.c.perobjectlock   2003-12-11 17:15:03.000000000 +1100
114462306a36Sopenharmony_ci    +++ cache.c.rcupdate    2003-12-11 17:55:14.000000000 +1100
114562306a36Sopenharmony_ci    @@ -1,15 +1,18 @@
114662306a36Sopenharmony_ci     #include <linux/list.h>
114762306a36Sopenharmony_ci     #include <linux/slab.h>
114862306a36Sopenharmony_ci     #include <linux/string.h>
114962306a36Sopenharmony_ci    +#include <linux/rcupdate.h>
115062306a36Sopenharmony_ci     #include <linux/mutex.h>
115162306a36Sopenharmony_ci     #include <asm/errno.h>
115262306a36Sopenharmony_ci
115362306a36Sopenharmony_ci     struct object
115462306a36Sopenharmony_ci     {
115562306a36Sopenharmony_ci    -        /* These two protected by cache_lock. */
115662306a36Sopenharmony_ci    +        /* This is protected by RCU */
115762306a36Sopenharmony_ci             struct list_head list;
115862306a36Sopenharmony_ci             int popularity;
115962306a36Sopenharmony_ci
116062306a36Sopenharmony_ci    +        struct rcu_head rcu;
116162306a36Sopenharmony_ci    +
116262306a36Sopenharmony_ci             atomic_t refcnt;
116362306a36Sopenharmony_ci
116462306a36Sopenharmony_ci             /* Doesn't change once created. */
116562306a36Sopenharmony_ci    @@ -40,7 +43,7 @@
116662306a36Sopenharmony_ci     {
116762306a36Sopenharmony_ci             struct object *i;
116862306a36Sopenharmony_ci
116962306a36Sopenharmony_ci    -        list_for_each_entry(i, &cache, list) {
117062306a36Sopenharmony_ci    +        list_for_each_entry_rcu(i, &cache, list) {
117162306a36Sopenharmony_ci                     if (i->id == id) {
117262306a36Sopenharmony_ci                             i->popularity++;
117362306a36Sopenharmony_ci                             return i;
117462306a36Sopenharmony_ci    @@ -49,19 +52,25 @@
117562306a36Sopenharmony_ci             return NULL;
117662306a36Sopenharmony_ci     }
117762306a36Sopenharmony_ci
117862306a36Sopenharmony_ci    +/* Final discard done once we know no readers are looking. */
117962306a36Sopenharmony_ci    +static void cache_delete_rcu(void *arg)
118062306a36Sopenharmony_ci    +{
118162306a36Sopenharmony_ci    +        object_put(arg);
118262306a36Sopenharmony_ci    +}
118362306a36Sopenharmony_ci    +
118462306a36Sopenharmony_ci     /* Must be holding cache_lock */
118562306a36Sopenharmony_ci     static void __cache_delete(struct object *obj)
118662306a36Sopenharmony_ci     {
118762306a36Sopenharmony_ci             BUG_ON(!obj);
118862306a36Sopenharmony_ci    -        list_del(&obj->list);
118962306a36Sopenharmony_ci    -        object_put(obj);
119062306a36Sopenharmony_ci    +        list_del_rcu(&obj->list);
119162306a36Sopenharmony_ci             cache_num--;
119262306a36Sopenharmony_ci    +        call_rcu(&obj->rcu, cache_delete_rcu);
119362306a36Sopenharmony_ci     }
119462306a36Sopenharmony_ci
119562306a36Sopenharmony_ci     /* Must be holding cache_lock */
119662306a36Sopenharmony_ci     static void __cache_add(struct object *obj)
119762306a36Sopenharmony_ci     {
119862306a36Sopenharmony_ci    -        list_add(&obj->list, &cache);
119962306a36Sopenharmony_ci    +        list_add_rcu(&obj->list, &cache);
120062306a36Sopenharmony_ci             if (++cache_num > MAX_CACHE_SIZE) {
120162306a36Sopenharmony_ci                     struct object *i, *outcast = NULL;
120262306a36Sopenharmony_ci                     list_for_each_entry(i, &cache, list) {
120362306a36Sopenharmony_ci    @@ -104,12 +114,11 @@
120462306a36Sopenharmony_ci     struct object *cache_find(int id)
120562306a36Sopenharmony_ci     {
120662306a36Sopenharmony_ci             struct object *obj;
120762306a36Sopenharmony_ci    -        unsigned long flags;
120862306a36Sopenharmony_ci
120962306a36Sopenharmony_ci    -        spin_lock_irqsave(&cache_lock, flags);
121062306a36Sopenharmony_ci    +        rcu_read_lock();
121162306a36Sopenharmony_ci             obj = __cache_find(id);
121262306a36Sopenharmony_ci             if (obj)
121362306a36Sopenharmony_ci                     object_get(obj);
121462306a36Sopenharmony_ci    -        spin_unlock_irqrestore(&cache_lock, flags);
121562306a36Sopenharmony_ci    +        rcu_read_unlock();
121662306a36Sopenharmony_ci             return obj;
121762306a36Sopenharmony_ci     }
121862306a36Sopenharmony_ci
121962306a36Sopenharmony_ciNote that the reader will alter the popularity member in
122062306a36Sopenharmony_ci__cache_find(), and now it doesn't hold a lock. One
122162306a36Sopenharmony_cisolution would be to make it an ``atomic_t``, but for this usage, we
122262306a36Sopenharmony_cidon't really care about races: an approximate result is good enough, so
122362306a36Sopenharmony_ciI didn't change it.
122462306a36Sopenharmony_ci
122562306a36Sopenharmony_ciThe result is that cache_find() requires no
122662306a36Sopenharmony_cisynchronization with any other functions, so is almost as fast on SMP as
122762306a36Sopenharmony_ciit would be on UP.
122862306a36Sopenharmony_ci
122962306a36Sopenharmony_ciThere is a further optimization possible here: remember our original
123062306a36Sopenharmony_cicache code, where there were no reference counts and the caller simply
123162306a36Sopenharmony_ciheld the lock whenever using the object? This is still possible: if you
123262306a36Sopenharmony_cihold the lock, no one can delete the object, so you don't need to get
123362306a36Sopenharmony_ciand put the reference count.
123462306a36Sopenharmony_ci
123562306a36Sopenharmony_ciNow, because the 'read lock' in RCU is simply disabling preemption, a
123662306a36Sopenharmony_cicaller which always has preemption disabled between calling
123762306a36Sopenharmony_cicache_find() and object_put() does not
123862306a36Sopenharmony_cineed to actually get and put the reference count: we could expose
123962306a36Sopenharmony_ci__cache_find() by making it non-static, and such
124062306a36Sopenharmony_cicallers could simply call that.
124162306a36Sopenharmony_ci
124262306a36Sopenharmony_ciThe benefit here is that the reference count is not written to: the
124362306a36Sopenharmony_ciobject is not altered in any way, which is much faster on SMP machines
124462306a36Sopenharmony_cidue to caching.
124562306a36Sopenharmony_ci
124662306a36Sopenharmony_ciPer-CPU Data
124762306a36Sopenharmony_ci------------
124862306a36Sopenharmony_ci
124962306a36Sopenharmony_ciAnother technique for avoiding locking which is used fairly widely is to
125062306a36Sopenharmony_ciduplicate information for each CPU. For example, if you wanted to keep a
125162306a36Sopenharmony_cicount of a common condition, you could use a spin lock and a single
125262306a36Sopenharmony_cicounter. Nice and simple.
125362306a36Sopenharmony_ci
125462306a36Sopenharmony_ciIf that was too slow (it's usually not, but if you've got a really big
125562306a36Sopenharmony_cimachine to test on and can show that it is), you could instead use a
125662306a36Sopenharmony_cicounter for each CPU, then none of them need an exclusive lock. See
125762306a36Sopenharmony_ciDEFINE_PER_CPU(), get_cpu_var() and
125862306a36Sopenharmony_ciput_cpu_var() (``include/linux/percpu.h``).
125962306a36Sopenharmony_ci
126062306a36Sopenharmony_ciOf particular use for simple per-cpu counters is the ``local_t`` type,
126162306a36Sopenharmony_ciand the cpu_local_inc() and related functions, which are
126262306a36Sopenharmony_cimore efficient than simple code on some architectures
126362306a36Sopenharmony_ci(``include/asm/local.h``).
126462306a36Sopenharmony_ci
126562306a36Sopenharmony_ciNote that there is no simple, reliable way of getting an exact value of
126662306a36Sopenharmony_cisuch a counter, without introducing more locks. This is not a problem
126762306a36Sopenharmony_cifor some uses.
126862306a36Sopenharmony_ci
126962306a36Sopenharmony_ciData Which Mostly Used By An IRQ Handler
127062306a36Sopenharmony_ci----------------------------------------
127162306a36Sopenharmony_ci
127262306a36Sopenharmony_ciIf data is always accessed from within the same IRQ handler, you don't
127362306a36Sopenharmony_cineed a lock at all: the kernel already guarantees that the irq handler
127462306a36Sopenharmony_ciwill not run simultaneously on multiple CPUs.
127562306a36Sopenharmony_ci
127662306a36Sopenharmony_ciManfred Spraul points out that you can still do this, even if the data
127762306a36Sopenharmony_ciis very occasionally accessed in user context or softirqs/tasklets. The
127862306a36Sopenharmony_ciirq handler doesn't use a lock, and all other accesses are done as so::
127962306a36Sopenharmony_ci
128062306a36Sopenharmony_ci        mutex_lock(&lock);
128162306a36Sopenharmony_ci        disable_irq(irq);
128262306a36Sopenharmony_ci        ...
128362306a36Sopenharmony_ci        enable_irq(irq);
128462306a36Sopenharmony_ci        mutex_unlock(&lock);
128562306a36Sopenharmony_ci
128662306a36Sopenharmony_ciThe disable_irq() prevents the irq handler from running
128762306a36Sopenharmony_ci(and waits for it to finish if it's currently running on other CPUs).
128862306a36Sopenharmony_ciThe spinlock prevents any other accesses happening at the same time.
128962306a36Sopenharmony_ciNaturally, this is slower than just a spin_lock_irq()
129062306a36Sopenharmony_cicall, so it only makes sense if this type of access happens extremely
129162306a36Sopenharmony_cirarely.
129262306a36Sopenharmony_ci
129362306a36Sopenharmony_ciWhat Functions Are Safe To Call From Interrupts?
129462306a36Sopenharmony_ci================================================
129562306a36Sopenharmony_ci
129662306a36Sopenharmony_ciMany functions in the kernel sleep (ie. call schedule()) directly or
129762306a36Sopenharmony_ciindirectly: you can never call them while holding a spinlock, or with
129862306a36Sopenharmony_cipreemption disabled. This also means you need to be in user context:
129962306a36Sopenharmony_cicalling them from an interrupt is illegal.
130062306a36Sopenharmony_ci
130162306a36Sopenharmony_ciSome Functions Which Sleep
130262306a36Sopenharmony_ci--------------------------
130362306a36Sopenharmony_ci
130462306a36Sopenharmony_ciThe most common ones are listed below, but you usually have to read the
130562306a36Sopenharmony_cicode to find out if other calls are safe. If everyone else who calls it
130662306a36Sopenharmony_cican sleep, you probably need to be able to sleep, too. In particular,
130762306a36Sopenharmony_ciregistration and deregistration functions usually expect to be called
130862306a36Sopenharmony_cifrom user context, and can sleep.
130962306a36Sopenharmony_ci
131062306a36Sopenharmony_ci-  Accesses to userspace:
131162306a36Sopenharmony_ci
131262306a36Sopenharmony_ci   -  copy_from_user()
131362306a36Sopenharmony_ci
131462306a36Sopenharmony_ci   -  copy_to_user()
131562306a36Sopenharmony_ci
131662306a36Sopenharmony_ci   -  get_user()
131762306a36Sopenharmony_ci
131862306a36Sopenharmony_ci   -  put_user()
131962306a36Sopenharmony_ci
132062306a36Sopenharmony_ci-  kmalloc(GP_KERNEL) <kmalloc>`
132162306a36Sopenharmony_ci
132262306a36Sopenharmony_ci-  mutex_lock_interruptible() and
132362306a36Sopenharmony_ci   mutex_lock()
132462306a36Sopenharmony_ci
132562306a36Sopenharmony_ci   There is a mutex_trylock() which does not sleep.
132662306a36Sopenharmony_ci   Still, it must not be used inside interrupt context since its
132762306a36Sopenharmony_ci   implementation is not safe for that. mutex_unlock()
132862306a36Sopenharmony_ci   will also never sleep. It cannot be used in interrupt context either
132962306a36Sopenharmony_ci   since a mutex must be released by the same task that acquired it.
133062306a36Sopenharmony_ci
133162306a36Sopenharmony_ciSome Functions Which Don't Sleep
133262306a36Sopenharmony_ci--------------------------------
133362306a36Sopenharmony_ci
133462306a36Sopenharmony_ciSome functions are safe to call from any context, or holding almost any
133562306a36Sopenharmony_cilock.
133662306a36Sopenharmony_ci
133762306a36Sopenharmony_ci-  printk()
133862306a36Sopenharmony_ci
133962306a36Sopenharmony_ci-  kfree()
134062306a36Sopenharmony_ci
134162306a36Sopenharmony_ci-  add_timer() and timer_delete()
134262306a36Sopenharmony_ci
134362306a36Sopenharmony_ciMutex API reference
134462306a36Sopenharmony_ci===================
134562306a36Sopenharmony_ci
134662306a36Sopenharmony_ci.. kernel-doc:: include/linux/mutex.h
134762306a36Sopenharmony_ci   :internal:
134862306a36Sopenharmony_ci
134962306a36Sopenharmony_ci.. kernel-doc:: kernel/locking/mutex.c
135062306a36Sopenharmony_ci   :export:
135162306a36Sopenharmony_ci
135262306a36Sopenharmony_ciFutex API reference
135362306a36Sopenharmony_ci===================
135462306a36Sopenharmony_ci
135562306a36Sopenharmony_ci.. kernel-doc:: kernel/futex/core.c
135662306a36Sopenharmony_ci   :internal:
135762306a36Sopenharmony_ci
135862306a36Sopenharmony_ci.. kernel-doc:: kernel/futex/futex.h
135962306a36Sopenharmony_ci   :internal:
136062306a36Sopenharmony_ci
136162306a36Sopenharmony_ci.. kernel-doc:: kernel/futex/pi.c
136262306a36Sopenharmony_ci   :internal:
136362306a36Sopenharmony_ci
136462306a36Sopenharmony_ci.. kernel-doc:: kernel/futex/requeue.c
136562306a36Sopenharmony_ci   :internal:
136662306a36Sopenharmony_ci
136762306a36Sopenharmony_ci.. kernel-doc:: kernel/futex/waitwake.c
136862306a36Sopenharmony_ci   :internal:
136962306a36Sopenharmony_ci
137062306a36Sopenharmony_ciFurther reading
137162306a36Sopenharmony_ci===============
137262306a36Sopenharmony_ci
137362306a36Sopenharmony_ci-  ``Documentation/locking/spinlocks.rst``: Linus Torvalds' spinlocking
137462306a36Sopenharmony_ci   tutorial in the kernel sources.
137562306a36Sopenharmony_ci
137662306a36Sopenharmony_ci-  Unix Systems for Modern Architectures: Symmetric Multiprocessing and
137762306a36Sopenharmony_ci   Caching for Kernel Programmers:
137862306a36Sopenharmony_ci
137962306a36Sopenharmony_ci   Curt Schimmel's very good introduction to kernel level locking (not
138062306a36Sopenharmony_ci   written for Linux, but nearly everything applies). The book is
138162306a36Sopenharmony_ci   expensive, but really worth every penny to understand SMP locking.
138262306a36Sopenharmony_ci   [ISBN: 0201633388]
138362306a36Sopenharmony_ci
138462306a36Sopenharmony_ciThanks
138562306a36Sopenharmony_ci======
138662306a36Sopenharmony_ci
138762306a36Sopenharmony_ciThanks to Telsa Gwynne for DocBooking, neatening and adding style.
138862306a36Sopenharmony_ci
138962306a36Sopenharmony_ciThanks to Martin Pool, Philipp Rumpf, Stephen Rothwell, Paul Mackerras,
139062306a36Sopenharmony_ciRuedi Aschwanden, Alan Cox, Manfred Spraul, Tim Waugh, Pete Zaitcev,
139162306a36Sopenharmony_ciJames Morris, Robert Love, Paul McKenney, John Ashby for proofreading,
139262306a36Sopenharmony_cicorrecting, flaming, commenting.
139362306a36Sopenharmony_ci
139462306a36Sopenharmony_ciThanks to the cabal for having no influence on this document.
139562306a36Sopenharmony_ci
139662306a36Sopenharmony_ciGlossary
139762306a36Sopenharmony_ci========
139862306a36Sopenharmony_ci
139962306a36Sopenharmony_cipreemption
140062306a36Sopenharmony_ci  Prior to 2.5, or when ``CONFIG_PREEMPT`` is unset, processes in user
140162306a36Sopenharmony_ci  context inside the kernel would not preempt each other (ie. you had that
140262306a36Sopenharmony_ci  CPU until you gave it up, except for interrupts). With the addition of
140362306a36Sopenharmony_ci  ``CONFIG_PREEMPT`` in 2.5.4, this changed: when in user context, higher
140462306a36Sopenharmony_ci  priority tasks can "cut in": spinlocks were changed to disable
140562306a36Sopenharmony_ci  preemption, even on UP.
140662306a36Sopenharmony_ci
140762306a36Sopenharmony_cibh
140862306a36Sopenharmony_ci  Bottom Half: for historical reasons, functions with '_bh' in them often
140962306a36Sopenharmony_ci  now refer to any software interrupt, e.g. spin_lock_bh()
141062306a36Sopenharmony_ci  blocks any software interrupt on the current CPU. Bottom halves are
141162306a36Sopenharmony_ci  deprecated, and will eventually be replaced by tasklets. Only one bottom
141262306a36Sopenharmony_ci  half will be running at any time.
141362306a36Sopenharmony_ci
141462306a36Sopenharmony_ciHardware Interrupt / Hardware IRQ
141562306a36Sopenharmony_ci  Hardware interrupt request. in_hardirq() returns true in a
141662306a36Sopenharmony_ci  hardware interrupt handler.
141762306a36Sopenharmony_ci
141862306a36Sopenharmony_ciInterrupt Context
141962306a36Sopenharmony_ci  Not user context: processing a hardware irq or software irq. Indicated
142062306a36Sopenharmony_ci  by the in_interrupt() macro returning true.
142162306a36Sopenharmony_ci
142262306a36Sopenharmony_ciSMP
142362306a36Sopenharmony_ci  Symmetric Multi-Processor: kernels compiled for multiple-CPU machines.
142462306a36Sopenharmony_ci  (``CONFIG_SMP=y``).
142562306a36Sopenharmony_ci
142662306a36Sopenharmony_ciSoftware Interrupt / softirq
142762306a36Sopenharmony_ci  Software interrupt handler. in_hardirq() returns false;
142862306a36Sopenharmony_ci  in_softirq() returns true. Tasklets and softirqs both
142962306a36Sopenharmony_ci  fall into the category of 'software interrupts'.
143062306a36Sopenharmony_ci
143162306a36Sopenharmony_ci  Strictly speaking a softirq is one of up to 32 enumerated software
143262306a36Sopenharmony_ci  interrupts which can run on multiple CPUs at once. Sometimes used to
143362306a36Sopenharmony_ci  refer to tasklets as well (ie. all software interrupts).
143462306a36Sopenharmony_ci
143562306a36Sopenharmony_citasklet
143662306a36Sopenharmony_ci  A dynamically-registrable software interrupt, which is guaranteed to
143762306a36Sopenharmony_ci  only run on one CPU at a time.
143862306a36Sopenharmony_ci
143962306a36Sopenharmony_citimer
144062306a36Sopenharmony_ci  A dynamically-registrable software interrupt, which is run at (or close
144162306a36Sopenharmony_ci  to) a given time. When running, it is just like a tasklet (in fact, they
144262306a36Sopenharmony_ci  are called from the ``TIMER_SOFTIRQ``).
144362306a36Sopenharmony_ci
144462306a36Sopenharmony_ciUP
144562306a36Sopenharmony_ci  Uni-Processor: Non-SMP. (``CONFIG_SMP=n``).
144662306a36Sopenharmony_ci
144762306a36Sopenharmony_ciUser Context
144862306a36Sopenharmony_ci  The kernel executing on behalf of a particular process (ie. a system
144962306a36Sopenharmony_ci  call or trap) or kernel thread. You can tell which process with the
145062306a36Sopenharmony_ci  ``current`` macro.) Not to be confused with userspace. Can be
145162306a36Sopenharmony_ci  interrupted by software or hardware interrupts.
145262306a36Sopenharmony_ci
145362306a36Sopenharmony_ciUserspace
145462306a36Sopenharmony_ci  A process executing its own code outside the kernel.
1455