162306a36Sopenharmony_ci==============
262306a36Sopenharmony_ciControl Groups
362306a36Sopenharmony_ci==============
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciWritten by Paul Menage <menage@google.com> based on
662306a36Sopenharmony_ciDocumentation/admin-guide/cgroup-v1/cpusets.rst
762306a36Sopenharmony_ci
862306a36Sopenharmony_ciOriginal copyright statements from cpusets.txt:
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciPortions Copyright (C) 2004 BULL SA.
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ciPortions Copyright (c) 2004-2006 Silicon Graphics, Inc.
1362306a36Sopenharmony_ci
1462306a36Sopenharmony_ciModified by Paul Jackson <pj@sgi.com>
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_ciModified by Christoph Lameter <cl@linux.com>
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ci.. CONTENTS:
1962306a36Sopenharmony_ci
2062306a36Sopenharmony_ci	1. Control Groups
2162306a36Sopenharmony_ci	1.1 What are cgroups ?
2262306a36Sopenharmony_ci	1.2 Why are cgroups needed ?
2362306a36Sopenharmony_ci	1.3 How are cgroups implemented ?
2462306a36Sopenharmony_ci	1.4 What does notify_on_release do ?
2562306a36Sopenharmony_ci	1.5 What does clone_children do ?
2662306a36Sopenharmony_ci	1.6 How do I use cgroups ?
2762306a36Sopenharmony_ci	2. Usage Examples and Syntax
2862306a36Sopenharmony_ci	2.1 Basic Usage
2962306a36Sopenharmony_ci	2.2 Attaching processes
3062306a36Sopenharmony_ci	2.3 Mounting hierarchies by name
3162306a36Sopenharmony_ci	3. Kernel API
3262306a36Sopenharmony_ci	3.1 Overview
3362306a36Sopenharmony_ci	3.2 Synchronization
3462306a36Sopenharmony_ci	3.3 Subsystem API
3562306a36Sopenharmony_ci	4. Extended attributes usage
3662306a36Sopenharmony_ci	5. Questions
3762306a36Sopenharmony_ci
3862306a36Sopenharmony_ci1. Control Groups
3962306a36Sopenharmony_ci=================
4062306a36Sopenharmony_ci
4162306a36Sopenharmony_ci1.1 What are cgroups ?
4262306a36Sopenharmony_ci----------------------
4362306a36Sopenharmony_ci
4462306a36Sopenharmony_ciControl Groups provide a mechanism for aggregating/partitioning sets of
4562306a36Sopenharmony_citasks, and all their future children, into hierarchical groups with
4662306a36Sopenharmony_cispecialized behaviour.
4762306a36Sopenharmony_ci
4862306a36Sopenharmony_ciDefinitions:
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ciA *cgroup* associates a set of tasks with a set of parameters for one
5162306a36Sopenharmony_cior more subsystems.
5262306a36Sopenharmony_ci
5362306a36Sopenharmony_ciA *subsystem* is a module that makes use of the task grouping
5462306a36Sopenharmony_cifacilities provided by cgroups to treat groups of tasks in
5562306a36Sopenharmony_ciparticular ways. A subsystem is typically a "resource controller" that
5662306a36Sopenharmony_cischedules a resource or applies per-cgroup limits, but it may be
5762306a36Sopenharmony_cianything that wants to act on a group of processes, e.g. a
5862306a36Sopenharmony_civirtualization subsystem.
5962306a36Sopenharmony_ci
6062306a36Sopenharmony_ciA *hierarchy* is a set of cgroups arranged in a tree, such that
6162306a36Sopenharmony_cievery task in the system is in exactly one of the cgroups in the
6262306a36Sopenharmony_cihierarchy, and a set of subsystems; each subsystem has system-specific
6362306a36Sopenharmony_cistate attached to each cgroup in the hierarchy.  Each hierarchy has
6462306a36Sopenharmony_cian instance of the cgroup virtual filesystem associated with it.
6562306a36Sopenharmony_ci
6662306a36Sopenharmony_ciAt any one time there may be multiple active hierarchies of task
6762306a36Sopenharmony_cicgroups. Each hierarchy is a partition of all tasks in the system.
6862306a36Sopenharmony_ci
6962306a36Sopenharmony_ciUser-level code may create and destroy cgroups by name in an
7062306a36Sopenharmony_ciinstance of the cgroup virtual file system, specify and query to
7162306a36Sopenharmony_ciwhich cgroup a task is assigned, and list the task PIDs assigned to
7262306a36Sopenharmony_cia cgroup. Those creations and assignments only affect the hierarchy
7362306a36Sopenharmony_ciassociated with that instance of the cgroup file system.
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ciOn their own, the only use for cgroups is for simple job
7662306a36Sopenharmony_citracking. The intention is that other subsystems hook into the generic
7762306a36Sopenharmony_cicgroup support to provide new attributes for cgroups, such as
7862306a36Sopenharmony_ciaccounting/limiting the resources which processes in a cgroup can
7962306a36Sopenharmony_ciaccess. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rst) allow
8062306a36Sopenharmony_ciyou to associate a set of CPUs and a set of memory nodes with the
8162306a36Sopenharmony_citasks in each cgroup.
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ci.. _cgroups-why-needed:
8462306a36Sopenharmony_ci
8562306a36Sopenharmony_ci1.2 Why are cgroups needed ?
8662306a36Sopenharmony_ci----------------------------
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ciThere are multiple efforts to provide process aggregations in the
8962306a36Sopenharmony_ciLinux kernel, mainly for resource-tracking purposes. Such efforts
9062306a36Sopenharmony_ciinclude cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server
9162306a36Sopenharmony_cinamespaces. These all require the basic notion of a
9262306a36Sopenharmony_cigrouping/partitioning of processes, with newly forked processes ending
9362306a36Sopenharmony_ciup in the same group (cgroup) as their parent process.
9462306a36Sopenharmony_ci
9562306a36Sopenharmony_ciThe kernel cgroup patch provides the minimum essential kernel
9662306a36Sopenharmony_cimechanisms required to efficiently implement such groups. It has
9762306a36Sopenharmony_ciminimal impact on the system fast paths, and provides hooks for
9862306a36Sopenharmony_cispecific subsystems such as cpusets to provide additional behaviour as
9962306a36Sopenharmony_cidesired.
10062306a36Sopenharmony_ci
10162306a36Sopenharmony_ciMultiple hierarchy support is provided to allow for situations where
10262306a36Sopenharmony_cithe division of tasks into cgroups is distinctly different for
10362306a36Sopenharmony_cidifferent subsystems - having parallel hierarchies allows each
10462306a36Sopenharmony_cihierarchy to be a natural division of tasks, without having to handle
10562306a36Sopenharmony_cicomplex combinations of tasks that would be present if several
10662306a36Sopenharmony_ciunrelated subsystems needed to be forced into the same tree of
10762306a36Sopenharmony_cicgroups.
10862306a36Sopenharmony_ci
10962306a36Sopenharmony_ciAt one extreme, each resource controller or subsystem could be in a
11062306a36Sopenharmony_ciseparate hierarchy; at the other extreme, all subsystems
11162306a36Sopenharmony_ciwould be attached to the same hierarchy.
11262306a36Sopenharmony_ci
11362306a36Sopenharmony_ciAs an example of a scenario (originally proposed by vatsa@in.ibm.com)
11462306a36Sopenharmony_cithat can benefit from multiple hierarchies, consider a large
11562306a36Sopenharmony_ciuniversity server with various users - students, professors, system
11662306a36Sopenharmony_citasks etc. The resource planning for this server could be along the
11762306a36Sopenharmony_cifollowing lines::
11862306a36Sopenharmony_ci
11962306a36Sopenharmony_ci       CPU :          "Top cpuset"
12062306a36Sopenharmony_ci                       /       \
12162306a36Sopenharmony_ci               CPUSet1         CPUSet2
12262306a36Sopenharmony_ci                  |               |
12362306a36Sopenharmony_ci               (Professors)    (Students)
12462306a36Sopenharmony_ci
12562306a36Sopenharmony_ci               In addition (system tasks) are attached to topcpuset (so
12662306a36Sopenharmony_ci               that they can run anywhere) with a limit of 20%
12762306a36Sopenharmony_ci
12862306a36Sopenharmony_ci       Memory : Professors (50%), Students (30%), system (20%)
12962306a36Sopenharmony_ci
13062306a36Sopenharmony_ci       Disk : Professors (50%), Students (30%), system (20%)
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ci       Network : WWW browsing (20%), Network File System (60%), others (20%)
13362306a36Sopenharmony_ci                               / \
13462306a36Sopenharmony_ci               Professors (15%)  students (5%)
13562306a36Sopenharmony_ci
13662306a36Sopenharmony_ciBrowsers like Firefox/Lynx go into the WWW network class, while (k)nfsd goes
13762306a36Sopenharmony_ciinto the NFS network class.
13862306a36Sopenharmony_ci
13962306a36Sopenharmony_ciAt the same time Firefox/Lynx will share an appropriate CPU/Memory class
14062306a36Sopenharmony_cidepending on who launched it (prof/student).
14162306a36Sopenharmony_ci
14262306a36Sopenharmony_ciWith the ability to classify tasks differently for different resources
14362306a36Sopenharmony_ci(by putting those resource subsystems in different hierarchies),
14462306a36Sopenharmony_cithe admin can easily set up a script which receives exec notifications
14562306a36Sopenharmony_ciand depending on who is launching the browser he can::
14662306a36Sopenharmony_ci
14762306a36Sopenharmony_ci    # echo browser_pid > /sys/fs/cgroup/<restype>/<userclass>/tasks
14862306a36Sopenharmony_ci
14962306a36Sopenharmony_ciWith only a single hierarchy, he now would potentially have to create
15062306a36Sopenharmony_cia separate cgroup for every browser launched and associate it with
15162306a36Sopenharmony_ciappropriate network and other resource class.  This may lead to
15262306a36Sopenharmony_ciproliferation of such cgroups.
15362306a36Sopenharmony_ci
15462306a36Sopenharmony_ciAlso let's say that the administrator would like to give enhanced network
15562306a36Sopenharmony_ciaccess temporarily to a student's browser (since it is night and the user
15662306a36Sopenharmony_ciwants to do online gaming :))  OR give one of the student's simulation
15762306a36Sopenharmony_ciapps enhanced CPU power.
15862306a36Sopenharmony_ci
15962306a36Sopenharmony_ciWith ability to write PIDs directly to resource classes, it's just a
16062306a36Sopenharmony_cimatter of::
16162306a36Sopenharmony_ci
16262306a36Sopenharmony_ci       # echo pid > /sys/fs/cgroup/network/<new_class>/tasks
16362306a36Sopenharmony_ci       (after some time)
16462306a36Sopenharmony_ci       # echo pid > /sys/fs/cgroup/network/<orig_class>/tasks
16562306a36Sopenharmony_ci
16662306a36Sopenharmony_ciWithout this ability, the administrator would have to split the cgroup into
16762306a36Sopenharmony_cimultiple separate ones and then associate the new cgroups with the
16862306a36Sopenharmony_cinew resource classes.
16962306a36Sopenharmony_ci
17062306a36Sopenharmony_ci
17162306a36Sopenharmony_ci
17262306a36Sopenharmony_ci1.3 How are cgroups implemented ?
17362306a36Sopenharmony_ci---------------------------------
17462306a36Sopenharmony_ci
17562306a36Sopenharmony_ciControl Groups extends the kernel as follows:
17662306a36Sopenharmony_ci
17762306a36Sopenharmony_ci - Each task in the system has a reference-counted pointer to a
17862306a36Sopenharmony_ci   css_set.
17962306a36Sopenharmony_ci
18062306a36Sopenharmony_ci - A css_set contains a set of reference-counted pointers to
18162306a36Sopenharmony_ci   cgroup_subsys_state objects, one for each cgroup subsystem
18262306a36Sopenharmony_ci   registered in the system. There is no direct link from a task to
18362306a36Sopenharmony_ci   the cgroup of which it's a member in each hierarchy, but this
18462306a36Sopenharmony_ci   can be determined by following pointers through the
18562306a36Sopenharmony_ci   cgroup_subsys_state objects. This is because accessing the
18662306a36Sopenharmony_ci   subsystem state is something that's expected to happen frequently
18762306a36Sopenharmony_ci   and in performance-critical code, whereas operations that require a
18862306a36Sopenharmony_ci   task's actual cgroup assignments (in particular, moving between
18962306a36Sopenharmony_ci   cgroups) are less common. A linked list runs through the cg_list
19062306a36Sopenharmony_ci   field of each task_struct using the css_set, anchored at
19162306a36Sopenharmony_ci   css_set->tasks.
19262306a36Sopenharmony_ci
19362306a36Sopenharmony_ci - A cgroup hierarchy filesystem can be mounted for browsing and
19462306a36Sopenharmony_ci   manipulation from user space.
19562306a36Sopenharmony_ci
19662306a36Sopenharmony_ci - You can list all the tasks (by PID) attached to any cgroup.
19762306a36Sopenharmony_ci
19862306a36Sopenharmony_ciThe implementation of cgroups requires a few, simple hooks
19962306a36Sopenharmony_ciinto the rest of the kernel, none in performance-critical paths:
20062306a36Sopenharmony_ci
20162306a36Sopenharmony_ci - in init/main.c, to initialize the root cgroups and initial
20262306a36Sopenharmony_ci   css_set at system boot.
20362306a36Sopenharmony_ci
20462306a36Sopenharmony_ci - in fork and exit, to attach and detach a task from its css_set.
20562306a36Sopenharmony_ci
20662306a36Sopenharmony_ciIn addition, a new file system of type "cgroup" may be mounted, to
20762306a36Sopenharmony_cienable browsing and modifying the cgroups presently known to the
20862306a36Sopenharmony_cikernel.  When mounting a cgroup hierarchy, you may specify a
20962306a36Sopenharmony_cicomma-separated list of subsystems to mount as the filesystem mount
21062306a36Sopenharmony_cioptions.  By default, mounting the cgroup filesystem attempts to
21162306a36Sopenharmony_cimount a hierarchy containing all registered subsystems.
21262306a36Sopenharmony_ci
21362306a36Sopenharmony_ciIf an active hierarchy with exactly the same set of subsystems already
21462306a36Sopenharmony_ciexists, it will be reused for the new mount. If no existing hierarchy
21562306a36Sopenharmony_cimatches, and any of the requested subsystems are in use in an existing
21662306a36Sopenharmony_cihierarchy, the mount will fail with -EBUSY. Otherwise, a new hierarchy
21762306a36Sopenharmony_ciis activated, associated with the requested subsystems.
21862306a36Sopenharmony_ci
21962306a36Sopenharmony_ciIt's not currently possible to bind a new subsystem to an active
22062306a36Sopenharmony_cicgroup hierarchy, or to unbind a subsystem from an active cgroup
22162306a36Sopenharmony_cihierarchy. This may be possible in future, but is fraught with nasty
22262306a36Sopenharmony_cierror-recovery issues.
22362306a36Sopenharmony_ci
22462306a36Sopenharmony_ciWhen a cgroup filesystem is unmounted, if there are any
22562306a36Sopenharmony_cichild cgroups created below the top-level cgroup, that hierarchy
22662306a36Sopenharmony_ciwill remain active even though unmounted; if there are no
22762306a36Sopenharmony_cichild cgroups then the hierarchy will be deactivated.
22862306a36Sopenharmony_ci
22962306a36Sopenharmony_ciNo new system calls are added for cgroups - all support for
23062306a36Sopenharmony_ciquerying and modifying cgroups is via this cgroup file system.
23162306a36Sopenharmony_ci
23262306a36Sopenharmony_ciEach task under /proc has an added file named 'cgroup' displaying,
23362306a36Sopenharmony_cifor each active hierarchy, the subsystem names and the cgroup name
23462306a36Sopenharmony_cias the path relative to the root of the cgroup file system.
23562306a36Sopenharmony_ci
23662306a36Sopenharmony_ciEach cgroup is represented by a directory in the cgroup file system
23762306a36Sopenharmony_cicontaining the following files describing that cgroup:
23862306a36Sopenharmony_ci
23962306a36Sopenharmony_ci - tasks: list of tasks (by PID) attached to that cgroup.  This list
24062306a36Sopenharmony_ci   is not guaranteed to be sorted.  Writing a thread ID into this file
24162306a36Sopenharmony_ci   moves the thread into this cgroup.
24262306a36Sopenharmony_ci - cgroup.procs: list of thread group IDs in the cgroup.  This list is
24362306a36Sopenharmony_ci   not guaranteed to be sorted or free of duplicate TGIDs, and userspace
24462306a36Sopenharmony_ci   should sort/uniquify the list if this property is required.
24562306a36Sopenharmony_ci   Writing a thread group ID into this file moves all threads in that
24662306a36Sopenharmony_ci   group into this cgroup.
24762306a36Sopenharmony_ci - notify_on_release flag: run the release agent on exit?
24862306a36Sopenharmony_ci - release_agent: the path to use for release notifications (this file
24962306a36Sopenharmony_ci   exists in the top cgroup only)
25062306a36Sopenharmony_ci
25162306a36Sopenharmony_ciOther subsystems such as cpusets may add additional files in each
25262306a36Sopenharmony_cicgroup dir.
25362306a36Sopenharmony_ci
25462306a36Sopenharmony_ciNew cgroups are created using the mkdir system call or shell
25562306a36Sopenharmony_cicommand.  The properties of a cgroup, such as its flags, are
25662306a36Sopenharmony_cimodified by writing to the appropriate file in that cgroups
25762306a36Sopenharmony_cidirectory, as listed above.
25862306a36Sopenharmony_ci
25962306a36Sopenharmony_ciThe named hierarchical structure of nested cgroups allows partitioning
26062306a36Sopenharmony_cia large system into nested, dynamically changeable, "soft-partitions".
26162306a36Sopenharmony_ci
26262306a36Sopenharmony_ciThe attachment of each task, automatically inherited at fork by any
26362306a36Sopenharmony_cichildren of that task, to a cgroup allows organizing the work load
26462306a36Sopenharmony_cion a system into related sets of tasks.  A task may be re-attached to
26562306a36Sopenharmony_ciany other cgroup, if allowed by the permissions on the necessary
26662306a36Sopenharmony_cicgroup file system directories.
26762306a36Sopenharmony_ci
26862306a36Sopenharmony_ciWhen a task is moved from one cgroup to another, it gets a new
26962306a36Sopenharmony_cicss_set pointer - if there's an already existing css_set with the
27062306a36Sopenharmony_cidesired collection of cgroups then that group is reused, otherwise a new
27162306a36Sopenharmony_cicss_set is allocated. The appropriate existing css_set is located by
27262306a36Sopenharmony_cilooking into a hash table.
27362306a36Sopenharmony_ci
27462306a36Sopenharmony_ciTo allow access from a cgroup to the css_sets (and hence tasks)
27562306a36Sopenharmony_cithat comprise it, a set of cg_cgroup_link objects form a lattice;
27662306a36Sopenharmony_cieach cg_cgroup_link is linked into a list of cg_cgroup_links for
27762306a36Sopenharmony_cia single cgroup on its cgrp_link_list field, and a list of
27862306a36Sopenharmony_cicg_cgroup_links for a single css_set on its cg_link_list.
27962306a36Sopenharmony_ci
28062306a36Sopenharmony_ciThus the set of tasks in a cgroup can be listed by iterating over
28162306a36Sopenharmony_cieach css_set that references the cgroup, and sub-iterating over
28262306a36Sopenharmony_cieach css_set's task set.
28362306a36Sopenharmony_ci
28462306a36Sopenharmony_ciThe use of a Linux virtual file system (vfs) to represent the
28562306a36Sopenharmony_cicgroup hierarchy provides for a familiar permission and name space
28662306a36Sopenharmony_cifor cgroups, with a minimum of additional kernel code.
28762306a36Sopenharmony_ci
28862306a36Sopenharmony_ci1.4 What does notify_on_release do ?
28962306a36Sopenharmony_ci------------------------------------
29062306a36Sopenharmony_ci
29162306a36Sopenharmony_ciIf the notify_on_release flag is enabled (1) in a cgroup, then
29262306a36Sopenharmony_ciwhenever the last task in the cgroup leaves (exits or attaches to
29362306a36Sopenharmony_cisome other cgroup) and the last child cgroup of that cgroup
29462306a36Sopenharmony_ciis removed, then the kernel runs the command specified by the contents
29562306a36Sopenharmony_ciof the "release_agent" file in that hierarchy's root directory,
29662306a36Sopenharmony_cisupplying the pathname (relative to the mount point of the cgroup
29762306a36Sopenharmony_cifile system) of the abandoned cgroup.  This enables automatic
29862306a36Sopenharmony_ciremoval of abandoned cgroups.  The default value of
29962306a36Sopenharmony_cinotify_on_release in the root cgroup at system boot is disabled
30062306a36Sopenharmony_ci(0).  The default value of other cgroups at creation is the current
30162306a36Sopenharmony_civalue of their parents' notify_on_release settings. The default value of
30262306a36Sopenharmony_cia cgroup hierarchy's release_agent path is empty.
30362306a36Sopenharmony_ci
30462306a36Sopenharmony_ci1.5 What does clone_children do ?
30562306a36Sopenharmony_ci---------------------------------
30662306a36Sopenharmony_ci
30762306a36Sopenharmony_ciThis flag only affects the cpuset controller. If the clone_children
30862306a36Sopenharmony_ciflag is enabled (1) in a cgroup, a new cpuset cgroup will copy its
30962306a36Sopenharmony_ciconfiguration from the parent during initialization.
31062306a36Sopenharmony_ci
31162306a36Sopenharmony_ci1.6 How do I use cgroups ?
31262306a36Sopenharmony_ci--------------------------
31362306a36Sopenharmony_ci
31462306a36Sopenharmony_ciTo start a new job that is to be contained within a cgroup, using
31562306a36Sopenharmony_cithe "cpuset" cgroup subsystem, the steps are something like::
31662306a36Sopenharmony_ci
31762306a36Sopenharmony_ci 1) mount -t tmpfs cgroup_root /sys/fs/cgroup
31862306a36Sopenharmony_ci 2) mkdir /sys/fs/cgroup/cpuset
31962306a36Sopenharmony_ci 3) mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset
32062306a36Sopenharmony_ci 4) Create the new cgroup by doing mkdir's and write's (or echo's) in
32162306a36Sopenharmony_ci    the /sys/fs/cgroup/cpuset virtual file system.
32262306a36Sopenharmony_ci 5) Start a task that will be the "founding father" of the new job.
32362306a36Sopenharmony_ci 6) Attach that task to the new cgroup by writing its PID to the
32462306a36Sopenharmony_ci    /sys/fs/cgroup/cpuset tasks file for that cgroup.
32562306a36Sopenharmony_ci 7) fork, exec or clone the job tasks from this founding father task.
32662306a36Sopenharmony_ci
32762306a36Sopenharmony_ciFor example, the following sequence of commands will setup a cgroup
32862306a36Sopenharmony_cinamed "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
32962306a36Sopenharmony_ciand then start a subshell 'sh' in that cgroup::
33062306a36Sopenharmony_ci
33162306a36Sopenharmony_ci  mount -t tmpfs cgroup_root /sys/fs/cgroup
33262306a36Sopenharmony_ci  mkdir /sys/fs/cgroup/cpuset
33362306a36Sopenharmony_ci  mount -t cgroup cpuset -ocpuset /sys/fs/cgroup/cpuset
33462306a36Sopenharmony_ci  cd /sys/fs/cgroup/cpuset
33562306a36Sopenharmony_ci  mkdir Charlie
33662306a36Sopenharmony_ci  cd Charlie
33762306a36Sopenharmony_ci  /bin/echo 2-3 > cpuset.cpus
33862306a36Sopenharmony_ci  /bin/echo 1 > cpuset.mems
33962306a36Sopenharmony_ci  /bin/echo $$ > tasks
34062306a36Sopenharmony_ci  sh
34162306a36Sopenharmony_ci  # The subshell 'sh' is now running in cgroup Charlie
34262306a36Sopenharmony_ci  # The next line should display '/Charlie'
34362306a36Sopenharmony_ci  cat /proc/self/cgroup
34462306a36Sopenharmony_ci
34562306a36Sopenharmony_ci2. Usage Examples and Syntax
34662306a36Sopenharmony_ci============================
34762306a36Sopenharmony_ci
34862306a36Sopenharmony_ci2.1 Basic Usage
34962306a36Sopenharmony_ci---------------
35062306a36Sopenharmony_ci
35162306a36Sopenharmony_ciCreating, modifying, using cgroups can be done through the cgroup
35262306a36Sopenharmony_civirtual filesystem.
35362306a36Sopenharmony_ci
35462306a36Sopenharmony_ciTo mount a cgroup hierarchy with all available subsystems, type::
35562306a36Sopenharmony_ci
35662306a36Sopenharmony_ci  # mount -t cgroup xxx /sys/fs/cgroup
35762306a36Sopenharmony_ci
35862306a36Sopenharmony_ciThe "xxx" is not interpreted by the cgroup code, but will appear in
35962306a36Sopenharmony_ci/proc/mounts so may be any useful identifying string that you like.
36062306a36Sopenharmony_ci
36162306a36Sopenharmony_ciNote: Some subsystems do not work without some user input first.  For instance,
36262306a36Sopenharmony_ciif cpusets are enabled the user will have to populate the cpus and mems files
36362306a36Sopenharmony_cifor each new cgroup created before that group can be used.
36462306a36Sopenharmony_ci
36562306a36Sopenharmony_ciAs explained in section `1.2 Why are cgroups needed?` you should create
36662306a36Sopenharmony_cidifferent hierarchies of cgroups for each single resource or group of
36762306a36Sopenharmony_ciresources you want to control. Therefore, you should mount a tmpfs on
36862306a36Sopenharmony_ci/sys/fs/cgroup and create directories for each cgroup resource or resource
36962306a36Sopenharmony_cigroup::
37062306a36Sopenharmony_ci
37162306a36Sopenharmony_ci  # mount -t tmpfs cgroup_root /sys/fs/cgroup
37262306a36Sopenharmony_ci  # mkdir /sys/fs/cgroup/rg1
37362306a36Sopenharmony_ci
37462306a36Sopenharmony_ciTo mount a cgroup hierarchy with just the cpuset and memory
37562306a36Sopenharmony_cisubsystems, type::
37662306a36Sopenharmony_ci
37762306a36Sopenharmony_ci  # mount -t cgroup -o cpuset,memory hier1 /sys/fs/cgroup/rg1
37862306a36Sopenharmony_ci
37962306a36Sopenharmony_ciWhile remounting cgroups is currently supported, it is not recommend
38062306a36Sopenharmony_cito use it. Remounting allows changing bound subsystems and
38162306a36Sopenharmony_cirelease_agent. Rebinding is hardly useful as it only works when the
38262306a36Sopenharmony_cihierarchy is empty and release_agent itself should be replaced with
38362306a36Sopenharmony_ciconventional fsnotify. The support for remounting will be removed in
38462306a36Sopenharmony_cithe future.
38562306a36Sopenharmony_ci
38662306a36Sopenharmony_ciTo Specify a hierarchy's release_agent::
38762306a36Sopenharmony_ci
38862306a36Sopenharmony_ci  # mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \
38962306a36Sopenharmony_ci    xxx /sys/fs/cgroup/rg1
39062306a36Sopenharmony_ci
39162306a36Sopenharmony_ciNote that specifying 'release_agent' more than once will return failure.
39262306a36Sopenharmony_ci
39362306a36Sopenharmony_ciNote that changing the set of subsystems is currently only supported
39462306a36Sopenharmony_ciwhen the hierarchy consists of a single (root) cgroup. Supporting
39562306a36Sopenharmony_cithe ability to arbitrarily bind/unbind subsystems from an existing
39662306a36Sopenharmony_cicgroup hierarchy is intended to be implemented in the future.
39762306a36Sopenharmony_ci
39862306a36Sopenharmony_ciThen under /sys/fs/cgroup/rg1 you can find a tree that corresponds to the
39962306a36Sopenharmony_citree of the cgroups in the system. For instance, /sys/fs/cgroup/rg1
40062306a36Sopenharmony_ciis the cgroup that holds the whole system.
40162306a36Sopenharmony_ci
40262306a36Sopenharmony_ciIf you want to change the value of release_agent::
40362306a36Sopenharmony_ci
40462306a36Sopenharmony_ci  # echo "/sbin/new_release_agent" > /sys/fs/cgroup/rg1/release_agent
40562306a36Sopenharmony_ci
40662306a36Sopenharmony_ciIt can also be changed via remount.
40762306a36Sopenharmony_ci
40862306a36Sopenharmony_ciIf you want to create a new cgroup under /sys/fs/cgroup/rg1::
40962306a36Sopenharmony_ci
41062306a36Sopenharmony_ci  # cd /sys/fs/cgroup/rg1
41162306a36Sopenharmony_ci  # mkdir my_cgroup
41262306a36Sopenharmony_ci
41362306a36Sopenharmony_ciNow you want to do something with this cgroup:
41462306a36Sopenharmony_ci
41562306a36Sopenharmony_ci  # cd my_cgroup
41662306a36Sopenharmony_ci
41762306a36Sopenharmony_ciIn this directory you can find several files::
41862306a36Sopenharmony_ci
41962306a36Sopenharmony_ci  # ls
42062306a36Sopenharmony_ci  cgroup.procs notify_on_release tasks
42162306a36Sopenharmony_ci  (plus whatever files added by the attached subsystems)
42262306a36Sopenharmony_ci
42362306a36Sopenharmony_ciNow attach your shell to this cgroup::
42462306a36Sopenharmony_ci
42562306a36Sopenharmony_ci  # /bin/echo $$ > tasks
42662306a36Sopenharmony_ci
42762306a36Sopenharmony_ciYou can also create cgroups inside your cgroup by using mkdir in this
42862306a36Sopenharmony_cidirectory::
42962306a36Sopenharmony_ci
43062306a36Sopenharmony_ci  # mkdir my_sub_cs
43162306a36Sopenharmony_ci
43262306a36Sopenharmony_ciTo remove a cgroup, just use rmdir::
43362306a36Sopenharmony_ci
43462306a36Sopenharmony_ci  # rmdir my_sub_cs
43562306a36Sopenharmony_ci
43662306a36Sopenharmony_ciThis will fail if the cgroup is in use (has cgroups inside, or
43762306a36Sopenharmony_cihas processes attached, or is held alive by other subsystem-specific
43862306a36Sopenharmony_cireference).
43962306a36Sopenharmony_ci
44062306a36Sopenharmony_ci2.2 Attaching processes
44162306a36Sopenharmony_ci-----------------------
44262306a36Sopenharmony_ci
44362306a36Sopenharmony_ci::
44462306a36Sopenharmony_ci
44562306a36Sopenharmony_ci  # /bin/echo PID > tasks
44662306a36Sopenharmony_ci
44762306a36Sopenharmony_ciNote that it is PID, not PIDs. You can only attach ONE task at a time.
44862306a36Sopenharmony_ciIf you have several tasks to attach, you have to do it one after another::
44962306a36Sopenharmony_ci
45062306a36Sopenharmony_ci  # /bin/echo PID1 > tasks
45162306a36Sopenharmony_ci  # /bin/echo PID2 > tasks
45262306a36Sopenharmony_ci	  ...
45362306a36Sopenharmony_ci  # /bin/echo PIDn > tasks
45462306a36Sopenharmony_ci
45562306a36Sopenharmony_ciYou can attach the current shell task by echoing 0::
45662306a36Sopenharmony_ci
45762306a36Sopenharmony_ci  # echo 0 > tasks
45862306a36Sopenharmony_ci
45962306a36Sopenharmony_ciYou can use the cgroup.procs file instead of the tasks file to move all
46062306a36Sopenharmony_cithreads in a threadgroup at once. Echoing the PID of any task in a
46162306a36Sopenharmony_cithreadgroup to cgroup.procs causes all tasks in that threadgroup to be
46262306a36Sopenharmony_ciattached to the cgroup. Writing 0 to cgroup.procs moves all tasks
46362306a36Sopenharmony_ciin the writing task's threadgroup.
46462306a36Sopenharmony_ci
46562306a36Sopenharmony_ciNote: Since every task is always a member of exactly one cgroup in each
46662306a36Sopenharmony_cimounted hierarchy, to remove a task from its current cgroup you must
46762306a36Sopenharmony_cimove it into a new cgroup (possibly the root cgroup) by writing to the
46862306a36Sopenharmony_cinew cgroup's tasks file.
46962306a36Sopenharmony_ci
47062306a36Sopenharmony_ciNote: Due to some restrictions enforced by some cgroup subsystems, moving
47162306a36Sopenharmony_cia process to another cgroup can fail.
47262306a36Sopenharmony_ci
47362306a36Sopenharmony_ci2.3 Mounting hierarchies by name
47462306a36Sopenharmony_ci--------------------------------
47562306a36Sopenharmony_ci
47662306a36Sopenharmony_ciPassing the name=<x> option when mounting a cgroups hierarchy
47762306a36Sopenharmony_ciassociates the given name with the hierarchy.  This can be used when
47862306a36Sopenharmony_cimounting a pre-existing hierarchy, in order to refer to it by name
47962306a36Sopenharmony_cirather than by its set of active subsystems.  Each hierarchy is either
48062306a36Sopenharmony_cinameless, or has a unique name.
48162306a36Sopenharmony_ci
48262306a36Sopenharmony_ciThe name should match [\w.-]+
48362306a36Sopenharmony_ci
48462306a36Sopenharmony_ciWhen passing a name=<x> option for a new hierarchy, you need to
48562306a36Sopenharmony_cispecify subsystems manually; the legacy behaviour of mounting all
48662306a36Sopenharmony_cisubsystems when none are explicitly specified is not supported when
48762306a36Sopenharmony_ciyou give a subsystem a name.
48862306a36Sopenharmony_ci
48962306a36Sopenharmony_ciThe name of the subsystem appears as part of the hierarchy description
49062306a36Sopenharmony_ciin /proc/mounts and /proc/<pid>/cgroups.
49162306a36Sopenharmony_ci
49262306a36Sopenharmony_ci
49362306a36Sopenharmony_ci3. Kernel API
49462306a36Sopenharmony_ci=============
49562306a36Sopenharmony_ci
49662306a36Sopenharmony_ci3.1 Overview
49762306a36Sopenharmony_ci------------
49862306a36Sopenharmony_ci
49962306a36Sopenharmony_ciEach kernel subsystem that wants to hook into the generic cgroup
50062306a36Sopenharmony_cisystem needs to create a cgroup_subsys object. This contains
50162306a36Sopenharmony_civarious methods, which are callbacks from the cgroup system, along
50262306a36Sopenharmony_ciwith a subsystem ID which will be assigned by the cgroup system.
50362306a36Sopenharmony_ci
50462306a36Sopenharmony_ciOther fields in the cgroup_subsys object include:
50562306a36Sopenharmony_ci
50662306a36Sopenharmony_ci- subsys_id: a unique array index for the subsystem, indicating which
50762306a36Sopenharmony_ci  entry in cgroup->subsys[] this subsystem should be managing.
50862306a36Sopenharmony_ci
50962306a36Sopenharmony_ci- name: should be initialized to a unique subsystem name. Should be
51062306a36Sopenharmony_ci  no longer than MAX_CGROUP_TYPE_NAMELEN.
51162306a36Sopenharmony_ci
51262306a36Sopenharmony_ci- early_init: indicate if the subsystem needs early initialization
51362306a36Sopenharmony_ci  at system boot.
51462306a36Sopenharmony_ci
51562306a36Sopenharmony_ciEach cgroup object created by the system has an array of pointers,
51662306a36Sopenharmony_ciindexed by subsystem ID; this pointer is entirely managed by the
51762306a36Sopenharmony_cisubsystem; the generic cgroup code will never touch this pointer.
51862306a36Sopenharmony_ci
51962306a36Sopenharmony_ci3.2 Synchronization
52062306a36Sopenharmony_ci-------------------
52162306a36Sopenharmony_ci
52262306a36Sopenharmony_ciThere is a global mutex, cgroup_mutex, used by the cgroup
52362306a36Sopenharmony_cisystem. This should be taken by anything that wants to modify a
52462306a36Sopenharmony_cicgroup. It may also be taken to prevent cgroups from being
52562306a36Sopenharmony_cimodified, but more specific locks may be more appropriate in that
52662306a36Sopenharmony_cisituation.
52762306a36Sopenharmony_ci
52862306a36Sopenharmony_ciSee kernel/cgroup.c for more details.
52962306a36Sopenharmony_ci
53062306a36Sopenharmony_ciSubsystems can take/release the cgroup_mutex via the functions
53162306a36Sopenharmony_cicgroup_lock()/cgroup_unlock().
53262306a36Sopenharmony_ci
53362306a36Sopenharmony_ciAccessing a task's cgroup pointer may be done in the following ways:
53462306a36Sopenharmony_ci- while holding cgroup_mutex
53562306a36Sopenharmony_ci- while holding the task's alloc_lock (via task_lock())
53662306a36Sopenharmony_ci- inside an rcu_read_lock() section via rcu_dereference()
53762306a36Sopenharmony_ci
53862306a36Sopenharmony_ci3.3 Subsystem API
53962306a36Sopenharmony_ci-----------------
54062306a36Sopenharmony_ci
54162306a36Sopenharmony_ciEach subsystem should:
54262306a36Sopenharmony_ci
54362306a36Sopenharmony_ci- add an entry in linux/cgroup_subsys.h
54462306a36Sopenharmony_ci- define a cgroup_subsys object called <name>_cgrp_subsys
54562306a36Sopenharmony_ci
54662306a36Sopenharmony_ciEach subsystem may export the following methods. The only mandatory
54762306a36Sopenharmony_cimethods are css_alloc/free. Any others that are null are presumed to
54862306a36Sopenharmony_cibe successful no-ops.
54962306a36Sopenharmony_ci
55062306a36Sopenharmony_ci``struct cgroup_subsys_state *css_alloc(struct cgroup *cgrp)``
55162306a36Sopenharmony_ci(cgroup_mutex held by caller)
55262306a36Sopenharmony_ci
55362306a36Sopenharmony_ciCalled to allocate a subsystem state object for a cgroup. The
55462306a36Sopenharmony_cisubsystem should allocate its subsystem state object for the passed
55562306a36Sopenharmony_cicgroup, returning a pointer to the new object on success or a
55662306a36Sopenharmony_ciERR_PTR() value. On success, the subsystem pointer should point to
55762306a36Sopenharmony_cia structure of type cgroup_subsys_state (typically embedded in a
55862306a36Sopenharmony_cilarger subsystem-specific object), which will be initialized by the
55962306a36Sopenharmony_cicgroup system. Note that this will be called at initialization to
56062306a36Sopenharmony_cicreate the root subsystem state for this subsystem; this case can be
56162306a36Sopenharmony_ciidentified by the passed cgroup object having a NULL parent (since
56262306a36Sopenharmony_ciit's the root of the hierarchy) and may be an appropriate place for
56362306a36Sopenharmony_ciinitialization code.
56462306a36Sopenharmony_ci
56562306a36Sopenharmony_ci``int css_online(struct cgroup *cgrp)``
56662306a36Sopenharmony_ci(cgroup_mutex held by caller)
56762306a36Sopenharmony_ci
56862306a36Sopenharmony_ciCalled after @cgrp successfully completed all allocations and made
56962306a36Sopenharmony_civisible to cgroup_for_each_child/descendant_*() iterators. The
57062306a36Sopenharmony_cisubsystem may choose to fail creation by returning -errno. This
57162306a36Sopenharmony_cicallback can be used to implement reliable state sharing and
57262306a36Sopenharmony_cipropagation along the hierarchy. See the comment on
57362306a36Sopenharmony_cicgroup_for_each_descendant_pre() for details.
57462306a36Sopenharmony_ci
57562306a36Sopenharmony_ci``void css_offline(struct cgroup *cgrp);``
57662306a36Sopenharmony_ci(cgroup_mutex held by caller)
57762306a36Sopenharmony_ci
57862306a36Sopenharmony_ciThis is the counterpart of css_online() and called iff css_online()
57962306a36Sopenharmony_cihas succeeded on @cgrp. This signifies the beginning of the end of
58062306a36Sopenharmony_ci@cgrp. @cgrp is being removed and the subsystem should start dropping
58162306a36Sopenharmony_ciall references it's holding on @cgrp. When all references are dropped,
58262306a36Sopenharmony_cicgroup removal will proceed to the next step - css_free(). After this
58362306a36Sopenharmony_cicallback, @cgrp should be considered dead to the subsystem.
58462306a36Sopenharmony_ci
58562306a36Sopenharmony_ci``void css_free(struct cgroup *cgrp)``
58662306a36Sopenharmony_ci(cgroup_mutex held by caller)
58762306a36Sopenharmony_ci
58862306a36Sopenharmony_ciThe cgroup system is about to free @cgrp; the subsystem should free
58962306a36Sopenharmony_ciits subsystem state object. By the time this method is called, @cgrp
59062306a36Sopenharmony_ciis completely unused; @cgrp->parent is still valid. (Note - can also
59162306a36Sopenharmony_cibe called for a newly-created cgroup if an error occurs after this
59262306a36Sopenharmony_cisubsystem's create() method has been called for the new cgroup).
59362306a36Sopenharmony_ci
59462306a36Sopenharmony_ci``int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``
59562306a36Sopenharmony_ci(cgroup_mutex held by caller)
59662306a36Sopenharmony_ci
59762306a36Sopenharmony_ciCalled prior to moving one or more tasks into a cgroup; if the
59862306a36Sopenharmony_cisubsystem returns an error, this will abort the attach operation.
59962306a36Sopenharmony_ci@tset contains the tasks to be attached and is guaranteed to have at
60062306a36Sopenharmony_cileast one task in it.
60162306a36Sopenharmony_ci
60262306a36Sopenharmony_ciIf there are multiple tasks in the taskset, then:
60362306a36Sopenharmony_ci  - it's guaranteed that all are from the same thread group
60462306a36Sopenharmony_ci  - @tset contains all tasks from the thread group whether or not
60562306a36Sopenharmony_ci    they're switching cgroups
60662306a36Sopenharmony_ci  - the first task is the leader
60762306a36Sopenharmony_ci
60862306a36Sopenharmony_ciEach @tset entry also contains the task's old cgroup and tasks which
60962306a36Sopenharmony_ciaren't switching cgroup can be skipped easily using the
61062306a36Sopenharmony_cicgroup_taskset_for_each() iterator. Note that this isn't called on a
61162306a36Sopenharmony_cifork. If this method returns 0 (success) then this should remain valid
61262306a36Sopenharmony_ciwhile the caller holds cgroup_mutex and it is ensured that either
61362306a36Sopenharmony_ciattach() or cancel_attach() will be called in future.
61462306a36Sopenharmony_ci
61562306a36Sopenharmony_ci``void css_reset(struct cgroup_subsys_state *css)``
61662306a36Sopenharmony_ci(cgroup_mutex held by caller)
61762306a36Sopenharmony_ci
61862306a36Sopenharmony_ciAn optional operation which should restore @css's configuration to the
61962306a36Sopenharmony_ciinitial state.  This is currently only used on the unified hierarchy
62062306a36Sopenharmony_ciwhen a subsystem is disabled on a cgroup through
62162306a36Sopenharmony_ci"cgroup.subtree_control" but should remain enabled because other
62262306a36Sopenharmony_cisubsystems depend on it.  cgroup core makes such a css invisible by
62362306a36Sopenharmony_ciremoving the associated interface files and invokes this callback so
62462306a36Sopenharmony_cithat the hidden subsystem can return to the initial neutral state.
62562306a36Sopenharmony_ciThis prevents unexpected resource control from a hidden css and
62662306a36Sopenharmony_ciensures that the configuration is in the initial state when it is made
62762306a36Sopenharmony_civisible again later.
62862306a36Sopenharmony_ci
62962306a36Sopenharmony_ci``void cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``
63062306a36Sopenharmony_ci(cgroup_mutex held by caller)
63162306a36Sopenharmony_ci
63262306a36Sopenharmony_ciCalled when a task attach operation has failed after can_attach() has succeeded.
63362306a36Sopenharmony_ciA subsystem whose can_attach() has some side-effects should provide this
63462306a36Sopenharmony_cifunction, so that the subsystem can implement a rollback. If not, not necessary.
63562306a36Sopenharmony_ciThis will be called only about subsystems whose can_attach() operation have
63662306a36Sopenharmony_cisucceeded. The parameters are identical to can_attach().
63762306a36Sopenharmony_ci
63862306a36Sopenharmony_ci``void attach(struct cgroup *cgrp, struct cgroup_taskset *tset)``
63962306a36Sopenharmony_ci(cgroup_mutex held by caller)
64062306a36Sopenharmony_ci
64162306a36Sopenharmony_ciCalled after the task has been attached to the cgroup, to allow any
64262306a36Sopenharmony_cipost-attachment activity that requires memory allocations or blocking.
64362306a36Sopenharmony_ciThe parameters are identical to can_attach().
64462306a36Sopenharmony_ci
64562306a36Sopenharmony_ci``void fork(struct task_struct *task)``
64662306a36Sopenharmony_ci
64762306a36Sopenharmony_ciCalled when a task is forked into a cgroup.
64862306a36Sopenharmony_ci
64962306a36Sopenharmony_ci``void exit(struct task_struct *task)``
65062306a36Sopenharmony_ci
65162306a36Sopenharmony_ciCalled during task exit.
65262306a36Sopenharmony_ci
65362306a36Sopenharmony_ci``void free(struct task_struct *task)``
65462306a36Sopenharmony_ci
65562306a36Sopenharmony_ciCalled when the task_struct is freed.
65662306a36Sopenharmony_ci
65762306a36Sopenharmony_ci``void bind(struct cgroup *root)``
65862306a36Sopenharmony_ci(cgroup_mutex held by caller)
65962306a36Sopenharmony_ci
66062306a36Sopenharmony_ciCalled when a cgroup subsystem is rebound to a different hierarchy
66162306a36Sopenharmony_ciand root cgroup. Currently this will only involve movement between
66262306a36Sopenharmony_cithe default hierarchy (which never has sub-cgroups) and a hierarchy
66362306a36Sopenharmony_cithat is being created/destroyed (and hence has no sub-cgroups).
66462306a36Sopenharmony_ci
66562306a36Sopenharmony_ci4. Extended attribute usage
66662306a36Sopenharmony_ci===========================
66762306a36Sopenharmony_ci
66862306a36Sopenharmony_cicgroup filesystem supports certain types of extended attributes in its
66962306a36Sopenharmony_cidirectories and files.  The current supported types are:
67062306a36Sopenharmony_ci
67162306a36Sopenharmony_ci	- Trusted (XATTR_TRUSTED)
67262306a36Sopenharmony_ci	- Security (XATTR_SECURITY)
67362306a36Sopenharmony_ci
67462306a36Sopenharmony_ciBoth require CAP_SYS_ADMIN capability to set.
67562306a36Sopenharmony_ci
67662306a36Sopenharmony_ciLike in tmpfs, the extended attributes in cgroup filesystem are stored
67762306a36Sopenharmony_ciusing kernel memory and it's advised to keep the usage at minimum.  This
67862306a36Sopenharmony_ciis the reason why user defined extended attributes are not supported, since
67962306a36Sopenharmony_ciany user can do it and there's no limit in the value size.
68062306a36Sopenharmony_ci
68162306a36Sopenharmony_ciThe current known users for this feature are SELinux to limit cgroup usage
68262306a36Sopenharmony_ciin containers and systemd for assorted meta data like main PID in a cgroup
68362306a36Sopenharmony_ci(systemd creates a cgroup per service).
68462306a36Sopenharmony_ci
68562306a36Sopenharmony_ci5. Questions
68662306a36Sopenharmony_ci============
68762306a36Sopenharmony_ci
68862306a36Sopenharmony_ci::
68962306a36Sopenharmony_ci
69062306a36Sopenharmony_ci  Q: what's up with this '/bin/echo' ?
69162306a36Sopenharmony_ci  A: bash's builtin 'echo' command does not check calls to write() against
69262306a36Sopenharmony_ci     errors. If you use it in the cgroup file system, you won't be
69362306a36Sopenharmony_ci     able to tell whether a command succeeded or failed.
69462306a36Sopenharmony_ci
69562306a36Sopenharmony_ci  Q: When I attach processes, only the first of the line gets really attached !
69662306a36Sopenharmony_ci  A: We can only return one error code per call to write(). So you should also
69762306a36Sopenharmony_ci     put only ONE PID.
698