18c2ecf20Sopenharmony_ci============== 28c2ecf20Sopenharmony_ciControl Groups 38c2ecf20Sopenharmony_ci============== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ciWritten by Paul Menage <menage@google.com> based on 68c2ecf20Sopenharmony_ciDocumentation/admin-guide/cgroup-v1/cpusets.rst 78c2ecf20Sopenharmony_ci 88c2ecf20Sopenharmony_ciOriginal copyright statements from cpusets.txt: 98c2ecf20Sopenharmony_ci 108c2ecf20Sopenharmony_ciPortions Copyright (C) 2004 BULL SA. 118c2ecf20Sopenharmony_ci 128c2ecf20Sopenharmony_ciPortions Copyright (c) 2004-2006 Silicon Graphics, Inc. 138c2ecf20Sopenharmony_ci 148c2ecf20Sopenharmony_ciModified by Paul Jackson <pj@sgi.com> 158c2ecf20Sopenharmony_ci 168c2ecf20Sopenharmony_ciModified by Christoph Lameter <cl@linux.com> 178c2ecf20Sopenharmony_ci 188c2ecf20Sopenharmony_ci.. CONTENTS: 198c2ecf20Sopenharmony_ci 208c2ecf20Sopenharmony_ci 1. Control Groups 218c2ecf20Sopenharmony_ci 1.1 What are cgroups ? 228c2ecf20Sopenharmony_ci 1.2 Why are cgroups needed ? 238c2ecf20Sopenharmony_ci 1.3 How are cgroups implemented ? 248c2ecf20Sopenharmony_ci 1.4 What does notify_on_release do ? 258c2ecf20Sopenharmony_ci 1.5 What does clone_children do ? 268c2ecf20Sopenharmony_ci 1.6 How do I use cgroups ? 278c2ecf20Sopenharmony_ci 2. Usage Examples and Syntax 288c2ecf20Sopenharmony_ci 2.1 Basic Usage 298c2ecf20Sopenharmony_ci 2.2 Attaching processes 308c2ecf20Sopenharmony_ci 2.3 Mounting hierarchies by name 318c2ecf20Sopenharmony_ci 3. Kernel API 328c2ecf20Sopenharmony_ci 3.1 Overview 338c2ecf20Sopenharmony_ci 3.2 Synchronization 348c2ecf20Sopenharmony_ci 3.3 Subsystem API 358c2ecf20Sopenharmony_ci 4. Extended attributes usage 368c2ecf20Sopenharmony_ci 5. Questions 378c2ecf20Sopenharmony_ci 388c2ecf20Sopenharmony_ci1. Control Groups 398c2ecf20Sopenharmony_ci================= 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ci1.1 What are cgroups ? 428c2ecf20Sopenharmony_ci---------------------- 438c2ecf20Sopenharmony_ci 448c2ecf20Sopenharmony_ciControl Groups provide a mechanism for aggregating/partitioning sets of 458c2ecf20Sopenharmony_citasks, and all their future children, into hierarchical groups with 468c2ecf20Sopenharmony_cispecialized behaviour. 478c2ecf20Sopenharmony_ci 488c2ecf20Sopenharmony_ciDefinitions: 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ciA *cgroup* associates a set of tasks with a set of parameters for one 518c2ecf20Sopenharmony_cior more subsystems. 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ciA *subsystem* is a module that makes use of the task grouping 548c2ecf20Sopenharmony_cifacilities provided by cgroups to treat groups of tasks in 558c2ecf20Sopenharmony_ciparticular ways. A subsystem is typically a "resource controller" that 568c2ecf20Sopenharmony_cischedules a resource or applies per-cgroup limits, but it may be 578c2ecf20Sopenharmony_cianything that wants to act on a group of processes, e.g. a 588c2ecf20Sopenharmony_civirtualization subsystem. 598c2ecf20Sopenharmony_ci 608c2ecf20Sopenharmony_ciA *hierarchy* is a set of cgroups arranged in a tree, such that 618c2ecf20Sopenharmony_cievery task in the system is in exactly one of the cgroups in the 628c2ecf20Sopenharmony_cihierarchy, and a set of subsystems; each subsystem has system-specific 638c2ecf20Sopenharmony_cistate attached to each cgroup in the hierarchy. Each hierarchy has 648c2ecf20Sopenharmony_cian instance of the cgroup virtual filesystem associated with it. 658c2ecf20Sopenharmony_ci 668c2ecf20Sopenharmony_ciAt any one time there may be multiple active hierarchies of task 678c2ecf20Sopenharmony_cicgroups. Each hierarchy is a partition of all tasks in the system. 688c2ecf20Sopenharmony_ci 698c2ecf20Sopenharmony_ciUser-level code may create and destroy cgroups by name in an 708c2ecf20Sopenharmony_ciinstance of the cgroup virtual file system, specify and query to 718c2ecf20Sopenharmony_ciwhich cgroup a task is assigned, and list the task PIDs assigned to 728c2ecf20Sopenharmony_cia cgroup. Those creations and assignments only affect the hierarchy 738c2ecf20Sopenharmony_ciassociated with that instance of the cgroup file system. 748c2ecf20Sopenharmony_ci 758c2ecf20Sopenharmony_ciOn their own, the only use for cgroups is for simple job 768c2ecf20Sopenharmony_citracking. The intention is that other subsystems hook into the generic 778c2ecf20Sopenharmony_cicgroup support to provide new attributes for cgroups, such as 788c2ecf20Sopenharmony_ciaccounting/limiting the resources which processes in a cgroup can 798c2ecf20Sopenharmony_ciaccess. For example, cpusets (see Documentation/admin-guide/cgroup-v1/cpusets.rst) allow 808c2ecf20Sopenharmony_ciyou to associate a set of CPUs and a set of memory nodes with the 818c2ecf20Sopenharmony_citasks in each cgroup. 828c2ecf20Sopenharmony_ci 838c2ecf20Sopenharmony_ci1.2 Why are cgroups needed ? 848c2ecf20Sopenharmony_ci---------------------------- 858c2ecf20Sopenharmony_ci 868c2ecf20Sopenharmony_ciThere are multiple efforts to provide process aggregations in the 878c2ecf20Sopenharmony_ciLinux kernel, mainly for resource-tracking purposes. Such efforts 888c2ecf20Sopenharmony_ciinclude cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server 898c2ecf20Sopenharmony_cinamespaces. These all require the basic notion of a 908c2ecf20Sopenharmony_cigrouping/partitioning of processes, with newly forked processes ending 918c2ecf20Sopenharmony_ciup in the same group (cgroup) as their parent process. 928c2ecf20Sopenharmony_ci 938c2ecf20Sopenharmony_ciThe kernel cgroup patch provides the minimum essential kernel 948c2ecf20Sopenharmony_cimechanisms required to efficiently implement such groups. It has 958c2ecf20Sopenharmony_ciminimal impact on the system fast paths, and provides hooks for 968c2ecf20Sopenharmony_cispecific subsystems such as cpusets to provide additional behaviour as 978c2ecf20Sopenharmony_cidesired. 988c2ecf20Sopenharmony_ci 998c2ecf20Sopenharmony_ciMultiple hierarchy support is provided to allow for situations where 1008c2ecf20Sopenharmony_cithe division of tasks into cgroups is distinctly different for 1018c2ecf20Sopenharmony_cidifferent subsystems - having parallel hierarchies allows each 1028c2ecf20Sopenharmony_cihierarchy to be a natural division of tasks, without having to handle 1038c2ecf20Sopenharmony_cicomplex combinations of tasks that would be present if several 1048c2ecf20Sopenharmony_ciunrelated subsystems needed to be forced into the same tree of 1058c2ecf20Sopenharmony_cicgroups. 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ciAt one extreme, each resource controller or subsystem could be in a 1088c2ecf20Sopenharmony_ciseparate hierarchy; at the other extreme, all subsystems 1098c2ecf20Sopenharmony_ciwould be attached to the same hierarchy. 1108c2ecf20Sopenharmony_ci 1118c2ecf20Sopenharmony_ciAs an example of a scenario (originally proposed by vatsa@in.ibm.com) 1128c2ecf20Sopenharmony_cithat can benefit from multiple hierarchies, consider a large 1138c2ecf20Sopenharmony_ciuniversity server with various users - students, professors, system 1148c2ecf20Sopenharmony_citasks etc. The resource planning for this server could be along the 1158c2ecf20Sopenharmony_cifollowing lines:: 1168c2ecf20Sopenharmony_ci 1178c2ecf20Sopenharmony_ci CPU : "Top cpuset" 1188c2ecf20Sopenharmony_ci / \ 1198c2ecf20Sopenharmony_ci CPUSet1 CPUSet2 1208c2ecf20Sopenharmony_ci | | 1218c2ecf20Sopenharmony_ci (Professors) (Students) 1228c2ecf20Sopenharmony_ci 1238c2ecf20Sopenharmony_ci In addition (system tasks) are attached to topcpuset (so 1248c2ecf20Sopenharmony_ci that they can run anywhere) with a limit of 20% 1258c2ecf20Sopenharmony_ci 1268c2ecf20Sopenharmony_ci Memory : Professors (50%), Students (30%), system (20%) 1278c2ecf20Sopenharmony_ci 1288c2ecf20Sopenharmony_ci Disk : Professors (50%), Students (30%), system (20%) 1298c2ecf20Sopenharmony_ci 1308c2ecf20Sopenharmony_ci Network : WWW browsing (20%), Network File System (60%), others (20%) 1318c2ecf20Sopenharmony_ci / \ 1328c2ecf20Sopenharmony_ci Professors (15%) students (5%) 1338c2ecf20Sopenharmony_ci 1348c2ecf20Sopenharmony_ciBrowsers like Firefox/Lynx go into the WWW network class, while (k)nfsd goes 1358c2ecf20Sopenharmony_ciinto the NFS network class. 1368c2ecf20Sopenharmony_ci 1378c2ecf20Sopenharmony_ciAt the same time Firefox/Lynx will share an appropriate CPU/Memory class 1388c2ecf20Sopenharmony_cidepending on who launched it (prof/student). 1398c2ecf20Sopenharmony_ci 1408c2ecf20Sopenharmony_ciWith the ability to classify tasks differently for different resources 1418c2ecf20Sopenharmony_ci(by putting those resource subsystems in different hierarchies), 1428c2ecf20Sopenharmony_cithe admin can easily set up a script which receives exec notifications 1438c2ecf20Sopenharmony_ciand depending on who is launching the browser he can:: 1448c2ecf20Sopenharmony_ci 1458c2ecf20Sopenharmony_ci # echo browser_pid > /sys/fs/cgroup/<restype>/<userclass>/tasks 1468c2ecf20Sopenharmony_ci 1478c2ecf20Sopenharmony_ciWith only a single hierarchy, he now would potentially have to create 1488c2ecf20Sopenharmony_cia separate cgroup for every browser launched and associate it with 1498c2ecf20Sopenharmony_ciappropriate network and other resource class. This may lead to 1508c2ecf20Sopenharmony_ciproliferation of such cgroups. 1518c2ecf20Sopenharmony_ci 1528c2ecf20Sopenharmony_ciAlso let's say that the administrator would like to give enhanced network 1538c2ecf20Sopenharmony_ciaccess temporarily to a student's browser (since it is night and the user 1548c2ecf20Sopenharmony_ciwants to do online gaming :)) OR give one of the student's simulation 1558c2ecf20Sopenharmony_ciapps enhanced CPU power. 1568c2ecf20Sopenharmony_ci 1578c2ecf20Sopenharmony_ciWith ability to write PIDs directly to resource classes, it's just a 1588c2ecf20Sopenharmony_cimatter of:: 1598c2ecf20Sopenharmony_ci 1608c2ecf20Sopenharmony_ci # echo pid > /sys/fs/cgroup/network/<new_class>/tasks 1618c2ecf20Sopenharmony_ci (after some time) 1628c2ecf20Sopenharmony_ci # echo pid > /sys/fs/cgroup/network/<orig_class>/tasks 1638c2ecf20Sopenharmony_ci 1648c2ecf20Sopenharmony_ciWithout this ability, the administrator would have to split the cgroup into 1658c2ecf20Sopenharmony_cimultiple separate ones and then associate the new cgroups with the 1668c2ecf20Sopenharmony_cinew resource classes. 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ci 1698c2ecf20Sopenharmony_ci 1708c2ecf20Sopenharmony_ci1.3 How are cgroups implemented ? 1718c2ecf20Sopenharmony_ci--------------------------------- 1728c2ecf20Sopenharmony_ci 1738c2ecf20Sopenharmony_ciControl Groups extends the kernel as follows: 1748c2ecf20Sopenharmony_ci 1758c2ecf20Sopenharmony_ci - Each task in the system has a reference-counted pointer to a 1768c2ecf20Sopenharmony_ci css_set. 1778c2ecf20Sopenharmony_ci 1788c2ecf20Sopenharmony_ci - A css_set contains a set of reference-counted pointers to 1798c2ecf20Sopenharmony_ci cgroup_subsys_state objects, one for each cgroup subsystem 1808c2ecf20Sopenharmony_ci registered in the system. There is no direct link from a task to 1818c2ecf20Sopenharmony_ci the cgroup of which it's a member in each hierarchy, but this 1828c2ecf20Sopenharmony_ci can be determined by following pointers through the 1838c2ecf20Sopenharmony_ci cgroup_subsys_state objects. This is because accessing the 1848c2ecf20Sopenharmony_ci subsystem state is something that's expected to happen frequently 1858c2ecf20Sopenharmony_ci and in performance-critical code, whereas operations that require a 1868c2ecf20Sopenharmony_ci task's actual cgroup assignments (in particular, moving between 1878c2ecf20Sopenharmony_ci cgroups) are less common. A linked list runs through the cg_list 1888c2ecf20Sopenharmony_ci field of each task_struct using the css_set, anchored at 1898c2ecf20Sopenharmony_ci css_set->tasks. 1908c2ecf20Sopenharmony_ci 1918c2ecf20Sopenharmony_ci - A cgroup hierarchy filesystem can be mounted for browsing and 1928c2ecf20Sopenharmony_ci manipulation from user space. 1938c2ecf20Sopenharmony_ci 1948c2ecf20Sopenharmony_ci - You can list all the tasks (by PID) attached to any cgroup. 1958c2ecf20Sopenharmony_ci 1968c2ecf20Sopenharmony_ciThe implementation of cgroups requires a few, simple hooks 1978c2ecf20Sopenharmony_ciinto the rest of the kernel, none in performance-critical paths: 1988c2ecf20Sopenharmony_ci 1998c2ecf20Sopenharmony_ci - in init/main.c, to initialize the root cgroups and initial 2008c2ecf20Sopenharmony_ci css_set at system boot. 2018c2ecf20Sopenharmony_ci 2028c2ecf20Sopenharmony_ci - in fork and exit, to attach and detach a task from its css_set. 2038c2ecf20Sopenharmony_ci 2048c2ecf20Sopenharmony_ciIn addition, a new file system of type "cgroup" may be mounted, to 2058c2ecf20Sopenharmony_cienable browsing and modifying the cgroups presently known to the 2068c2ecf20Sopenharmony_cikernel. When mounting a cgroup hierarchy, you may specify a 2078c2ecf20Sopenharmony_cicomma-separated list of subsystems to mount as the filesystem mount 2088c2ecf20Sopenharmony_cioptions. By default, mounting the cgroup filesystem attempts to 2098c2ecf20Sopenharmony_cimount a hierarchy containing all registered subsystems. 2108c2ecf20Sopenharmony_ci 2118c2ecf20Sopenharmony_ciIf an active hierarchy with exactly the same set of subsystems already 2128c2ecf20Sopenharmony_ciexists, it will be reused for the new mount. If no existing hierarchy 2138c2ecf20Sopenharmony_cimatches, and any of the requested subsystems are in use in an existing 2148c2ecf20Sopenharmony_cihierarchy, the mount will fail with -EBUSY. Otherwise, a new hierarchy 2158c2ecf20Sopenharmony_ciis activated, associated with the requested subsystems. 2168c2ecf20Sopenharmony_ci 2178c2ecf20Sopenharmony_ciIt's not currently possible to bind a new subsystem to an active 2188c2ecf20Sopenharmony_cicgroup hierarchy, or to unbind a subsystem from an active cgroup 2198c2ecf20Sopenharmony_cihierarchy. This may be possible in future, but is fraught with nasty 2208c2ecf20Sopenharmony_cierror-recovery issues. 2218c2ecf20Sopenharmony_ci 2228c2ecf20Sopenharmony_ciWhen a cgroup filesystem is unmounted, if there are any 2238c2ecf20Sopenharmony_cichild cgroups created below the top-level cgroup, that hierarchy 2248c2ecf20Sopenharmony_ciwill remain active even though unmounted; if there are no 2258c2ecf20Sopenharmony_cichild cgroups then the hierarchy will be deactivated. 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ciNo new system calls are added for cgroups - all support for 2288c2ecf20Sopenharmony_ciquerying and modifying cgroups is via this cgroup file system. 2298c2ecf20Sopenharmony_ci 2308c2ecf20Sopenharmony_ciEach task under /proc has an added file named 'cgroup' displaying, 2318c2ecf20Sopenharmony_cifor each active hierarchy, the subsystem names and the cgroup name 2328c2ecf20Sopenharmony_cias the path relative to the root of the cgroup file system. 2338c2ecf20Sopenharmony_ci 2348c2ecf20Sopenharmony_ciEach cgroup is represented by a directory in the cgroup file system 2358c2ecf20Sopenharmony_cicontaining the following files describing that cgroup: 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ci - tasks: list of tasks (by PID) attached to that cgroup. This list 2388c2ecf20Sopenharmony_ci is not guaranteed to be sorted. Writing a thread ID into this file 2398c2ecf20Sopenharmony_ci moves the thread into this cgroup. 2408c2ecf20Sopenharmony_ci - cgroup.procs: list of thread group IDs in the cgroup. This list is 2418c2ecf20Sopenharmony_ci not guaranteed to be sorted or free of duplicate TGIDs, and userspace 2428c2ecf20Sopenharmony_ci should sort/uniquify the list if this property is required. 2438c2ecf20Sopenharmony_ci Writing a thread group ID into this file moves all threads in that 2448c2ecf20Sopenharmony_ci group into this cgroup. 2458c2ecf20Sopenharmony_ci - notify_on_release flag: run the release agent on exit? 2468c2ecf20Sopenharmony_ci - release_agent: the path to use for release notifications (this file 2478c2ecf20Sopenharmony_ci exists in the top cgroup only) 2488c2ecf20Sopenharmony_ci 2498c2ecf20Sopenharmony_ciOther subsystems such as cpusets may add additional files in each 2508c2ecf20Sopenharmony_cicgroup dir. 2518c2ecf20Sopenharmony_ci 2528c2ecf20Sopenharmony_ciNew cgroups are created using the mkdir system call or shell 2538c2ecf20Sopenharmony_cicommand. The properties of a cgroup, such as its flags, are 2548c2ecf20Sopenharmony_cimodified by writing to the appropriate file in that cgroups 2558c2ecf20Sopenharmony_cidirectory, as listed above. 2568c2ecf20Sopenharmony_ci 2578c2ecf20Sopenharmony_ciThe named hierarchical structure of nested cgroups allows partitioning 2588c2ecf20Sopenharmony_cia large system into nested, dynamically changeable, "soft-partitions". 2598c2ecf20Sopenharmony_ci 2608c2ecf20Sopenharmony_ciThe attachment of each task, automatically inherited at fork by any 2618c2ecf20Sopenharmony_cichildren of that task, to a cgroup allows organizing the work load 2628c2ecf20Sopenharmony_cion a system into related sets of tasks. A task may be re-attached to 2638c2ecf20Sopenharmony_ciany other cgroup, if allowed by the permissions on the necessary 2648c2ecf20Sopenharmony_cicgroup file system directories. 2658c2ecf20Sopenharmony_ci 2668c2ecf20Sopenharmony_ciWhen a task is moved from one cgroup to another, it gets a new 2678c2ecf20Sopenharmony_cicss_set pointer - if there's an already existing css_set with the 2688c2ecf20Sopenharmony_cidesired collection of cgroups then that group is reused, otherwise a new 2698c2ecf20Sopenharmony_cicss_set is allocated. The appropriate existing css_set is located by 2708c2ecf20Sopenharmony_cilooking into a hash table. 2718c2ecf20Sopenharmony_ci 2728c2ecf20Sopenharmony_ciTo allow access from a cgroup to the css_sets (and hence tasks) 2738c2ecf20Sopenharmony_cithat comprise it, a set of cg_cgroup_link objects form a lattice; 2748c2ecf20Sopenharmony_cieach cg_cgroup_link is linked into a list of cg_cgroup_links for 2758c2ecf20Sopenharmony_cia single cgroup on its cgrp_link_list field, and a list of 2768c2ecf20Sopenharmony_cicg_cgroup_links for a single css_set on its cg_link_list. 2778c2ecf20Sopenharmony_ci 2788c2ecf20Sopenharmony_ciThus the set of tasks in a cgroup can be listed by iterating over 2798c2ecf20Sopenharmony_cieach css_set that references the cgroup, and sub-iterating over 2808c2ecf20Sopenharmony_cieach css_set's task set. 2818c2ecf20Sopenharmony_ci 2828c2ecf20Sopenharmony_ciThe use of a Linux virtual file system (vfs) to represent the 2838c2ecf20Sopenharmony_cicgroup hierarchy provides for a familiar permission and name space 2848c2ecf20Sopenharmony_cifor cgroups, with a minimum of additional kernel code. 2858c2ecf20Sopenharmony_ci 2868c2ecf20Sopenharmony_ci1.4 What does notify_on_release do ? 2878c2ecf20Sopenharmony_ci------------------------------------ 2888c2ecf20Sopenharmony_ci 2898c2ecf20Sopenharmony_ciIf the notify_on_release flag is enabled (1) in a cgroup, then 2908c2ecf20Sopenharmony_ciwhenever the last task in the cgroup leaves (exits or attaches to 2918c2ecf20Sopenharmony_cisome other cgroup) and the last child cgroup of that cgroup 2928c2ecf20Sopenharmony_ciis removed, then the kernel runs the command specified by the contents 2938c2ecf20Sopenharmony_ciof the "release_agent" file in that hierarchy's root directory, 2948c2ecf20Sopenharmony_cisupplying the pathname (relative to the mount point of the cgroup 2958c2ecf20Sopenharmony_cifile system) of the abandoned cgroup. This enables automatic 2968c2ecf20Sopenharmony_ciremoval of abandoned cgroups. The default value of 2978c2ecf20Sopenharmony_cinotify_on_release in the root cgroup at system boot is disabled 2988c2ecf20Sopenharmony_ci(0). The default value of other cgroups at creation is the current 2998c2ecf20Sopenharmony_civalue of their parents' notify_on_release settings. The default value of 3008c2ecf20Sopenharmony_cia cgroup hierarchy's release_agent path is empty. 3018c2ecf20Sopenharmony_ci 3028c2ecf20Sopenharmony_ci1.5 What does clone_children do ? 3038c2ecf20Sopenharmony_ci--------------------------------- 3048c2ecf20Sopenharmony_ci 3058c2ecf20Sopenharmony_ciThis flag only affects the cpuset controller. If the clone_children 3068c2ecf20Sopenharmony_ciflag is enabled (1) in a cgroup, a new cpuset cgroup will copy its 3078c2ecf20Sopenharmony_ciconfiguration from the parent during initialization. 3088c2ecf20Sopenharmony_ci 3098c2ecf20Sopenharmony_ci1.6 How do I use cgroups ? 3108c2ecf20Sopenharmony_ci-------------------------- 3118c2ecf20Sopenharmony_ci 3128c2ecf20Sopenharmony_ciTo start a new job that is to be contained within a cgroup, using 3138c2ecf20Sopenharmony_cithe "cpuset" cgroup subsystem, the steps are something like:: 3148c2ecf20Sopenharmony_ci 3158c2ecf20Sopenharmony_ci 1) mount -t tmpfs cgroup_root /sys/fs/cgroup 3168c2ecf20Sopenharmony_ci 2) mkdir /sys/fs/cgroup/cpuset 3178c2ecf20Sopenharmony_ci 3) mount -t cgroup -ocpuset cpuset /sys/fs/cgroup/cpuset 3188c2ecf20Sopenharmony_ci 4) Create the new cgroup by doing mkdir's and write's (or echo's) in 3198c2ecf20Sopenharmony_ci the /sys/fs/cgroup/cpuset virtual file system. 3208c2ecf20Sopenharmony_ci 5) Start a task that will be the "founding father" of the new job. 3218c2ecf20Sopenharmony_ci 6) Attach that task to the new cgroup by writing its PID to the 3228c2ecf20Sopenharmony_ci /sys/fs/cgroup/cpuset tasks file for that cgroup. 3238c2ecf20Sopenharmony_ci 7) fork, exec or clone the job tasks from this founding father task. 3248c2ecf20Sopenharmony_ci 3258c2ecf20Sopenharmony_ciFor example, the following sequence of commands will setup a cgroup 3268c2ecf20Sopenharmony_cinamed "Charlie", containing just CPUs 2 and 3, and Memory Node 1, 3278c2ecf20Sopenharmony_ciand then start a subshell 'sh' in that cgroup:: 3288c2ecf20Sopenharmony_ci 3298c2ecf20Sopenharmony_ci mount -t tmpfs cgroup_root /sys/fs/cgroup 3308c2ecf20Sopenharmony_ci mkdir /sys/fs/cgroup/cpuset 3318c2ecf20Sopenharmony_ci mount -t cgroup cpuset -ocpuset /sys/fs/cgroup/cpuset 3328c2ecf20Sopenharmony_ci cd /sys/fs/cgroup/cpuset 3338c2ecf20Sopenharmony_ci mkdir Charlie 3348c2ecf20Sopenharmony_ci cd Charlie 3358c2ecf20Sopenharmony_ci /bin/echo 2-3 > cpuset.cpus 3368c2ecf20Sopenharmony_ci /bin/echo 1 > cpuset.mems 3378c2ecf20Sopenharmony_ci /bin/echo $$ > tasks 3388c2ecf20Sopenharmony_ci sh 3398c2ecf20Sopenharmony_ci # The subshell 'sh' is now running in cgroup Charlie 3408c2ecf20Sopenharmony_ci # The next line should display '/Charlie' 3418c2ecf20Sopenharmony_ci cat /proc/self/cgroup 3428c2ecf20Sopenharmony_ci 3438c2ecf20Sopenharmony_ci2. Usage Examples and Syntax 3448c2ecf20Sopenharmony_ci============================ 3458c2ecf20Sopenharmony_ci 3468c2ecf20Sopenharmony_ci2.1 Basic Usage 3478c2ecf20Sopenharmony_ci--------------- 3488c2ecf20Sopenharmony_ci 3498c2ecf20Sopenharmony_ciCreating, modifying, using cgroups can be done through the cgroup 3508c2ecf20Sopenharmony_civirtual filesystem. 3518c2ecf20Sopenharmony_ci 3528c2ecf20Sopenharmony_ciTo mount a cgroup hierarchy with all available subsystems, type:: 3538c2ecf20Sopenharmony_ci 3548c2ecf20Sopenharmony_ci # mount -t cgroup xxx /sys/fs/cgroup 3558c2ecf20Sopenharmony_ci 3568c2ecf20Sopenharmony_ciThe "xxx" is not interpreted by the cgroup code, but will appear in 3578c2ecf20Sopenharmony_ci/proc/mounts so may be any useful identifying string that you like. 3588c2ecf20Sopenharmony_ci 3598c2ecf20Sopenharmony_ciNote: Some subsystems do not work without some user input first. For instance, 3608c2ecf20Sopenharmony_ciif cpusets are enabled the user will have to populate the cpus and mems files 3618c2ecf20Sopenharmony_cifor each new cgroup created before that group can be used. 3628c2ecf20Sopenharmony_ci 3638c2ecf20Sopenharmony_ciAs explained in section `1.2 Why are cgroups needed?` you should create 3648c2ecf20Sopenharmony_cidifferent hierarchies of cgroups for each single resource or group of 3658c2ecf20Sopenharmony_ciresources you want to control. Therefore, you should mount a tmpfs on 3668c2ecf20Sopenharmony_ci/sys/fs/cgroup and create directories for each cgroup resource or resource 3678c2ecf20Sopenharmony_cigroup:: 3688c2ecf20Sopenharmony_ci 3698c2ecf20Sopenharmony_ci # mount -t tmpfs cgroup_root /sys/fs/cgroup 3708c2ecf20Sopenharmony_ci # mkdir /sys/fs/cgroup/rg1 3718c2ecf20Sopenharmony_ci 3728c2ecf20Sopenharmony_ciTo mount a cgroup hierarchy with just the cpuset and memory 3738c2ecf20Sopenharmony_cisubsystems, type:: 3748c2ecf20Sopenharmony_ci 3758c2ecf20Sopenharmony_ci # mount -t cgroup -o cpuset,memory hier1 /sys/fs/cgroup/rg1 3768c2ecf20Sopenharmony_ci 3778c2ecf20Sopenharmony_ciWhile remounting cgroups is currently supported, it is not recommend 3788c2ecf20Sopenharmony_cito use it. Remounting allows changing bound subsystems and 3798c2ecf20Sopenharmony_cirelease_agent. Rebinding is hardly useful as it only works when the 3808c2ecf20Sopenharmony_cihierarchy is empty and release_agent itself should be replaced with 3818c2ecf20Sopenharmony_ciconventional fsnotify. The support for remounting will be removed in 3828c2ecf20Sopenharmony_cithe future. 3838c2ecf20Sopenharmony_ci 3848c2ecf20Sopenharmony_ciTo Specify a hierarchy's release_agent:: 3858c2ecf20Sopenharmony_ci 3868c2ecf20Sopenharmony_ci # mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \ 3878c2ecf20Sopenharmony_ci xxx /sys/fs/cgroup/rg1 3888c2ecf20Sopenharmony_ci 3898c2ecf20Sopenharmony_ciNote that specifying 'release_agent' more than once will return failure. 3908c2ecf20Sopenharmony_ci 3918c2ecf20Sopenharmony_ciNote that changing the set of subsystems is currently only supported 3928c2ecf20Sopenharmony_ciwhen the hierarchy consists of a single (root) cgroup. Supporting 3938c2ecf20Sopenharmony_cithe ability to arbitrarily bind/unbind subsystems from an existing 3948c2ecf20Sopenharmony_cicgroup hierarchy is intended to be implemented in the future. 3958c2ecf20Sopenharmony_ci 3968c2ecf20Sopenharmony_ciThen under /sys/fs/cgroup/rg1 you can find a tree that corresponds to the 3978c2ecf20Sopenharmony_citree of the cgroups in the system. For instance, /sys/fs/cgroup/rg1 3988c2ecf20Sopenharmony_ciis the cgroup that holds the whole system. 3998c2ecf20Sopenharmony_ci 4008c2ecf20Sopenharmony_ciIf you want to change the value of release_agent:: 4018c2ecf20Sopenharmony_ci 4028c2ecf20Sopenharmony_ci # echo "/sbin/new_release_agent" > /sys/fs/cgroup/rg1/release_agent 4038c2ecf20Sopenharmony_ci 4048c2ecf20Sopenharmony_ciIt can also be changed via remount. 4058c2ecf20Sopenharmony_ci 4068c2ecf20Sopenharmony_ciIf you want to create a new cgroup under /sys/fs/cgroup/rg1:: 4078c2ecf20Sopenharmony_ci 4088c2ecf20Sopenharmony_ci # cd /sys/fs/cgroup/rg1 4098c2ecf20Sopenharmony_ci # mkdir my_cgroup 4108c2ecf20Sopenharmony_ci 4118c2ecf20Sopenharmony_ciNow you want to do something with this cgroup: 4128c2ecf20Sopenharmony_ci 4138c2ecf20Sopenharmony_ci # cd my_cgroup 4148c2ecf20Sopenharmony_ci 4158c2ecf20Sopenharmony_ciIn this directory you can find several files:: 4168c2ecf20Sopenharmony_ci 4178c2ecf20Sopenharmony_ci # ls 4188c2ecf20Sopenharmony_ci cgroup.procs notify_on_release tasks 4198c2ecf20Sopenharmony_ci (plus whatever files added by the attached subsystems) 4208c2ecf20Sopenharmony_ci 4218c2ecf20Sopenharmony_ciNow attach your shell to this cgroup:: 4228c2ecf20Sopenharmony_ci 4238c2ecf20Sopenharmony_ci # /bin/echo $$ > tasks 4248c2ecf20Sopenharmony_ci 4258c2ecf20Sopenharmony_ciYou can also create cgroups inside your cgroup by using mkdir in this 4268c2ecf20Sopenharmony_cidirectory:: 4278c2ecf20Sopenharmony_ci 4288c2ecf20Sopenharmony_ci # mkdir my_sub_cs 4298c2ecf20Sopenharmony_ci 4308c2ecf20Sopenharmony_ciTo remove a cgroup, just use rmdir:: 4318c2ecf20Sopenharmony_ci 4328c2ecf20Sopenharmony_ci # rmdir my_sub_cs 4338c2ecf20Sopenharmony_ci 4348c2ecf20Sopenharmony_ciThis will fail if the cgroup is in use (has cgroups inside, or 4358c2ecf20Sopenharmony_cihas processes attached, or is held alive by other subsystem-specific 4368c2ecf20Sopenharmony_cireference). 4378c2ecf20Sopenharmony_ci 4388c2ecf20Sopenharmony_ci2.2 Attaching processes 4398c2ecf20Sopenharmony_ci----------------------- 4408c2ecf20Sopenharmony_ci 4418c2ecf20Sopenharmony_ci:: 4428c2ecf20Sopenharmony_ci 4438c2ecf20Sopenharmony_ci # /bin/echo PID > tasks 4448c2ecf20Sopenharmony_ci 4458c2ecf20Sopenharmony_ciNote that it is PID, not PIDs. You can only attach ONE task at a time. 4468c2ecf20Sopenharmony_ciIf you have several tasks to attach, you have to do it one after another:: 4478c2ecf20Sopenharmony_ci 4488c2ecf20Sopenharmony_ci # /bin/echo PID1 > tasks 4498c2ecf20Sopenharmony_ci # /bin/echo PID2 > tasks 4508c2ecf20Sopenharmony_ci ... 4518c2ecf20Sopenharmony_ci # /bin/echo PIDn > tasks 4528c2ecf20Sopenharmony_ci 4538c2ecf20Sopenharmony_ciYou can attach the current shell task by echoing 0:: 4548c2ecf20Sopenharmony_ci 4558c2ecf20Sopenharmony_ci # echo 0 > tasks 4568c2ecf20Sopenharmony_ci 4578c2ecf20Sopenharmony_ciYou can use the cgroup.procs file instead of the tasks file to move all 4588c2ecf20Sopenharmony_cithreads in a threadgroup at once. Echoing the PID of any task in a 4598c2ecf20Sopenharmony_cithreadgroup to cgroup.procs causes all tasks in that threadgroup to be 4608c2ecf20Sopenharmony_ciattached to the cgroup. Writing 0 to cgroup.procs moves all tasks 4618c2ecf20Sopenharmony_ciin the writing task's threadgroup. 4628c2ecf20Sopenharmony_ci 4638c2ecf20Sopenharmony_ciNote: Since every task is always a member of exactly one cgroup in each 4648c2ecf20Sopenharmony_cimounted hierarchy, to remove a task from its current cgroup you must 4658c2ecf20Sopenharmony_cimove it into a new cgroup (possibly the root cgroup) by writing to the 4668c2ecf20Sopenharmony_cinew cgroup's tasks file. 4678c2ecf20Sopenharmony_ci 4688c2ecf20Sopenharmony_ciNote: Due to some restrictions enforced by some cgroup subsystems, moving 4698c2ecf20Sopenharmony_cia process to another cgroup can fail. 4708c2ecf20Sopenharmony_ci 4718c2ecf20Sopenharmony_ci2.3 Mounting hierarchies by name 4728c2ecf20Sopenharmony_ci-------------------------------- 4738c2ecf20Sopenharmony_ci 4748c2ecf20Sopenharmony_ciPassing the name=<x> option when mounting a cgroups hierarchy 4758c2ecf20Sopenharmony_ciassociates the given name with the hierarchy. This can be used when 4768c2ecf20Sopenharmony_cimounting a pre-existing hierarchy, in order to refer to it by name 4778c2ecf20Sopenharmony_cirather than by its set of active subsystems. Each hierarchy is either 4788c2ecf20Sopenharmony_cinameless, or has a unique name. 4798c2ecf20Sopenharmony_ci 4808c2ecf20Sopenharmony_ciThe name should match [\w.-]+ 4818c2ecf20Sopenharmony_ci 4828c2ecf20Sopenharmony_ciWhen passing a name=<x> option for a new hierarchy, you need to 4838c2ecf20Sopenharmony_cispecify subsystems manually; the legacy behaviour of mounting all 4848c2ecf20Sopenharmony_cisubsystems when none are explicitly specified is not supported when 4858c2ecf20Sopenharmony_ciyou give a subsystem a name. 4868c2ecf20Sopenharmony_ci 4878c2ecf20Sopenharmony_ciThe name of the subsystem appears as part of the hierarchy description 4888c2ecf20Sopenharmony_ciin /proc/mounts and /proc/<pid>/cgroups. 4898c2ecf20Sopenharmony_ci 4908c2ecf20Sopenharmony_ci 4918c2ecf20Sopenharmony_ci3. Kernel API 4928c2ecf20Sopenharmony_ci============= 4938c2ecf20Sopenharmony_ci 4948c2ecf20Sopenharmony_ci3.1 Overview 4958c2ecf20Sopenharmony_ci------------ 4968c2ecf20Sopenharmony_ci 4978c2ecf20Sopenharmony_ciEach kernel subsystem that wants to hook into the generic cgroup 4988c2ecf20Sopenharmony_cisystem needs to create a cgroup_subsys object. This contains 4998c2ecf20Sopenharmony_civarious methods, which are callbacks from the cgroup system, along 5008c2ecf20Sopenharmony_ciwith a subsystem ID which will be assigned by the cgroup system. 5018c2ecf20Sopenharmony_ci 5028c2ecf20Sopenharmony_ciOther fields in the cgroup_subsys object include: 5038c2ecf20Sopenharmony_ci 5048c2ecf20Sopenharmony_ci- subsys_id: a unique array index for the subsystem, indicating which 5058c2ecf20Sopenharmony_ci entry in cgroup->subsys[] this subsystem should be managing. 5068c2ecf20Sopenharmony_ci 5078c2ecf20Sopenharmony_ci- name: should be initialized to a unique subsystem name. Should be 5088c2ecf20Sopenharmony_ci no longer than MAX_CGROUP_TYPE_NAMELEN. 5098c2ecf20Sopenharmony_ci 5108c2ecf20Sopenharmony_ci- early_init: indicate if the subsystem needs early initialization 5118c2ecf20Sopenharmony_ci at system boot. 5128c2ecf20Sopenharmony_ci 5138c2ecf20Sopenharmony_ciEach cgroup object created by the system has an array of pointers, 5148c2ecf20Sopenharmony_ciindexed by subsystem ID; this pointer is entirely managed by the 5158c2ecf20Sopenharmony_cisubsystem; the generic cgroup code will never touch this pointer. 5168c2ecf20Sopenharmony_ci 5178c2ecf20Sopenharmony_ci3.2 Synchronization 5188c2ecf20Sopenharmony_ci------------------- 5198c2ecf20Sopenharmony_ci 5208c2ecf20Sopenharmony_ciThere is a global mutex, cgroup_mutex, used by the cgroup 5218c2ecf20Sopenharmony_cisystem. This should be taken by anything that wants to modify a 5228c2ecf20Sopenharmony_cicgroup. It may also be taken to prevent cgroups from being 5238c2ecf20Sopenharmony_cimodified, but more specific locks may be more appropriate in that 5248c2ecf20Sopenharmony_cisituation. 5258c2ecf20Sopenharmony_ci 5268c2ecf20Sopenharmony_ciSee kernel/cgroup.c for more details. 5278c2ecf20Sopenharmony_ci 5288c2ecf20Sopenharmony_ciSubsystems can take/release the cgroup_mutex via the functions 5298c2ecf20Sopenharmony_cicgroup_lock()/cgroup_unlock(). 5308c2ecf20Sopenharmony_ci 5318c2ecf20Sopenharmony_ciAccessing a task's cgroup pointer may be done in the following ways: 5328c2ecf20Sopenharmony_ci- while holding cgroup_mutex 5338c2ecf20Sopenharmony_ci- while holding the task's alloc_lock (via task_lock()) 5348c2ecf20Sopenharmony_ci- inside an rcu_read_lock() section via rcu_dereference() 5358c2ecf20Sopenharmony_ci 5368c2ecf20Sopenharmony_ci3.3 Subsystem API 5378c2ecf20Sopenharmony_ci----------------- 5388c2ecf20Sopenharmony_ci 5398c2ecf20Sopenharmony_ciEach subsystem should: 5408c2ecf20Sopenharmony_ci 5418c2ecf20Sopenharmony_ci- add an entry in linux/cgroup_subsys.h 5428c2ecf20Sopenharmony_ci- define a cgroup_subsys object called <name>_cgrp_subsys 5438c2ecf20Sopenharmony_ci 5448c2ecf20Sopenharmony_ciEach subsystem may export the following methods. The only mandatory 5458c2ecf20Sopenharmony_cimethods are css_alloc/free. Any others that are null are presumed to 5468c2ecf20Sopenharmony_cibe successful no-ops. 5478c2ecf20Sopenharmony_ci 5488c2ecf20Sopenharmony_ci``struct cgroup_subsys_state *css_alloc(struct cgroup *cgrp)`` 5498c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 5508c2ecf20Sopenharmony_ci 5518c2ecf20Sopenharmony_ciCalled to allocate a subsystem state object for a cgroup. The 5528c2ecf20Sopenharmony_cisubsystem should allocate its subsystem state object for the passed 5538c2ecf20Sopenharmony_cicgroup, returning a pointer to the new object on success or a 5548c2ecf20Sopenharmony_ciERR_PTR() value. On success, the subsystem pointer should point to 5558c2ecf20Sopenharmony_cia structure of type cgroup_subsys_state (typically embedded in a 5568c2ecf20Sopenharmony_cilarger subsystem-specific object), which will be initialized by the 5578c2ecf20Sopenharmony_cicgroup system. Note that this will be called at initialization to 5588c2ecf20Sopenharmony_cicreate the root subsystem state for this subsystem; this case can be 5598c2ecf20Sopenharmony_ciidentified by the passed cgroup object having a NULL parent (since 5608c2ecf20Sopenharmony_ciit's the root of the hierarchy) and may be an appropriate place for 5618c2ecf20Sopenharmony_ciinitialization code. 5628c2ecf20Sopenharmony_ci 5638c2ecf20Sopenharmony_ci``int css_online(struct cgroup *cgrp)`` 5648c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 5658c2ecf20Sopenharmony_ci 5668c2ecf20Sopenharmony_ciCalled after @cgrp successfully completed all allocations and made 5678c2ecf20Sopenharmony_civisible to cgroup_for_each_child/descendant_*() iterators. The 5688c2ecf20Sopenharmony_cisubsystem may choose to fail creation by returning -errno. This 5698c2ecf20Sopenharmony_cicallback can be used to implement reliable state sharing and 5708c2ecf20Sopenharmony_cipropagation along the hierarchy. See the comment on 5718c2ecf20Sopenharmony_cicgroup_for_each_descendant_pre() for details. 5728c2ecf20Sopenharmony_ci 5738c2ecf20Sopenharmony_ci``void css_offline(struct cgroup *cgrp);`` 5748c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 5758c2ecf20Sopenharmony_ci 5768c2ecf20Sopenharmony_ciThis is the counterpart of css_online() and called iff css_online() 5778c2ecf20Sopenharmony_cihas succeeded on @cgrp. This signifies the beginning of the end of 5788c2ecf20Sopenharmony_ci@cgrp. @cgrp is being removed and the subsystem should start dropping 5798c2ecf20Sopenharmony_ciall references it's holding on @cgrp. When all references are dropped, 5808c2ecf20Sopenharmony_cicgroup removal will proceed to the next step - css_free(). After this 5818c2ecf20Sopenharmony_cicallback, @cgrp should be considered dead to the subsystem. 5828c2ecf20Sopenharmony_ci 5838c2ecf20Sopenharmony_ci``void css_free(struct cgroup *cgrp)`` 5848c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 5858c2ecf20Sopenharmony_ci 5868c2ecf20Sopenharmony_ciThe cgroup system is about to free @cgrp; the subsystem should free 5878c2ecf20Sopenharmony_ciits subsystem state object. By the time this method is called, @cgrp 5888c2ecf20Sopenharmony_ciis completely unused; @cgrp->parent is still valid. (Note - can also 5898c2ecf20Sopenharmony_cibe called for a newly-created cgroup if an error occurs after this 5908c2ecf20Sopenharmony_cisubsystem's create() method has been called for the new cgroup). 5918c2ecf20Sopenharmony_ci 5928c2ecf20Sopenharmony_ci``int can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` 5938c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 5948c2ecf20Sopenharmony_ci 5958c2ecf20Sopenharmony_ciCalled prior to moving one or more tasks into a cgroup; if the 5968c2ecf20Sopenharmony_cisubsystem returns an error, this will abort the attach operation. 5978c2ecf20Sopenharmony_ci@tset contains the tasks to be attached and is guaranteed to have at 5988c2ecf20Sopenharmony_cileast one task in it. 5998c2ecf20Sopenharmony_ci 6008c2ecf20Sopenharmony_ciIf there are multiple tasks in the taskset, then: 6018c2ecf20Sopenharmony_ci - it's guaranteed that all are from the same thread group 6028c2ecf20Sopenharmony_ci - @tset contains all tasks from the thread group whether or not 6038c2ecf20Sopenharmony_ci they're switching cgroups 6048c2ecf20Sopenharmony_ci - the first task is the leader 6058c2ecf20Sopenharmony_ci 6068c2ecf20Sopenharmony_ciEach @tset entry also contains the task's old cgroup and tasks which 6078c2ecf20Sopenharmony_ciaren't switching cgroup can be skipped easily using the 6088c2ecf20Sopenharmony_cicgroup_taskset_for_each() iterator. Note that this isn't called on a 6098c2ecf20Sopenharmony_cifork. If this method returns 0 (success) then this should remain valid 6108c2ecf20Sopenharmony_ciwhile the caller holds cgroup_mutex and it is ensured that either 6118c2ecf20Sopenharmony_ciattach() or cancel_attach() will be called in future. 6128c2ecf20Sopenharmony_ci 6138c2ecf20Sopenharmony_ci``void css_reset(struct cgroup_subsys_state *css)`` 6148c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 6158c2ecf20Sopenharmony_ci 6168c2ecf20Sopenharmony_ciAn optional operation which should restore @css's configuration to the 6178c2ecf20Sopenharmony_ciinitial state. This is currently only used on the unified hierarchy 6188c2ecf20Sopenharmony_ciwhen a subsystem is disabled on a cgroup through 6198c2ecf20Sopenharmony_ci"cgroup.subtree_control" but should remain enabled because other 6208c2ecf20Sopenharmony_cisubsystems depend on it. cgroup core makes such a css invisible by 6218c2ecf20Sopenharmony_ciremoving the associated interface files and invokes this callback so 6228c2ecf20Sopenharmony_cithat the hidden subsystem can return to the initial neutral state. 6238c2ecf20Sopenharmony_ciThis prevents unexpected resource control from a hidden css and 6248c2ecf20Sopenharmony_ciensures that the configuration is in the initial state when it is made 6258c2ecf20Sopenharmony_civisible again later. 6268c2ecf20Sopenharmony_ci 6278c2ecf20Sopenharmony_ci``void cancel_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` 6288c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 6298c2ecf20Sopenharmony_ci 6308c2ecf20Sopenharmony_ciCalled when a task attach operation has failed after can_attach() has succeeded. 6318c2ecf20Sopenharmony_ciA subsystem whose can_attach() has some side-effects should provide this 6328c2ecf20Sopenharmony_cifunction, so that the subsystem can implement a rollback. If not, not necessary. 6338c2ecf20Sopenharmony_ciThis will be called only about subsystems whose can_attach() operation have 6348c2ecf20Sopenharmony_cisucceeded. The parameters are identical to can_attach(). 6358c2ecf20Sopenharmony_ci 6368c2ecf20Sopenharmony_ci``void attach(struct cgroup *cgrp, struct cgroup_taskset *tset)`` 6378c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 6388c2ecf20Sopenharmony_ci 6398c2ecf20Sopenharmony_ciCalled after the task has been attached to the cgroup, to allow any 6408c2ecf20Sopenharmony_cipost-attachment activity that requires memory allocations or blocking. 6418c2ecf20Sopenharmony_ciThe parameters are identical to can_attach(). 6428c2ecf20Sopenharmony_ci 6438c2ecf20Sopenharmony_ci``void fork(struct task_struct *task)`` 6448c2ecf20Sopenharmony_ci 6458c2ecf20Sopenharmony_ciCalled when a task is forked into a cgroup. 6468c2ecf20Sopenharmony_ci 6478c2ecf20Sopenharmony_ci``void exit(struct task_struct *task)`` 6488c2ecf20Sopenharmony_ci 6498c2ecf20Sopenharmony_ciCalled during task exit. 6508c2ecf20Sopenharmony_ci 6518c2ecf20Sopenharmony_ci``void free(struct task_struct *task)`` 6528c2ecf20Sopenharmony_ci 6538c2ecf20Sopenharmony_ciCalled when the task_struct is freed. 6548c2ecf20Sopenharmony_ci 6558c2ecf20Sopenharmony_ci``void bind(struct cgroup *root)`` 6568c2ecf20Sopenharmony_ci(cgroup_mutex held by caller) 6578c2ecf20Sopenharmony_ci 6588c2ecf20Sopenharmony_ciCalled when a cgroup subsystem is rebound to a different hierarchy 6598c2ecf20Sopenharmony_ciand root cgroup. Currently this will only involve movement between 6608c2ecf20Sopenharmony_cithe default hierarchy (which never has sub-cgroups) and a hierarchy 6618c2ecf20Sopenharmony_cithat is being created/destroyed (and hence has no sub-cgroups). 6628c2ecf20Sopenharmony_ci 6638c2ecf20Sopenharmony_ci4. Extended attribute usage 6648c2ecf20Sopenharmony_ci=========================== 6658c2ecf20Sopenharmony_ci 6668c2ecf20Sopenharmony_cicgroup filesystem supports certain types of extended attributes in its 6678c2ecf20Sopenharmony_cidirectories and files. The current supported types are: 6688c2ecf20Sopenharmony_ci 6698c2ecf20Sopenharmony_ci - Trusted (XATTR_TRUSTED) 6708c2ecf20Sopenharmony_ci - Security (XATTR_SECURITY) 6718c2ecf20Sopenharmony_ci 6728c2ecf20Sopenharmony_ciBoth require CAP_SYS_ADMIN capability to set. 6738c2ecf20Sopenharmony_ci 6748c2ecf20Sopenharmony_ciLike in tmpfs, the extended attributes in cgroup filesystem are stored 6758c2ecf20Sopenharmony_ciusing kernel memory and it's advised to keep the usage at minimum. This 6768c2ecf20Sopenharmony_ciis the reason why user defined extended attributes are not supported, since 6778c2ecf20Sopenharmony_ciany user can do it and there's no limit in the value size. 6788c2ecf20Sopenharmony_ci 6798c2ecf20Sopenharmony_ciThe current known users for this feature are SELinux to limit cgroup usage 6808c2ecf20Sopenharmony_ciin containers and systemd for assorted meta data like main PID in a cgroup 6818c2ecf20Sopenharmony_ci(systemd creates a cgroup per service). 6828c2ecf20Sopenharmony_ci 6838c2ecf20Sopenharmony_ci5. Questions 6848c2ecf20Sopenharmony_ci============ 6858c2ecf20Sopenharmony_ci 6868c2ecf20Sopenharmony_ci:: 6878c2ecf20Sopenharmony_ci 6888c2ecf20Sopenharmony_ci Q: what's up with this '/bin/echo' ? 6898c2ecf20Sopenharmony_ci A: bash's builtin 'echo' command does not check calls to write() against 6908c2ecf20Sopenharmony_ci errors. If you use it in the cgroup file system, you won't be 6918c2ecf20Sopenharmony_ci able to tell whether a command succeeded or failed. 6928c2ecf20Sopenharmony_ci 6938c2ecf20Sopenharmony_ci Q: When I attach processes, only the first of the line gets really attached ! 6948c2ecf20Sopenharmony_ci A: We can only return one error code per call to write(). So you should also 6958c2ecf20Sopenharmony_ci put only ONE PID. 696