162306a36Sopenharmony_ci.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci===========================
462306a36Sopenharmony_ciBPF_PROG_TYPE_CGROUP_SYSCTL
562306a36Sopenharmony_ci===========================
662306a36Sopenharmony_ci
762306a36Sopenharmony_ciThis document describes ``BPF_PROG_TYPE_CGROUP_SYSCTL`` program type that
862306a36Sopenharmony_ciprovides cgroup-bpf hook for sysctl.
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciThe hook has to be attached to a cgroup and will be called every time a
1162306a36Sopenharmony_ciprocess inside that cgroup tries to read from or write to sysctl knob in proc.
1262306a36Sopenharmony_ci
1362306a36Sopenharmony_ci1. Attach type
1462306a36Sopenharmony_ci**************
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_ci``BPF_CGROUP_SYSCTL`` attach type has to be used to attach
1762306a36Sopenharmony_ci``BPF_PROG_TYPE_CGROUP_SYSCTL`` program to a cgroup.
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ci2. Context
2062306a36Sopenharmony_ci**********
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ci``BPF_PROG_TYPE_CGROUP_SYSCTL`` provides access to the following context from
2362306a36Sopenharmony_ciBPF program::
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ci    struct bpf_sysctl {
2662306a36Sopenharmony_ci        __u32 write;
2762306a36Sopenharmony_ci        __u32 file_pos;
2862306a36Sopenharmony_ci    };
2962306a36Sopenharmony_ci
3062306a36Sopenharmony_ci* ``write`` indicates whether sysctl value is being read (``0``) or written
3162306a36Sopenharmony_ci  (``1``). This field is read-only.
3262306a36Sopenharmony_ci
3362306a36Sopenharmony_ci* ``file_pos`` indicates file position sysctl is being accessed at, read
3462306a36Sopenharmony_ci  or written. This field is read-write. Writing to the field sets the starting
3562306a36Sopenharmony_ci  position in sysctl proc file ``read(2)`` will be reading from or ``write(2)``
3662306a36Sopenharmony_ci  will be writing to. Writing zero to the field can be used e.g. to override
3762306a36Sopenharmony_ci  whole sysctl value by ``bpf_sysctl_set_new_value()`` on ``write(2)`` even
3862306a36Sopenharmony_ci  when it's called by user space on ``file_pos > 0``. Writing non-zero
3962306a36Sopenharmony_ci  value to the field can be used to access part of sysctl value starting from
4062306a36Sopenharmony_ci  specified ``file_pos``. Not all sysctl support access with ``file_pos !=
4162306a36Sopenharmony_ci  0``, e.g. writes to numeric sysctl entries must always be at file position
4262306a36Sopenharmony_ci  ``0``. See also ``kernel.sysctl_writes_strict`` sysctl.
4362306a36Sopenharmony_ci
4462306a36Sopenharmony_ciSee `linux/bpf.h`_ for more details on how context field can be accessed.
4562306a36Sopenharmony_ci
4662306a36Sopenharmony_ci3. Return code
4762306a36Sopenharmony_ci**************
4862306a36Sopenharmony_ci
4962306a36Sopenharmony_ci``BPF_PROG_TYPE_CGROUP_SYSCTL`` program must return one of the following
5062306a36Sopenharmony_cireturn codes:
5162306a36Sopenharmony_ci
5262306a36Sopenharmony_ci* ``0`` means "reject access to sysctl";
5362306a36Sopenharmony_ci* ``1`` means "proceed with access".
5462306a36Sopenharmony_ci
5562306a36Sopenharmony_ciIf program returns ``0`` user space will get ``-1`` from ``read(2)`` or
5662306a36Sopenharmony_ci``write(2)`` and ``errno`` will be set to ``EPERM``.
5762306a36Sopenharmony_ci
5862306a36Sopenharmony_ci4. Helpers
5962306a36Sopenharmony_ci**********
6062306a36Sopenharmony_ci
6162306a36Sopenharmony_ciSince sysctl knob is represented by a name and a value, sysctl specific BPF
6262306a36Sopenharmony_cihelpers focus on providing access to these properties:
6362306a36Sopenharmony_ci
6462306a36Sopenharmony_ci* ``bpf_sysctl_get_name()`` to get sysctl name as it is visible in
6562306a36Sopenharmony_ci  ``/proc/sys`` into provided by BPF program buffer;
6662306a36Sopenharmony_ci
6762306a36Sopenharmony_ci* ``bpf_sysctl_get_current_value()`` to get string value currently held by
6862306a36Sopenharmony_ci  sysctl into provided by BPF program buffer. This helper is available on both
6962306a36Sopenharmony_ci  ``read(2)`` from and ``write(2)`` to sysctl;
7062306a36Sopenharmony_ci
7162306a36Sopenharmony_ci* ``bpf_sysctl_get_new_value()`` to get new string value currently being
7262306a36Sopenharmony_ci  written to sysctl before actual write happens. This helper can be used only
7362306a36Sopenharmony_ci  on ``ctx->write == 1``;
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ci* ``bpf_sysctl_set_new_value()`` to override new string value currently being
7662306a36Sopenharmony_ci  written to sysctl before actual write happens. Sysctl value will be
7762306a36Sopenharmony_ci  overridden starting from the current ``ctx->file_pos``. If the whole value
7862306a36Sopenharmony_ci  has to be overridden BPF program can set ``file_pos`` to zero before calling
7962306a36Sopenharmony_ci  to the helper. This helper can be used only on ``ctx->write == 1``. New
8062306a36Sopenharmony_ci  string value set by the helper is treated and verified by kernel same way as
8162306a36Sopenharmony_ci  an equivalent string passed by user space.
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ciBPF program sees sysctl value same way as user space does in proc filesystem,
8462306a36Sopenharmony_cii.e. as a string. Since many sysctl values represent an integer or a vector
8562306a36Sopenharmony_ciof integers, the following helpers can be used to get numeric value from the
8662306a36Sopenharmony_cistring:
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ci* ``bpf_strtol()`` to convert initial part of the string to long integer
8962306a36Sopenharmony_ci  similar to user space `strtol(3)`_;
9062306a36Sopenharmony_ci* ``bpf_strtoul()`` to convert initial part of the string to unsigned long
9162306a36Sopenharmony_ci  integer similar to user space `strtoul(3)`_;
9262306a36Sopenharmony_ci
9362306a36Sopenharmony_ciSee `linux/bpf.h`_ for more details on helpers described here.
9462306a36Sopenharmony_ci
9562306a36Sopenharmony_ci5. Examples
9662306a36Sopenharmony_ci***********
9762306a36Sopenharmony_ci
9862306a36Sopenharmony_ciSee `test_sysctl_prog.c`_ for an example of BPF program in C that access
9962306a36Sopenharmony_cisysctl name and value, parses string value to get vector of integers and uses
10062306a36Sopenharmony_cithe result to make decision whether to allow or deny access to sysctl.
10162306a36Sopenharmony_ci
10262306a36Sopenharmony_ci6. Notes
10362306a36Sopenharmony_ci********
10462306a36Sopenharmony_ci
10562306a36Sopenharmony_ci``BPF_PROG_TYPE_CGROUP_SYSCTL`` is intended to be used in **trusted** root
10662306a36Sopenharmony_cienvironment, for example to monitor sysctl usage or catch unreasonable values
10762306a36Sopenharmony_cian application, running as root in a separate cgroup, is trying to set.
10862306a36Sopenharmony_ci
10962306a36Sopenharmony_ciSince `task_dfl_cgroup(current)` is called at `sys_read` / `sys_write` time it
11062306a36Sopenharmony_cimay return results different from that at `sys_open` time, i.e. process that
11162306a36Sopenharmony_ciopened sysctl file in proc filesystem may differ from process that is trying
11262306a36Sopenharmony_cito read from / write to it and two such processes may run in different
11362306a36Sopenharmony_cicgroups, what means ``BPF_PROG_TYPE_CGROUP_SYSCTL`` should not be used as a
11462306a36Sopenharmony_cisecurity mechanism to limit sysctl usage.
11562306a36Sopenharmony_ci
11662306a36Sopenharmony_ciAs with any cgroup-bpf program additional care should be taken if an
11762306a36Sopenharmony_ciapplication running as root in a cgroup should not be allowed to
11862306a36Sopenharmony_cidetach/replace BPF program attached by administrator.
11962306a36Sopenharmony_ci
12062306a36Sopenharmony_ci.. Links
12162306a36Sopenharmony_ci.. _linux/bpf.h: ../../include/uapi/linux/bpf.h
12262306a36Sopenharmony_ci.. _strtol(3): http://man7.org/linux/man-pages/man3/strtol.3p.html
12362306a36Sopenharmony_ci.. _strtoul(3): http://man7.org/linux/man-pages/man3/strtoul.3p.html
12462306a36Sopenharmony_ci.. _test_sysctl_prog.c:
12562306a36Sopenharmony_ci   ../../tools/testing/selftests/bpf/progs/test_sysctl_prog.c
126