162306a36Sopenharmony_ci=============
262306a36Sopenharmony_ciBPF Iterators
362306a36Sopenharmony_ci=============
462306a36Sopenharmony_ci
562306a36Sopenharmony_ci
662306a36Sopenharmony_ci----------
762306a36Sopenharmony_ciMotivation
862306a36Sopenharmony_ci----------
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciThere are a few existing ways to dump kernel data into user space. The most
1162306a36Sopenharmony_cipopular one is the ``/proc`` system. For example, ``cat /proc/net/tcp6`` dumps
1262306a36Sopenharmony_ciall tcp6 sockets in the system, and ``cat /proc/net/netlink`` dumps all netlink
1362306a36Sopenharmony_cisockets in the system. However, their output format tends to be fixed, and if
1462306a36Sopenharmony_ciusers want more information about these sockets, they have to patch the kernel,
1562306a36Sopenharmony_ciwhich often takes time to publish upstream and release. The same is true for popular
1662306a36Sopenharmony_citools like `ss <https://man7.org/linux/man-pages/man8/ss.8.html>`_ where any
1762306a36Sopenharmony_ciadditional information needs a kernel patch.
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ciTo solve this problem, the `drgn
2062306a36Sopenharmony_ci<https://www.kernel.org/doc/html/latest/bpf/drgn.html>`_ tool is often used to
2162306a36Sopenharmony_cidig out the kernel data with no kernel change. However, the main drawback for
2262306a36Sopenharmony_cidrgn is performance, as it cannot do pointer tracing inside the kernel. In
2362306a36Sopenharmony_ciaddition, drgn cannot validate a pointer value and may read invalid data if the
2462306a36Sopenharmony_cipointer becomes invalid inside the kernel.
2562306a36Sopenharmony_ci
2662306a36Sopenharmony_ciThe BPF iterator solves the above problem by providing flexibility on what data
2762306a36Sopenharmony_ci(e.g., tasks, bpf_maps, etc.) to collect by calling BPF programs for each kernel
2862306a36Sopenharmony_cidata object.
2962306a36Sopenharmony_ci
3062306a36Sopenharmony_ci----------------------
3162306a36Sopenharmony_ciHow BPF Iterators Work
3262306a36Sopenharmony_ci----------------------
3362306a36Sopenharmony_ci
3462306a36Sopenharmony_ciA BPF iterator is a type of BPF program that allows users to iterate over
3562306a36Sopenharmony_cispecific types of kernel objects. Unlike traditional BPF tracing programs that
3662306a36Sopenharmony_ciallow users to define callbacks that are invoked at particular points of
3762306a36Sopenharmony_ciexecution in the kernel, BPF iterators allow users to define callbacks that
3862306a36Sopenharmony_cishould be executed for every entry in a variety of kernel data structures.
3962306a36Sopenharmony_ci
4062306a36Sopenharmony_ciFor example, users can define a BPF iterator that iterates over every task on
4162306a36Sopenharmony_cithe system and dumps the total amount of CPU runtime currently used by each of
4262306a36Sopenharmony_cithem. Another BPF task iterator may instead dump the cgroup information for each
4362306a36Sopenharmony_citask. Such flexibility is the core value of BPF iterators.
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciA BPF program is always loaded into the kernel at the behest of a user space
4662306a36Sopenharmony_ciprocess. A user space process loads a BPF program by opening and initializing
4762306a36Sopenharmony_cithe program skeleton as required and then invoking a syscall to have the BPF
4862306a36Sopenharmony_ciprogram verified and loaded by the kernel.
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ciIn traditional tracing programs, a program is activated by having user space
5162306a36Sopenharmony_ciobtain a ``bpf_link`` to the program with ``bpf_program__attach()``. Once
5262306a36Sopenharmony_ciactivated, the program callback will be invoked whenever the tracepoint is
5362306a36Sopenharmony_citriggered in the main kernel. For BPF iterator programs, a ``bpf_link`` to the
5462306a36Sopenharmony_ciprogram is obtained using ``bpf_link_create()``, and the program callback is
5562306a36Sopenharmony_ciinvoked by issuing system calls from user space.
5662306a36Sopenharmony_ci
5762306a36Sopenharmony_ciNext, let us see how you can use the iterators to iterate on kernel objects and
5862306a36Sopenharmony_ciread data.
5962306a36Sopenharmony_ci
6062306a36Sopenharmony_ci------------------------
6162306a36Sopenharmony_ciHow to Use BPF iterators
6262306a36Sopenharmony_ci------------------------
6362306a36Sopenharmony_ci
6462306a36Sopenharmony_ciBPF selftests are a great resource to illustrate how to use the iterators. In
6562306a36Sopenharmony_cithis section, we’ll walk through a BPF selftest which shows how to load and use
6662306a36Sopenharmony_cia BPF iterator program.   To begin, we’ll look at `bpf_iter.c
6762306a36Sopenharmony_ci<https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/prog_tests/bpf_iter.c>`_,
6862306a36Sopenharmony_ciwhich illustrates how to load and trigger BPF iterators on the user space side.
6962306a36Sopenharmony_ciLater, we’ll look at a BPF program that runs in kernel space.
7062306a36Sopenharmony_ci
7162306a36Sopenharmony_ciLoading a BPF iterator in the kernel from user space typically involves the
7262306a36Sopenharmony_cifollowing steps:
7362306a36Sopenharmony_ci
7462306a36Sopenharmony_ci* The BPF program is loaded into the kernel through ``libbpf``. Once the kernel
7562306a36Sopenharmony_ci  has verified and loaded the program, it returns a file descriptor (fd) to user
7662306a36Sopenharmony_ci  space.
7762306a36Sopenharmony_ci* Obtain a ``link_fd`` to the BPF program by calling the ``bpf_link_create()``
7862306a36Sopenharmony_ci  specified with the BPF program file descriptor received from the kernel.
7962306a36Sopenharmony_ci* Next, obtain a BPF iterator file descriptor (``bpf_iter_fd``) by calling the
8062306a36Sopenharmony_ci  ``bpf_iter_create()`` specified with the ``bpf_link`` received from Step 2.
8162306a36Sopenharmony_ci* Trigger the iteration by calling ``read(bpf_iter_fd)`` until no data is
8262306a36Sopenharmony_ci  available.
8362306a36Sopenharmony_ci* Close the iterator fd using ``close(bpf_iter_fd)``.
8462306a36Sopenharmony_ci* If needed to reread the data, get a new ``bpf_iter_fd`` and do the read again.
8562306a36Sopenharmony_ci
8662306a36Sopenharmony_ciThe following are a few examples of selftest BPF iterator programs:
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ci* `bpf_iter_tcp4.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_tcp4.c>`_
8962306a36Sopenharmony_ci* `bpf_iter_task_vma.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_vma.c>`_
9062306a36Sopenharmony_ci* `bpf_iter_task_file.c <https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/tools/testing/selftests/bpf/progs/bpf_iter_task_file.c>`_
9162306a36Sopenharmony_ci
9262306a36Sopenharmony_ciLet us look at ``bpf_iter_task_file.c``, which runs in kernel space:
9362306a36Sopenharmony_ci
9462306a36Sopenharmony_ciHere is the definition of ``bpf_iter__task_file`` in `vmlinux.h
9562306a36Sopenharmony_ci<https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html#btf>`_.
9662306a36Sopenharmony_ciAny struct name in ``vmlinux.h`` in the format ``bpf_iter__<iter_name>``
9762306a36Sopenharmony_cirepresents a BPF iterator. The suffix ``<iter_name>`` represents the type of
9862306a36Sopenharmony_ciiterator.
9962306a36Sopenharmony_ci
10062306a36Sopenharmony_ci::
10162306a36Sopenharmony_ci
10262306a36Sopenharmony_ci    struct bpf_iter__task_file {
10362306a36Sopenharmony_ci            union {
10462306a36Sopenharmony_ci                struct bpf_iter_meta *meta;
10562306a36Sopenharmony_ci            };
10662306a36Sopenharmony_ci            union {
10762306a36Sopenharmony_ci                struct task_struct *task;
10862306a36Sopenharmony_ci            };
10962306a36Sopenharmony_ci            u32 fd;
11062306a36Sopenharmony_ci            union {
11162306a36Sopenharmony_ci                struct file *file;
11262306a36Sopenharmony_ci            };
11362306a36Sopenharmony_ci    };
11462306a36Sopenharmony_ci
11562306a36Sopenharmony_ciIn the above code, the field 'meta' contains the metadata, which is the same for
11662306a36Sopenharmony_ciall BPF iterator programs. The rest of the fields are specific to different
11762306a36Sopenharmony_ciiterators. For example, for task_file iterators, the kernel layer provides the
11862306a36Sopenharmony_ci'task', 'fd' and 'file' field values. The 'task' and 'file' are `reference
11962306a36Sopenharmony_cicounted
12062306a36Sopenharmony_ci<https://facebookmicrosites.github.io/bpf/blog/2018/08/31/object-lifetime.html#file-descriptors-and-reference-counters>`_,
12162306a36Sopenharmony_ciso they won't go away when the BPF program runs.
12262306a36Sopenharmony_ci
12362306a36Sopenharmony_ciHere is a snippet from the  ``bpf_iter_task_file.c`` file:
12462306a36Sopenharmony_ci
12562306a36Sopenharmony_ci::
12662306a36Sopenharmony_ci
12762306a36Sopenharmony_ci  SEC("iter/task_file")
12862306a36Sopenharmony_ci  int dump_task_file(struct bpf_iter__task_file *ctx)
12962306a36Sopenharmony_ci  {
13062306a36Sopenharmony_ci    struct seq_file *seq = ctx->meta->seq;
13162306a36Sopenharmony_ci    struct task_struct *task = ctx->task;
13262306a36Sopenharmony_ci    struct file *file = ctx->file;
13362306a36Sopenharmony_ci    __u32 fd = ctx->fd;
13462306a36Sopenharmony_ci
13562306a36Sopenharmony_ci    if (task == NULL || file == NULL)
13662306a36Sopenharmony_ci      return 0;
13762306a36Sopenharmony_ci
13862306a36Sopenharmony_ci    if (ctx->meta->seq_num == 0) {
13962306a36Sopenharmony_ci      count = 0;
14062306a36Sopenharmony_ci      BPF_SEQ_PRINTF(seq, "    tgid      gid       fd      file\n");
14162306a36Sopenharmony_ci    }
14262306a36Sopenharmony_ci
14362306a36Sopenharmony_ci    if (tgid == task->tgid && task->tgid != task->pid)
14462306a36Sopenharmony_ci      count++;
14562306a36Sopenharmony_ci
14662306a36Sopenharmony_ci    if (last_tgid != task->tgid) {
14762306a36Sopenharmony_ci      last_tgid = task->tgid;
14862306a36Sopenharmony_ci      unique_tgid_count++;
14962306a36Sopenharmony_ci    }
15062306a36Sopenharmony_ci
15162306a36Sopenharmony_ci    BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd,
15262306a36Sopenharmony_ci            (long)file->f_op);
15362306a36Sopenharmony_ci    return 0;
15462306a36Sopenharmony_ci  }
15562306a36Sopenharmony_ci
15662306a36Sopenharmony_ciIn the above example, the section name ``SEC(iter/task_file)``, indicates that
15762306a36Sopenharmony_cithe program is a BPF iterator program to iterate all files from all tasks. The
15862306a36Sopenharmony_cicontext of the program is ``bpf_iter__task_file`` struct.
15962306a36Sopenharmony_ci
16062306a36Sopenharmony_ciThe user space program invokes the BPF iterator program running in the kernel
16162306a36Sopenharmony_ciby issuing a ``read()`` syscall. Once invoked, the BPF
16262306a36Sopenharmony_ciprogram can export data to user space using a variety of BPF helper functions.
16362306a36Sopenharmony_ciYou can use either ``bpf_seq_printf()`` (and BPF_SEQ_PRINTF helper macro) or
16462306a36Sopenharmony_ci``bpf_seq_write()`` function based on whether you need formatted output or just
16562306a36Sopenharmony_cibinary data, respectively. For binary-encoded data, the user space applications
16662306a36Sopenharmony_cican process the data from ``bpf_seq_write()`` as needed. For the formatted data,
16762306a36Sopenharmony_ciyou can use ``cat <path>`` to print the results similar to ``cat
16862306a36Sopenharmony_ci/proc/net/netlink`` after pinning the BPF iterator to the bpffs mount. Later,
16962306a36Sopenharmony_ciuse  ``rm -f <path>`` to remove the pinned iterator.
17062306a36Sopenharmony_ci
17162306a36Sopenharmony_ciFor example, you can use the following command to create a BPF iterator from the
17262306a36Sopenharmony_ci``bpf_iter_ipv6_route.o`` object file and pin it to the ``/sys/fs/bpf/my_route``
17362306a36Sopenharmony_cipath:
17462306a36Sopenharmony_ci
17562306a36Sopenharmony_ci::
17662306a36Sopenharmony_ci
17762306a36Sopenharmony_ci  $ bpftool iter pin ./bpf_iter_ipv6_route.o  /sys/fs/bpf/my_route
17862306a36Sopenharmony_ci
17962306a36Sopenharmony_ciAnd then print out the results using the following command:
18062306a36Sopenharmony_ci
18162306a36Sopenharmony_ci::
18262306a36Sopenharmony_ci
18362306a36Sopenharmony_ci  $ cat /sys/fs/bpf/my_route
18462306a36Sopenharmony_ci
18562306a36Sopenharmony_ci
18662306a36Sopenharmony_ci-------------------------------------------------------
18762306a36Sopenharmony_ciImplement Kernel Support for BPF Iterator Program Types
18862306a36Sopenharmony_ci-------------------------------------------------------
18962306a36Sopenharmony_ci
19062306a36Sopenharmony_ciTo implement a BPF iterator in the kernel, the developer must make a one-time
19162306a36Sopenharmony_cichange to the following key data structure defined in the `bpf.h
19262306a36Sopenharmony_ci<https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/tree/include/linux/bpf.h>`_
19362306a36Sopenharmony_cifile.
19462306a36Sopenharmony_ci
19562306a36Sopenharmony_ci::
19662306a36Sopenharmony_ci
19762306a36Sopenharmony_ci  struct bpf_iter_reg {
19862306a36Sopenharmony_ci            const char *target;
19962306a36Sopenharmony_ci            bpf_iter_attach_target_t attach_target;
20062306a36Sopenharmony_ci            bpf_iter_detach_target_t detach_target;
20162306a36Sopenharmony_ci            bpf_iter_show_fdinfo_t show_fdinfo;
20262306a36Sopenharmony_ci            bpf_iter_fill_link_info_t fill_link_info;
20362306a36Sopenharmony_ci            bpf_iter_get_func_proto_t get_func_proto;
20462306a36Sopenharmony_ci            u32 ctx_arg_info_size;
20562306a36Sopenharmony_ci            u32 feature;
20662306a36Sopenharmony_ci            struct bpf_ctx_arg_aux ctx_arg_info[BPF_ITER_CTX_ARG_MAX];
20762306a36Sopenharmony_ci            const struct bpf_iter_seq_info *seq_info;
20862306a36Sopenharmony_ci  };
20962306a36Sopenharmony_ci
21062306a36Sopenharmony_ciAfter filling the data structure fields, call ``bpf_iter_reg_target()`` to
21162306a36Sopenharmony_ciregister the iterator to the main BPF iterator subsystem.
21262306a36Sopenharmony_ci
21362306a36Sopenharmony_ciThe following is the breakdown for each field in struct ``bpf_iter_reg``.
21462306a36Sopenharmony_ci
21562306a36Sopenharmony_ci.. list-table::
21662306a36Sopenharmony_ci   :widths: 25 50
21762306a36Sopenharmony_ci   :header-rows: 1
21862306a36Sopenharmony_ci
21962306a36Sopenharmony_ci   * - Fields
22062306a36Sopenharmony_ci     - Description
22162306a36Sopenharmony_ci   * - target
22262306a36Sopenharmony_ci     - Specifies the name of the BPF iterator. For example: ``bpf_map``,
22362306a36Sopenharmony_ci       ``bpf_map_elem``. The name should be different from other ``bpf_iter`` target names in the kernel.
22462306a36Sopenharmony_ci   * - attach_target and detach_target
22562306a36Sopenharmony_ci     - Allows for target specific ``link_create`` action since some targets
22662306a36Sopenharmony_ci       may need special processing. Called during the user space link_create stage.
22762306a36Sopenharmony_ci   * - show_fdinfo and fill_link_info
22862306a36Sopenharmony_ci     - Called to fill target specific information when user tries to get link
22962306a36Sopenharmony_ci       info associated with the iterator.
23062306a36Sopenharmony_ci   * - get_func_proto
23162306a36Sopenharmony_ci     - Permits a BPF iterator to access BPF helpers specific to the iterator.
23262306a36Sopenharmony_ci   * - ctx_arg_info_size and ctx_arg_info
23362306a36Sopenharmony_ci     - Specifies the verifier states for BPF program arguments associated with
23462306a36Sopenharmony_ci       the bpf iterator.
23562306a36Sopenharmony_ci   * - feature
23662306a36Sopenharmony_ci     - Specifies certain action requests in the kernel BPF iterator
23762306a36Sopenharmony_ci       infrastructure. Currently, only BPF_ITER_RESCHED is supported. This means
23862306a36Sopenharmony_ci       that the kernel function cond_resched() is called to avoid other kernel
23962306a36Sopenharmony_ci       subsystem (e.g., rcu) misbehaving.
24062306a36Sopenharmony_ci   * - seq_info
24162306a36Sopenharmony_ci     - Specifies the set of seq operations for the BPF iterator and helpers to
24262306a36Sopenharmony_ci       initialize/free the private data for the corresponding ``seq_file``.
24362306a36Sopenharmony_ci
24462306a36Sopenharmony_ci`Click here
24562306a36Sopenharmony_ci<https://lore.kernel.org/bpf/20210212183107.50963-2-songliubraving@fb.com/>`_
24662306a36Sopenharmony_cito see an implementation of the ``task_vma`` BPF iterator in the kernel.
24762306a36Sopenharmony_ci
24862306a36Sopenharmony_ci---------------------------------
24962306a36Sopenharmony_ciParameterizing BPF Task Iterators
25062306a36Sopenharmony_ci---------------------------------
25162306a36Sopenharmony_ci
25262306a36Sopenharmony_ciBy default, BPF iterators walk through all the objects of the specified types
25362306a36Sopenharmony_ci(processes, cgroups, maps, etc.) across the entire system to read relevant
25462306a36Sopenharmony_cikernel data. But often, there are cases where we only care about a much smaller
25562306a36Sopenharmony_cisubset of iterable kernel objects, such as only iterating tasks within a
25662306a36Sopenharmony_cispecific process. Therefore, BPF iterator programs support filtering out objects
25762306a36Sopenharmony_cifrom iteration by allowing user space to configure the iterator program when it
25862306a36Sopenharmony_ciis attached.
25962306a36Sopenharmony_ci
26062306a36Sopenharmony_ci--------------------------
26162306a36Sopenharmony_ciBPF Task Iterator Program
26262306a36Sopenharmony_ci--------------------------
26362306a36Sopenharmony_ci
26462306a36Sopenharmony_ciThe following code is a BPF iterator program to print files and task information
26562306a36Sopenharmony_cithrough the ``seq_file`` of the iterator. It is a standard BPF iterator program
26662306a36Sopenharmony_cithat visits every file of an iterator. We will use this BPF program in our
26762306a36Sopenharmony_ciexample later.
26862306a36Sopenharmony_ci
26962306a36Sopenharmony_ci::
27062306a36Sopenharmony_ci
27162306a36Sopenharmony_ci  #include <vmlinux.h>
27262306a36Sopenharmony_ci  #include <bpf/bpf_helpers.h>
27362306a36Sopenharmony_ci
27462306a36Sopenharmony_ci  char _license[] SEC("license") = "GPL";
27562306a36Sopenharmony_ci
27662306a36Sopenharmony_ci  SEC("iter/task_file")
27762306a36Sopenharmony_ci  int dump_task_file(struct bpf_iter__task_file *ctx)
27862306a36Sopenharmony_ci  {
27962306a36Sopenharmony_ci        struct seq_file *seq = ctx->meta->seq;
28062306a36Sopenharmony_ci        struct task_struct *task = ctx->task;
28162306a36Sopenharmony_ci        struct file *file = ctx->file;
28262306a36Sopenharmony_ci        __u32 fd = ctx->fd;
28362306a36Sopenharmony_ci        if (task == NULL || file == NULL)
28462306a36Sopenharmony_ci                return 0;
28562306a36Sopenharmony_ci        if (ctx->meta->seq_num == 0) {
28662306a36Sopenharmony_ci                BPF_SEQ_PRINTF(seq, "    tgid      pid       fd      file\n");
28762306a36Sopenharmony_ci        }
28862306a36Sopenharmony_ci        BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd,
28962306a36Sopenharmony_ci                        (long)file->f_op);
29062306a36Sopenharmony_ci        return 0;
29162306a36Sopenharmony_ci  }
29262306a36Sopenharmony_ci
29362306a36Sopenharmony_ci----------------------------------------
29462306a36Sopenharmony_ciCreating a File Iterator with Parameters
29562306a36Sopenharmony_ci----------------------------------------
29662306a36Sopenharmony_ci
29762306a36Sopenharmony_ciNow, let us look at how to create an iterator that includes only files of a
29862306a36Sopenharmony_ciprocess.
29962306a36Sopenharmony_ci
30062306a36Sopenharmony_ciFirst,  fill the ``bpf_iter_attach_opts`` struct as shown below:
30162306a36Sopenharmony_ci
30262306a36Sopenharmony_ci::
30362306a36Sopenharmony_ci
30462306a36Sopenharmony_ci  LIBBPF_OPTS(bpf_iter_attach_opts, opts);
30562306a36Sopenharmony_ci  union bpf_iter_link_info linfo;
30662306a36Sopenharmony_ci  memset(&linfo, 0, sizeof(linfo));
30762306a36Sopenharmony_ci  linfo.task.pid = getpid();
30862306a36Sopenharmony_ci  opts.link_info = &linfo;
30962306a36Sopenharmony_ci  opts.link_info_len = sizeof(linfo);
31062306a36Sopenharmony_ci
31162306a36Sopenharmony_ci``linfo.task.pid``, if it is non-zero, directs the kernel to create an iterator
31262306a36Sopenharmony_cithat only includes opened files for the process with the specified ``pid``. In
31362306a36Sopenharmony_cithis example, we will only be iterating files for our process. If
31462306a36Sopenharmony_ci``linfo.task.pid`` is zero, the iterator will visit every opened file of every
31562306a36Sopenharmony_ciprocess. Similarly, ``linfo.task.tid`` directs the kernel to create an iterator
31662306a36Sopenharmony_cithat visits opened files of a specific thread, not a process. In this example,
31762306a36Sopenharmony_ci``linfo.task.tid`` is different from ``linfo.task.pid`` only if the thread has a
31862306a36Sopenharmony_ciseparate file descriptor table. In most circumstances, all process threads share
31962306a36Sopenharmony_cia single file descriptor table.
32062306a36Sopenharmony_ci
32162306a36Sopenharmony_ciNow, in the userspace program, pass the pointer of struct to the
32262306a36Sopenharmony_ci``bpf_program__attach_iter()``.
32362306a36Sopenharmony_ci
32462306a36Sopenharmony_ci::
32562306a36Sopenharmony_ci
32662306a36Sopenharmony_ci  link = bpf_program__attach_iter(prog, &opts); iter_fd =
32762306a36Sopenharmony_ci  bpf_iter_create(bpf_link__fd(link));
32862306a36Sopenharmony_ci
32962306a36Sopenharmony_ciIf both *tid* and *pid* are zero, an iterator created from this struct
33062306a36Sopenharmony_ci``bpf_iter_attach_opts`` will include every opened file of every task in the
33162306a36Sopenharmony_cisystem (in the namespace, actually.) It is the same as passing a NULL as the
33262306a36Sopenharmony_cisecond argument to ``bpf_program__attach_iter()``.
33362306a36Sopenharmony_ci
33462306a36Sopenharmony_ciThe whole program looks like the following code:
33562306a36Sopenharmony_ci
33662306a36Sopenharmony_ci::
33762306a36Sopenharmony_ci
33862306a36Sopenharmony_ci  #include <stdio.h>
33962306a36Sopenharmony_ci  #include <unistd.h>
34062306a36Sopenharmony_ci  #include <bpf/bpf.h>
34162306a36Sopenharmony_ci  #include <bpf/libbpf.h>
34262306a36Sopenharmony_ci  #include "bpf_iter_task_ex.skel.h"
34362306a36Sopenharmony_ci
34462306a36Sopenharmony_ci  static int do_read_opts(struct bpf_program *prog, struct bpf_iter_attach_opts *opts)
34562306a36Sopenharmony_ci  {
34662306a36Sopenharmony_ci        struct bpf_link *link;
34762306a36Sopenharmony_ci        char buf[16] = {};
34862306a36Sopenharmony_ci        int iter_fd = -1, len;
34962306a36Sopenharmony_ci        int ret = 0;
35062306a36Sopenharmony_ci
35162306a36Sopenharmony_ci        link = bpf_program__attach_iter(prog, opts);
35262306a36Sopenharmony_ci        if (!link) {
35362306a36Sopenharmony_ci                fprintf(stderr, "bpf_program__attach_iter() fails\n");
35462306a36Sopenharmony_ci                return -1;
35562306a36Sopenharmony_ci        }
35662306a36Sopenharmony_ci        iter_fd = bpf_iter_create(bpf_link__fd(link));
35762306a36Sopenharmony_ci        if (iter_fd < 0) {
35862306a36Sopenharmony_ci                fprintf(stderr, "bpf_iter_create() fails\n");
35962306a36Sopenharmony_ci                ret = -1;
36062306a36Sopenharmony_ci                goto free_link;
36162306a36Sopenharmony_ci        }
36262306a36Sopenharmony_ci        /* not check contents, but ensure read() ends without error */
36362306a36Sopenharmony_ci        while ((len = read(iter_fd, buf, sizeof(buf) - 1)) > 0) {
36462306a36Sopenharmony_ci                buf[len] = 0;
36562306a36Sopenharmony_ci                printf("%s", buf);
36662306a36Sopenharmony_ci        }
36762306a36Sopenharmony_ci        printf("\n");
36862306a36Sopenharmony_ci  free_link:
36962306a36Sopenharmony_ci        if (iter_fd >= 0)
37062306a36Sopenharmony_ci                close(iter_fd);
37162306a36Sopenharmony_ci        bpf_link__destroy(link);
37262306a36Sopenharmony_ci        return 0;
37362306a36Sopenharmony_ci  }
37462306a36Sopenharmony_ci
37562306a36Sopenharmony_ci  static void test_task_file(void)
37662306a36Sopenharmony_ci  {
37762306a36Sopenharmony_ci        LIBBPF_OPTS(bpf_iter_attach_opts, opts);
37862306a36Sopenharmony_ci        struct bpf_iter_task_ex *skel;
37962306a36Sopenharmony_ci        union bpf_iter_link_info linfo;
38062306a36Sopenharmony_ci        skel = bpf_iter_task_ex__open_and_load();
38162306a36Sopenharmony_ci        if (skel == NULL)
38262306a36Sopenharmony_ci                return;
38362306a36Sopenharmony_ci        memset(&linfo, 0, sizeof(linfo));
38462306a36Sopenharmony_ci        linfo.task.pid = getpid();
38562306a36Sopenharmony_ci        opts.link_info = &linfo;
38662306a36Sopenharmony_ci        opts.link_info_len = sizeof(linfo);
38762306a36Sopenharmony_ci        printf("PID %d\n", getpid());
38862306a36Sopenharmony_ci        do_read_opts(skel->progs.dump_task_file, &opts);
38962306a36Sopenharmony_ci        bpf_iter_task_ex__destroy(skel);
39062306a36Sopenharmony_ci  }
39162306a36Sopenharmony_ci
39262306a36Sopenharmony_ci  int main(int argc, const char * const * argv)
39362306a36Sopenharmony_ci  {
39462306a36Sopenharmony_ci        test_task_file();
39562306a36Sopenharmony_ci        return 0;
39662306a36Sopenharmony_ci  }
39762306a36Sopenharmony_ci
39862306a36Sopenharmony_ciThe following lines are the output of the program.
39962306a36Sopenharmony_ci::
40062306a36Sopenharmony_ci
40162306a36Sopenharmony_ci  PID 1859
40262306a36Sopenharmony_ci
40362306a36Sopenharmony_ci     tgid      pid       fd      file
40462306a36Sopenharmony_ci     1859     1859        0 ffffffff82270aa0
40562306a36Sopenharmony_ci     1859     1859        1 ffffffff82270aa0
40662306a36Sopenharmony_ci     1859     1859        2 ffffffff82270aa0
40762306a36Sopenharmony_ci     1859     1859        3 ffffffff82272980
40862306a36Sopenharmony_ci     1859     1859        4 ffffffff8225e120
40962306a36Sopenharmony_ci     1859     1859        5 ffffffff82255120
41062306a36Sopenharmony_ci     1859     1859        6 ffffffff82254f00
41162306a36Sopenharmony_ci     1859     1859        7 ffffffff82254d80
41262306a36Sopenharmony_ci     1859     1859        8 ffffffff8225abe0
41362306a36Sopenharmony_ci
41462306a36Sopenharmony_ci------------------
41562306a36Sopenharmony_ciWithout Parameters
41662306a36Sopenharmony_ci------------------
41762306a36Sopenharmony_ci
41862306a36Sopenharmony_ciLet us look at how a BPF iterator without parameters skips files of other
41962306a36Sopenharmony_ciprocesses in the system. In this case, the BPF program has to check the pid or
42062306a36Sopenharmony_cithe tid of tasks, or it will receive every opened file in the system (in the
42162306a36Sopenharmony_cicurrent *pid* namespace, actually). So, we usually add a global variable in the
42262306a36Sopenharmony_ciBPF program to pass a *pid* to the BPF program.
42362306a36Sopenharmony_ci
42462306a36Sopenharmony_ciThe BPF program would look like the following block.
42562306a36Sopenharmony_ci
42662306a36Sopenharmony_ci  ::
42762306a36Sopenharmony_ci
42862306a36Sopenharmony_ci    ......
42962306a36Sopenharmony_ci    int target_pid = 0;
43062306a36Sopenharmony_ci
43162306a36Sopenharmony_ci    SEC("iter/task_file")
43262306a36Sopenharmony_ci    int dump_task_file(struct bpf_iter__task_file *ctx)
43362306a36Sopenharmony_ci    {
43462306a36Sopenharmony_ci          ......
43562306a36Sopenharmony_ci          if (task->tgid != target_pid) /* Check task->pid instead to check thread IDs */
43662306a36Sopenharmony_ci                  return 0;
43762306a36Sopenharmony_ci          BPF_SEQ_PRINTF(seq, "%8d %8d %8d %lx\n", task->tgid, task->pid, fd,
43862306a36Sopenharmony_ci                          (long)file->f_op);
43962306a36Sopenharmony_ci          return 0;
44062306a36Sopenharmony_ci    }
44162306a36Sopenharmony_ci
44262306a36Sopenharmony_ciThe user space program would look like the following block:
44362306a36Sopenharmony_ci
44462306a36Sopenharmony_ci  ::
44562306a36Sopenharmony_ci
44662306a36Sopenharmony_ci    ......
44762306a36Sopenharmony_ci    static void test_task_file(void)
44862306a36Sopenharmony_ci    {
44962306a36Sopenharmony_ci          ......
45062306a36Sopenharmony_ci          skel = bpf_iter_task_ex__open_and_load();
45162306a36Sopenharmony_ci          if (skel == NULL)
45262306a36Sopenharmony_ci                  return;
45362306a36Sopenharmony_ci          skel->bss->target_pid = getpid(); /* process ID.  For thread id, use gettid() */
45462306a36Sopenharmony_ci          memset(&linfo, 0, sizeof(linfo));
45562306a36Sopenharmony_ci          linfo.task.pid = getpid();
45662306a36Sopenharmony_ci          opts.link_info = &linfo;
45762306a36Sopenharmony_ci          opts.link_info_len = sizeof(linfo);
45862306a36Sopenharmony_ci          ......
45962306a36Sopenharmony_ci    }
46062306a36Sopenharmony_ci
46162306a36Sopenharmony_ci``target_pid`` is a global variable in the BPF program. The user space program
46262306a36Sopenharmony_cishould initialize the variable with a process ID to skip opened files of other
46362306a36Sopenharmony_ciprocesses in the BPF program. When you parametrize a BPF iterator, the iterator
46462306a36Sopenharmony_cicalls the BPF program fewer times which can save significant resources.
46562306a36Sopenharmony_ci
46662306a36Sopenharmony_ci---------------------------
46762306a36Sopenharmony_ciParametrizing VMA Iterators
46862306a36Sopenharmony_ci---------------------------
46962306a36Sopenharmony_ci
47062306a36Sopenharmony_ciBy default, a BPF VMA iterator includes every VMA in every process.  However,
47162306a36Sopenharmony_ciyou can still specify a process or a thread to include only its VMAs. Unlike
47262306a36Sopenharmony_cifiles, a thread can not have a separate address space (since Linux 2.6.0-test6).
47362306a36Sopenharmony_ciHere, using *tid* makes no difference from using *pid*.
47462306a36Sopenharmony_ci
47562306a36Sopenharmony_ci----------------------------
47662306a36Sopenharmony_ciParametrizing Task Iterators
47762306a36Sopenharmony_ci----------------------------
47862306a36Sopenharmony_ci
47962306a36Sopenharmony_ciA BPF task iterator with *pid* includes all tasks (threads) of a process. The
48062306a36Sopenharmony_ciBPF program receives these tasks one after another. You can specify a BPF task
48162306a36Sopenharmony_ciiterator with *tid* parameter to include only the tasks that match the given
48262306a36Sopenharmony_ci*tid*.
483