162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0-only 262306a36Sopenharmony_ci.. Copyright (C) 2022 Red Hat, Inc. 362306a36Sopenharmony_ci 462306a36Sopenharmony_ci================================================= 562306a36Sopenharmony_ciBPF_MAP_TYPE_DEVMAP and BPF_MAP_TYPE_DEVMAP_HASH 662306a36Sopenharmony_ci================================================= 762306a36Sopenharmony_ci 862306a36Sopenharmony_ci.. note:: 962306a36Sopenharmony_ci - ``BPF_MAP_TYPE_DEVMAP`` was introduced in kernel version 4.14 1062306a36Sopenharmony_ci - ``BPF_MAP_TYPE_DEVMAP_HASH`` was introduced in kernel version 5.4 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ci``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` are BPF maps primarily 1362306a36Sopenharmony_ciused as backend maps for the XDP BPF helper call ``bpf_redirect_map()``. 1462306a36Sopenharmony_ci``BPF_MAP_TYPE_DEVMAP`` is backed by an array that uses the key as 1562306a36Sopenharmony_cithe index to lookup a reference to a net device. While ``BPF_MAP_TYPE_DEVMAP_HASH`` 1662306a36Sopenharmony_ciis backed by a hash table that uses a key to lookup a reference to a net device. 1762306a36Sopenharmony_ciThe user provides either <``key``/ ``ifindex``> or <``key``/ ``struct bpf_devmap_val``> 1862306a36Sopenharmony_cipairs to update the maps with new net devices. 1962306a36Sopenharmony_ci 2062306a36Sopenharmony_ci.. note:: 2162306a36Sopenharmony_ci - The key to a hash map doesn't have to be an ``ifindex``. 2262306a36Sopenharmony_ci - While ``BPF_MAP_TYPE_DEVMAP_HASH`` allows for densely packing the net devices 2362306a36Sopenharmony_ci it comes at the cost of a hash of the key when performing a look up. 2462306a36Sopenharmony_ci 2562306a36Sopenharmony_ciThe setup and packet enqueue/send code is shared between the two types of 2662306a36Sopenharmony_cidevmap; only the lookup and insertion is different. 2762306a36Sopenharmony_ci 2862306a36Sopenharmony_ciUsage 2962306a36Sopenharmony_ci===== 3062306a36Sopenharmony_ciKernel BPF 3162306a36Sopenharmony_ci---------- 3262306a36Sopenharmony_cibpf_redirect_map() 3362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^ 3462306a36Sopenharmony_ci.. code-block:: c 3562306a36Sopenharmony_ci 3662306a36Sopenharmony_ci long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) 3762306a36Sopenharmony_ci 3862306a36Sopenharmony_ciRedirect the packet to the endpoint referenced by ``map`` at index ``key``. 3962306a36Sopenharmony_ciFor ``BPF_MAP_TYPE_DEVMAP`` and ``BPF_MAP_TYPE_DEVMAP_HASH`` this map contains 4062306a36Sopenharmony_cireferences to net devices (for forwarding packets through other ports). 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ciThe lower two bits of *flags* are used as the return code if the map lookup 4362306a36Sopenharmony_cifails. This is so that the return value can be one of the XDP program return 4462306a36Sopenharmony_cicodes up to ``XDP_TX``, as chosen by the caller. The higher bits of ``flags`` 4562306a36Sopenharmony_cican be set to ``BPF_F_BROADCAST`` or ``BPF_F_EXCLUDE_INGRESS`` as defined 4662306a36Sopenharmony_cibelow. 4762306a36Sopenharmony_ci 4862306a36Sopenharmony_ciWith ``BPF_F_BROADCAST`` the packet will be broadcast to all the interfaces 4962306a36Sopenharmony_ciin the map, with ``BPF_F_EXCLUDE_INGRESS`` the ingress interface will be excluded 5062306a36Sopenharmony_cifrom the broadcast. 5162306a36Sopenharmony_ci 5262306a36Sopenharmony_ci.. note:: 5362306a36Sopenharmony_ci - The key is ignored if BPF_F_BROADCAST is set. 5462306a36Sopenharmony_ci - The broadcast feature can also be used to implement multicast forwarding: 5562306a36Sopenharmony_ci simply create multiple DEVMAPs, each one corresponding to a single multicast group. 5662306a36Sopenharmony_ci 5762306a36Sopenharmony_ciThis helper will return ``XDP_REDIRECT`` on success, or the value of the two 5862306a36Sopenharmony_cilower bits of the ``flags`` argument if the map lookup fails. 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ciMore information about redirection can be found :doc:`redirect` 6162306a36Sopenharmony_ci 6262306a36Sopenharmony_cibpf_map_lookup_elem() 6362306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^ 6462306a36Sopenharmony_ci.. code-block:: c 6562306a36Sopenharmony_ci 6662306a36Sopenharmony_ci void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) 6762306a36Sopenharmony_ci 6862306a36Sopenharmony_ciNet device entries can be retrieved using the ``bpf_map_lookup_elem()`` 6962306a36Sopenharmony_cihelper. 7062306a36Sopenharmony_ci 7162306a36Sopenharmony_ciUser space 7262306a36Sopenharmony_ci---------- 7362306a36Sopenharmony_ci.. note:: 7462306a36Sopenharmony_ci DEVMAP entries can only be updated/deleted from user space and not 7562306a36Sopenharmony_ci from an eBPF program. Trying to call these functions from a kernel eBPF 7662306a36Sopenharmony_ci program will result in the program failing to load and a verifier warning. 7762306a36Sopenharmony_ci 7862306a36Sopenharmony_cibpf_map_update_elem() 7962306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^ 8062306a36Sopenharmony_ci.. code-block:: c 8162306a36Sopenharmony_ci 8262306a36Sopenharmony_ci int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags); 8362306a36Sopenharmony_ci 8462306a36Sopenharmony_ciNet device entries can be added or updated using the ``bpf_map_update_elem()`` 8562306a36Sopenharmony_cihelper. This helper replaces existing elements atomically. The ``value`` parameter 8662306a36Sopenharmony_cican be ``struct bpf_devmap_val`` or a simple ``int ifindex`` for backwards 8762306a36Sopenharmony_cicompatibility. 8862306a36Sopenharmony_ci 8962306a36Sopenharmony_ci .. code-block:: c 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ci struct bpf_devmap_val { 9262306a36Sopenharmony_ci __u32 ifindex; /* device index */ 9362306a36Sopenharmony_ci union { 9462306a36Sopenharmony_ci int fd; /* prog fd on map write */ 9562306a36Sopenharmony_ci __u32 id; /* prog id on map read */ 9662306a36Sopenharmony_ci } bpf_prog; 9762306a36Sopenharmony_ci }; 9862306a36Sopenharmony_ci 9962306a36Sopenharmony_ciThe ``flags`` argument can be one of the following: 10062306a36Sopenharmony_ci - ``BPF_ANY``: Create a new element or update an existing element. 10162306a36Sopenharmony_ci - ``BPF_NOEXIST``: Create a new element only if it did not exist. 10262306a36Sopenharmony_ci - ``BPF_EXIST``: Update an existing element. 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_ciDEVMAPs can associate a program with a device entry by adding a ``bpf_prog.fd`` 10562306a36Sopenharmony_cito ``struct bpf_devmap_val``. Programs are run after ``XDP_REDIRECT`` and have 10662306a36Sopenharmony_ciaccess to both Rx device and Tx device. The program associated with the ``fd`` 10762306a36Sopenharmony_cimust have type XDP with expected attach type ``xdp_devmap``. 10862306a36Sopenharmony_ciWhen a program is associated with a device index, the program is run on an 10962306a36Sopenharmony_ci``XDP_REDIRECT`` and before the buffer is added to the per-cpu queue. Examples 11062306a36Sopenharmony_ciof how to attach/use xdp_devmap progs can be found in the kernel selftests: 11162306a36Sopenharmony_ci 11262306a36Sopenharmony_ci- ``tools/testing/selftests/bpf/prog_tests/xdp_devmap_attach.c`` 11362306a36Sopenharmony_ci- ``tools/testing/selftests/bpf/progs/test_xdp_with_devmap_helpers.c`` 11462306a36Sopenharmony_ci 11562306a36Sopenharmony_cibpf_map_lookup_elem() 11662306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^ 11762306a36Sopenharmony_ci.. code-block:: c 11862306a36Sopenharmony_ci 11962306a36Sopenharmony_ci.. c:function:: 12062306a36Sopenharmony_ci int bpf_map_lookup_elem(int fd, const void *key, void *value); 12162306a36Sopenharmony_ci 12262306a36Sopenharmony_ciNet device entries can be retrieved using the ``bpf_map_lookup_elem()`` 12362306a36Sopenharmony_cihelper. 12462306a36Sopenharmony_ci 12562306a36Sopenharmony_cibpf_map_delete_elem() 12662306a36Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^ 12762306a36Sopenharmony_ci.. code-block:: c 12862306a36Sopenharmony_ci 12962306a36Sopenharmony_ci.. c:function:: 13062306a36Sopenharmony_ci int bpf_map_delete_elem(int fd, const void *key); 13162306a36Sopenharmony_ci 13262306a36Sopenharmony_ciNet device entries can be deleted using the ``bpf_map_delete_elem()`` 13362306a36Sopenharmony_cihelper. This helper will return 0 on success, or negative error in case of 13462306a36Sopenharmony_cifailure. 13562306a36Sopenharmony_ci 13662306a36Sopenharmony_ciExamples 13762306a36Sopenharmony_ci======== 13862306a36Sopenharmony_ci 13962306a36Sopenharmony_ciKernel BPF 14062306a36Sopenharmony_ci---------- 14162306a36Sopenharmony_ci 14262306a36Sopenharmony_ciThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP`` 14362306a36Sopenharmony_cicalled tx_port. 14462306a36Sopenharmony_ci 14562306a36Sopenharmony_ci.. code-block:: c 14662306a36Sopenharmony_ci 14762306a36Sopenharmony_ci struct { 14862306a36Sopenharmony_ci __uint(type, BPF_MAP_TYPE_DEVMAP); 14962306a36Sopenharmony_ci __type(key, __u32); 15062306a36Sopenharmony_ci __type(value, __u32); 15162306a36Sopenharmony_ci __uint(max_entries, 256); 15262306a36Sopenharmony_ci } tx_port SEC(".maps"); 15362306a36Sopenharmony_ci 15462306a36Sopenharmony_ciThe following code snippet shows how to declare a ``BPF_MAP_TYPE_DEVMAP_HASH`` 15562306a36Sopenharmony_cicalled forward_map. 15662306a36Sopenharmony_ci 15762306a36Sopenharmony_ci.. code-block:: c 15862306a36Sopenharmony_ci 15962306a36Sopenharmony_ci struct { 16062306a36Sopenharmony_ci __uint(type, BPF_MAP_TYPE_DEVMAP_HASH); 16162306a36Sopenharmony_ci __type(key, __u32); 16262306a36Sopenharmony_ci __type(value, struct bpf_devmap_val); 16362306a36Sopenharmony_ci __uint(max_entries, 32); 16462306a36Sopenharmony_ci } forward_map SEC(".maps"); 16562306a36Sopenharmony_ci 16662306a36Sopenharmony_ci.. note:: 16762306a36Sopenharmony_ci 16862306a36Sopenharmony_ci The value type in the DEVMAP above is a ``struct bpf_devmap_val`` 16962306a36Sopenharmony_ci 17062306a36Sopenharmony_ciThe following code snippet shows a simple xdp_redirect_map program. This program 17162306a36Sopenharmony_ciwould work with a user space program that populates the devmap ``forward_map`` based 17262306a36Sopenharmony_cion ingress ifindexes. The BPF program (below) is redirecting packets using the 17362306a36Sopenharmony_ciingress ``ifindex`` as the ``key``. 17462306a36Sopenharmony_ci 17562306a36Sopenharmony_ci.. code-block:: c 17662306a36Sopenharmony_ci 17762306a36Sopenharmony_ci SEC("xdp") 17862306a36Sopenharmony_ci int xdp_redirect_map_func(struct xdp_md *ctx) 17962306a36Sopenharmony_ci { 18062306a36Sopenharmony_ci int index = ctx->ingress_ifindex; 18162306a36Sopenharmony_ci 18262306a36Sopenharmony_ci return bpf_redirect_map(&forward_map, index, 0); 18362306a36Sopenharmony_ci } 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ciThe following code snippet shows a BPF program that is broadcasting packets to 18662306a36Sopenharmony_ciall the interfaces in the ``tx_port`` devmap. 18762306a36Sopenharmony_ci 18862306a36Sopenharmony_ci.. code-block:: c 18962306a36Sopenharmony_ci 19062306a36Sopenharmony_ci SEC("xdp") 19162306a36Sopenharmony_ci int xdp_redirect_map_func(struct xdp_md *ctx) 19262306a36Sopenharmony_ci { 19362306a36Sopenharmony_ci return bpf_redirect_map(&tx_port, 0, BPF_F_BROADCAST | BPF_F_EXCLUDE_INGRESS); 19462306a36Sopenharmony_ci } 19562306a36Sopenharmony_ci 19662306a36Sopenharmony_ciUser space 19762306a36Sopenharmony_ci---------- 19862306a36Sopenharmony_ci 19962306a36Sopenharmony_ciThe following code snippet shows how to update a devmap called ``tx_port``. 20062306a36Sopenharmony_ci 20162306a36Sopenharmony_ci.. code-block:: c 20262306a36Sopenharmony_ci 20362306a36Sopenharmony_ci int update_devmap(int ifindex, int redirect_ifindex) 20462306a36Sopenharmony_ci { 20562306a36Sopenharmony_ci int ret; 20662306a36Sopenharmony_ci 20762306a36Sopenharmony_ci ret = bpf_map_update_elem(bpf_map__fd(tx_port), &ifindex, &redirect_ifindex, 0); 20862306a36Sopenharmony_ci if (ret < 0) { 20962306a36Sopenharmony_ci fprintf(stderr, "Failed to update devmap_ value: %s\n", 21062306a36Sopenharmony_ci strerror(errno)); 21162306a36Sopenharmony_ci } 21262306a36Sopenharmony_ci 21362306a36Sopenharmony_ci return ret; 21462306a36Sopenharmony_ci } 21562306a36Sopenharmony_ci 21662306a36Sopenharmony_ciThe following code snippet shows how to update a hash_devmap called ``forward_map``. 21762306a36Sopenharmony_ci 21862306a36Sopenharmony_ci.. code-block:: c 21962306a36Sopenharmony_ci 22062306a36Sopenharmony_ci int update_devmap(int ifindex, int redirect_ifindex) 22162306a36Sopenharmony_ci { 22262306a36Sopenharmony_ci struct bpf_devmap_val devmap_val = { .ifindex = redirect_ifindex }; 22362306a36Sopenharmony_ci int ret; 22462306a36Sopenharmony_ci 22562306a36Sopenharmony_ci ret = bpf_map_update_elem(bpf_map__fd(forward_map), &ifindex, &devmap_val, 0); 22662306a36Sopenharmony_ci if (ret < 0) { 22762306a36Sopenharmony_ci fprintf(stderr, "Failed to update devmap_ value: %s\n", 22862306a36Sopenharmony_ci strerror(errno)); 22962306a36Sopenharmony_ci } 23062306a36Sopenharmony_ci return ret; 23162306a36Sopenharmony_ci } 23262306a36Sopenharmony_ci 23362306a36Sopenharmony_ciReferences 23462306a36Sopenharmony_ci=========== 23562306a36Sopenharmony_ci 23662306a36Sopenharmony_ci- https://lwn.net/Articles/728146/ 23762306a36Sopenharmony_ci- https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=6f9d451ab1a33728adb72d7ff66a7b374d665176 23862306a36Sopenharmony_ci- https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4106 239