162306a36Sopenharmony_ci===============
262306a36Sopenharmony_ciRDMA Controller
362306a36Sopenharmony_ci===============
462306a36Sopenharmony_ci
562306a36Sopenharmony_ci.. Contents
662306a36Sopenharmony_ci
762306a36Sopenharmony_ci   1. Overview
862306a36Sopenharmony_ci     1-1. What is RDMA controller?
962306a36Sopenharmony_ci     1-2. Why RDMA controller needed?
1062306a36Sopenharmony_ci     1-3. How is RDMA controller implemented?
1162306a36Sopenharmony_ci   2. Usage Examples
1262306a36Sopenharmony_ci
1362306a36Sopenharmony_ci1. Overview
1462306a36Sopenharmony_ci===========
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_ci1-1. What is RDMA controller?
1762306a36Sopenharmony_ci-----------------------------
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ciRDMA controller allows user to limit RDMA/IB specific resources that a given
2062306a36Sopenharmony_ciset of processes can use. These processes are grouped using RDMA controller.
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ciRDMA controller defines two resources which can be limited for processes of a
2362306a36Sopenharmony_cicgroup.
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ci1-2. Why RDMA controller needed?
2662306a36Sopenharmony_ci--------------------------------
2762306a36Sopenharmony_ci
2862306a36Sopenharmony_ciCurrently user space applications can easily take away all the rdma verb
2962306a36Sopenharmony_cispecific resources such as AH, CQ, QP, MR etc. Due to which other applications
3062306a36Sopenharmony_ciin other cgroup or kernel space ULPs may not even get chance to allocate any
3162306a36Sopenharmony_cirdma resources. This can lead to service unavailability.
3262306a36Sopenharmony_ci
3362306a36Sopenharmony_ciTherefore RDMA controller is needed through which resource consumption
3462306a36Sopenharmony_ciof processes can be limited. Through this controller different rdma
3562306a36Sopenharmony_ciresources can be accounted.
3662306a36Sopenharmony_ci
3762306a36Sopenharmony_ci1-3. How is RDMA controller implemented?
3862306a36Sopenharmony_ci----------------------------------------
3962306a36Sopenharmony_ci
4062306a36Sopenharmony_ciRDMA cgroup allows limit configuration of resources. Rdma cgroup maintains
4162306a36Sopenharmony_ciresource accounting per cgroup, per device using resource pool structure.
4262306a36Sopenharmony_ciEach such resource pool is limited up to 64 resources in given resource pool
4362306a36Sopenharmony_ciby rdma cgroup, which can be extended later if required.
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciThis resource pool object is linked to the cgroup css. Typically there
4662306a36Sopenharmony_ciare 0 to 4 resource pool instances per cgroup, per device in most use cases.
4762306a36Sopenharmony_ciBut nothing limits to have it more. At present hundreds of RDMA devices per
4862306a36Sopenharmony_cisingle cgroup may not be handled optimally, however there is no
4962306a36Sopenharmony_ciknown use case or requirement for such configuration either.
5062306a36Sopenharmony_ci
5162306a36Sopenharmony_ciSince RDMA resources can be allocated from any process and can be freed by any
5262306a36Sopenharmony_ciof the child processes which shares the address space, rdma resources are
5362306a36Sopenharmony_cialways owned by the creator cgroup css. This allows process migration from one
5462306a36Sopenharmony_cito other cgroup without major complexity of transferring resource ownership;
5562306a36Sopenharmony_cibecause such ownership is not really present due to shared nature of
5662306a36Sopenharmony_cirdma resources. Linking resources around css also ensures that cgroups can be
5762306a36Sopenharmony_cideleted after processes migrated. This allow progress migration as well with
5862306a36Sopenharmony_ciactive resources, even though that is not a primary use case.
5962306a36Sopenharmony_ci
6062306a36Sopenharmony_ciWhenever RDMA resource charging occurs, owner rdma cgroup is returned to
6162306a36Sopenharmony_cithe caller. Same rdma cgroup should be passed while uncharging the resource.
6262306a36Sopenharmony_ciThis also allows process migrated with active RDMA resource to charge
6362306a36Sopenharmony_cito new owner cgroup for new resource. It also allows to uncharge resource of
6462306a36Sopenharmony_cia process from previously charged cgroup which is migrated to new cgroup,
6562306a36Sopenharmony_cieven though that is not a primary use case.
6662306a36Sopenharmony_ci
6762306a36Sopenharmony_ciResource pool object is created in following situations.
6862306a36Sopenharmony_ci(a) User sets the limit and no previous resource pool exist for the device
6962306a36Sopenharmony_ciof interest for the cgroup.
7062306a36Sopenharmony_ci(b) No resource limits were configured, but IB/RDMA stack tries to
7162306a36Sopenharmony_cicharge the resource. So that it correctly uncharge them when applications are
7262306a36Sopenharmony_cirunning without limits and later on when limits are enforced during uncharging,
7362306a36Sopenharmony_ciotherwise usage count will drop to negative.
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ciResource pool is destroyed if all the resource limits are set to max and
7662306a36Sopenharmony_ciit is the last resource getting deallocated.
7762306a36Sopenharmony_ci
7862306a36Sopenharmony_ciUser should set all the limit to max value if it intents to remove/unconfigure
7962306a36Sopenharmony_cithe resource pool for a particular device.
8062306a36Sopenharmony_ci
8162306a36Sopenharmony_ciIB stack honors limits enforced by the rdma controller. When application
8262306a36Sopenharmony_ciquery about maximum resource limits of IB device, it returns minimum of
8362306a36Sopenharmony_ciwhat is configured by user for a given cgroup and what is supported by
8462306a36Sopenharmony_ciIB device.
8562306a36Sopenharmony_ci
8662306a36Sopenharmony_ciFollowing resources can be accounted by rdma controller.
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ci  ==========    =============================
8962306a36Sopenharmony_ci  hca_handle	Maximum number of HCA Handles
9062306a36Sopenharmony_ci  hca_object 	Maximum number of HCA Objects
9162306a36Sopenharmony_ci  ==========    =============================
9262306a36Sopenharmony_ci
9362306a36Sopenharmony_ci2. Usage Examples
9462306a36Sopenharmony_ci=================
9562306a36Sopenharmony_ci
9662306a36Sopenharmony_ci(a) Configure resource limit::
9762306a36Sopenharmony_ci
9862306a36Sopenharmony_ci	echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max
9962306a36Sopenharmony_ci	echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max
10062306a36Sopenharmony_ci
10162306a36Sopenharmony_ci(b) Query resource limit::
10262306a36Sopenharmony_ci
10362306a36Sopenharmony_ci	cat /sys/fs/cgroup/rdma/2/rdma.max
10462306a36Sopenharmony_ci	#Output:
10562306a36Sopenharmony_ci	mlx4_0 hca_handle=2 hca_object=2000
10662306a36Sopenharmony_ci	ocrdma1 hca_handle=3 hca_object=max
10762306a36Sopenharmony_ci
10862306a36Sopenharmony_ci(c) Query current usage::
10962306a36Sopenharmony_ci
11062306a36Sopenharmony_ci	cat /sys/fs/cgroup/rdma/2/rdma.current
11162306a36Sopenharmony_ci	#Output:
11262306a36Sopenharmony_ci	mlx4_0 hca_handle=1 hca_object=20
11362306a36Sopenharmony_ci	ocrdma1 hca_handle=1 hca_object=23
11462306a36Sopenharmony_ci
11562306a36Sopenharmony_ci(d) Delete resource limit::
11662306a36Sopenharmony_ci
11762306a36Sopenharmony_ci	echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max
118