18c2ecf20Sopenharmony_ci====================================================== 28c2ecf20Sopenharmony_ciNet DIM - Generic Network Dynamic Interrupt Moderation 38c2ecf20Sopenharmony_ci====================================================== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ci:Author: Tal Gilboa <talgi@mellanox.com> 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ci.. contents:: :depth: 2 88c2ecf20Sopenharmony_ci 98c2ecf20Sopenharmony_ciAssumptions 108c2ecf20Sopenharmony_ci=========== 118c2ecf20Sopenharmony_ci 128c2ecf20Sopenharmony_ciThis document assumes the reader has basic knowledge in network drivers 138c2ecf20Sopenharmony_ciand in general interrupt moderation. 148c2ecf20Sopenharmony_ci 158c2ecf20Sopenharmony_ci 168c2ecf20Sopenharmony_ciIntroduction 178c2ecf20Sopenharmony_ci============ 188c2ecf20Sopenharmony_ci 198c2ecf20Sopenharmony_ciDynamic Interrupt Moderation (DIM) (in networking) refers to changing the 208c2ecf20Sopenharmony_ciinterrupt moderation configuration of a channel in order to optimize packet 218c2ecf20Sopenharmony_ciprocessing. The mechanism includes an algorithm which decides if and how to 228c2ecf20Sopenharmony_cichange moderation parameters for a channel, usually by performing an analysis on 238c2ecf20Sopenharmony_ciruntime data sampled from the system. Net DIM is such a mechanism. In each 248c2ecf20Sopenharmony_ciiteration of the algorithm, it analyses a given sample of the data, compares it 258c2ecf20Sopenharmony_cito the previous sample and if required, it can decide to change some of the 268c2ecf20Sopenharmony_ciinterrupt moderation configuration fields. The data sample is composed of data 278c2ecf20Sopenharmony_cibandwidth, the number of packets and the number of events. The time between 288c2ecf20Sopenharmony_cisamples is also measured. Net DIM compares the current and the previous data and 298c2ecf20Sopenharmony_cireturns an adjusted interrupt moderation configuration object. In some cases, 308c2ecf20Sopenharmony_cithe algorithm might decide not to change anything. The configuration fields are 318c2ecf20Sopenharmony_cithe minimum duration (microseconds) allowed between events and the maximum 328c2ecf20Sopenharmony_cinumber of wanted packets per event. The Net DIM algorithm ascribes importance to 338c2ecf20Sopenharmony_ciincrease bandwidth over reducing interrupt rate. 348c2ecf20Sopenharmony_ci 358c2ecf20Sopenharmony_ci 368c2ecf20Sopenharmony_ciNet DIM Algorithm 378c2ecf20Sopenharmony_ci================= 388c2ecf20Sopenharmony_ci 398c2ecf20Sopenharmony_ciEach iteration of the Net DIM algorithm follows these steps: 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ci#. Calculates new data sample. 428c2ecf20Sopenharmony_ci#. Compares it to previous sample. 438c2ecf20Sopenharmony_ci#. Makes a decision - suggests interrupt moderation configuration fields. 448c2ecf20Sopenharmony_ci#. Applies a schedule work function, which applies suggested configuration. 458c2ecf20Sopenharmony_ci 468c2ecf20Sopenharmony_ciThe first two steps are straightforward, both the new and the previous data are 478c2ecf20Sopenharmony_cisupplied by the driver registered to Net DIM. The previous data is the new data 488c2ecf20Sopenharmony_cisupplied to the previous iteration. The comparison step checks the difference 498c2ecf20Sopenharmony_cibetween the new and previous data and decides on the result of the last step. 508c2ecf20Sopenharmony_ciA step would result as "better" if bandwidth increases and as "worse" if 518c2ecf20Sopenharmony_cibandwidth reduces. If there is no change in bandwidth, the packet rate is 528c2ecf20Sopenharmony_cicompared in a similar fashion - increase == "better" and decrease == "worse". 538c2ecf20Sopenharmony_ciIn case there is no change in the packet rate as well, the interrupt rate is 548c2ecf20Sopenharmony_cicompared. Here the algorithm tries to optimize for lower interrupt rate so an 558c2ecf20Sopenharmony_ciincrease in the interrupt rate is considered "worse" and a decrease is 568c2ecf20Sopenharmony_ciconsidered "better". Step #2 has an optimization for avoiding false results: it 578c2ecf20Sopenharmony_cionly considers a difference between samples as valid if it is greater than a 588c2ecf20Sopenharmony_cicertain percentage. Also, since Net DIM does not measure anything by itself, it 598c2ecf20Sopenharmony_ciassumes the data provided by the driver is valid. 608c2ecf20Sopenharmony_ci 618c2ecf20Sopenharmony_ciStep #3 decides on the suggested configuration based on the result from step #2 628c2ecf20Sopenharmony_ciand the internal state of the algorithm. The states reflect the "direction" of 638c2ecf20Sopenharmony_cithe algorithm: is it going left (reducing moderation), right (increasing 648c2ecf20Sopenharmony_cimoderation) or standing still. Another optimization is that if a decision 658c2ecf20Sopenharmony_cito stay still is made multiple times, the interval between iterations of the 668c2ecf20Sopenharmony_cialgorithm would increase in order to reduce calculation overhead. Also, after 678c2ecf20Sopenharmony_ci"parking" on one of the most left or most right decisions, the algorithm may 688c2ecf20Sopenharmony_cidecide to verify this decision by taking a step in the other direction. This is 698c2ecf20Sopenharmony_cidone in order to avoid getting stuck in a "deep sleep" scenario. Once a 708c2ecf20Sopenharmony_cidecision is made, an interrupt moderation configuration is selected from 718c2ecf20Sopenharmony_cithe predefined profiles. 728c2ecf20Sopenharmony_ci 738c2ecf20Sopenharmony_ciThe last step is to notify the registered driver that it should apply the 748c2ecf20Sopenharmony_cisuggested configuration. This is done by scheduling a work function, defined by 758c2ecf20Sopenharmony_cithe Net DIM API and provided by the registered driver. 768c2ecf20Sopenharmony_ci 778c2ecf20Sopenharmony_ciAs you can see, Net DIM itself does not actively interact with the system. It 788c2ecf20Sopenharmony_ciwould have trouble making the correct decisions if the wrong data is supplied to 798c2ecf20Sopenharmony_ciit and it would be useless if the work function would not apply the suggested 808c2ecf20Sopenharmony_ciconfiguration. This does, however, allow the registered driver some room for 818c2ecf20Sopenharmony_cimanoeuvre as it may provide partial data or ignore the algorithm suggestion 828c2ecf20Sopenharmony_ciunder some conditions. 838c2ecf20Sopenharmony_ci 848c2ecf20Sopenharmony_ci 858c2ecf20Sopenharmony_ciRegistering a Network Device to DIM 868c2ecf20Sopenharmony_ci=================================== 878c2ecf20Sopenharmony_ci 888c2ecf20Sopenharmony_ciNet DIM API exposes the main function net_dim(). 898c2ecf20Sopenharmony_ciThis function is the entry point to the Net 908c2ecf20Sopenharmony_ciDIM algorithm and has to be called every time the driver would like to check if 918c2ecf20Sopenharmony_ciit should change interrupt moderation parameters. The driver should provide two 928c2ecf20Sopenharmony_cidata structures: :c:type:`struct dim <dim>` and 938c2ecf20Sopenharmony_ci:c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>` 948c2ecf20Sopenharmony_cidescribes the state of DIM for a specific object (RX queue, TX queue, 958c2ecf20Sopenharmony_ciother queues, etc.). This includes the current selected profile, previous data 968c2ecf20Sopenharmony_cisamples, the callback function provided by the driver and more. 978c2ecf20Sopenharmony_ci:c:type:`struct dim_sample <dim_sample>` describes a data sample, 988c2ecf20Sopenharmony_ciwhich will be compared to the data sample stored in :c:type:`struct dim <dim>` 998c2ecf20Sopenharmony_ciin order to decide on the algorithm's next 1008c2ecf20Sopenharmony_cistep. The sample should include bytes, packets and interrupts, measured by 1018c2ecf20Sopenharmony_cithe driver. 1028c2ecf20Sopenharmony_ci 1038c2ecf20Sopenharmony_ciIn order to use Net DIM from a networking driver, the driver needs to call the 1048c2ecf20Sopenharmony_cimain net_dim() function. The recommended method is to call net_dim() on each 1058c2ecf20Sopenharmony_ciinterrupt. Since Net DIM has a built-in moderation and it might decide to skip 1068c2ecf20Sopenharmony_ciiterations under certain conditions, there is no need to moderate the net_dim() 1078c2ecf20Sopenharmony_cicalls as well. As mentioned above, the driver needs to provide an object of type 1088c2ecf20Sopenharmony_ci:c:type:`struct dim <dim>` to the net_dim() function call. It is advised for 1098c2ecf20Sopenharmony_cieach entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its 1108c2ecf20Sopenharmony_cidata structure and use it as the main Net DIM API object. 1118c2ecf20Sopenharmony_ciThe :c:type:`struct dim_sample <dim_sample>` should hold the latest 1128c2ecf20Sopenharmony_cibytes, packets and interrupts count. No need to perform any calculations, just 1138c2ecf20Sopenharmony_ciinclude the raw data. 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_ciThe net_dim() call itself does not return anything. Instead Net DIM relies on 1168c2ecf20Sopenharmony_cithe driver to provide a callback function, which is called when the algorithm 1178c2ecf20Sopenharmony_cidecides to make a change in the interrupt moderation parameters. This callback 1188c2ecf20Sopenharmony_ciwill be scheduled and run in a separate thread in order not to add overhead to 1198c2ecf20Sopenharmony_cithe data flow. After the work is done, Net DIM algorithm needs to be set to 1208c2ecf20Sopenharmony_cithe proper state in order to move to the next iteration. 1218c2ecf20Sopenharmony_ci 1228c2ecf20Sopenharmony_ci 1238c2ecf20Sopenharmony_ciExample 1248c2ecf20Sopenharmony_ci======= 1258c2ecf20Sopenharmony_ci 1268c2ecf20Sopenharmony_ciThe following code demonstrates how to register a driver to Net DIM. The actual 1278c2ecf20Sopenharmony_ciusage is not complete but it should make the outline of the usage clear. 1288c2ecf20Sopenharmony_ci 1298c2ecf20Sopenharmony_ci.. code-block:: c 1308c2ecf20Sopenharmony_ci 1318c2ecf20Sopenharmony_ci #include <linux/dim.h> 1328c2ecf20Sopenharmony_ci 1338c2ecf20Sopenharmony_ci /* Callback for net DIM to schedule on a decision to change moderation */ 1348c2ecf20Sopenharmony_ci void my_driver_do_dim_work(struct work_struct *work) 1358c2ecf20Sopenharmony_ci { 1368c2ecf20Sopenharmony_ci /* Get struct dim from struct work_struct */ 1378c2ecf20Sopenharmony_ci struct dim *dim = container_of(work, struct dim, 1388c2ecf20Sopenharmony_ci work); 1398c2ecf20Sopenharmony_ci /* Do interrupt moderation related stuff */ 1408c2ecf20Sopenharmony_ci ... 1418c2ecf20Sopenharmony_ci 1428c2ecf20Sopenharmony_ci /* Signal net DIM work is done and it should move to next iteration */ 1438c2ecf20Sopenharmony_ci dim->state = DIM_START_MEASURE; 1448c2ecf20Sopenharmony_ci } 1458c2ecf20Sopenharmony_ci 1468c2ecf20Sopenharmony_ci /* My driver's interrupt handler */ 1478c2ecf20Sopenharmony_ci int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...) 1488c2ecf20Sopenharmony_ci { 1498c2ecf20Sopenharmony_ci ... 1508c2ecf20Sopenharmony_ci /* A struct to hold current measured data */ 1518c2ecf20Sopenharmony_ci struct dim_sample dim_sample; 1528c2ecf20Sopenharmony_ci ... 1538c2ecf20Sopenharmony_ci /* Initiate data sample struct with current data */ 1548c2ecf20Sopenharmony_ci dim_update_sample(my_entity->events, 1558c2ecf20Sopenharmony_ci my_entity->packets, 1568c2ecf20Sopenharmony_ci my_entity->bytes, 1578c2ecf20Sopenharmony_ci &dim_sample); 1588c2ecf20Sopenharmony_ci /* Call net DIM */ 1598c2ecf20Sopenharmony_ci net_dim(&my_entity->dim, dim_sample); 1608c2ecf20Sopenharmony_ci ... 1618c2ecf20Sopenharmony_ci } 1628c2ecf20Sopenharmony_ci 1638c2ecf20Sopenharmony_ci /* My entity's initialization function (my_entity was already allocated) */ 1648c2ecf20Sopenharmony_ci int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...) 1658c2ecf20Sopenharmony_ci { 1668c2ecf20Sopenharmony_ci ... 1678c2ecf20Sopenharmony_ci /* Initiate struct work_struct with my driver's callback function */ 1688c2ecf20Sopenharmony_ci INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work); 1698c2ecf20Sopenharmony_ci ... 1708c2ecf20Sopenharmony_ci } 1718c2ecf20Sopenharmony_ci 1728c2ecf20Sopenharmony_ciDynamic Interrupt Moderation (DIM) library API 1738c2ecf20Sopenharmony_ci============================================== 1748c2ecf20Sopenharmony_ci 1758c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dim.h 1768c2ecf20Sopenharmony_ci :internal: 177