18c2ecf20Sopenharmony_ci======================================================
28c2ecf20Sopenharmony_ciNet DIM - Generic Network Dynamic Interrupt Moderation
38c2ecf20Sopenharmony_ci======================================================
48c2ecf20Sopenharmony_ci
58c2ecf20Sopenharmony_ci:Author: Tal Gilboa <talgi@mellanox.com>
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ci.. contents:: :depth: 2
88c2ecf20Sopenharmony_ci
98c2ecf20Sopenharmony_ciAssumptions
108c2ecf20Sopenharmony_ci===========
118c2ecf20Sopenharmony_ci
128c2ecf20Sopenharmony_ciThis document assumes the reader has basic knowledge in network drivers
138c2ecf20Sopenharmony_ciand in general interrupt moderation.
148c2ecf20Sopenharmony_ci
158c2ecf20Sopenharmony_ci
168c2ecf20Sopenharmony_ciIntroduction
178c2ecf20Sopenharmony_ci============
188c2ecf20Sopenharmony_ci
198c2ecf20Sopenharmony_ciDynamic Interrupt Moderation (DIM) (in networking) refers to changing the
208c2ecf20Sopenharmony_ciinterrupt moderation configuration of a channel in order to optimize packet
218c2ecf20Sopenharmony_ciprocessing. The mechanism includes an algorithm which decides if and how to
228c2ecf20Sopenharmony_cichange moderation parameters for a channel, usually by performing an analysis on
238c2ecf20Sopenharmony_ciruntime data sampled from the system. Net DIM is such a mechanism. In each
248c2ecf20Sopenharmony_ciiteration of the algorithm, it analyses a given sample of the data, compares it
258c2ecf20Sopenharmony_cito the previous sample and if required, it can decide to change some of the
268c2ecf20Sopenharmony_ciinterrupt moderation configuration fields. The data sample is composed of data
278c2ecf20Sopenharmony_cibandwidth, the number of packets and the number of events. The time between
288c2ecf20Sopenharmony_cisamples is also measured. Net DIM compares the current and the previous data and
298c2ecf20Sopenharmony_cireturns an adjusted interrupt moderation configuration object. In some cases,
308c2ecf20Sopenharmony_cithe algorithm might decide not to change anything. The configuration fields are
318c2ecf20Sopenharmony_cithe minimum duration (microseconds) allowed between events and the maximum
328c2ecf20Sopenharmony_cinumber of wanted packets per event. The Net DIM algorithm ascribes importance to
338c2ecf20Sopenharmony_ciincrease bandwidth over reducing interrupt rate.
348c2ecf20Sopenharmony_ci
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_ciNet DIM Algorithm
378c2ecf20Sopenharmony_ci=================
388c2ecf20Sopenharmony_ci
398c2ecf20Sopenharmony_ciEach iteration of the Net DIM algorithm follows these steps:
408c2ecf20Sopenharmony_ci
418c2ecf20Sopenharmony_ci#. Calculates new data sample.
428c2ecf20Sopenharmony_ci#. Compares it to previous sample.
438c2ecf20Sopenharmony_ci#. Makes a decision - suggests interrupt moderation configuration fields.
448c2ecf20Sopenharmony_ci#. Applies a schedule work function, which applies suggested configuration.
458c2ecf20Sopenharmony_ci
468c2ecf20Sopenharmony_ciThe first two steps are straightforward, both the new and the previous data are
478c2ecf20Sopenharmony_cisupplied by the driver registered to Net DIM. The previous data is the new data
488c2ecf20Sopenharmony_cisupplied to the previous iteration. The comparison step checks the difference
498c2ecf20Sopenharmony_cibetween the new and previous data and decides on the result of the last step.
508c2ecf20Sopenharmony_ciA step would result as "better" if bandwidth increases and as "worse" if
518c2ecf20Sopenharmony_cibandwidth reduces. If there is no change in bandwidth, the packet rate is
528c2ecf20Sopenharmony_cicompared in a similar fashion - increase == "better" and decrease == "worse".
538c2ecf20Sopenharmony_ciIn case there is no change in the packet rate as well, the interrupt rate is
548c2ecf20Sopenharmony_cicompared. Here the algorithm tries to optimize for lower interrupt rate so an
558c2ecf20Sopenharmony_ciincrease in the interrupt rate is considered "worse" and a decrease is
568c2ecf20Sopenharmony_ciconsidered "better". Step #2 has an optimization for avoiding false results: it
578c2ecf20Sopenharmony_cionly considers a difference between samples as valid if it is greater than a
588c2ecf20Sopenharmony_cicertain percentage. Also, since Net DIM does not measure anything by itself, it
598c2ecf20Sopenharmony_ciassumes the data provided by the driver is valid.
608c2ecf20Sopenharmony_ci
618c2ecf20Sopenharmony_ciStep #3 decides on the suggested configuration based on the result from step #2
628c2ecf20Sopenharmony_ciand the internal state of the algorithm. The states reflect the "direction" of
638c2ecf20Sopenharmony_cithe algorithm: is it going left (reducing moderation), right (increasing
648c2ecf20Sopenharmony_cimoderation) or standing still. Another optimization is that if a decision
658c2ecf20Sopenharmony_cito stay still is made multiple times, the interval between iterations of the
668c2ecf20Sopenharmony_cialgorithm would increase in order to reduce calculation overhead. Also, after
678c2ecf20Sopenharmony_ci"parking" on one of the most left or most right decisions, the algorithm may
688c2ecf20Sopenharmony_cidecide to verify this decision by taking a step in the other direction. This is
698c2ecf20Sopenharmony_cidone in order to avoid getting stuck in a "deep sleep" scenario. Once a
708c2ecf20Sopenharmony_cidecision is made, an interrupt moderation configuration is selected from
718c2ecf20Sopenharmony_cithe predefined profiles.
728c2ecf20Sopenharmony_ci
738c2ecf20Sopenharmony_ciThe last step is to notify the registered driver that it should apply the
748c2ecf20Sopenharmony_cisuggested configuration. This is done by scheduling a work function, defined by
758c2ecf20Sopenharmony_cithe Net DIM API and provided by the registered driver.
768c2ecf20Sopenharmony_ci
778c2ecf20Sopenharmony_ciAs you can see, Net DIM itself does not actively interact with the system. It
788c2ecf20Sopenharmony_ciwould have trouble making the correct decisions if the wrong data is supplied to
798c2ecf20Sopenharmony_ciit and it would be useless if the work function would not apply the suggested
808c2ecf20Sopenharmony_ciconfiguration. This does, however, allow the registered driver some room for
818c2ecf20Sopenharmony_cimanoeuvre as it may provide partial data or ignore the algorithm suggestion
828c2ecf20Sopenharmony_ciunder some conditions.
838c2ecf20Sopenharmony_ci
848c2ecf20Sopenharmony_ci
858c2ecf20Sopenharmony_ciRegistering a Network Device to DIM
868c2ecf20Sopenharmony_ci===================================
878c2ecf20Sopenharmony_ci
888c2ecf20Sopenharmony_ciNet DIM API exposes the main function net_dim().
898c2ecf20Sopenharmony_ciThis function is the entry point to the Net
908c2ecf20Sopenharmony_ciDIM algorithm and has to be called every time the driver would like to check if
918c2ecf20Sopenharmony_ciit should change interrupt moderation parameters. The driver should provide two
928c2ecf20Sopenharmony_cidata structures: :c:type:`struct dim <dim>` and
938c2ecf20Sopenharmony_ci:c:type:`struct dim_sample <dim_sample>`. :c:type:`struct dim <dim>`
948c2ecf20Sopenharmony_cidescribes the state of DIM for a specific object (RX queue, TX queue,
958c2ecf20Sopenharmony_ciother queues, etc.). This includes the current selected profile, previous data
968c2ecf20Sopenharmony_cisamples, the callback function provided by the driver and more.
978c2ecf20Sopenharmony_ci:c:type:`struct dim_sample <dim_sample>` describes a data sample,
988c2ecf20Sopenharmony_ciwhich will be compared to the data sample stored in :c:type:`struct dim <dim>`
998c2ecf20Sopenharmony_ciin order to decide on the algorithm's next
1008c2ecf20Sopenharmony_cistep. The sample should include bytes, packets and interrupts, measured by
1018c2ecf20Sopenharmony_cithe driver.
1028c2ecf20Sopenharmony_ci
1038c2ecf20Sopenharmony_ciIn order to use Net DIM from a networking driver, the driver needs to call the
1048c2ecf20Sopenharmony_cimain net_dim() function. The recommended method is to call net_dim() on each
1058c2ecf20Sopenharmony_ciinterrupt. Since Net DIM has a built-in moderation and it might decide to skip
1068c2ecf20Sopenharmony_ciiterations under certain conditions, there is no need to moderate the net_dim()
1078c2ecf20Sopenharmony_cicalls as well. As mentioned above, the driver needs to provide an object of type
1088c2ecf20Sopenharmony_ci:c:type:`struct dim <dim>` to the net_dim() function call. It is advised for
1098c2ecf20Sopenharmony_cieach entity using Net DIM to hold a :c:type:`struct dim <dim>` as part of its
1108c2ecf20Sopenharmony_cidata structure and use it as the main Net DIM API object.
1118c2ecf20Sopenharmony_ciThe :c:type:`struct dim_sample <dim_sample>` should hold the latest
1128c2ecf20Sopenharmony_cibytes, packets and interrupts count. No need to perform any calculations, just
1138c2ecf20Sopenharmony_ciinclude the raw data.
1148c2ecf20Sopenharmony_ci
1158c2ecf20Sopenharmony_ciThe net_dim() call itself does not return anything. Instead Net DIM relies on
1168c2ecf20Sopenharmony_cithe driver to provide a callback function, which is called when the algorithm
1178c2ecf20Sopenharmony_cidecides to make a change in the interrupt moderation parameters. This callback
1188c2ecf20Sopenharmony_ciwill be scheduled and run in a separate thread in order not to add overhead to
1198c2ecf20Sopenharmony_cithe data flow. After the work is done, Net DIM algorithm needs to be set to
1208c2ecf20Sopenharmony_cithe proper state in order to move to the next iteration.
1218c2ecf20Sopenharmony_ci
1228c2ecf20Sopenharmony_ci
1238c2ecf20Sopenharmony_ciExample
1248c2ecf20Sopenharmony_ci=======
1258c2ecf20Sopenharmony_ci
1268c2ecf20Sopenharmony_ciThe following code demonstrates how to register a driver to Net DIM. The actual
1278c2ecf20Sopenharmony_ciusage is not complete but it should make the outline of the usage clear.
1288c2ecf20Sopenharmony_ci
1298c2ecf20Sopenharmony_ci.. code-block:: c
1308c2ecf20Sopenharmony_ci
1318c2ecf20Sopenharmony_ci  #include <linux/dim.h>
1328c2ecf20Sopenharmony_ci
1338c2ecf20Sopenharmony_ci  /* Callback for net DIM to schedule on a decision to change moderation */
1348c2ecf20Sopenharmony_ci  void my_driver_do_dim_work(struct work_struct *work)
1358c2ecf20Sopenharmony_ci  {
1368c2ecf20Sopenharmony_ci	/* Get struct dim from struct work_struct */
1378c2ecf20Sopenharmony_ci	struct dim *dim = container_of(work, struct dim,
1388c2ecf20Sopenharmony_ci				       work);
1398c2ecf20Sopenharmony_ci	/* Do interrupt moderation related stuff */
1408c2ecf20Sopenharmony_ci	...
1418c2ecf20Sopenharmony_ci
1428c2ecf20Sopenharmony_ci	/* Signal net DIM work is done and it should move to next iteration */
1438c2ecf20Sopenharmony_ci	dim->state = DIM_START_MEASURE;
1448c2ecf20Sopenharmony_ci  }
1458c2ecf20Sopenharmony_ci
1468c2ecf20Sopenharmony_ci  /* My driver's interrupt handler */
1478c2ecf20Sopenharmony_ci  int my_driver_handle_interrupt(struct my_driver_entity *my_entity, ...)
1488c2ecf20Sopenharmony_ci  {
1498c2ecf20Sopenharmony_ci	...
1508c2ecf20Sopenharmony_ci	/* A struct to hold current measured data */
1518c2ecf20Sopenharmony_ci	struct dim_sample dim_sample;
1528c2ecf20Sopenharmony_ci	...
1538c2ecf20Sopenharmony_ci	/* Initiate data sample struct with current data */
1548c2ecf20Sopenharmony_ci	dim_update_sample(my_entity->events,
1558c2ecf20Sopenharmony_ci		          my_entity->packets,
1568c2ecf20Sopenharmony_ci		          my_entity->bytes,
1578c2ecf20Sopenharmony_ci		          &dim_sample);
1588c2ecf20Sopenharmony_ci	/* Call net DIM */
1598c2ecf20Sopenharmony_ci	net_dim(&my_entity->dim, dim_sample);
1608c2ecf20Sopenharmony_ci	...
1618c2ecf20Sopenharmony_ci  }
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ci  /* My entity's initialization function (my_entity was already allocated) */
1648c2ecf20Sopenharmony_ci  int my_driver_init_my_entity(struct my_driver_entity *my_entity, ...)
1658c2ecf20Sopenharmony_ci  {
1668c2ecf20Sopenharmony_ci	...
1678c2ecf20Sopenharmony_ci	/* Initiate struct work_struct with my driver's callback function */
1688c2ecf20Sopenharmony_ci	INIT_WORK(&my_entity->dim.work, my_driver_do_dim_work);
1698c2ecf20Sopenharmony_ci	...
1708c2ecf20Sopenharmony_ci  }
1718c2ecf20Sopenharmony_ci
1728c2ecf20Sopenharmony_ciDynamic Interrupt Moderation (DIM) library API
1738c2ecf20Sopenharmony_ci==============================================
1748c2ecf20Sopenharmony_ci
1758c2ecf20Sopenharmony_ci.. kernel-doc:: include/linux/dim.h
1768c2ecf20Sopenharmony_ci    :internal:
177