162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci==================== 462306a36Sopenharmony_cimlx5 devlink support 562306a36Sopenharmony_ci==================== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciThis document describes the devlink features implemented by the ``mlx5`` 862306a36Sopenharmony_cidevice driver. 962306a36Sopenharmony_ci 1062306a36Sopenharmony_ciParameters 1162306a36Sopenharmony_ci========== 1262306a36Sopenharmony_ci 1362306a36Sopenharmony_ci.. list-table:: Generic parameters implemented 1462306a36Sopenharmony_ci 1562306a36Sopenharmony_ci * - Name 1662306a36Sopenharmony_ci - Mode 1762306a36Sopenharmony_ci - Validation 1862306a36Sopenharmony_ci * - ``enable_roce`` 1962306a36Sopenharmony_ci - driverinit 2062306a36Sopenharmony_ci - Type: Boolean 2162306a36Sopenharmony_ci 2262306a36Sopenharmony_ci If the device supports RoCE disablement, RoCE enablement state controls 2362306a36Sopenharmony_ci device support for RoCE capability. Otherwise, the control occurs in the 2462306a36Sopenharmony_ci driver stack. When RoCE is disabled at the driver level, only raw 2562306a36Sopenharmony_ci ethernet QPs are supported. 2662306a36Sopenharmony_ci * - ``io_eq_size`` 2762306a36Sopenharmony_ci - driverinit 2862306a36Sopenharmony_ci - The range is between 64 and 4096. 2962306a36Sopenharmony_ci * - ``event_eq_size`` 3062306a36Sopenharmony_ci - driverinit 3162306a36Sopenharmony_ci - The range is between 64 and 4096. 3262306a36Sopenharmony_ci * - ``max_macs`` 3362306a36Sopenharmony_ci - driverinit 3462306a36Sopenharmony_ci - The range is between 1 and 2^31. Only power of 2 values are supported. 3562306a36Sopenharmony_ci 3662306a36Sopenharmony_ciThe ``mlx5`` driver also implements the following driver-specific 3762306a36Sopenharmony_ciparameters. 3862306a36Sopenharmony_ci 3962306a36Sopenharmony_ci.. list-table:: Driver-specific parameters implemented 4062306a36Sopenharmony_ci :widths: 5 5 5 85 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ci * - Name 4362306a36Sopenharmony_ci - Type 4462306a36Sopenharmony_ci - Mode 4562306a36Sopenharmony_ci - Description 4662306a36Sopenharmony_ci * - ``flow_steering_mode`` 4762306a36Sopenharmony_ci - string 4862306a36Sopenharmony_ci - runtime 4962306a36Sopenharmony_ci - Controls the flow steering mode of the driver 5062306a36Sopenharmony_ci 5162306a36Sopenharmony_ci * ``dmfs`` Device managed flow steering. In DMFS mode, the HW 5262306a36Sopenharmony_ci steering entities are created and managed through firmware. 5362306a36Sopenharmony_ci * ``smfs`` Software managed flow steering. In SMFS mode, the HW 5462306a36Sopenharmony_ci steering entities are created and manage through the driver without 5562306a36Sopenharmony_ci firmware intervention. 5662306a36Sopenharmony_ci 5762306a36Sopenharmony_ci SMFS mode is faster and provides better rule insertion rate compared to 5862306a36Sopenharmony_ci default DMFS mode. 5962306a36Sopenharmony_ci * - ``fdb_large_groups`` 6062306a36Sopenharmony_ci - u32 6162306a36Sopenharmony_ci - driverinit 6262306a36Sopenharmony_ci - Control the number of large groups (size > 1) in the FDB table. 6362306a36Sopenharmony_ci 6462306a36Sopenharmony_ci * The default value is 15, and the range is between 1 and 1024. 6562306a36Sopenharmony_ci * - ``esw_multiport`` 6662306a36Sopenharmony_ci - Boolean 6762306a36Sopenharmony_ci - runtime 6862306a36Sopenharmony_ci - Control MultiPort E-Switch shared fdb mode. 6962306a36Sopenharmony_ci 7062306a36Sopenharmony_ci An experimental mode where a single E-Switch is used and all the vports 7162306a36Sopenharmony_ci and physical ports on the NIC are connected to it. 7262306a36Sopenharmony_ci 7362306a36Sopenharmony_ci An example is to send traffic from a VF that is created on PF0 to an 7462306a36Sopenharmony_ci uplink that is natively associated with the uplink of PF1 7562306a36Sopenharmony_ci 7662306a36Sopenharmony_ci Note: Future devices, ConnectX-8 and onward, will eventually have this 7762306a36Sopenharmony_ci as the default to allow forwarding between all NIC ports in a single 7862306a36Sopenharmony_ci E-switch environment and the dual E-switch mode will likely get 7962306a36Sopenharmony_ci deprecated. 8062306a36Sopenharmony_ci 8162306a36Sopenharmony_ci Default: disabled 8262306a36Sopenharmony_ci * - ``esw_port_metadata`` 8362306a36Sopenharmony_ci - Boolean 8462306a36Sopenharmony_ci - runtime 8562306a36Sopenharmony_ci - When applicable, disabling eswitch metadata can increase packet rate up 8662306a36Sopenharmony_ci to 20% depending on the use case and packet sizes. 8762306a36Sopenharmony_ci 8862306a36Sopenharmony_ci Eswitch port metadata state controls whether to internally tag packets 8962306a36Sopenharmony_ci with metadata. Metadata tagging must be enabled for multi-port RoCE, 9062306a36Sopenharmony_ci failover between representors and stacked devices. By default metadata is 9162306a36Sopenharmony_ci enabled on the supported devices in E-switch. Metadata is applicable only 9262306a36Sopenharmony_ci for E-switch in switchdev mode and users may disable it when NONE of the 9362306a36Sopenharmony_ci below use cases will be in use: 9462306a36Sopenharmony_ci 1. HCA is in Dual/multi-port RoCE mode. 9562306a36Sopenharmony_ci 2. VF/SF representor bonding (Usually used for Live migration) 9662306a36Sopenharmony_ci 3. Stacked devices 9762306a36Sopenharmony_ci 9862306a36Sopenharmony_ci When metadata is disabled, the above use cases will fail to initialize if 9962306a36Sopenharmony_ci users try to enable them. 10062306a36Sopenharmony_ci * - ``hairpin_num_queues`` 10162306a36Sopenharmony_ci - u32 10262306a36Sopenharmony_ci - driverinit 10362306a36Sopenharmony_ci - We refer to a TC NIC rule that involves forwarding as "hairpin". 10462306a36Sopenharmony_ci Hairpin queues are mlx5 hardware specific implementation for hardware 10562306a36Sopenharmony_ci forwarding of such packets. 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ci Control the number of hairpin queues. 10862306a36Sopenharmony_ci * - ``hairpin_queue_size`` 10962306a36Sopenharmony_ci - u32 11062306a36Sopenharmony_ci - driverinit 11162306a36Sopenharmony_ci - Control the size (in packets) of the hairpin queues. 11262306a36Sopenharmony_ci 11362306a36Sopenharmony_ciThe ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD`` 11462306a36Sopenharmony_ci 11562306a36Sopenharmony_ciInfo versions 11662306a36Sopenharmony_ci============= 11762306a36Sopenharmony_ci 11862306a36Sopenharmony_ciThe ``mlx5`` driver reports the following versions 11962306a36Sopenharmony_ci 12062306a36Sopenharmony_ci.. list-table:: devlink info versions implemented 12162306a36Sopenharmony_ci :widths: 5 5 90 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_ci * - Name 12462306a36Sopenharmony_ci - Type 12562306a36Sopenharmony_ci - Description 12662306a36Sopenharmony_ci * - ``fw.psid`` 12762306a36Sopenharmony_ci - fixed 12862306a36Sopenharmony_ci - Used to represent the board id of the device. 12962306a36Sopenharmony_ci * - ``fw.version`` 13062306a36Sopenharmony_ci - stored, running 13162306a36Sopenharmony_ci - Three digit major.minor.subminor firmware version number. 13262306a36Sopenharmony_ci 13362306a36Sopenharmony_ciHealth reporters 13462306a36Sopenharmony_ci================ 13562306a36Sopenharmony_ci 13662306a36Sopenharmony_citx reporter 13762306a36Sopenharmony_ci----------- 13862306a36Sopenharmony_ciThe tx reporter is responsible for reporting and recovering of the following three error scenarios: 13962306a36Sopenharmony_ci 14062306a36Sopenharmony_ci- tx timeout 14162306a36Sopenharmony_ci Report on kernel tx timeout detection. 14262306a36Sopenharmony_ci Recover by searching lost interrupts. 14362306a36Sopenharmony_ci- tx error completion 14462306a36Sopenharmony_ci Report on error tx completion. 14562306a36Sopenharmony_ci Recover by flushing the tx queue and reset it. 14662306a36Sopenharmony_ci- tx PTP port timestamping CQ unhealthy 14762306a36Sopenharmony_ci Report too many CQEs never delivered on port ts CQ. 14862306a36Sopenharmony_ci Recover by flushing and re-creating all PTP channels. 14962306a36Sopenharmony_ci 15062306a36Sopenharmony_citx reporter also support on demand diagnose callback, on which it provides 15162306a36Sopenharmony_cireal time information of its send queues status. 15262306a36Sopenharmony_ci 15362306a36Sopenharmony_ciUser commands examples: 15462306a36Sopenharmony_ci 15562306a36Sopenharmony_ci- Diagnose send queues status:: 15662306a36Sopenharmony_ci 15762306a36Sopenharmony_ci $ devlink health diagnose pci/0000:82:00.0 reporter tx 15862306a36Sopenharmony_ci 15962306a36Sopenharmony_ci.. note:: 16062306a36Sopenharmony_ci This command has valid output only when interface is up, otherwise the command has empty output. 16162306a36Sopenharmony_ci 16262306a36Sopenharmony_ci- Show number of tx errors indicated, number of recover flows ended successfully, 16362306a36Sopenharmony_ci is autorecover enabled and graceful period from last recover:: 16462306a36Sopenharmony_ci 16562306a36Sopenharmony_ci $ devlink health show pci/0000:82:00.0 reporter tx 16662306a36Sopenharmony_ci 16762306a36Sopenharmony_cirx reporter 16862306a36Sopenharmony_ci----------- 16962306a36Sopenharmony_ciThe rx reporter is responsible for reporting and recovering of the following two error scenarios: 17062306a36Sopenharmony_ci 17162306a36Sopenharmony_ci- rx queues' initialization (population) timeout 17262306a36Sopenharmony_ci Population of rx queues' descriptors on ring initialization is done 17362306a36Sopenharmony_ci in napi context via triggering an irq. In case of a failure to get 17462306a36Sopenharmony_ci the minimum amount of descriptors, a timeout would occur, and 17562306a36Sopenharmony_ci descriptors could be recovered by polling the EQ (Event Queue). 17662306a36Sopenharmony_ci- rx completions with errors (reported by HW on interrupt context) 17762306a36Sopenharmony_ci Report on rx completion error. 17862306a36Sopenharmony_ci Recover (if needed) by flushing the related queue and reset it. 17962306a36Sopenharmony_ci 18062306a36Sopenharmony_cirx reporter also supports on demand diagnose callback, on which it 18162306a36Sopenharmony_ciprovides real time information of its receive queues' status. 18262306a36Sopenharmony_ci 18362306a36Sopenharmony_ci- Diagnose rx queues' status and corresponding completion queue:: 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ci $ devlink health diagnose pci/0000:82:00.0 reporter rx 18662306a36Sopenharmony_ci 18762306a36Sopenharmony_ci.. note:: 18862306a36Sopenharmony_ci This command has valid output only when interface is up. Otherwise, the command has empty output. 18962306a36Sopenharmony_ci 19062306a36Sopenharmony_ci- Show number of rx errors indicated, number of recover flows ended successfully, 19162306a36Sopenharmony_ci is autorecover enabled, and graceful period from last recover:: 19262306a36Sopenharmony_ci 19362306a36Sopenharmony_ci $ devlink health show pci/0000:82:00.0 reporter rx 19462306a36Sopenharmony_ci 19562306a36Sopenharmony_cifw reporter 19662306a36Sopenharmony_ci----------- 19762306a36Sopenharmony_ciThe fw reporter implements `diagnose` and `dump` callbacks. 19862306a36Sopenharmony_ciIt follows symptoms of fw error such as fw syndrome by triggering 19962306a36Sopenharmony_cifw core dump and storing it into the dump buffer. 20062306a36Sopenharmony_ciThe fw reporter diagnose command can be triggered any time by the user to check 20162306a36Sopenharmony_cicurrent fw status. 20262306a36Sopenharmony_ci 20362306a36Sopenharmony_ciUser commands examples: 20462306a36Sopenharmony_ci 20562306a36Sopenharmony_ci- Check fw heath status:: 20662306a36Sopenharmony_ci 20762306a36Sopenharmony_ci $ devlink health diagnose pci/0000:82:00.0 reporter fw 20862306a36Sopenharmony_ci 20962306a36Sopenharmony_ci- Read FW core dump if already stored or trigger new one:: 21062306a36Sopenharmony_ci 21162306a36Sopenharmony_ci $ devlink health dump show pci/0000:82:00.0 reporter fw 21262306a36Sopenharmony_ci 21362306a36Sopenharmony_ci.. note:: 21462306a36Sopenharmony_ci This command can run only on the PF which has fw tracer ownership, 21562306a36Sopenharmony_ci running it on other PF or any VF will return "Operation not permitted". 21662306a36Sopenharmony_ci 21762306a36Sopenharmony_cifw fatal reporter 21862306a36Sopenharmony_ci----------------- 21962306a36Sopenharmony_ciThe fw fatal reporter implements `dump` and `recover` callbacks. 22062306a36Sopenharmony_ciIt follows fatal errors indications by CR-space dump and recover flow. 22162306a36Sopenharmony_ciThe CR-space dump uses vsc interface which is valid even if the FW command 22262306a36Sopenharmony_ciinterface is not functional, which is the case in most FW fatal errors. 22362306a36Sopenharmony_ciThe recover function runs recover flow which reloads the driver and triggers fw 22462306a36Sopenharmony_cireset if needed. 22562306a36Sopenharmony_ciOn firmware error, the health buffer is dumped into the dmesg. The log 22662306a36Sopenharmony_cilevel is derived from the error's severity (given in health buffer). 22762306a36Sopenharmony_ci 22862306a36Sopenharmony_ciUser commands examples: 22962306a36Sopenharmony_ci 23062306a36Sopenharmony_ci- Run fw recover flow manually:: 23162306a36Sopenharmony_ci 23262306a36Sopenharmony_ci $ devlink health recover pci/0000:82:00.0 reporter fw_fatal 23362306a36Sopenharmony_ci 23462306a36Sopenharmony_ci- Read FW CR-space dump if already stored or trigger new one:: 23562306a36Sopenharmony_ci 23662306a36Sopenharmony_ci $ devlink health dump show pci/0000:82:00.1 reporter fw_fatal 23762306a36Sopenharmony_ci 23862306a36Sopenharmony_ci.. note:: 23962306a36Sopenharmony_ci This command can run only on PF. 24062306a36Sopenharmony_ci 24162306a36Sopenharmony_civnic reporter 24262306a36Sopenharmony_ci------------- 24362306a36Sopenharmony_ciThe vnic reporter implements only the `diagnose` callback. 24462306a36Sopenharmony_ciIt is responsible for querying the vnic diagnostic counters from fw and displaying 24562306a36Sopenharmony_cithem in realtime. 24662306a36Sopenharmony_ci 24762306a36Sopenharmony_ciDescription of the vnic counters: 24862306a36Sopenharmony_ci 24962306a36Sopenharmony_ci- total_q_under_processor_handle 25062306a36Sopenharmony_ci number of queues in an error state due to 25162306a36Sopenharmony_ci an async error or errored command. 25262306a36Sopenharmony_ci- send_queue_priority_update_flow 25362306a36Sopenharmony_ci number of QP/SQ priority/SL update events. 25462306a36Sopenharmony_ci- cq_overrun 25562306a36Sopenharmony_ci number of times CQ entered an error state due to an overflow. 25662306a36Sopenharmony_ci- async_eq_overrun 25762306a36Sopenharmony_ci number of times an EQ mapped to async events was overrun. 25862306a36Sopenharmony_ci comp_eq_overrun number of times an EQ mapped to completion events was 25962306a36Sopenharmony_ci overrun. 26062306a36Sopenharmony_ci- quota_exceeded_command 26162306a36Sopenharmony_ci number of commands issued and failed due to quota exceeded. 26262306a36Sopenharmony_ci- invalid_command 26362306a36Sopenharmony_ci number of commands issued and failed dues to any reason other than quota 26462306a36Sopenharmony_ci exceeded. 26562306a36Sopenharmony_ci- nic_receive_steering_discard 26662306a36Sopenharmony_ci number of packets that completed RX flow 26762306a36Sopenharmony_ci steering but were discarded due to a mismatch in flow table. 26862306a36Sopenharmony_ci- generated_pkt_steering_fail 26962306a36Sopenharmony_ci number of packets generated by the VNIC experiencing unexpected steering 27062306a36Sopenharmony_ci failure (at any point in steering flow). 27162306a36Sopenharmony_ci- handled_pkt_steering_fail 27262306a36Sopenharmony_ci number of packets handled by the VNIC experiencing unexpected steering 27362306a36Sopenharmony_ci failure (at any point in steering flow owned by the VNIC, including the FDB 27462306a36Sopenharmony_ci for the eswitch owner). 27562306a36Sopenharmony_ci 27662306a36Sopenharmony_ciUser commands examples: 27762306a36Sopenharmony_ci 27862306a36Sopenharmony_ci- Diagnose PF/VF vnic counters:: 27962306a36Sopenharmony_ci 28062306a36Sopenharmony_ci $ devlink health diagnose pci/0000:82:00.1 reporter vnic 28162306a36Sopenharmony_ci 28262306a36Sopenharmony_ci- Diagnose representor vnic counters (performed by supplying devlink port of the 28362306a36Sopenharmony_ci representor, which can be obtained via devlink port command):: 28462306a36Sopenharmony_ci 28562306a36Sopenharmony_ci $ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic 28662306a36Sopenharmony_ci 28762306a36Sopenharmony_ci.. note:: 28862306a36Sopenharmony_ci This command can run over all interfaces such as PF/VF and representor ports. 289