162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci.. _xfrm_device: 362306a36Sopenharmony_ci 462306a36Sopenharmony_ci=============================================== 562306a36Sopenharmony_ciXFRM device - offloading the IPsec computations 662306a36Sopenharmony_ci=============================================== 762306a36Sopenharmony_ci 862306a36Sopenharmony_ciShannon Nelson <shannon.nelson@oracle.com> 962306a36Sopenharmony_ciLeon Romanovsky <leonro@nvidia.com> 1062306a36Sopenharmony_ci 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ciOverview 1362306a36Sopenharmony_ci======== 1462306a36Sopenharmony_ci 1562306a36Sopenharmony_ciIPsec is a useful feature for securing network traffic, but the 1662306a36Sopenharmony_cicomputational cost is high: a 10Gbps link can easily be brought down 1762306a36Sopenharmony_cito under 1Gbps, depending on the traffic and link configuration. 1862306a36Sopenharmony_ciLuckily, there are NICs that offer a hardware based IPsec offload which 1962306a36Sopenharmony_cican radically increase throughput and decrease CPU utilization. The XFRM 2062306a36Sopenharmony_ciDevice interface allows NIC drivers to offer to the stack access to the 2162306a36Sopenharmony_cihardware offload. 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ciRight now, there are two types of hardware offload that kernel supports. 2462306a36Sopenharmony_ci * IPsec crypto offload: 2562306a36Sopenharmony_ci * NIC performs encrypt/decrypt 2662306a36Sopenharmony_ci * Kernel does everything else 2762306a36Sopenharmony_ci * IPsec packet offload: 2862306a36Sopenharmony_ci * NIC performs encrypt/decrypt 2962306a36Sopenharmony_ci * NIC does encapsulation 3062306a36Sopenharmony_ci * Kernel and NIC have SA and policy in-sync 3162306a36Sopenharmony_ci * NIC handles the SA and policies states 3262306a36Sopenharmony_ci * The Kernel talks to the keymanager 3362306a36Sopenharmony_ci 3462306a36Sopenharmony_ciUserland access to the offload is typically through a system such as 3562306a36Sopenharmony_cilibreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can 3662306a36Sopenharmony_cibe handy when experimenting. An example command might look something 3762306a36Sopenharmony_cilike this for crypto offload: 3862306a36Sopenharmony_ci 3962306a36Sopenharmony_ci ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ 4062306a36Sopenharmony_ci reqid 0x07 replay-window 32 \ 4162306a36Sopenharmony_ci aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \ 4262306a36Sopenharmony_ci sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ 4362306a36Sopenharmony_ci offload dev eth4 dir in 4462306a36Sopenharmony_ci 4562306a36Sopenharmony_ciand for packet offload 4662306a36Sopenharmony_ci 4762306a36Sopenharmony_ci ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ 4862306a36Sopenharmony_ci reqid 0x07 replay-window 32 \ 4962306a36Sopenharmony_ci aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \ 5062306a36Sopenharmony_ci sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ 5162306a36Sopenharmony_ci offload packet dev eth4 dir in 5262306a36Sopenharmony_ci 5362306a36Sopenharmony_ci ip x p add src 14.0.0.70 dst 14.0.0.52 offload packet dev eth4 dir in 5462306a36Sopenharmony_ci tmpl src 14.0.0.70 dst 14.0.0.52 proto esp reqid 10000 mode transport 5562306a36Sopenharmony_ci 5662306a36Sopenharmony_ciYes, that's ugly, but that's what shell scripts and/or libreswan are for. 5762306a36Sopenharmony_ci 5862306a36Sopenharmony_ci 5962306a36Sopenharmony_ci 6062306a36Sopenharmony_ciCallbacks to implement 6162306a36Sopenharmony_ci====================== 6262306a36Sopenharmony_ci 6362306a36Sopenharmony_ci:: 6462306a36Sopenharmony_ci 6562306a36Sopenharmony_ci /* from include/linux/netdevice.h */ 6662306a36Sopenharmony_ci struct xfrmdev_ops { 6762306a36Sopenharmony_ci /* Crypto and Packet offload callbacks */ 6862306a36Sopenharmony_ci int (*xdo_dev_state_add) (struct xfrm_state *x, struct netlink_ext_ack *extack); 6962306a36Sopenharmony_ci void (*xdo_dev_state_delete) (struct xfrm_state *x); 7062306a36Sopenharmony_ci void (*xdo_dev_state_free) (struct xfrm_state *x); 7162306a36Sopenharmony_ci bool (*xdo_dev_offload_ok) (struct sk_buff *skb, 7262306a36Sopenharmony_ci struct xfrm_state *x); 7362306a36Sopenharmony_ci void (*xdo_dev_state_advance_esn) (struct xfrm_state *x); 7462306a36Sopenharmony_ci 7562306a36Sopenharmony_ci /* Solely packet offload callbacks */ 7662306a36Sopenharmony_ci void (*xdo_dev_state_update_curlft) (struct xfrm_state *x); 7762306a36Sopenharmony_ci int (*xdo_dev_policy_add) (struct xfrm_policy *x, struct netlink_ext_ack *extack); 7862306a36Sopenharmony_ci void (*xdo_dev_policy_delete) (struct xfrm_policy *x); 7962306a36Sopenharmony_ci void (*xdo_dev_policy_free) (struct xfrm_policy *x); 8062306a36Sopenharmony_ci }; 8162306a36Sopenharmony_ci 8262306a36Sopenharmony_ciThe NIC driver offering ipsec offload will need to implement callbacks 8362306a36Sopenharmony_cirelevant to supported offload to make the offload available to the network 8462306a36Sopenharmony_cistack's XFRM subsystem. Additionally, the feature bits NETIF_F_HW_ESP and 8562306a36Sopenharmony_ciNETIF_F_HW_ESP_TX_CSUM will signal the availability of the offload. 8662306a36Sopenharmony_ci 8762306a36Sopenharmony_ci 8862306a36Sopenharmony_ci 8962306a36Sopenharmony_ciFlow 9062306a36Sopenharmony_ci==== 9162306a36Sopenharmony_ci 9262306a36Sopenharmony_ciAt probe time and before the call to register_netdev(), the driver should 9362306a36Sopenharmony_ciset up local data structures and XFRM callbacks, and set the feature bits. 9462306a36Sopenharmony_ciThe XFRM code's listener will finish the setup on NETDEV_REGISTER. 9562306a36Sopenharmony_ci 9662306a36Sopenharmony_ci:: 9762306a36Sopenharmony_ci 9862306a36Sopenharmony_ci adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops; 9962306a36Sopenharmony_ci adapter->netdev->features |= NETIF_F_HW_ESP; 10062306a36Sopenharmony_ci adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP; 10162306a36Sopenharmony_ci 10262306a36Sopenharmony_ciWhen new SAs are set up with a request for "offload" feature, the 10362306a36Sopenharmony_cidriver's xdo_dev_state_add() will be given the new SA to be offloaded 10462306a36Sopenharmony_ciand an indication of whether it is for Rx or Tx. The driver should 10562306a36Sopenharmony_ci 10662306a36Sopenharmony_ci - verify the algorithm is supported for offloads 10762306a36Sopenharmony_ci - store the SA information (key, salt, target-ip, protocol, etc) 10862306a36Sopenharmony_ci - enable the HW offload of the SA 10962306a36Sopenharmony_ci - return status value: 11062306a36Sopenharmony_ci 11162306a36Sopenharmony_ci =========== =================================== 11262306a36Sopenharmony_ci 0 success 11362306a36Sopenharmony_ci -EOPNETSUPP offload not supported, try SW IPsec, 11462306a36Sopenharmony_ci not applicable for packet offload mode 11562306a36Sopenharmony_ci other fail the request 11662306a36Sopenharmony_ci =========== =================================== 11762306a36Sopenharmony_ci 11862306a36Sopenharmony_ciThe driver can also set an offload_handle in the SA, an opaque void pointer 11962306a36Sopenharmony_cithat can be used to convey context into the fast-path offload requests:: 12062306a36Sopenharmony_ci 12162306a36Sopenharmony_ci xs->xso.offload_handle = context; 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_ci 12462306a36Sopenharmony_ciWhen the network stack is preparing an IPsec packet for an SA that has 12562306a36Sopenharmony_cibeen setup for offload, it first calls into xdo_dev_offload_ok() with 12662306a36Sopenharmony_cithe skb and the intended offload state to ask the driver if the offload 12762306a36Sopenharmony_ciwill serviceable. This can check the packet information to be sure the 12862306a36Sopenharmony_cioffload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and 12962306a36Sopenharmony_cireturn true of false to signify its support. 13062306a36Sopenharmony_ci 13162306a36Sopenharmony_ciCrypto offload mode: 13262306a36Sopenharmony_ciWhen ready to send, the driver needs to inspect the Tx packet for the 13362306a36Sopenharmony_cioffload information, including the opaque context, and set up the packet 13462306a36Sopenharmony_cisend accordingly:: 13562306a36Sopenharmony_ci 13662306a36Sopenharmony_ci xs = xfrm_input_state(skb); 13762306a36Sopenharmony_ci context = xs->xso.offload_handle; 13862306a36Sopenharmony_ci set up HW for send 13962306a36Sopenharmony_ci 14062306a36Sopenharmony_ciThe stack has already inserted the appropriate IPsec headers in the 14162306a36Sopenharmony_cipacket data, the offload just needs to do the encryption and fix up the 14262306a36Sopenharmony_ciheader values. 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_ci 14562306a36Sopenharmony_ciWhen a packet is received and the HW has indicated that it offloaded a 14662306a36Sopenharmony_cidecryption, the driver needs to add a reference to the decoded SA into 14762306a36Sopenharmony_cithe packet's skb. At this point the data should be decrypted but the 14862306a36Sopenharmony_ciIPsec headers are still in the packet data; they are removed later up 14962306a36Sopenharmony_cithe stack in xfrm_input(). 15062306a36Sopenharmony_ci 15162306a36Sopenharmony_ci find and hold the SA that was used to the Rx skb:: 15262306a36Sopenharmony_ci 15362306a36Sopenharmony_ci get spi, protocol, and destination IP from packet headers 15462306a36Sopenharmony_ci xs = find xs from (spi, protocol, dest_IP) 15562306a36Sopenharmony_ci xfrm_state_hold(xs); 15662306a36Sopenharmony_ci 15762306a36Sopenharmony_ci store the state information into the skb:: 15862306a36Sopenharmony_ci 15962306a36Sopenharmony_ci sp = secpath_set(skb); 16062306a36Sopenharmony_ci if (!sp) return; 16162306a36Sopenharmony_ci sp->xvec[sp->len++] = xs; 16262306a36Sopenharmony_ci sp->olen++; 16362306a36Sopenharmony_ci 16462306a36Sopenharmony_ci indicate the success and/or error status of the offload:: 16562306a36Sopenharmony_ci 16662306a36Sopenharmony_ci xo = xfrm_offload(skb); 16762306a36Sopenharmony_ci xo->flags = CRYPTO_DONE; 16862306a36Sopenharmony_ci xo->status = crypto_status; 16962306a36Sopenharmony_ci 17062306a36Sopenharmony_ci hand the packet to napi_gro_receive() as usual 17162306a36Sopenharmony_ci 17262306a36Sopenharmony_ciIn ESN mode, xdo_dev_state_advance_esn() is called from xfrm_replay_advance_esn(). 17362306a36Sopenharmony_ciDriver will check packet seq number and update HW ESN state machine if needed. 17462306a36Sopenharmony_ci 17562306a36Sopenharmony_ciPacket offload mode: 17662306a36Sopenharmony_ciHW adds and deletes XFRM headers. So in RX path, XFRM stack is bypassed if HW 17762306a36Sopenharmony_cireported success. In TX path, the packet lefts kernel without extra header 17862306a36Sopenharmony_ciand not encrypted, the HW is responsible to perform it. 17962306a36Sopenharmony_ci 18062306a36Sopenharmony_ciWhen the SA is removed by the user, the driver's xdo_dev_state_delete() 18162306a36Sopenharmony_ciand xdo_dev_policy_delete() are asked to disable the offload. Later, 18262306a36Sopenharmony_cixdo_dev_state_free() and xdo_dev_policy_free() are called from a garbage 18362306a36Sopenharmony_cicollection routine after all reference counts to the state and policy 18462306a36Sopenharmony_cihave been removed and any remaining resources can be cleared for the 18562306a36Sopenharmony_cioffload state. How these are used by the driver will depend on specific 18662306a36Sopenharmony_cihardware needs. 18762306a36Sopenharmony_ci 18862306a36Sopenharmony_ciAs a netdev is set to DOWN the XFRM stack's netdev listener will call 18962306a36Sopenharmony_cixdo_dev_state_delete(), xdo_dev_policy_delete(), xdo_dev_state_free() and 19062306a36Sopenharmony_cixdo_dev_policy_free() on any remaining offloaded states. 19162306a36Sopenharmony_ci 19262306a36Sopenharmony_ciOutcome of HW handling packets, the XFRM core can't count hard, soft limits. 19362306a36Sopenharmony_ciThe HW/driver are responsible to perform it and provide accurate data when 19462306a36Sopenharmony_cixdo_dev_state_update_curlft() is called. In case of one of these limits 19562306a36Sopenharmony_cioccuried, the driver needs to call to xfrm_state_check_expire() to make sure 19662306a36Sopenharmony_cithat XFRM performs rekeying sequence. 197