162306a36Sopenharmony_ci.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci================== 462306a36Sopenharmony_ciKernel TLS offload 562306a36Sopenharmony_ci================== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciKernel TLS operation 862306a36Sopenharmony_ci==================== 962306a36Sopenharmony_ci 1062306a36Sopenharmony_ciLinux kernel provides TLS connection offload infrastructure. Once a TCP 1162306a36Sopenharmony_ciconnection is in ``ESTABLISHED`` state user space can enable the TLS Upper 1262306a36Sopenharmony_ciLayer Protocol (ULP) and install the cryptographic connection state. 1362306a36Sopenharmony_ciFor details regarding the user-facing interface refer to the TLS 1462306a36Sopenharmony_cidocumentation in :ref:`Documentation/networking/tls.rst <kernel_tls>`. 1562306a36Sopenharmony_ci 1662306a36Sopenharmony_ci``ktls`` can operate in three modes: 1762306a36Sopenharmony_ci 1862306a36Sopenharmony_ci * Software crypto mode (``TLS_SW``) - CPU handles the cryptography. 1962306a36Sopenharmony_ci In most basic cases only crypto operations synchronous with the CPU 2062306a36Sopenharmony_ci can be used, but depending on calling context CPU may utilize 2162306a36Sopenharmony_ci asynchronous crypto accelerators. The use of accelerators introduces extra 2262306a36Sopenharmony_ci latency on socket reads (decryption only starts when a read syscall 2362306a36Sopenharmony_ci is made) and additional I/O load on the system. 2462306a36Sopenharmony_ci * Packet-based NIC offload mode (``TLS_HW``) - the NIC handles crypto 2562306a36Sopenharmony_ci on a packet by packet basis, provided the packets arrive in order. 2662306a36Sopenharmony_ci This mode integrates best with the kernel stack and is described in detail 2762306a36Sopenharmony_ci in the remaining part of this document 2862306a36Sopenharmony_ci (``ethtool`` flags ``tls-hw-tx-offload`` and ``tls-hw-rx-offload``). 2962306a36Sopenharmony_ci * Full TCP NIC offload mode (``TLS_HW_RECORD``) - mode of operation where 3062306a36Sopenharmony_ci NIC driver and firmware replace the kernel networking stack 3162306a36Sopenharmony_ci with its own TCP handling, it is not usable in production environments 3262306a36Sopenharmony_ci making use of the Linux networking stack for example any firewalling 3362306a36Sopenharmony_ci abilities or QoS and packet scheduling (``ethtool`` flag ``tls-hw-record``). 3462306a36Sopenharmony_ci 3562306a36Sopenharmony_ciThe operation mode is selected automatically based on device configuration, 3662306a36Sopenharmony_cioffload opt-in or opt-out on per-connection basis is not currently supported. 3762306a36Sopenharmony_ci 3862306a36Sopenharmony_ciTX 3962306a36Sopenharmony_ci-- 4062306a36Sopenharmony_ci 4162306a36Sopenharmony_ciAt a high level user write requests are turned into a scatter list, the TLS ULP 4262306a36Sopenharmony_ciintercepts them, inserts record framing, performs encryption (in ``TLS_SW`` 4362306a36Sopenharmony_cimode) and then hands the modified scatter list to the TCP layer. From this 4462306a36Sopenharmony_cipoint on the TCP stack proceeds as normal. 4562306a36Sopenharmony_ci 4662306a36Sopenharmony_ciIn ``TLS_HW`` mode the encryption is not performed in the TLS ULP. 4762306a36Sopenharmony_ciInstead packets reach a device driver, the driver will mark the packets 4862306a36Sopenharmony_cifor crypto offload based on the socket the packet is attached to, 4962306a36Sopenharmony_ciand send them to the device for encryption and transmission. 5062306a36Sopenharmony_ci 5162306a36Sopenharmony_ciRX 5262306a36Sopenharmony_ci-- 5362306a36Sopenharmony_ci 5462306a36Sopenharmony_ciOn the receive side if the device handled decryption and authentication 5562306a36Sopenharmony_cisuccessfully, the driver will set the decrypted bit in the associated 5662306a36Sopenharmony_ci:c:type:`struct sk_buff <sk_buff>`. The packets reach the TCP stack and 5762306a36Sopenharmony_ciare handled normally. ``ktls`` is informed when data is queued to the socket 5862306a36Sopenharmony_ciand the ``strparser`` mechanism is used to delineate the records. Upon read 5962306a36Sopenharmony_cirequest, records are retrieved from the socket and passed to decryption routine. 6062306a36Sopenharmony_ciIf device decrypted all the segments of the record the decryption is skipped, 6162306a36Sopenharmony_ciotherwise software path handles decryption. 6262306a36Sopenharmony_ci 6362306a36Sopenharmony_ci.. kernel-figure:: tls-offload-layers.svg 6462306a36Sopenharmony_ci :alt: TLS offload layers 6562306a36Sopenharmony_ci :align: center 6662306a36Sopenharmony_ci :figwidth: 28em 6762306a36Sopenharmony_ci 6862306a36Sopenharmony_ci Layers of Kernel TLS stack 6962306a36Sopenharmony_ci 7062306a36Sopenharmony_ciDevice configuration 7162306a36Sopenharmony_ci==================== 7262306a36Sopenharmony_ci 7362306a36Sopenharmony_ciDuring driver initialization device sets the ``NETIF_F_HW_TLS_RX`` and 7462306a36Sopenharmony_ci``NETIF_F_HW_TLS_TX`` features and installs its 7562306a36Sopenharmony_ci:c:type:`struct tlsdev_ops <tlsdev_ops>` 7662306a36Sopenharmony_cipointer in the :c:member:`tlsdev_ops` member of the 7762306a36Sopenharmony_ci:c:type:`struct net_device <net_device>`. 7862306a36Sopenharmony_ci 7962306a36Sopenharmony_ciWhen TLS cryptographic connection state is installed on a ``ktls`` socket 8062306a36Sopenharmony_ci(note that it is done twice, once for RX and once for TX direction, 8162306a36Sopenharmony_ciand the two are completely independent), the kernel checks if the underlying 8262306a36Sopenharmony_cinetwork device is offload-capable and attempts the offload. In case offload 8362306a36Sopenharmony_cifails the connection is handled entirely in software using the same mechanism 8462306a36Sopenharmony_cias if the offload was never tried. 8562306a36Sopenharmony_ci 8662306a36Sopenharmony_ciOffload request is performed via the :c:member:`tls_dev_add` callback of 8762306a36Sopenharmony_ci:c:type:`struct tlsdev_ops <tlsdev_ops>`: 8862306a36Sopenharmony_ci 8962306a36Sopenharmony_ci.. code-block:: c 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ci int (*tls_dev_add)(struct net_device *netdev, struct sock *sk, 9262306a36Sopenharmony_ci enum tls_offload_ctx_dir direction, 9362306a36Sopenharmony_ci struct tls_crypto_info *crypto_info, 9462306a36Sopenharmony_ci u32 start_offload_tcp_sn); 9562306a36Sopenharmony_ci 9662306a36Sopenharmony_ci``direction`` indicates whether the cryptographic information is for 9762306a36Sopenharmony_cithe received or transmitted packets. Driver uses the ``sk`` parameter 9862306a36Sopenharmony_cito retrieve the connection 5-tuple and socket family (IPv4 vs IPv6). 9962306a36Sopenharmony_ciCryptographic information in ``crypto_info`` includes the key, iv, salt 10062306a36Sopenharmony_cias well as TLS record sequence number. ``start_offload_tcp_sn`` indicates 10162306a36Sopenharmony_ciwhich TCP sequence number corresponds to the beginning of the record with 10262306a36Sopenharmony_cisequence number from ``crypto_info``. The driver can add its state 10362306a36Sopenharmony_ciat the end of kernel structures (see :c:member:`driver_state` members 10462306a36Sopenharmony_ciin ``include/net/tls.h``) to avoid additional allocations and pointer 10562306a36Sopenharmony_cidereferences. 10662306a36Sopenharmony_ci 10762306a36Sopenharmony_ciTX 10862306a36Sopenharmony_ci-- 10962306a36Sopenharmony_ci 11062306a36Sopenharmony_ciAfter TX state is installed, the stack guarantees that the first segment 11162306a36Sopenharmony_ciof the stream will start exactly at the ``start_offload_tcp_sn`` sequence 11262306a36Sopenharmony_cinumber, simplifying TCP sequence number matching. 11362306a36Sopenharmony_ci 11462306a36Sopenharmony_ciTX offload being fully initialized does not imply that all segments passing 11562306a36Sopenharmony_cithrough the driver and which belong to the offloaded socket will be after 11662306a36Sopenharmony_cithe expected sequence number and will have kernel record information. 11762306a36Sopenharmony_ciIn particular, already encrypted data may have been queued to the socket 11862306a36Sopenharmony_cibefore installing the connection state in the kernel. 11962306a36Sopenharmony_ci 12062306a36Sopenharmony_ciRX 12162306a36Sopenharmony_ci-- 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_ciIn RX direction local networking stack has little control over the segmentation, 12462306a36Sopenharmony_ciso the initial records' TCP sequence number may be anywhere inside the segment. 12562306a36Sopenharmony_ci 12662306a36Sopenharmony_ciNormal operation 12762306a36Sopenharmony_ci================ 12862306a36Sopenharmony_ci 12962306a36Sopenharmony_ciAt the minimum the device maintains the following state for each connection, in 13062306a36Sopenharmony_cieach direction: 13162306a36Sopenharmony_ci 13262306a36Sopenharmony_ci * crypto secrets (key, iv, salt) 13362306a36Sopenharmony_ci * crypto processing state (partial blocks, partial authentication tag, etc.) 13462306a36Sopenharmony_ci * record metadata (sequence number, processing offset and length) 13562306a36Sopenharmony_ci * expected TCP sequence number 13662306a36Sopenharmony_ci 13762306a36Sopenharmony_ciThere are no guarantees on record length or record segmentation. In particular 13862306a36Sopenharmony_cisegments may start at any point of a record and contain any number of records. 13962306a36Sopenharmony_ciAssuming segments are received in order, the device should be able to perform 14062306a36Sopenharmony_cicrypto operations and authentication regardless of segmentation. For this 14162306a36Sopenharmony_cito be possible device has to keep small amount of segment-to-segment state. 14262306a36Sopenharmony_ciThis includes at least: 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_ci * partial headers (if a segment carried only a part of the TLS header) 14562306a36Sopenharmony_ci * partial data block 14662306a36Sopenharmony_ci * partial authentication tag (all data had been seen but part of the 14762306a36Sopenharmony_ci authentication tag has to be written or read from the subsequent segment) 14862306a36Sopenharmony_ci 14962306a36Sopenharmony_ciRecord reassembly is not necessary for TLS offload. If the packets arrive 15062306a36Sopenharmony_ciin order the device should be able to handle them separately and make 15162306a36Sopenharmony_ciforward progress. 15262306a36Sopenharmony_ci 15362306a36Sopenharmony_ciTX 15462306a36Sopenharmony_ci-- 15562306a36Sopenharmony_ci 15662306a36Sopenharmony_ciThe kernel stack performs record framing reserving space for the authentication 15762306a36Sopenharmony_citag and populating all other TLS header and tailer fields. 15862306a36Sopenharmony_ci 15962306a36Sopenharmony_ciBoth the device and the driver maintain expected TCP sequence numbers 16062306a36Sopenharmony_cidue to the possibility of retransmissions and the lack of software fallback 16162306a36Sopenharmony_cionce the packet reaches the device. 16262306a36Sopenharmony_ciFor segments passed in order, the driver marks the packets with 16362306a36Sopenharmony_cia connection identifier (note that a 5-tuple lookup is insufficient to identify 16462306a36Sopenharmony_cipackets requiring HW offload, see the :ref:`5tuple_problems` section) 16562306a36Sopenharmony_ciand hands them to the device. The device identifies the packet as requiring 16662306a36Sopenharmony_ciTLS handling and confirms the sequence number matches its expectation. 16762306a36Sopenharmony_ciThe device performs encryption and authentication of the record data. 16862306a36Sopenharmony_ciIt replaces the authentication tag and TCP checksum with correct values. 16962306a36Sopenharmony_ci 17062306a36Sopenharmony_ciRX 17162306a36Sopenharmony_ci-- 17262306a36Sopenharmony_ci 17362306a36Sopenharmony_ciBefore a packet is DMAed to the host (but after NIC's embedded switching 17462306a36Sopenharmony_ciand packet transformation functions) the device validates the Layer 4 17562306a36Sopenharmony_cichecksum and performs a 5-tuple lookup to find any TLS connection the packet 17662306a36Sopenharmony_cimay belong to (technically a 4-tuple 17762306a36Sopenharmony_cilookup is sufficient - IP addresses and TCP port numbers, as the protocol 17862306a36Sopenharmony_ciis always TCP). If connection is matched device confirms if the TCP sequence 17962306a36Sopenharmony_cinumber is the expected one and proceeds to TLS handling (record delineation, 18062306a36Sopenharmony_cidecryption, authentication for each record in the packet). The device leaves 18162306a36Sopenharmony_cithe record framing unmodified, the stack takes care of record decapsulation. 18262306a36Sopenharmony_ciDevice indicates successful handling of TLS offload in the per-packet context 18362306a36Sopenharmony_ci(descriptor) passed to the host. 18462306a36Sopenharmony_ci 18562306a36Sopenharmony_ciUpon reception of a TLS offloaded packet, the driver sets 18662306a36Sopenharmony_cithe :c:member:`decrypted` mark in :c:type:`struct sk_buff <sk_buff>` 18762306a36Sopenharmony_cicorresponding to the segment. Networking stack makes sure decrypted 18862306a36Sopenharmony_ciand non-decrypted segments do not get coalesced (e.g. by GRO or socket layer) 18962306a36Sopenharmony_ciand takes care of partial decryption. 19062306a36Sopenharmony_ci 19162306a36Sopenharmony_ciResync handling 19262306a36Sopenharmony_ci=============== 19362306a36Sopenharmony_ci 19462306a36Sopenharmony_ciIn presence of packet drops or network packet reordering, the device may lose 19562306a36Sopenharmony_cisynchronization with the TLS stream, and require a resync with the kernel's 19662306a36Sopenharmony_ciTCP stack. 19762306a36Sopenharmony_ci 19862306a36Sopenharmony_ciNote that resync is only attempted for connections which were successfully 19962306a36Sopenharmony_ciadded to the device table and are in TLS_HW mode. For example, 20062306a36Sopenharmony_ciif the table was full when cryptographic state was installed in the kernel, 20162306a36Sopenharmony_cisuch connection will never get offloaded. Therefore the resync request 20262306a36Sopenharmony_cidoes not carry any cryptographic connection state. 20362306a36Sopenharmony_ci 20462306a36Sopenharmony_ciTX 20562306a36Sopenharmony_ci-- 20662306a36Sopenharmony_ci 20762306a36Sopenharmony_ciSegments transmitted from an offloaded socket can get out of sync 20862306a36Sopenharmony_ciin similar ways to the receive side-retransmissions - local drops 20962306a36Sopenharmony_ciare possible, though network reorders are not. There are currently 21062306a36Sopenharmony_citwo mechanisms for dealing with out of order segments. 21162306a36Sopenharmony_ci 21262306a36Sopenharmony_ciCrypto state rebuilding 21362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~ 21462306a36Sopenharmony_ci 21562306a36Sopenharmony_ciWhenever an out of order segment is transmitted the driver provides 21662306a36Sopenharmony_cithe device with enough information to perform cryptographic operations. 21762306a36Sopenharmony_ciThis means most likely that the part of the record preceding the current 21862306a36Sopenharmony_cisegment has to be passed to the device as part of the packet context, 21962306a36Sopenharmony_citogether with its TCP sequence number and TLS record number. The device 22062306a36Sopenharmony_cican then initialize its crypto state, process and discard the preceding 22162306a36Sopenharmony_cidata (to be able to insert the authentication tag) and move onto handling 22262306a36Sopenharmony_cithe actual packet. 22362306a36Sopenharmony_ci 22462306a36Sopenharmony_ciIn this mode depending on the implementation the driver can either ask 22562306a36Sopenharmony_cifor a continuation with the crypto state and the new sequence number 22662306a36Sopenharmony_ci(next expected segment is the one after the out of order one), or continue 22762306a36Sopenharmony_ciwith the previous stream state - assuming that the out of order segment 22862306a36Sopenharmony_ciwas just a retransmission. The former is simpler, and does not require 22962306a36Sopenharmony_ciretransmission detection therefore it is the recommended method until 23062306a36Sopenharmony_cisuch time it is proven inefficient. 23162306a36Sopenharmony_ci 23262306a36Sopenharmony_ciNext record sync 23362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~ 23462306a36Sopenharmony_ci 23562306a36Sopenharmony_ciWhenever an out of order segment is detected the driver requests 23662306a36Sopenharmony_cithat the ``ktls`` software fallback code encrypt it. If the segment's 23762306a36Sopenharmony_cisequence number is lower than expected the driver assumes retransmission 23862306a36Sopenharmony_ciand doesn't change device state. If the segment is in the future, it 23962306a36Sopenharmony_cimay imply a local drop, the driver asks the stack to sync the device 24062306a36Sopenharmony_cito the next record state and falls back to software. 24162306a36Sopenharmony_ci 24262306a36Sopenharmony_ciResync request is indicated with: 24362306a36Sopenharmony_ci 24462306a36Sopenharmony_ci.. code-block:: c 24562306a36Sopenharmony_ci 24662306a36Sopenharmony_ci void tls_offload_tx_resync_request(struct sock *sk, u32 got_seq, u32 exp_seq) 24762306a36Sopenharmony_ci 24862306a36Sopenharmony_ciUntil resync is complete driver should not access its expected TCP 24962306a36Sopenharmony_cisequence number (as it will be updated from a different context). 25062306a36Sopenharmony_ciFollowing helper should be used to test if resync is complete: 25162306a36Sopenharmony_ci 25262306a36Sopenharmony_ci.. code-block:: c 25362306a36Sopenharmony_ci 25462306a36Sopenharmony_ci bool tls_offload_tx_resync_pending(struct sock *sk) 25562306a36Sopenharmony_ci 25662306a36Sopenharmony_ciNext time ``ktls`` pushes a record it will first send its TCP sequence number 25762306a36Sopenharmony_ciand TLS record number to the driver. Stack will also make sure that 25862306a36Sopenharmony_cithe new record will start on a segment boundary (like it does when 25962306a36Sopenharmony_cithe connection is initially added). 26062306a36Sopenharmony_ci 26162306a36Sopenharmony_ciRX 26262306a36Sopenharmony_ci-- 26362306a36Sopenharmony_ci 26462306a36Sopenharmony_ciA small amount of RX reorder events may not require a full resynchronization. 26562306a36Sopenharmony_ciIn particular the device should not lose synchronization 26662306a36Sopenharmony_ciwhen record boundary can be recovered: 26762306a36Sopenharmony_ci 26862306a36Sopenharmony_ci.. kernel-figure:: tls-offload-reorder-good.svg 26962306a36Sopenharmony_ci :alt: reorder of non-header segment 27062306a36Sopenharmony_ci :align: center 27162306a36Sopenharmony_ci 27262306a36Sopenharmony_ci Reorder of non-header segment 27362306a36Sopenharmony_ci 27462306a36Sopenharmony_ciGreen segments are successfully decrypted, blue ones are passed 27562306a36Sopenharmony_cias received on wire, red stripes mark start of new records. 27662306a36Sopenharmony_ci 27762306a36Sopenharmony_ciIn above case segment 1 is received and decrypted successfully. 27862306a36Sopenharmony_ciSegment 2 was dropped so 3 arrives out of order. The device knows 27962306a36Sopenharmony_cithe next record starts inside 3, based on record length in segment 1. 28062306a36Sopenharmony_ciSegment 3 is passed untouched, because due to lack of data from segment 2 28162306a36Sopenharmony_cithe remainder of the previous record inside segment 3 cannot be handled. 28262306a36Sopenharmony_ciThe device can, however, collect the authentication algorithm's state 28362306a36Sopenharmony_ciand partial block from the new record in segment 3 and when 4 and 5 28462306a36Sopenharmony_ciarrive continue decryption. Finally when 2 arrives it's completely outside 28562306a36Sopenharmony_ciof expected window of the device so it's passed as is without special 28662306a36Sopenharmony_cihandling. ``ktls`` software fallback handles the decryption of record 28762306a36Sopenharmony_cispanning segments 1, 2 and 3. The device did not get out of sync, 28862306a36Sopenharmony_cieven though two segments did not get decrypted. 28962306a36Sopenharmony_ci 29062306a36Sopenharmony_ciKernel synchronization may be necessary if the lost segment contained 29162306a36Sopenharmony_cia record header and arrived after the next record header has already passed: 29262306a36Sopenharmony_ci 29362306a36Sopenharmony_ci.. kernel-figure:: tls-offload-reorder-bad.svg 29462306a36Sopenharmony_ci :alt: reorder of header segment 29562306a36Sopenharmony_ci :align: center 29662306a36Sopenharmony_ci 29762306a36Sopenharmony_ci Reorder of segment with a TLS header 29862306a36Sopenharmony_ci 29962306a36Sopenharmony_ciIn this example segment 2 gets dropped, and it contains a record header. 30062306a36Sopenharmony_ciDevice can only detect that segment 4 also contains a TLS header 30162306a36Sopenharmony_ciif it knows the length of the previous record from segment 2. In this case 30262306a36Sopenharmony_cithe device will lose synchronization with the stream. 30362306a36Sopenharmony_ci 30462306a36Sopenharmony_ciStream scan resynchronization 30562306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 30662306a36Sopenharmony_ci 30762306a36Sopenharmony_ciWhen the device gets out of sync and the stream reaches TCP sequence 30862306a36Sopenharmony_cinumbers more than a max size record past the expected TCP sequence number, 30962306a36Sopenharmony_cithe device starts scanning for a known header pattern. For example 31062306a36Sopenharmony_cifor TLS 1.2 and TLS 1.3 subsequent bytes of value ``0x03 0x03`` occur 31162306a36Sopenharmony_ciin the SSL/TLS version field of the header. Once pattern is matched 31262306a36Sopenharmony_cithe device continues attempting parsing headers at expected locations 31362306a36Sopenharmony_ci(based on the length fields at guessed locations). 31462306a36Sopenharmony_ciWhenever the expected location does not contain a valid header the scan 31562306a36Sopenharmony_ciis restarted. 31662306a36Sopenharmony_ci 31762306a36Sopenharmony_ciWhen the header is matched the device sends a confirmation request 31862306a36Sopenharmony_cito the kernel, asking if the guessed location is correct (if a TLS record 31962306a36Sopenharmony_cireally starts there), and which record sequence number the given header had. 32062306a36Sopenharmony_ciThe kernel confirms the guessed location was correct and tells the device 32162306a36Sopenharmony_cithe record sequence number. Meanwhile, the device had been parsing 32262306a36Sopenharmony_ciand counting all records since the just-confirmed one, it adds the number 32362306a36Sopenharmony_ciof records it had seen to the record number provided by the kernel. 32462306a36Sopenharmony_ciAt this point the device is in sync and can resume decryption at next 32562306a36Sopenharmony_cisegment boundary. 32662306a36Sopenharmony_ci 32762306a36Sopenharmony_ciIn a pathological case the device may latch onto a sequence of matching 32862306a36Sopenharmony_ciheaders and never hear back from the kernel (there is no negative 32962306a36Sopenharmony_ciconfirmation from the kernel). The implementation may choose to periodically 33062306a36Sopenharmony_cirestart scan. Given how unlikely falsely-matching stream is, however, 33162306a36Sopenharmony_ciperiodic restart is not deemed necessary. 33262306a36Sopenharmony_ci 33362306a36Sopenharmony_ciSpecial care has to be taken if the confirmation request is passed 33462306a36Sopenharmony_ciasynchronously to the packet stream and record may get processed 33562306a36Sopenharmony_ciby the kernel before the confirmation request. 33662306a36Sopenharmony_ci 33762306a36Sopenharmony_ciStack-driven resynchronization 33862306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 33962306a36Sopenharmony_ci 34062306a36Sopenharmony_ciThe driver may also request the stack to perform resynchronization 34162306a36Sopenharmony_ciwhenever it sees the records are no longer getting decrypted. 34262306a36Sopenharmony_ciIf the connection is configured in this mode the stack automatically 34362306a36Sopenharmony_cischedules resynchronization after it has received two completely encrypted 34462306a36Sopenharmony_cirecords. 34562306a36Sopenharmony_ci 34662306a36Sopenharmony_ciThe stack waits for the socket to drain and informs the device about 34762306a36Sopenharmony_cithe next expected record number and its TCP sequence number. If the 34862306a36Sopenharmony_cirecords continue to be received fully encrypted stack retries the 34962306a36Sopenharmony_cisynchronization with an exponential back off (first after 2 encrypted 35062306a36Sopenharmony_cirecords, then after 4 records, after 8, after 16... up until every 35162306a36Sopenharmony_ci128 records). 35262306a36Sopenharmony_ci 35362306a36Sopenharmony_ciError handling 35462306a36Sopenharmony_ci============== 35562306a36Sopenharmony_ci 35662306a36Sopenharmony_ciTX 35762306a36Sopenharmony_ci-- 35862306a36Sopenharmony_ci 35962306a36Sopenharmony_ciPackets may be redirected or rerouted by the stack to a different 36062306a36Sopenharmony_cidevice than the selected TLS offload device. The stack will handle 36162306a36Sopenharmony_cisuch condition using the :c:func:`sk_validate_xmit_skb` helper 36262306a36Sopenharmony_ci(TLS offload code installs :c:func:`tls_validate_xmit_skb` at this hook). 36362306a36Sopenharmony_ciOffload maintains information about all records until the data is 36462306a36Sopenharmony_cifully acknowledged, so if skbs reach the wrong device they can be handled 36562306a36Sopenharmony_ciby software fallback. 36662306a36Sopenharmony_ci 36762306a36Sopenharmony_ciAny device TLS offload handling error on the transmission side must result 36862306a36Sopenharmony_ciin the packet being dropped. For example if a packet got out of order 36962306a36Sopenharmony_cidue to a bug in the stack or the device, reached the device and can't 37062306a36Sopenharmony_cibe encrypted such packet must be dropped. 37162306a36Sopenharmony_ci 37262306a36Sopenharmony_ciRX 37362306a36Sopenharmony_ci-- 37462306a36Sopenharmony_ci 37562306a36Sopenharmony_ciIf the device encounters any problems with TLS offload on the receive 37662306a36Sopenharmony_ciside it should pass the packet to the host's networking stack as it was 37762306a36Sopenharmony_cireceived on the wire. 37862306a36Sopenharmony_ci 37962306a36Sopenharmony_ciFor example authentication failure for any record in the segment should 38062306a36Sopenharmony_ciresult in passing the unmodified packet to the software fallback. This means 38162306a36Sopenharmony_cipackets should not be modified "in place". Splitting segments to handle partial 38262306a36Sopenharmony_cidecryption is not advised. In other words either all records in the packet 38362306a36Sopenharmony_cihad been handled successfully and authenticated or the packet has to be passed 38462306a36Sopenharmony_cito the host's stack as it was on the wire (recovering original packet in the 38562306a36Sopenharmony_cidriver if device provides precise error is sufficient). 38662306a36Sopenharmony_ci 38762306a36Sopenharmony_ciThe Linux networking stack does not provide a way of reporting per-packet 38862306a36Sopenharmony_cidecryption and authentication errors, packets with errors must simply not 38962306a36Sopenharmony_cihave the :c:member:`decrypted` mark set. 39062306a36Sopenharmony_ci 39162306a36Sopenharmony_ciA packet should also not be handled by the TLS offload if it contains 39262306a36Sopenharmony_ciincorrect checksums. 39362306a36Sopenharmony_ci 39462306a36Sopenharmony_ciPerformance metrics 39562306a36Sopenharmony_ci=================== 39662306a36Sopenharmony_ci 39762306a36Sopenharmony_ciTLS offload can be characterized by the following basic metrics: 39862306a36Sopenharmony_ci 39962306a36Sopenharmony_ci * max connection count 40062306a36Sopenharmony_ci * connection installation rate 40162306a36Sopenharmony_ci * connection installation latency 40262306a36Sopenharmony_ci * total cryptographic performance 40362306a36Sopenharmony_ci 40462306a36Sopenharmony_ciNote that each TCP connection requires a TLS session in both directions, 40562306a36Sopenharmony_cithe performance may be reported treating each direction separately. 40662306a36Sopenharmony_ci 40762306a36Sopenharmony_ciMax connection count 40862306a36Sopenharmony_ci-------------------- 40962306a36Sopenharmony_ci 41062306a36Sopenharmony_ciThe number of connections device can support can be exposed via 41162306a36Sopenharmony_ci``devlink resource`` API. 41262306a36Sopenharmony_ci 41362306a36Sopenharmony_ciTotal cryptographic performance 41462306a36Sopenharmony_ci------------------------------- 41562306a36Sopenharmony_ci 41662306a36Sopenharmony_ciOffload performance may depend on segment and record size. 41762306a36Sopenharmony_ci 41862306a36Sopenharmony_ciOverload of the cryptographic subsystem of the device should not have 41962306a36Sopenharmony_cisignificant performance impact on non-offloaded streams. 42062306a36Sopenharmony_ci 42162306a36Sopenharmony_ciStatistics 42262306a36Sopenharmony_ci========== 42362306a36Sopenharmony_ci 42462306a36Sopenharmony_ciFollowing minimum set of TLS-related statistics should be reported 42562306a36Sopenharmony_ciby the driver: 42662306a36Sopenharmony_ci 42762306a36Sopenharmony_ci * ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets 42862306a36Sopenharmony_ci which were part of a TLS stream. 42962306a36Sopenharmony_ci * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets 43062306a36Sopenharmony_ci which were successfully decrypted. 43162306a36Sopenharmony_ci * ``rx_tls_ctx`` - number of TLS RX HW offload contexts added to device for 43262306a36Sopenharmony_ci decryption. 43362306a36Sopenharmony_ci * ``rx_tls_del`` - number of TLS RX HW offload contexts deleted from device 43462306a36Sopenharmony_ci (connection has finished). 43562306a36Sopenharmony_ci * ``rx_tls_resync_req_pkt`` - number of received TLS packets with a resync 43662306a36Sopenharmony_ci request. 43762306a36Sopenharmony_ci * ``rx_tls_resync_req_start`` - number of times the TLS async resync request 43862306a36Sopenharmony_ci was started. 43962306a36Sopenharmony_ci * ``rx_tls_resync_req_end`` - number of times the TLS async resync request 44062306a36Sopenharmony_ci properly ended with providing the HW tracked tcp-seq. 44162306a36Sopenharmony_ci * ``rx_tls_resync_req_skip`` - number of times the TLS async resync request 44262306a36Sopenharmony_ci procedure was started by not properly ended. 44362306a36Sopenharmony_ci * ``rx_tls_resync_res_ok`` - number of times the TLS resync response call to 44462306a36Sopenharmony_ci the driver was successfully handled. 44562306a36Sopenharmony_ci * ``rx_tls_resync_res_skip`` - number of times the TLS resync response call to 44662306a36Sopenharmony_ci the driver was terminated unsuccessfully. 44762306a36Sopenharmony_ci * ``rx_tls_err`` - number of RX packets which were part of a TLS stream 44862306a36Sopenharmony_ci but were not decrypted due to unexpected error in the state machine. 44962306a36Sopenharmony_ci * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device 45062306a36Sopenharmony_ci for encryption of their TLS payload. 45162306a36Sopenharmony_ci * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets 45262306a36Sopenharmony_ci passed to the device for encryption. 45362306a36Sopenharmony_ci * ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for 45462306a36Sopenharmony_ci encryption. 45562306a36Sopenharmony_ci * ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream 45662306a36Sopenharmony_ci but did not arrive in the expected order. 45762306a36Sopenharmony_ci * ``tx_tls_skip_no_sync_data`` - number of TX packets which were part of 45862306a36Sopenharmony_ci a TLS stream and arrived out-of-order, but skipped the HW offload routine 45962306a36Sopenharmony_ci and went to the regular transmit flow as they were retransmissions of the 46062306a36Sopenharmony_ci connection handshake. 46162306a36Sopenharmony_ci * ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of 46262306a36Sopenharmony_ci a TLS stream dropped, because they arrived out of order and associated 46362306a36Sopenharmony_ci record could not be found. 46462306a36Sopenharmony_ci * ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS 46562306a36Sopenharmony_ci stream dropped, because they contain both data that has been encrypted by 46662306a36Sopenharmony_ci software and data that expects hardware crypto offload. 46762306a36Sopenharmony_ci 46862306a36Sopenharmony_ciNotable corner cases, exceptions and additional requirements 46962306a36Sopenharmony_ci============================================================ 47062306a36Sopenharmony_ci 47162306a36Sopenharmony_ci.. _5tuple_problems: 47262306a36Sopenharmony_ci 47362306a36Sopenharmony_ci5-tuple matching limitations 47462306a36Sopenharmony_ci---------------------------- 47562306a36Sopenharmony_ci 47662306a36Sopenharmony_ciThe device can only recognize received packets based on the 5-tuple 47762306a36Sopenharmony_ciof the socket. Current ``ktls`` implementation will not offload sockets 47862306a36Sopenharmony_cirouted through software interfaces such as those used for tunneling 47962306a36Sopenharmony_cior virtual networking. However, many packet transformations performed 48062306a36Sopenharmony_ciby the networking stack (most notably any BPF logic) do not require 48162306a36Sopenharmony_ciany intermediate software device, therefore a 5-tuple match may 48262306a36Sopenharmony_ciconsistently miss at the device level. In such cases the device 48362306a36Sopenharmony_cishould still be able to perform TX offload (encryption) and should 48462306a36Sopenharmony_cifallback cleanly to software decryption (RX). 48562306a36Sopenharmony_ci 48662306a36Sopenharmony_ciOut of order 48762306a36Sopenharmony_ci------------ 48862306a36Sopenharmony_ci 48962306a36Sopenharmony_ciIntroducing extra processing in NICs should not cause packets to be 49062306a36Sopenharmony_citransmitted or received out of order, for example pure ACK packets 49162306a36Sopenharmony_cishould not be reordered with respect to data segments. 49262306a36Sopenharmony_ci 49362306a36Sopenharmony_ciIngress reorder 49462306a36Sopenharmony_ci--------------- 49562306a36Sopenharmony_ci 49662306a36Sopenharmony_ciA device is permitted to perform packet reordering for consecutive 49762306a36Sopenharmony_ciTCP segments (i.e. placing packets in the correct order) but any form 49862306a36Sopenharmony_ciof additional buffering is disallowed. 49962306a36Sopenharmony_ci 50062306a36Sopenharmony_ciCoexistence with standard networking offload features 50162306a36Sopenharmony_ci----------------------------------------------------- 50262306a36Sopenharmony_ci 50362306a36Sopenharmony_ciOffloaded ``ktls`` sockets should support standard TCP stack features 50462306a36Sopenharmony_citransparently. Enabling device TLS offload should not cause any difference 50562306a36Sopenharmony_ciin packets as seen on the wire. 50662306a36Sopenharmony_ci 50762306a36Sopenharmony_ciTransport layer transparency 50862306a36Sopenharmony_ci---------------------------- 50962306a36Sopenharmony_ci 51062306a36Sopenharmony_ciThe device should not modify any packet headers for the purpose 51162306a36Sopenharmony_ciof the simplifying TLS offload. 51262306a36Sopenharmony_ci 51362306a36Sopenharmony_ciThe device should not depend on any packet headers beyond what is strictly 51462306a36Sopenharmony_cinecessary for TLS offload. 51562306a36Sopenharmony_ci 51662306a36Sopenharmony_ciSegment drops 51762306a36Sopenharmony_ci------------- 51862306a36Sopenharmony_ci 51962306a36Sopenharmony_ciDropping packets is acceptable only in the event of catastrophic 52062306a36Sopenharmony_cisystem errors and should never be used as an error handling mechanism 52162306a36Sopenharmony_ciin cases arising from normal operation. In other words, reliance 52262306a36Sopenharmony_cion TCP retransmissions to handle corner cases is not acceptable. 52362306a36Sopenharmony_ci 52462306a36Sopenharmony_ciTLS device features 52562306a36Sopenharmony_ci------------------- 52662306a36Sopenharmony_ci 52762306a36Sopenharmony_ciDrivers should ignore the changes to the TLS device feature flags. 52862306a36Sopenharmony_ciThese flags will be acted upon accordingly by the core ``ktls`` code. 52962306a36Sopenharmony_ciTLS device feature flags only control adding of new TLS connection 53062306a36Sopenharmony_cioffloads, old connections will remain active after flags are cleared. 53162306a36Sopenharmony_ci 53262306a36Sopenharmony_ciTLS encryption cannot be offloaded to devices without checksum calculation 53362306a36Sopenharmony_cioffload. Hence, TLS TX device feature flag requires TX csum offload being set. 53462306a36Sopenharmony_ciDisabling the latter implies clearing the former. Disabling TX checksum offload 53562306a36Sopenharmony_cishould not affect old connections, and drivers should make sure checksum 53662306a36Sopenharmony_cicalculation does not break for them. 53762306a36Sopenharmony_ciSimilarly, device-offloaded TLS decryption implies doing RXCSUM. If the user 53862306a36Sopenharmony_cidoes not want to enable RX csum offload, TLS RX device feature is disabled 53962306a36Sopenharmony_cias well. 540