162306a36Sopenharmony_ci.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci==================
462306a36Sopenharmony_ciKernel TLS offload
562306a36Sopenharmony_ci==================
662306a36Sopenharmony_ci
762306a36Sopenharmony_ciKernel TLS operation
862306a36Sopenharmony_ci====================
962306a36Sopenharmony_ci
1062306a36Sopenharmony_ciLinux kernel provides TLS connection offload infrastructure. Once a TCP
1162306a36Sopenharmony_ciconnection is in ``ESTABLISHED`` state user space can enable the TLS Upper
1262306a36Sopenharmony_ciLayer Protocol (ULP) and install the cryptographic connection state.
1362306a36Sopenharmony_ciFor details regarding the user-facing interface refer to the TLS
1462306a36Sopenharmony_cidocumentation in :ref:`Documentation/networking/tls.rst <kernel_tls>`.
1562306a36Sopenharmony_ci
1662306a36Sopenharmony_ci``ktls`` can operate in three modes:
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ci * Software crypto mode (``TLS_SW``) - CPU handles the cryptography.
1962306a36Sopenharmony_ci   In most basic cases only crypto operations synchronous with the CPU
2062306a36Sopenharmony_ci   can be used, but depending on calling context CPU may utilize
2162306a36Sopenharmony_ci   asynchronous crypto accelerators. The use of accelerators introduces extra
2262306a36Sopenharmony_ci   latency on socket reads (decryption only starts when a read syscall
2362306a36Sopenharmony_ci   is made) and additional I/O load on the system.
2462306a36Sopenharmony_ci * Packet-based NIC offload mode (``TLS_HW``) - the NIC handles crypto
2562306a36Sopenharmony_ci   on a packet by packet basis, provided the packets arrive in order.
2662306a36Sopenharmony_ci   This mode integrates best with the kernel stack and is described in detail
2762306a36Sopenharmony_ci   in the remaining part of this document
2862306a36Sopenharmony_ci   (``ethtool`` flags ``tls-hw-tx-offload`` and ``tls-hw-rx-offload``).
2962306a36Sopenharmony_ci * Full TCP NIC offload mode (``TLS_HW_RECORD``) - mode of operation where
3062306a36Sopenharmony_ci   NIC driver and firmware replace the kernel networking stack
3162306a36Sopenharmony_ci   with its own TCP handling, it is not usable in production environments
3262306a36Sopenharmony_ci   making use of the Linux networking stack for example any firewalling
3362306a36Sopenharmony_ci   abilities or QoS and packet scheduling (``ethtool`` flag ``tls-hw-record``).
3462306a36Sopenharmony_ci
3562306a36Sopenharmony_ciThe operation mode is selected automatically based on device configuration,
3662306a36Sopenharmony_cioffload opt-in or opt-out on per-connection basis is not currently supported.
3762306a36Sopenharmony_ci
3862306a36Sopenharmony_ciTX
3962306a36Sopenharmony_ci--
4062306a36Sopenharmony_ci
4162306a36Sopenharmony_ciAt a high level user write requests are turned into a scatter list, the TLS ULP
4262306a36Sopenharmony_ciintercepts them, inserts record framing, performs encryption (in ``TLS_SW``
4362306a36Sopenharmony_cimode) and then hands the modified scatter list to the TCP layer. From this
4462306a36Sopenharmony_cipoint on the TCP stack proceeds as normal.
4562306a36Sopenharmony_ci
4662306a36Sopenharmony_ciIn ``TLS_HW`` mode the encryption is not performed in the TLS ULP.
4762306a36Sopenharmony_ciInstead packets reach a device driver, the driver will mark the packets
4862306a36Sopenharmony_cifor crypto offload based on the socket the packet is attached to,
4962306a36Sopenharmony_ciand send them to the device for encryption and transmission.
5062306a36Sopenharmony_ci
5162306a36Sopenharmony_ciRX
5262306a36Sopenharmony_ci--
5362306a36Sopenharmony_ci
5462306a36Sopenharmony_ciOn the receive side if the device handled decryption and authentication
5562306a36Sopenharmony_cisuccessfully, the driver will set the decrypted bit in the associated
5662306a36Sopenharmony_ci:c:type:`struct sk_buff <sk_buff>`. The packets reach the TCP stack and
5762306a36Sopenharmony_ciare handled normally. ``ktls`` is informed when data is queued to the socket
5862306a36Sopenharmony_ciand the ``strparser`` mechanism is used to delineate the records. Upon read
5962306a36Sopenharmony_cirequest, records are retrieved from the socket and passed to decryption routine.
6062306a36Sopenharmony_ciIf device decrypted all the segments of the record the decryption is skipped,
6162306a36Sopenharmony_ciotherwise software path handles decryption.
6262306a36Sopenharmony_ci
6362306a36Sopenharmony_ci.. kernel-figure::  tls-offload-layers.svg
6462306a36Sopenharmony_ci   :alt:	TLS offload layers
6562306a36Sopenharmony_ci   :align:	center
6662306a36Sopenharmony_ci   :figwidth:	28em
6762306a36Sopenharmony_ci
6862306a36Sopenharmony_ci   Layers of Kernel TLS stack
6962306a36Sopenharmony_ci
7062306a36Sopenharmony_ciDevice configuration
7162306a36Sopenharmony_ci====================
7262306a36Sopenharmony_ci
7362306a36Sopenharmony_ciDuring driver initialization device sets the ``NETIF_F_HW_TLS_RX`` and
7462306a36Sopenharmony_ci``NETIF_F_HW_TLS_TX`` features and installs its
7562306a36Sopenharmony_ci:c:type:`struct tlsdev_ops <tlsdev_ops>`
7662306a36Sopenharmony_cipointer in the :c:member:`tlsdev_ops` member of the
7762306a36Sopenharmony_ci:c:type:`struct net_device <net_device>`.
7862306a36Sopenharmony_ci
7962306a36Sopenharmony_ciWhen TLS cryptographic connection state is installed on a ``ktls`` socket
8062306a36Sopenharmony_ci(note that it is done twice, once for RX and once for TX direction,
8162306a36Sopenharmony_ciand the two are completely independent), the kernel checks if the underlying
8262306a36Sopenharmony_cinetwork device is offload-capable and attempts the offload. In case offload
8362306a36Sopenharmony_cifails the connection is handled entirely in software using the same mechanism
8462306a36Sopenharmony_cias if the offload was never tried.
8562306a36Sopenharmony_ci
8662306a36Sopenharmony_ciOffload request is performed via the :c:member:`tls_dev_add` callback of
8762306a36Sopenharmony_ci:c:type:`struct tlsdev_ops <tlsdev_ops>`:
8862306a36Sopenharmony_ci
8962306a36Sopenharmony_ci.. code-block:: c
9062306a36Sopenharmony_ci
9162306a36Sopenharmony_ci	int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
9262306a36Sopenharmony_ci			   enum tls_offload_ctx_dir direction,
9362306a36Sopenharmony_ci			   struct tls_crypto_info *crypto_info,
9462306a36Sopenharmony_ci			   u32 start_offload_tcp_sn);
9562306a36Sopenharmony_ci
9662306a36Sopenharmony_ci``direction`` indicates whether the cryptographic information is for
9762306a36Sopenharmony_cithe received or transmitted packets. Driver uses the ``sk`` parameter
9862306a36Sopenharmony_cito retrieve the connection 5-tuple and socket family (IPv4 vs IPv6).
9962306a36Sopenharmony_ciCryptographic information in ``crypto_info`` includes the key, iv, salt
10062306a36Sopenharmony_cias well as TLS record sequence number. ``start_offload_tcp_sn`` indicates
10162306a36Sopenharmony_ciwhich TCP sequence number corresponds to the beginning of the record with
10262306a36Sopenharmony_cisequence number from ``crypto_info``. The driver can add its state
10362306a36Sopenharmony_ciat the end of kernel structures (see :c:member:`driver_state` members
10462306a36Sopenharmony_ciin ``include/net/tls.h``) to avoid additional allocations and pointer
10562306a36Sopenharmony_cidereferences.
10662306a36Sopenharmony_ci
10762306a36Sopenharmony_ciTX
10862306a36Sopenharmony_ci--
10962306a36Sopenharmony_ci
11062306a36Sopenharmony_ciAfter TX state is installed, the stack guarantees that the first segment
11162306a36Sopenharmony_ciof the stream will start exactly at the ``start_offload_tcp_sn`` sequence
11262306a36Sopenharmony_cinumber, simplifying TCP sequence number matching.
11362306a36Sopenharmony_ci
11462306a36Sopenharmony_ciTX offload being fully initialized does not imply that all segments passing
11562306a36Sopenharmony_cithrough the driver and which belong to the offloaded socket will be after
11662306a36Sopenharmony_cithe expected sequence number and will have kernel record information.
11762306a36Sopenharmony_ciIn particular, already encrypted data may have been queued to the socket
11862306a36Sopenharmony_cibefore installing the connection state in the kernel.
11962306a36Sopenharmony_ci
12062306a36Sopenharmony_ciRX
12162306a36Sopenharmony_ci--
12262306a36Sopenharmony_ci
12362306a36Sopenharmony_ciIn RX direction local networking stack has little control over the segmentation,
12462306a36Sopenharmony_ciso the initial records' TCP sequence number may be anywhere inside the segment.
12562306a36Sopenharmony_ci
12662306a36Sopenharmony_ciNormal operation
12762306a36Sopenharmony_ci================
12862306a36Sopenharmony_ci
12962306a36Sopenharmony_ciAt the minimum the device maintains the following state for each connection, in
13062306a36Sopenharmony_cieach direction:
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ci * crypto secrets (key, iv, salt)
13362306a36Sopenharmony_ci * crypto processing state (partial blocks, partial authentication tag, etc.)
13462306a36Sopenharmony_ci * record metadata (sequence number, processing offset and length)
13562306a36Sopenharmony_ci * expected TCP sequence number
13662306a36Sopenharmony_ci
13762306a36Sopenharmony_ciThere are no guarantees on record length or record segmentation. In particular
13862306a36Sopenharmony_cisegments may start at any point of a record and contain any number of records.
13962306a36Sopenharmony_ciAssuming segments are received in order, the device should be able to perform
14062306a36Sopenharmony_cicrypto operations and authentication regardless of segmentation. For this
14162306a36Sopenharmony_cito be possible device has to keep small amount of segment-to-segment state.
14262306a36Sopenharmony_ciThis includes at least:
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_ci * partial headers (if a segment carried only a part of the TLS header)
14562306a36Sopenharmony_ci * partial data block
14662306a36Sopenharmony_ci * partial authentication tag (all data had been seen but part of the
14762306a36Sopenharmony_ci   authentication tag has to be written or read from the subsequent segment)
14862306a36Sopenharmony_ci
14962306a36Sopenharmony_ciRecord reassembly is not necessary for TLS offload. If the packets arrive
15062306a36Sopenharmony_ciin order the device should be able to handle them separately and make
15162306a36Sopenharmony_ciforward progress.
15262306a36Sopenharmony_ci
15362306a36Sopenharmony_ciTX
15462306a36Sopenharmony_ci--
15562306a36Sopenharmony_ci
15662306a36Sopenharmony_ciThe kernel stack performs record framing reserving space for the authentication
15762306a36Sopenharmony_citag and populating all other TLS header and tailer fields.
15862306a36Sopenharmony_ci
15962306a36Sopenharmony_ciBoth the device and the driver maintain expected TCP sequence numbers
16062306a36Sopenharmony_cidue to the possibility of retransmissions and the lack of software fallback
16162306a36Sopenharmony_cionce the packet reaches the device.
16262306a36Sopenharmony_ciFor segments passed in order, the driver marks the packets with
16362306a36Sopenharmony_cia connection identifier (note that a 5-tuple lookup is insufficient to identify
16462306a36Sopenharmony_cipackets requiring HW offload, see the :ref:`5tuple_problems` section)
16562306a36Sopenharmony_ciand hands them to the device. The device identifies the packet as requiring
16662306a36Sopenharmony_ciTLS handling and confirms the sequence number matches its expectation.
16762306a36Sopenharmony_ciThe device performs encryption and authentication of the record data.
16862306a36Sopenharmony_ciIt replaces the authentication tag and TCP checksum with correct values.
16962306a36Sopenharmony_ci
17062306a36Sopenharmony_ciRX
17162306a36Sopenharmony_ci--
17262306a36Sopenharmony_ci
17362306a36Sopenharmony_ciBefore a packet is DMAed to the host (but after NIC's embedded switching
17462306a36Sopenharmony_ciand packet transformation functions) the device validates the Layer 4
17562306a36Sopenharmony_cichecksum and performs a 5-tuple lookup to find any TLS connection the packet
17662306a36Sopenharmony_cimay belong to (technically a 4-tuple
17762306a36Sopenharmony_cilookup is sufficient - IP addresses and TCP port numbers, as the protocol
17862306a36Sopenharmony_ciis always TCP). If connection is matched device confirms if the TCP sequence
17962306a36Sopenharmony_cinumber is the expected one and proceeds to TLS handling (record delineation,
18062306a36Sopenharmony_cidecryption, authentication for each record in the packet). The device leaves
18162306a36Sopenharmony_cithe record framing unmodified, the stack takes care of record decapsulation.
18262306a36Sopenharmony_ciDevice indicates successful handling of TLS offload in the per-packet context
18362306a36Sopenharmony_ci(descriptor) passed to the host.
18462306a36Sopenharmony_ci
18562306a36Sopenharmony_ciUpon reception of a TLS offloaded packet, the driver sets
18662306a36Sopenharmony_cithe :c:member:`decrypted` mark in :c:type:`struct sk_buff <sk_buff>`
18762306a36Sopenharmony_cicorresponding to the segment. Networking stack makes sure decrypted
18862306a36Sopenharmony_ciand non-decrypted segments do not get coalesced (e.g. by GRO or socket layer)
18962306a36Sopenharmony_ciand takes care of partial decryption.
19062306a36Sopenharmony_ci
19162306a36Sopenharmony_ciResync handling
19262306a36Sopenharmony_ci===============
19362306a36Sopenharmony_ci
19462306a36Sopenharmony_ciIn presence of packet drops or network packet reordering, the device may lose
19562306a36Sopenharmony_cisynchronization with the TLS stream, and require a resync with the kernel's
19662306a36Sopenharmony_ciTCP stack.
19762306a36Sopenharmony_ci
19862306a36Sopenharmony_ciNote that resync is only attempted for connections which were successfully
19962306a36Sopenharmony_ciadded to the device table and are in TLS_HW mode. For example,
20062306a36Sopenharmony_ciif the table was full when cryptographic state was installed in the kernel,
20162306a36Sopenharmony_cisuch connection will never get offloaded. Therefore the resync request
20262306a36Sopenharmony_cidoes not carry any cryptographic connection state.
20362306a36Sopenharmony_ci
20462306a36Sopenharmony_ciTX
20562306a36Sopenharmony_ci--
20662306a36Sopenharmony_ci
20762306a36Sopenharmony_ciSegments transmitted from an offloaded socket can get out of sync
20862306a36Sopenharmony_ciin similar ways to the receive side-retransmissions - local drops
20962306a36Sopenharmony_ciare possible, though network reorders are not. There are currently
21062306a36Sopenharmony_citwo mechanisms for dealing with out of order segments.
21162306a36Sopenharmony_ci
21262306a36Sopenharmony_ciCrypto state rebuilding
21362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~
21462306a36Sopenharmony_ci
21562306a36Sopenharmony_ciWhenever an out of order segment is transmitted the driver provides
21662306a36Sopenharmony_cithe device with enough information to perform cryptographic operations.
21762306a36Sopenharmony_ciThis means most likely that the part of the record preceding the current
21862306a36Sopenharmony_cisegment has to be passed to the device as part of the packet context,
21962306a36Sopenharmony_citogether with its TCP sequence number and TLS record number. The device
22062306a36Sopenharmony_cican then initialize its crypto state, process and discard the preceding
22162306a36Sopenharmony_cidata (to be able to insert the authentication tag) and move onto handling
22262306a36Sopenharmony_cithe actual packet.
22362306a36Sopenharmony_ci
22462306a36Sopenharmony_ciIn this mode depending on the implementation the driver can either ask
22562306a36Sopenharmony_cifor a continuation with the crypto state and the new sequence number
22662306a36Sopenharmony_ci(next expected segment is the one after the out of order one), or continue
22762306a36Sopenharmony_ciwith the previous stream state - assuming that the out of order segment
22862306a36Sopenharmony_ciwas just a retransmission. The former is simpler, and does not require
22962306a36Sopenharmony_ciretransmission detection therefore it is the recommended method until
23062306a36Sopenharmony_cisuch time it is proven inefficient.
23162306a36Sopenharmony_ci
23262306a36Sopenharmony_ciNext record sync
23362306a36Sopenharmony_ci~~~~~~~~~~~~~~~~
23462306a36Sopenharmony_ci
23562306a36Sopenharmony_ciWhenever an out of order segment is detected the driver requests
23662306a36Sopenharmony_cithat the ``ktls`` software fallback code encrypt it. If the segment's
23762306a36Sopenharmony_cisequence number is lower than expected the driver assumes retransmission
23862306a36Sopenharmony_ciand doesn't change device state. If the segment is in the future, it
23962306a36Sopenharmony_cimay imply a local drop, the driver asks the stack to sync the device
24062306a36Sopenharmony_cito the next record state and falls back to software.
24162306a36Sopenharmony_ci
24262306a36Sopenharmony_ciResync request is indicated with:
24362306a36Sopenharmony_ci
24462306a36Sopenharmony_ci.. code-block:: c
24562306a36Sopenharmony_ci
24662306a36Sopenharmony_ci  void tls_offload_tx_resync_request(struct sock *sk, u32 got_seq, u32 exp_seq)
24762306a36Sopenharmony_ci
24862306a36Sopenharmony_ciUntil resync is complete driver should not access its expected TCP
24962306a36Sopenharmony_cisequence number (as it will be updated from a different context).
25062306a36Sopenharmony_ciFollowing helper should be used to test if resync is complete:
25162306a36Sopenharmony_ci
25262306a36Sopenharmony_ci.. code-block:: c
25362306a36Sopenharmony_ci
25462306a36Sopenharmony_ci  bool tls_offload_tx_resync_pending(struct sock *sk)
25562306a36Sopenharmony_ci
25662306a36Sopenharmony_ciNext time ``ktls`` pushes a record it will first send its TCP sequence number
25762306a36Sopenharmony_ciand TLS record number to the driver. Stack will also make sure that
25862306a36Sopenharmony_cithe new record will start on a segment boundary (like it does when
25962306a36Sopenharmony_cithe connection is initially added).
26062306a36Sopenharmony_ci
26162306a36Sopenharmony_ciRX
26262306a36Sopenharmony_ci--
26362306a36Sopenharmony_ci
26462306a36Sopenharmony_ciA small amount of RX reorder events may not require a full resynchronization.
26562306a36Sopenharmony_ciIn particular the device should not lose synchronization
26662306a36Sopenharmony_ciwhen record boundary can be recovered:
26762306a36Sopenharmony_ci
26862306a36Sopenharmony_ci.. kernel-figure::  tls-offload-reorder-good.svg
26962306a36Sopenharmony_ci   :alt:	reorder of non-header segment
27062306a36Sopenharmony_ci   :align:	center
27162306a36Sopenharmony_ci
27262306a36Sopenharmony_ci   Reorder of non-header segment
27362306a36Sopenharmony_ci
27462306a36Sopenharmony_ciGreen segments are successfully decrypted, blue ones are passed
27562306a36Sopenharmony_cias received on wire, red stripes mark start of new records.
27662306a36Sopenharmony_ci
27762306a36Sopenharmony_ciIn above case segment 1 is received and decrypted successfully.
27862306a36Sopenharmony_ciSegment 2 was dropped so 3 arrives out of order. The device knows
27962306a36Sopenharmony_cithe next record starts inside 3, based on record length in segment 1.
28062306a36Sopenharmony_ciSegment 3 is passed untouched, because due to lack of data from segment 2
28162306a36Sopenharmony_cithe remainder of the previous record inside segment 3 cannot be handled.
28262306a36Sopenharmony_ciThe device can, however, collect the authentication algorithm's state
28362306a36Sopenharmony_ciand partial block from the new record in segment 3 and when 4 and 5
28462306a36Sopenharmony_ciarrive continue decryption. Finally when 2 arrives it's completely outside
28562306a36Sopenharmony_ciof expected window of the device so it's passed as is without special
28662306a36Sopenharmony_cihandling. ``ktls`` software fallback handles the decryption of record
28762306a36Sopenharmony_cispanning segments 1, 2 and 3. The device did not get out of sync,
28862306a36Sopenharmony_cieven though two segments did not get decrypted.
28962306a36Sopenharmony_ci
29062306a36Sopenharmony_ciKernel synchronization may be necessary if the lost segment contained
29162306a36Sopenharmony_cia record header and arrived after the next record header has already passed:
29262306a36Sopenharmony_ci
29362306a36Sopenharmony_ci.. kernel-figure::  tls-offload-reorder-bad.svg
29462306a36Sopenharmony_ci   :alt:	reorder of header segment
29562306a36Sopenharmony_ci   :align:	center
29662306a36Sopenharmony_ci
29762306a36Sopenharmony_ci   Reorder of segment with a TLS header
29862306a36Sopenharmony_ci
29962306a36Sopenharmony_ciIn this example segment 2 gets dropped, and it contains a record header.
30062306a36Sopenharmony_ciDevice can only detect that segment 4 also contains a TLS header
30162306a36Sopenharmony_ciif it knows the length of the previous record from segment 2. In this case
30262306a36Sopenharmony_cithe device will lose synchronization with the stream.
30362306a36Sopenharmony_ci
30462306a36Sopenharmony_ciStream scan resynchronization
30562306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
30662306a36Sopenharmony_ci
30762306a36Sopenharmony_ciWhen the device gets out of sync and the stream reaches TCP sequence
30862306a36Sopenharmony_cinumbers more than a max size record past the expected TCP sequence number,
30962306a36Sopenharmony_cithe device starts scanning for a known header pattern. For example
31062306a36Sopenharmony_cifor TLS 1.2 and TLS 1.3 subsequent bytes of value ``0x03 0x03`` occur
31162306a36Sopenharmony_ciin the SSL/TLS version field of the header. Once pattern is matched
31262306a36Sopenharmony_cithe device continues attempting parsing headers at expected locations
31362306a36Sopenharmony_ci(based on the length fields at guessed locations).
31462306a36Sopenharmony_ciWhenever the expected location does not contain a valid header the scan
31562306a36Sopenharmony_ciis restarted.
31662306a36Sopenharmony_ci
31762306a36Sopenharmony_ciWhen the header is matched the device sends a confirmation request
31862306a36Sopenharmony_cito the kernel, asking if the guessed location is correct (if a TLS record
31962306a36Sopenharmony_cireally starts there), and which record sequence number the given header had.
32062306a36Sopenharmony_ciThe kernel confirms the guessed location was correct and tells the device
32162306a36Sopenharmony_cithe record sequence number. Meanwhile, the device had been parsing
32262306a36Sopenharmony_ciand counting all records since the just-confirmed one, it adds the number
32362306a36Sopenharmony_ciof records it had seen to the record number provided by the kernel.
32462306a36Sopenharmony_ciAt this point the device is in sync and can resume decryption at next
32562306a36Sopenharmony_cisegment boundary.
32662306a36Sopenharmony_ci
32762306a36Sopenharmony_ciIn a pathological case the device may latch onto a sequence of matching
32862306a36Sopenharmony_ciheaders and never hear back from the kernel (there is no negative
32962306a36Sopenharmony_ciconfirmation from the kernel). The implementation may choose to periodically
33062306a36Sopenharmony_cirestart scan. Given how unlikely falsely-matching stream is, however,
33162306a36Sopenharmony_ciperiodic restart is not deemed necessary.
33262306a36Sopenharmony_ci
33362306a36Sopenharmony_ciSpecial care has to be taken if the confirmation request is passed
33462306a36Sopenharmony_ciasynchronously to the packet stream and record may get processed
33562306a36Sopenharmony_ciby the kernel before the confirmation request.
33662306a36Sopenharmony_ci
33762306a36Sopenharmony_ciStack-driven resynchronization
33862306a36Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33962306a36Sopenharmony_ci
34062306a36Sopenharmony_ciThe driver may also request the stack to perform resynchronization
34162306a36Sopenharmony_ciwhenever it sees the records are no longer getting decrypted.
34262306a36Sopenharmony_ciIf the connection is configured in this mode the stack automatically
34362306a36Sopenharmony_cischedules resynchronization after it has received two completely encrypted
34462306a36Sopenharmony_cirecords.
34562306a36Sopenharmony_ci
34662306a36Sopenharmony_ciThe stack waits for the socket to drain and informs the device about
34762306a36Sopenharmony_cithe next expected record number and its TCP sequence number. If the
34862306a36Sopenharmony_cirecords continue to be received fully encrypted stack retries the
34962306a36Sopenharmony_cisynchronization with an exponential back off (first after 2 encrypted
35062306a36Sopenharmony_cirecords, then after 4 records, after 8, after 16... up until every
35162306a36Sopenharmony_ci128 records).
35262306a36Sopenharmony_ci
35362306a36Sopenharmony_ciError handling
35462306a36Sopenharmony_ci==============
35562306a36Sopenharmony_ci
35662306a36Sopenharmony_ciTX
35762306a36Sopenharmony_ci--
35862306a36Sopenharmony_ci
35962306a36Sopenharmony_ciPackets may be redirected or rerouted by the stack to a different
36062306a36Sopenharmony_cidevice than the selected TLS offload device. The stack will handle
36162306a36Sopenharmony_cisuch condition using the :c:func:`sk_validate_xmit_skb` helper
36262306a36Sopenharmony_ci(TLS offload code installs :c:func:`tls_validate_xmit_skb` at this hook).
36362306a36Sopenharmony_ciOffload maintains information about all records until the data is
36462306a36Sopenharmony_cifully acknowledged, so if skbs reach the wrong device they can be handled
36562306a36Sopenharmony_ciby software fallback.
36662306a36Sopenharmony_ci
36762306a36Sopenharmony_ciAny device TLS offload handling error on the transmission side must result
36862306a36Sopenharmony_ciin the packet being dropped. For example if a packet got out of order
36962306a36Sopenharmony_cidue to a bug in the stack or the device, reached the device and can't
37062306a36Sopenharmony_cibe encrypted such packet must be dropped.
37162306a36Sopenharmony_ci
37262306a36Sopenharmony_ciRX
37362306a36Sopenharmony_ci--
37462306a36Sopenharmony_ci
37562306a36Sopenharmony_ciIf the device encounters any problems with TLS offload on the receive
37662306a36Sopenharmony_ciside it should pass the packet to the host's networking stack as it was
37762306a36Sopenharmony_cireceived on the wire.
37862306a36Sopenharmony_ci
37962306a36Sopenharmony_ciFor example authentication failure for any record in the segment should
38062306a36Sopenharmony_ciresult in passing the unmodified packet to the software fallback. This means
38162306a36Sopenharmony_cipackets should not be modified "in place". Splitting segments to handle partial
38262306a36Sopenharmony_cidecryption is not advised. In other words either all records in the packet
38362306a36Sopenharmony_cihad been handled successfully and authenticated or the packet has to be passed
38462306a36Sopenharmony_cito the host's stack as it was on the wire (recovering original packet in the
38562306a36Sopenharmony_cidriver if device provides precise error is sufficient).
38662306a36Sopenharmony_ci
38762306a36Sopenharmony_ciThe Linux networking stack does not provide a way of reporting per-packet
38862306a36Sopenharmony_cidecryption and authentication errors, packets with errors must simply not
38962306a36Sopenharmony_cihave the :c:member:`decrypted` mark set.
39062306a36Sopenharmony_ci
39162306a36Sopenharmony_ciA packet should also not be handled by the TLS offload if it contains
39262306a36Sopenharmony_ciincorrect checksums.
39362306a36Sopenharmony_ci
39462306a36Sopenharmony_ciPerformance metrics
39562306a36Sopenharmony_ci===================
39662306a36Sopenharmony_ci
39762306a36Sopenharmony_ciTLS offload can be characterized by the following basic metrics:
39862306a36Sopenharmony_ci
39962306a36Sopenharmony_ci * max connection count
40062306a36Sopenharmony_ci * connection installation rate
40162306a36Sopenharmony_ci * connection installation latency
40262306a36Sopenharmony_ci * total cryptographic performance
40362306a36Sopenharmony_ci
40462306a36Sopenharmony_ciNote that each TCP connection requires a TLS session in both directions,
40562306a36Sopenharmony_cithe performance may be reported treating each direction separately.
40662306a36Sopenharmony_ci
40762306a36Sopenharmony_ciMax connection count
40862306a36Sopenharmony_ci--------------------
40962306a36Sopenharmony_ci
41062306a36Sopenharmony_ciThe number of connections device can support can be exposed via
41162306a36Sopenharmony_ci``devlink resource`` API.
41262306a36Sopenharmony_ci
41362306a36Sopenharmony_ciTotal cryptographic performance
41462306a36Sopenharmony_ci-------------------------------
41562306a36Sopenharmony_ci
41662306a36Sopenharmony_ciOffload performance may depend on segment and record size.
41762306a36Sopenharmony_ci
41862306a36Sopenharmony_ciOverload of the cryptographic subsystem of the device should not have
41962306a36Sopenharmony_cisignificant performance impact on non-offloaded streams.
42062306a36Sopenharmony_ci
42162306a36Sopenharmony_ciStatistics
42262306a36Sopenharmony_ci==========
42362306a36Sopenharmony_ci
42462306a36Sopenharmony_ciFollowing minimum set of TLS-related statistics should be reported
42562306a36Sopenharmony_ciby the driver:
42662306a36Sopenharmony_ci
42762306a36Sopenharmony_ci * ``rx_tls_decrypted_packets`` - number of successfully decrypted RX packets
42862306a36Sopenharmony_ci   which were part of a TLS stream.
42962306a36Sopenharmony_ci * ``rx_tls_decrypted_bytes`` - number of TLS payload bytes in RX packets
43062306a36Sopenharmony_ci   which were successfully decrypted.
43162306a36Sopenharmony_ci * ``rx_tls_ctx`` - number of TLS RX HW offload contexts added to device for
43262306a36Sopenharmony_ci   decryption.
43362306a36Sopenharmony_ci * ``rx_tls_del`` - number of TLS RX HW offload contexts deleted from device
43462306a36Sopenharmony_ci   (connection has finished).
43562306a36Sopenharmony_ci * ``rx_tls_resync_req_pkt`` - number of received TLS packets with a resync
43662306a36Sopenharmony_ci    request.
43762306a36Sopenharmony_ci * ``rx_tls_resync_req_start`` - number of times the TLS async resync request
43862306a36Sopenharmony_ci    was started.
43962306a36Sopenharmony_ci * ``rx_tls_resync_req_end`` - number of times the TLS async resync request
44062306a36Sopenharmony_ci    properly ended with providing the HW tracked tcp-seq.
44162306a36Sopenharmony_ci * ``rx_tls_resync_req_skip`` - number of times the TLS async resync request
44262306a36Sopenharmony_ci    procedure was started by not properly ended.
44362306a36Sopenharmony_ci * ``rx_tls_resync_res_ok`` - number of times the TLS resync response call to
44462306a36Sopenharmony_ci    the driver was successfully handled.
44562306a36Sopenharmony_ci * ``rx_tls_resync_res_skip`` - number of times the TLS resync response call to
44662306a36Sopenharmony_ci    the driver was terminated unsuccessfully.
44762306a36Sopenharmony_ci * ``rx_tls_err`` - number of RX packets which were part of a TLS stream
44862306a36Sopenharmony_ci   but were not decrypted due to unexpected error in the state machine.
44962306a36Sopenharmony_ci * ``tx_tls_encrypted_packets`` - number of TX packets passed to the device
45062306a36Sopenharmony_ci   for encryption of their TLS payload.
45162306a36Sopenharmony_ci * ``tx_tls_encrypted_bytes`` - number of TLS payload bytes in TX packets
45262306a36Sopenharmony_ci   passed to the device for encryption.
45362306a36Sopenharmony_ci * ``tx_tls_ctx`` - number of TLS TX HW offload contexts added to device for
45462306a36Sopenharmony_ci   encryption.
45562306a36Sopenharmony_ci * ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream
45662306a36Sopenharmony_ci   but did not arrive in the expected order.
45762306a36Sopenharmony_ci * ``tx_tls_skip_no_sync_data`` - number of TX packets which were part of
45862306a36Sopenharmony_ci   a TLS stream and arrived out-of-order, but skipped the HW offload routine
45962306a36Sopenharmony_ci   and went to the regular transmit flow as they were retransmissions of the
46062306a36Sopenharmony_ci   connection handshake.
46162306a36Sopenharmony_ci * ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of
46262306a36Sopenharmony_ci   a TLS stream dropped, because they arrived out of order and associated
46362306a36Sopenharmony_ci   record could not be found.
46462306a36Sopenharmony_ci * ``tx_tls_drop_bypass_req`` - number of TX packets which were part of a TLS
46562306a36Sopenharmony_ci   stream dropped, because they contain both data that has been encrypted by
46662306a36Sopenharmony_ci   software and data that expects hardware crypto offload.
46762306a36Sopenharmony_ci
46862306a36Sopenharmony_ciNotable corner cases, exceptions and additional requirements
46962306a36Sopenharmony_ci============================================================
47062306a36Sopenharmony_ci
47162306a36Sopenharmony_ci.. _5tuple_problems:
47262306a36Sopenharmony_ci
47362306a36Sopenharmony_ci5-tuple matching limitations
47462306a36Sopenharmony_ci----------------------------
47562306a36Sopenharmony_ci
47662306a36Sopenharmony_ciThe device can only recognize received packets based on the 5-tuple
47762306a36Sopenharmony_ciof the socket. Current ``ktls`` implementation will not offload sockets
47862306a36Sopenharmony_cirouted through software interfaces such as those used for tunneling
47962306a36Sopenharmony_cior virtual networking. However, many packet transformations performed
48062306a36Sopenharmony_ciby the networking stack (most notably any BPF logic) do not require
48162306a36Sopenharmony_ciany intermediate software device, therefore a 5-tuple match may
48262306a36Sopenharmony_ciconsistently miss at the device level. In such cases the device
48362306a36Sopenharmony_cishould still be able to perform TX offload (encryption) and should
48462306a36Sopenharmony_cifallback cleanly to software decryption (RX).
48562306a36Sopenharmony_ci
48662306a36Sopenharmony_ciOut of order
48762306a36Sopenharmony_ci------------
48862306a36Sopenharmony_ci
48962306a36Sopenharmony_ciIntroducing extra processing in NICs should not cause packets to be
49062306a36Sopenharmony_citransmitted or received out of order, for example pure ACK packets
49162306a36Sopenharmony_cishould not be reordered with respect to data segments.
49262306a36Sopenharmony_ci
49362306a36Sopenharmony_ciIngress reorder
49462306a36Sopenharmony_ci---------------
49562306a36Sopenharmony_ci
49662306a36Sopenharmony_ciA device is permitted to perform packet reordering for consecutive
49762306a36Sopenharmony_ciTCP segments (i.e. placing packets in the correct order) but any form
49862306a36Sopenharmony_ciof additional buffering is disallowed.
49962306a36Sopenharmony_ci
50062306a36Sopenharmony_ciCoexistence with standard networking offload features
50162306a36Sopenharmony_ci-----------------------------------------------------
50262306a36Sopenharmony_ci
50362306a36Sopenharmony_ciOffloaded ``ktls`` sockets should support standard TCP stack features
50462306a36Sopenharmony_citransparently. Enabling device TLS offload should not cause any difference
50562306a36Sopenharmony_ciin packets as seen on the wire.
50662306a36Sopenharmony_ci
50762306a36Sopenharmony_ciTransport layer transparency
50862306a36Sopenharmony_ci----------------------------
50962306a36Sopenharmony_ci
51062306a36Sopenharmony_ciThe device should not modify any packet headers for the purpose
51162306a36Sopenharmony_ciof the simplifying TLS offload.
51262306a36Sopenharmony_ci
51362306a36Sopenharmony_ciThe device should not depend on any packet headers beyond what is strictly
51462306a36Sopenharmony_cinecessary for TLS offload.
51562306a36Sopenharmony_ci
51662306a36Sopenharmony_ciSegment drops
51762306a36Sopenharmony_ci-------------
51862306a36Sopenharmony_ci
51962306a36Sopenharmony_ciDropping packets is acceptable only in the event of catastrophic
52062306a36Sopenharmony_cisystem errors and should never be used as an error handling mechanism
52162306a36Sopenharmony_ciin cases arising from normal operation. In other words, reliance
52262306a36Sopenharmony_cion TCP retransmissions to handle corner cases is not acceptable.
52362306a36Sopenharmony_ci
52462306a36Sopenharmony_ciTLS device features
52562306a36Sopenharmony_ci-------------------
52662306a36Sopenharmony_ci
52762306a36Sopenharmony_ciDrivers should ignore the changes to the TLS device feature flags.
52862306a36Sopenharmony_ciThese flags will be acted upon accordingly by the core ``ktls`` code.
52962306a36Sopenharmony_ciTLS device feature flags only control adding of new TLS connection
53062306a36Sopenharmony_cioffloads, old connections will remain active after flags are cleared.
53162306a36Sopenharmony_ci
53262306a36Sopenharmony_ciTLS encryption cannot be offloaded to devices without checksum calculation
53362306a36Sopenharmony_cioffload. Hence, TLS TX device feature flag requires TX csum offload being set.
53462306a36Sopenharmony_ciDisabling the latter implies clearing the former. Disabling TX checksum offload
53562306a36Sopenharmony_cishould not affect old connections, and drivers should make sure checksum
53662306a36Sopenharmony_cicalculation does not break for them.
53762306a36Sopenharmony_ciSimilarly, device-offloaded TLS decryption implies doing RXCSUM. If the user
53862306a36Sopenharmony_cidoes not want to enable RX csum offload, TLS RX device feature is disabled
53962306a36Sopenharmony_cias well.
540