18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci============
48c2ecf20Sopenharmony_ciTimestamping
58c2ecf20Sopenharmony_ci============
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ci
88c2ecf20Sopenharmony_ci1. Control Interfaces
98c2ecf20Sopenharmony_ci=====================
108c2ecf20Sopenharmony_ci
118c2ecf20Sopenharmony_ciThe interfaces for receiving network packages timestamps are:
128c2ecf20Sopenharmony_ci
138c2ecf20Sopenharmony_ciSO_TIMESTAMP
148c2ecf20Sopenharmony_ci  Generates a timestamp for each incoming packet in (not necessarily
158c2ecf20Sopenharmony_ci  monotonic) system time. Reports the timestamp via recvmsg() in a
168c2ecf20Sopenharmony_ci  control message in usec resolution.
178c2ecf20Sopenharmony_ci  SO_TIMESTAMP is defined as SO_TIMESTAMP_NEW or SO_TIMESTAMP_OLD
188c2ecf20Sopenharmony_ci  based on the architecture type and time_t representation of libc.
198c2ecf20Sopenharmony_ci  Control message format is in struct __kernel_old_timeval for
208c2ecf20Sopenharmony_ci  SO_TIMESTAMP_OLD and in struct __kernel_sock_timeval for
218c2ecf20Sopenharmony_ci  SO_TIMESTAMP_NEW options respectively.
228c2ecf20Sopenharmony_ci
238c2ecf20Sopenharmony_ciSO_TIMESTAMPNS
248c2ecf20Sopenharmony_ci  Same timestamping mechanism as SO_TIMESTAMP, but reports the
258c2ecf20Sopenharmony_ci  timestamp as struct timespec in nsec resolution.
268c2ecf20Sopenharmony_ci  SO_TIMESTAMPNS is defined as SO_TIMESTAMPNS_NEW or SO_TIMESTAMPNS_OLD
278c2ecf20Sopenharmony_ci  based on the architecture type and time_t representation of libc.
288c2ecf20Sopenharmony_ci  Control message format is in struct timespec for SO_TIMESTAMPNS_OLD
298c2ecf20Sopenharmony_ci  and in struct __kernel_timespec for SO_TIMESTAMPNS_NEW options
308c2ecf20Sopenharmony_ci  respectively.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_ciIP_MULTICAST_LOOP + SO_TIMESTAMP[NS]
338c2ecf20Sopenharmony_ci  Only for multicast:approximate transmit timestamp obtained by
348c2ecf20Sopenharmony_ci  reading the looped packet receive timestamp.
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_ciSO_TIMESTAMPING
378c2ecf20Sopenharmony_ci  Generates timestamps on reception, transmission or both. Supports
388c2ecf20Sopenharmony_ci  multiple timestamp sources, including hardware. Supports generating
398c2ecf20Sopenharmony_ci  timestamps for stream sockets.
408c2ecf20Sopenharmony_ci
418c2ecf20Sopenharmony_ci
428c2ecf20Sopenharmony_ci1.1 SO_TIMESTAMP (also SO_TIMESTAMP_OLD and SO_TIMESTAMP_NEW)
438c2ecf20Sopenharmony_ci-------------------------------------------------------------
448c2ecf20Sopenharmony_ci
458c2ecf20Sopenharmony_ciThis socket option enables timestamping of datagrams on the reception
468c2ecf20Sopenharmony_cipath. Because the destination socket, if any, is not known early in
478c2ecf20Sopenharmony_cithe network stack, the feature has to be enabled for all packets. The
488c2ecf20Sopenharmony_cisame is true for all early receive timestamp options.
498c2ecf20Sopenharmony_ci
508c2ecf20Sopenharmony_ciFor interface details, see `man 7 socket`.
518c2ecf20Sopenharmony_ci
528c2ecf20Sopenharmony_ciAlways use SO_TIMESTAMP_NEW timestamp to always get timestamp in
538c2ecf20Sopenharmony_cistruct __kernel_sock_timeval format.
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ciSO_TIMESTAMP_OLD returns incorrect timestamps after the year 2038
568c2ecf20Sopenharmony_cion 32 bit machines.
578c2ecf20Sopenharmony_ci
588c2ecf20Sopenharmony_ci1.2 SO_TIMESTAMPNS (also SO_TIMESTAMPNS_OLD and SO_TIMESTAMPNS_NEW):
598c2ecf20Sopenharmony_ci
608c2ecf20Sopenharmony_ciThis option is identical to SO_TIMESTAMP except for the returned data type.
618c2ecf20Sopenharmony_ciIts struct timespec allows for higher resolution (ns) timestamps than the
628c2ecf20Sopenharmony_citimeval of SO_TIMESTAMP (ms).
638c2ecf20Sopenharmony_ci
648c2ecf20Sopenharmony_ciAlways use SO_TIMESTAMPNS_NEW timestamp to always get timestamp in
658c2ecf20Sopenharmony_cistruct __kernel_timespec format.
668c2ecf20Sopenharmony_ci
678c2ecf20Sopenharmony_ciSO_TIMESTAMPNS_OLD returns incorrect timestamps after the year 2038
688c2ecf20Sopenharmony_cion 32 bit machines.
698c2ecf20Sopenharmony_ci
708c2ecf20Sopenharmony_ci1.3 SO_TIMESTAMPING (also SO_TIMESTAMPING_OLD and SO_TIMESTAMPING_NEW)
718c2ecf20Sopenharmony_ci----------------------------------------------------------------------
728c2ecf20Sopenharmony_ci
738c2ecf20Sopenharmony_ciSupports multiple types of timestamp requests. As a result, this
748c2ecf20Sopenharmony_cisocket option takes a bitmap of flags, not a boolean. In::
758c2ecf20Sopenharmony_ci
768c2ecf20Sopenharmony_ci  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
778c2ecf20Sopenharmony_ci
788c2ecf20Sopenharmony_cival is an integer with any of the following bits set. Setting other
798c2ecf20Sopenharmony_cibit returns EINVAL and does not change the current state.
808c2ecf20Sopenharmony_ci
818c2ecf20Sopenharmony_ciThe socket option configures timestamp generation for individual
828c2ecf20Sopenharmony_cisk_buffs (1.3.1), timestamp reporting to the socket's error
838c2ecf20Sopenharmony_ciqueue (1.3.2) and options (1.3.3). Timestamp generation can also
848c2ecf20Sopenharmony_cibe enabled for individual sendmsg calls using cmsg (1.3.4).
858c2ecf20Sopenharmony_ci
868c2ecf20Sopenharmony_ci
878c2ecf20Sopenharmony_ci1.3.1 Timestamp Generation
888c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^
898c2ecf20Sopenharmony_ci
908c2ecf20Sopenharmony_ciSome bits are requests to the stack to try to generate timestamps. Any
918c2ecf20Sopenharmony_cicombination of them is valid. Changes to these bits apply to newly
928c2ecf20Sopenharmony_cicreated packets, not to packets already in the stack. As a result, it
938c2ecf20Sopenharmony_ciis possible to selectively request timestamps for a subset of packets
948c2ecf20Sopenharmony_ci(e.g., for sampling) by embedding an send() call within two setsockopt
958c2ecf20Sopenharmony_cicalls, one to enable timestamp generation and one to disable it.
968c2ecf20Sopenharmony_ciTimestamps may also be generated for reasons other than being
978c2ecf20Sopenharmony_cirequested by a particular socket, such as when receive timestamping is
988c2ecf20Sopenharmony_cienabled system wide, as explained earlier.
998c2ecf20Sopenharmony_ci
1008c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_RX_HARDWARE:
1018c2ecf20Sopenharmony_ci  Request rx timestamps generated by the network adapter.
1028c2ecf20Sopenharmony_ci
1038c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_RX_SOFTWARE:
1048c2ecf20Sopenharmony_ci  Request rx timestamps when data enters the kernel. These timestamps
1058c2ecf20Sopenharmony_ci  are generated just after a device driver hands a packet to the
1068c2ecf20Sopenharmony_ci  kernel receive stack.
1078c2ecf20Sopenharmony_ci
1088c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_TX_HARDWARE:
1098c2ecf20Sopenharmony_ci  Request tx timestamps generated by the network adapter. This flag
1108c2ecf20Sopenharmony_ci  can be enabled via both socket options and control messages.
1118c2ecf20Sopenharmony_ci
1128c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_TX_SOFTWARE:
1138c2ecf20Sopenharmony_ci  Request tx timestamps when data leaves the kernel. These timestamps
1148c2ecf20Sopenharmony_ci  are generated in the device driver as close as possible, but always
1158c2ecf20Sopenharmony_ci  prior to, passing the packet to the network interface. Hence, they
1168c2ecf20Sopenharmony_ci  require driver support and may not be available for all devices.
1178c2ecf20Sopenharmony_ci  This flag can be enabled via both socket options and control messages.
1188c2ecf20Sopenharmony_ci
1198c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_TX_SCHED:
1208c2ecf20Sopenharmony_ci  Request tx timestamps prior to entering the packet scheduler. Kernel
1218c2ecf20Sopenharmony_ci  transmit latency is, if long, often dominated by queuing delay. The
1228c2ecf20Sopenharmony_ci  difference between this timestamp and one taken at
1238c2ecf20Sopenharmony_ci  SOF_TIMESTAMPING_TX_SOFTWARE will expose this latency independent
1248c2ecf20Sopenharmony_ci  of protocol processing. The latency incurred in protocol
1258c2ecf20Sopenharmony_ci  processing, if any, can be computed by subtracting a userspace
1268c2ecf20Sopenharmony_ci  timestamp taken immediately before send() from this timestamp. On
1278c2ecf20Sopenharmony_ci  machines with virtual devices where a transmitted packet travels
1288c2ecf20Sopenharmony_ci  through multiple devices and, hence, multiple packet schedulers,
1298c2ecf20Sopenharmony_ci  a timestamp is generated at each layer. This allows for fine
1308c2ecf20Sopenharmony_ci  grained measurement of queuing delay. This flag can be enabled
1318c2ecf20Sopenharmony_ci  via both socket options and control messages.
1328c2ecf20Sopenharmony_ci
1338c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_TX_ACK:
1348c2ecf20Sopenharmony_ci  Request tx timestamps when all data in the send buffer has been
1358c2ecf20Sopenharmony_ci  acknowledged. This only makes sense for reliable protocols. It is
1368c2ecf20Sopenharmony_ci  currently only implemented for TCP. For that protocol, it may
1378c2ecf20Sopenharmony_ci  over-report measurement, because the timestamp is generated when all
1388c2ecf20Sopenharmony_ci  data up to and including the buffer at send() was acknowledged: the
1398c2ecf20Sopenharmony_ci  cumulative acknowledgment. The mechanism ignores SACK and FACK.
1408c2ecf20Sopenharmony_ci  This flag can be enabled via both socket options and control messages.
1418c2ecf20Sopenharmony_ci
1428c2ecf20Sopenharmony_ci
1438c2ecf20Sopenharmony_ci1.3.2 Timestamp Reporting
1448c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^
1458c2ecf20Sopenharmony_ci
1468c2ecf20Sopenharmony_ciThe other three bits control which timestamps will be reported in a
1478c2ecf20Sopenharmony_cigenerated control message. Changes to the bits take immediate
1488c2ecf20Sopenharmony_cieffect at the timestamp reporting locations in the stack. Timestamps
1498c2ecf20Sopenharmony_ciare only reported for packets that also have the relevant timestamp
1508c2ecf20Sopenharmony_cigeneration request set.
1518c2ecf20Sopenharmony_ci
1528c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_SOFTWARE:
1538c2ecf20Sopenharmony_ci  Report any software timestamps when available.
1548c2ecf20Sopenharmony_ci
1558c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_SYS_HARDWARE:
1568c2ecf20Sopenharmony_ci  This option is deprecated and ignored.
1578c2ecf20Sopenharmony_ci
1588c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_RAW_HARDWARE:
1598c2ecf20Sopenharmony_ci  Report hardware timestamps as generated by
1608c2ecf20Sopenharmony_ci  SOF_TIMESTAMPING_TX_HARDWARE when available.
1618c2ecf20Sopenharmony_ci
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ci1.3.3 Timestamp Options
1648c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^
1658c2ecf20Sopenharmony_ci
1668c2ecf20Sopenharmony_ciThe interface supports the options
1678c2ecf20Sopenharmony_ci
1688c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_ID:
1698c2ecf20Sopenharmony_ci  Generate a unique identifier along with each packet. A process can
1708c2ecf20Sopenharmony_ci  have multiple concurrent timestamping requests outstanding. Packets
1718c2ecf20Sopenharmony_ci  can be reordered in the transmit path, for instance in the packet
1728c2ecf20Sopenharmony_ci  scheduler. In that case timestamps will be queued onto the error
1738c2ecf20Sopenharmony_ci  queue out of order from the original send() calls. It is not always
1748c2ecf20Sopenharmony_ci  possible to uniquely match timestamps to the original send() calls
1758c2ecf20Sopenharmony_ci  based on timestamp order or payload inspection alone, then.
1768c2ecf20Sopenharmony_ci
1778c2ecf20Sopenharmony_ci  This option associates each packet at send() with a unique
1788c2ecf20Sopenharmony_ci  identifier and returns that along with the timestamp. The identifier
1798c2ecf20Sopenharmony_ci  is derived from a per-socket u32 counter (that wraps). For datagram
1808c2ecf20Sopenharmony_ci  sockets, the counter increments with each sent packet. For stream
1818c2ecf20Sopenharmony_ci  sockets, it increments with every byte.
1828c2ecf20Sopenharmony_ci
1838c2ecf20Sopenharmony_ci  The counter starts at zero. It is initialized the first time that
1848c2ecf20Sopenharmony_ci  the socket option is enabled. It is reset each time the option is
1858c2ecf20Sopenharmony_ci  enabled after having been disabled. Resetting the counter does not
1868c2ecf20Sopenharmony_ci  change the identifiers of existing packets in the system.
1878c2ecf20Sopenharmony_ci
1888c2ecf20Sopenharmony_ci  This option is implemented only for transmit timestamps. There, the
1898c2ecf20Sopenharmony_ci  timestamp is always looped along with a struct sock_extended_err.
1908c2ecf20Sopenharmony_ci  The option modifies field ee_data to pass an id that is unique
1918c2ecf20Sopenharmony_ci  among all possibly concurrently outstanding timestamp requests for
1928c2ecf20Sopenharmony_ci  that socket.
1938c2ecf20Sopenharmony_ci
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_CMSG:
1968c2ecf20Sopenharmony_ci  Support recv() cmsg for all timestamped packets. Control messages
1978c2ecf20Sopenharmony_ci  are already supported unconditionally on all packets with receive
1988c2ecf20Sopenharmony_ci  timestamps and on IPv6 packets with transmit timestamp. This option
1998c2ecf20Sopenharmony_ci  extends them to IPv4 packets with transmit timestamp. One use case
2008c2ecf20Sopenharmony_ci  is to correlate packets with their egress device, by enabling socket
2018c2ecf20Sopenharmony_ci  option IP_PKTINFO simultaneously.
2028c2ecf20Sopenharmony_ci
2038c2ecf20Sopenharmony_ci
2048c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_TSONLY:
2058c2ecf20Sopenharmony_ci  Applies to transmit timestamps only. Makes the kernel return the
2068c2ecf20Sopenharmony_ci  timestamp as a cmsg alongside an empty packet, as opposed to
2078c2ecf20Sopenharmony_ci  alongside the original packet. This reduces the amount of memory
2088c2ecf20Sopenharmony_ci  charged to the socket's receive budget (SO_RCVBUF) and delivers
2098c2ecf20Sopenharmony_ci  the timestamp even if sysctl net.core.tstamp_allow_data is 0.
2108c2ecf20Sopenharmony_ci  This option disables SOF_TIMESTAMPING_OPT_CMSG.
2118c2ecf20Sopenharmony_ci
2128c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_STATS:
2138c2ecf20Sopenharmony_ci  Optional stats that are obtained along with the transmit timestamps.
2148c2ecf20Sopenharmony_ci  It must be used together with SOF_TIMESTAMPING_OPT_TSONLY. When the
2158c2ecf20Sopenharmony_ci  transmit timestamp is available, the stats are available in a
2168c2ecf20Sopenharmony_ci  separate control message of type SCM_TIMESTAMPING_OPT_STATS, as a
2178c2ecf20Sopenharmony_ci  list of TLVs (struct nlattr) of types. These stats allow the
2188c2ecf20Sopenharmony_ci  application to associate various transport layer stats with
2198c2ecf20Sopenharmony_ci  the transmit timestamps, such as how long a certain block of
2208c2ecf20Sopenharmony_ci  data was limited by peer's receiver window.
2218c2ecf20Sopenharmony_ci
2228c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_PKTINFO:
2238c2ecf20Sopenharmony_ci  Enable the SCM_TIMESTAMPING_PKTINFO control message for incoming
2248c2ecf20Sopenharmony_ci  packets with hardware timestamps. The message contains struct
2258c2ecf20Sopenharmony_ci  scm_ts_pktinfo, which supplies the index of the real interface which
2268c2ecf20Sopenharmony_ci  received the packet and its length at layer 2. A valid (non-zero)
2278c2ecf20Sopenharmony_ci  interface index will be returned only if CONFIG_NET_RX_BUSY_POLL is
2288c2ecf20Sopenharmony_ci  enabled and the driver is using NAPI. The struct contains also two
2298c2ecf20Sopenharmony_ci  other fields, but they are reserved and undefined.
2308c2ecf20Sopenharmony_ci
2318c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_OPT_TX_SWHW:
2328c2ecf20Sopenharmony_ci  Request both hardware and software timestamps for outgoing packets
2338c2ecf20Sopenharmony_ci  when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
2348c2ecf20Sopenharmony_ci  are enabled at the same time. If both timestamps are generated,
2358c2ecf20Sopenharmony_ci  two separate messages will be looped to the socket's error queue,
2368c2ecf20Sopenharmony_ci  each containing just one timestamp.
2378c2ecf20Sopenharmony_ci
2388c2ecf20Sopenharmony_ciNew applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
2398c2ecf20Sopenharmony_cidisambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
2408c2ecf20Sopenharmony_ciregardless of the setting of sysctl net.core.tstamp_allow_data.
2418c2ecf20Sopenharmony_ci
2428c2ecf20Sopenharmony_ciAn exception is when a process needs additional cmsg data, for
2438c2ecf20Sopenharmony_ciinstance SOL_IP/IP_PKTINFO to detect the egress network interface.
2448c2ecf20Sopenharmony_ciThen pass option SOF_TIMESTAMPING_OPT_CMSG. This option depends on
2458c2ecf20Sopenharmony_cihaving access to the contents of the original packet, so cannot be
2468c2ecf20Sopenharmony_cicombined with SOF_TIMESTAMPING_OPT_TSONLY.
2478c2ecf20Sopenharmony_ci
2488c2ecf20Sopenharmony_ci
2498c2ecf20Sopenharmony_ci1.3.4. Enabling timestamps via control messages
2508c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2518c2ecf20Sopenharmony_ci
2528c2ecf20Sopenharmony_ciIn addition to socket options, timestamp generation can be requested
2538c2ecf20Sopenharmony_ciper write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1).
2548c2ecf20Sopenharmony_ciUsing this feature, applications can sample timestamps per sendmsg()
2558c2ecf20Sopenharmony_ciwithout paying the overhead of enabling and disabling timestamps via
2568c2ecf20Sopenharmony_cisetsockopt::
2578c2ecf20Sopenharmony_ci
2588c2ecf20Sopenharmony_ci  struct msghdr *msg;
2598c2ecf20Sopenharmony_ci  ...
2608c2ecf20Sopenharmony_ci  cmsg			       = CMSG_FIRSTHDR(msg);
2618c2ecf20Sopenharmony_ci  cmsg->cmsg_level	       = SOL_SOCKET;
2628c2ecf20Sopenharmony_ci  cmsg->cmsg_type	       = SO_TIMESTAMPING;
2638c2ecf20Sopenharmony_ci  cmsg->cmsg_len	       = CMSG_LEN(sizeof(__u32));
2648c2ecf20Sopenharmony_ci  *((__u32 *) CMSG_DATA(cmsg)) = SOF_TIMESTAMPING_TX_SCHED |
2658c2ecf20Sopenharmony_ci				 SOF_TIMESTAMPING_TX_SOFTWARE |
2668c2ecf20Sopenharmony_ci				 SOF_TIMESTAMPING_TX_ACK;
2678c2ecf20Sopenharmony_ci  err = sendmsg(fd, msg, 0);
2688c2ecf20Sopenharmony_ci
2698c2ecf20Sopenharmony_ciThe SOF_TIMESTAMPING_TX_* flags set via cmsg will override
2708c2ecf20Sopenharmony_cithe SOF_TIMESTAMPING_TX_* flags set via setsockopt.
2718c2ecf20Sopenharmony_ci
2728c2ecf20Sopenharmony_ciMoreover, applications must still enable timestamp reporting via
2738c2ecf20Sopenharmony_cisetsockopt to receive timestamps::
2748c2ecf20Sopenharmony_ci
2758c2ecf20Sopenharmony_ci  __u32 val = SOF_TIMESTAMPING_SOFTWARE |
2768c2ecf20Sopenharmony_ci	      SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
2778c2ecf20Sopenharmony_ci  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val));
2788c2ecf20Sopenharmony_ci
2798c2ecf20Sopenharmony_ci
2808c2ecf20Sopenharmony_ci1.4 Bytestream Timestamps
2818c2ecf20Sopenharmony_ci-------------------------
2828c2ecf20Sopenharmony_ci
2838c2ecf20Sopenharmony_ciThe SO_TIMESTAMPING interface supports timestamping of bytes in a
2848c2ecf20Sopenharmony_cibytestream. Each request is interpreted as a request for when the
2858c2ecf20Sopenharmony_cientire contents of the buffer has passed a timestamping point. That
2868c2ecf20Sopenharmony_ciis, for streams option SOF_TIMESTAMPING_TX_SOFTWARE will record
2878c2ecf20Sopenharmony_ciwhen all bytes have reached the device driver, regardless of how
2888c2ecf20Sopenharmony_cimany packets the data has been converted into.
2898c2ecf20Sopenharmony_ci
2908c2ecf20Sopenharmony_ciIn general, bytestreams have no natural delimiters and therefore
2918c2ecf20Sopenharmony_cicorrelating a timestamp with data is non-trivial. A range of bytes
2928c2ecf20Sopenharmony_cimay be split across segments, any segments may be merged (possibly
2938c2ecf20Sopenharmony_cicoalescing sections of previously segmented buffers associated with
2948c2ecf20Sopenharmony_ciindependent send() calls). Segments can be reordered and the same
2958c2ecf20Sopenharmony_cibyte range can coexist in multiple segments for protocols that
2968c2ecf20Sopenharmony_ciimplement retransmissions.
2978c2ecf20Sopenharmony_ci
2988c2ecf20Sopenharmony_ciIt is essential that all timestamps implement the same semantics,
2998c2ecf20Sopenharmony_ciregardless of these possible transformations, as otherwise they are
3008c2ecf20Sopenharmony_ciincomparable. Handling "rare" corner cases differently from the
3018c2ecf20Sopenharmony_cisimple case (a 1:1 mapping from buffer to skb) is insufficient
3028c2ecf20Sopenharmony_cibecause performance debugging often needs to focus on such outliers.
3038c2ecf20Sopenharmony_ci
3048c2ecf20Sopenharmony_ciIn practice, timestamps can be correlated with segments of a
3058c2ecf20Sopenharmony_cibytestream consistently, if both semantics of the timestamp and the
3068c2ecf20Sopenharmony_citiming of measurement are chosen correctly. This challenge is no
3078c2ecf20Sopenharmony_cidifferent from deciding on a strategy for IP fragmentation. There, the
3088c2ecf20Sopenharmony_cidefinition is that only the first fragment is timestamped. For
3098c2ecf20Sopenharmony_cibytestreams, we chose that a timestamp is generated only when all
3108c2ecf20Sopenharmony_cibytes have passed a point. SOF_TIMESTAMPING_TX_ACK as defined is easy to
3118c2ecf20Sopenharmony_ciimplement and reason about. An implementation that has to take into
3128c2ecf20Sopenharmony_ciaccount SACK would be more complex due to possible transmission holes
3138c2ecf20Sopenharmony_ciand out of order arrival.
3148c2ecf20Sopenharmony_ci
3158c2ecf20Sopenharmony_ciOn the host, TCP can also break the simple 1:1 mapping from buffer to
3168c2ecf20Sopenharmony_ciskbuff as a result of Nagle, cork, autocork, segmentation and GSO. The
3178c2ecf20Sopenharmony_ciimplementation ensures correctness in all cases by tracking the
3188c2ecf20Sopenharmony_ciindividual last byte passed to send(), even if it is no longer the
3198c2ecf20Sopenharmony_cilast byte after an skbuff extend or merge operation. It stores the
3208c2ecf20Sopenharmony_cirelevant sequence number in skb_shinfo(skb)->tskey. Because an skbuff
3218c2ecf20Sopenharmony_cihas only one such field, only one timestamp can be generated.
3228c2ecf20Sopenharmony_ci
3238c2ecf20Sopenharmony_ciIn rare cases, a timestamp request can be missed if two requests are
3248c2ecf20Sopenharmony_cicollapsed onto the same skb. A process can detect this situation by
3258c2ecf20Sopenharmony_cienabling SOF_TIMESTAMPING_OPT_ID and comparing the byte offset at
3268c2ecf20Sopenharmony_cisend time with the value returned for each timestamp. It can prevent
3278c2ecf20Sopenharmony_cithe situation by always flushing the TCP stack in between requests,
3288c2ecf20Sopenharmony_cifor instance by enabling TCP_NODELAY and disabling TCP_CORK and
3298c2ecf20Sopenharmony_ciautocork.
3308c2ecf20Sopenharmony_ci
3318c2ecf20Sopenharmony_ciThese precautions ensure that the timestamp is generated only when all
3328c2ecf20Sopenharmony_cibytes have passed a timestamp point, assuming that the network stack
3338c2ecf20Sopenharmony_ciitself does not reorder the segments. The stack indeed tries to avoid
3348c2ecf20Sopenharmony_cireordering. The one exception is under administrator control: it is
3358c2ecf20Sopenharmony_cipossible to construct a packet scheduler configuration that delays
3368c2ecf20Sopenharmony_cisegments from the same stream differently. Such a setup would be
3378c2ecf20Sopenharmony_ciunusual.
3388c2ecf20Sopenharmony_ci
3398c2ecf20Sopenharmony_ci
3408c2ecf20Sopenharmony_ci2 Data Interfaces
3418c2ecf20Sopenharmony_ci==================
3428c2ecf20Sopenharmony_ci
3438c2ecf20Sopenharmony_ciTimestamps are read using the ancillary data feature of recvmsg().
3448c2ecf20Sopenharmony_ciSee `man 3 cmsg` for details of this interface. The socket manual
3458c2ecf20Sopenharmony_cipage (`man 7 socket`) describes how timestamps generated with
3468c2ecf20Sopenharmony_ciSO_TIMESTAMP and SO_TIMESTAMPNS records can be retrieved.
3478c2ecf20Sopenharmony_ci
3488c2ecf20Sopenharmony_ci
3498c2ecf20Sopenharmony_ci2.1 SCM_TIMESTAMPING records
3508c2ecf20Sopenharmony_ci----------------------------
3518c2ecf20Sopenharmony_ci
3528c2ecf20Sopenharmony_ciThese timestamps are returned in a control message with cmsg_level
3538c2ecf20Sopenharmony_ciSOL_SOCKET, cmsg_type SCM_TIMESTAMPING, and payload of type
3548c2ecf20Sopenharmony_ci
3558c2ecf20Sopenharmony_ciFor SO_TIMESTAMPING_OLD::
3568c2ecf20Sopenharmony_ci
3578c2ecf20Sopenharmony_ci	struct scm_timestamping {
3588c2ecf20Sopenharmony_ci		struct timespec ts[3];
3598c2ecf20Sopenharmony_ci	};
3608c2ecf20Sopenharmony_ci
3618c2ecf20Sopenharmony_ciFor SO_TIMESTAMPING_NEW::
3628c2ecf20Sopenharmony_ci
3638c2ecf20Sopenharmony_ci	struct scm_timestamping64 {
3648c2ecf20Sopenharmony_ci		struct __kernel_timespec ts[3];
3658c2ecf20Sopenharmony_ci
3668c2ecf20Sopenharmony_ciAlways use SO_TIMESTAMPING_NEW timestamp to always get timestamp in
3678c2ecf20Sopenharmony_cistruct scm_timestamping64 format.
3688c2ecf20Sopenharmony_ci
3698c2ecf20Sopenharmony_ciSO_TIMESTAMPING_OLD returns incorrect timestamps after the year 2038
3708c2ecf20Sopenharmony_cion 32 bit machines.
3718c2ecf20Sopenharmony_ci
3728c2ecf20Sopenharmony_ciThe structure can return up to three timestamps. This is a legacy
3738c2ecf20Sopenharmony_cifeature. At least one field is non-zero at any time. Most timestamps
3748c2ecf20Sopenharmony_ciare passed in ts[0]. Hardware timestamps are passed in ts[2].
3758c2ecf20Sopenharmony_ci
3768c2ecf20Sopenharmony_cits[1] used to hold hardware timestamps converted to system time.
3778c2ecf20Sopenharmony_ciInstead, expose the hardware clock device on the NIC directly as
3788c2ecf20Sopenharmony_cia HW PTP clock source, to allow time conversion in userspace and
3798c2ecf20Sopenharmony_cioptionally synchronize system time with a userspace PTP stack such
3808c2ecf20Sopenharmony_cias linuxptp. For the PTP clock API, see Documentation/driver-api/ptp.rst.
3818c2ecf20Sopenharmony_ci
3828c2ecf20Sopenharmony_ciNote that if the SO_TIMESTAMP or SO_TIMESTAMPNS option is enabled
3838c2ecf20Sopenharmony_citogether with SO_TIMESTAMPING using SOF_TIMESTAMPING_SOFTWARE, a false
3848c2ecf20Sopenharmony_cisoftware timestamp will be generated in the recvmsg() call and passed
3858c2ecf20Sopenharmony_ciin ts[0] when a real software timestamp is missing. This happens also
3868c2ecf20Sopenharmony_cion hardware transmit timestamps.
3878c2ecf20Sopenharmony_ci
3888c2ecf20Sopenharmony_ci2.1.1 Transmit timestamps with MSG_ERRQUEUE
3898c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3908c2ecf20Sopenharmony_ci
3918c2ecf20Sopenharmony_ciFor transmit timestamps the outgoing packet is looped back to the
3928c2ecf20Sopenharmony_cisocket's error queue with the send timestamp(s) attached. A process
3938c2ecf20Sopenharmony_cireceives the timestamps by calling recvmsg() with flag MSG_ERRQUEUE
3948c2ecf20Sopenharmony_ciset and with a msg_control buffer sufficiently large to receive the
3958c2ecf20Sopenharmony_cirelevant metadata structures. The recvmsg call returns the original
3968c2ecf20Sopenharmony_cioutgoing data packet with two ancillary messages attached.
3978c2ecf20Sopenharmony_ci
3988c2ecf20Sopenharmony_ciA message of cm_level SOL_IP(V6) and cm_type IP(V6)_RECVERR
3998c2ecf20Sopenharmony_ciembeds a struct sock_extended_err. This defines the error type. For
4008c2ecf20Sopenharmony_citimestamps, the ee_errno field is ENOMSG. The other ancillary message
4018c2ecf20Sopenharmony_ciwill have cm_level SOL_SOCKET and cm_type SCM_TIMESTAMPING. This
4028c2ecf20Sopenharmony_ciembeds the struct scm_timestamping.
4038c2ecf20Sopenharmony_ci
4048c2ecf20Sopenharmony_ci
4058c2ecf20Sopenharmony_ci2.1.1.2 Timestamp types
4068c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~~
4078c2ecf20Sopenharmony_ci
4088c2ecf20Sopenharmony_ciThe semantics of the three struct timespec are defined by field
4098c2ecf20Sopenharmony_ciee_info in the extended error structure. It contains a value of
4108c2ecf20Sopenharmony_citype SCM_TSTAMP_* to define the actual timestamp passed in
4118c2ecf20Sopenharmony_ciscm_timestamping.
4128c2ecf20Sopenharmony_ci
4138c2ecf20Sopenharmony_ciThe SCM_TSTAMP_* types are 1:1 matches to the SOF_TIMESTAMPING_*
4148c2ecf20Sopenharmony_cicontrol fields discussed previously, with one exception. For legacy
4158c2ecf20Sopenharmony_cireasons, SCM_TSTAMP_SND is equal to zero and can be set for both
4168c2ecf20Sopenharmony_ciSOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE. It
4178c2ecf20Sopenharmony_ciis the first if ts[2] is non-zero, the second otherwise, in which
4188c2ecf20Sopenharmony_cicase the timestamp is stored in ts[0].
4198c2ecf20Sopenharmony_ci
4208c2ecf20Sopenharmony_ci
4218c2ecf20Sopenharmony_ci2.1.1.3 Fragmentation
4228c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~
4238c2ecf20Sopenharmony_ci
4248c2ecf20Sopenharmony_ciFragmentation of outgoing datagrams is rare, but is possible, e.g., by
4258c2ecf20Sopenharmony_ciexplicitly disabling PMTU discovery. If an outgoing packet is fragmented,
4268c2ecf20Sopenharmony_cithen only the first fragment is timestamped and returned to the sending
4278c2ecf20Sopenharmony_cisocket.
4288c2ecf20Sopenharmony_ci
4298c2ecf20Sopenharmony_ci
4308c2ecf20Sopenharmony_ci2.1.1.4 Packet Payload
4318c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~~
4328c2ecf20Sopenharmony_ci
4338c2ecf20Sopenharmony_ciThe calling application is often not interested in receiving the whole
4348c2ecf20Sopenharmony_cipacket payload that it passed to the stack originally: the socket
4358c2ecf20Sopenharmony_cierror queue mechanism is just a method to piggyback the timestamp on.
4368c2ecf20Sopenharmony_ciIn this case, the application can choose to read datagrams with a
4378c2ecf20Sopenharmony_cismaller buffer, possibly even of length 0. The payload is truncated
4388c2ecf20Sopenharmony_ciaccordingly. Until the process calls recvmsg() on the error queue,
4398c2ecf20Sopenharmony_cihowever, the full packet is queued, taking up budget from SO_RCVBUF.
4408c2ecf20Sopenharmony_ci
4418c2ecf20Sopenharmony_ci
4428c2ecf20Sopenharmony_ci2.1.1.5 Blocking Read
4438c2ecf20Sopenharmony_ci~~~~~~~~~~~~~~~~~~~~~
4448c2ecf20Sopenharmony_ci
4458c2ecf20Sopenharmony_ciReading from the error queue is always a non-blocking operation. To
4468c2ecf20Sopenharmony_ciblock waiting on a timestamp, use poll or select. poll() will return
4478c2ecf20Sopenharmony_ciPOLLERR in pollfd.revents if any data is ready on the error queue.
4488c2ecf20Sopenharmony_ciThere is no need to pass this flag in pollfd.events. This flag is
4498c2ecf20Sopenharmony_ciignored on request. See also `man 2 poll`.
4508c2ecf20Sopenharmony_ci
4518c2ecf20Sopenharmony_ci
4528c2ecf20Sopenharmony_ci2.1.2 Receive timestamps
4538c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^
4548c2ecf20Sopenharmony_ci
4558c2ecf20Sopenharmony_ciOn reception, there is no reason to read from the socket error queue.
4568c2ecf20Sopenharmony_ciThe SCM_TIMESTAMPING ancillary data is sent along with the packet data
4578c2ecf20Sopenharmony_cion a normal recvmsg(). Since this is not a socket error, it is not
4588c2ecf20Sopenharmony_ciaccompanied by a message SOL_IP(V6)/IP(V6)_RECVERROR. In this case,
4598c2ecf20Sopenharmony_cithe meaning of the three fields in struct scm_timestamping is
4608c2ecf20Sopenharmony_ciimplicitly defined. ts[0] holds a software timestamp if set, ts[1]
4618c2ecf20Sopenharmony_ciis again deprecated and ts[2] holds a hardware timestamp if set.
4628c2ecf20Sopenharmony_ci
4638c2ecf20Sopenharmony_ci
4648c2ecf20Sopenharmony_ci3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP
4658c2ecf20Sopenharmony_ci=======================================================================
4668c2ecf20Sopenharmony_ci
4678c2ecf20Sopenharmony_ciHardware time stamping must also be initialized for each device driver
4688c2ecf20Sopenharmony_cithat is expected to do hardware time stamping. The parameter is defined in
4698c2ecf20Sopenharmony_ciinclude/uapi/linux/net_tstamp.h as::
4708c2ecf20Sopenharmony_ci
4718c2ecf20Sopenharmony_ci	struct hwtstamp_config {
4728c2ecf20Sopenharmony_ci		int flags;	/* no flags defined right now, must be zero */
4738c2ecf20Sopenharmony_ci		int tx_type;	/* HWTSTAMP_TX_* */
4748c2ecf20Sopenharmony_ci		int rx_filter;	/* HWTSTAMP_FILTER_* */
4758c2ecf20Sopenharmony_ci	};
4768c2ecf20Sopenharmony_ci
4778c2ecf20Sopenharmony_ciDesired behavior is passed into the kernel and to a specific device by
4788c2ecf20Sopenharmony_cicalling ioctl(SIOCSHWTSTAMP) with a pointer to a struct ifreq whose
4798c2ecf20Sopenharmony_ciifr_data points to a struct hwtstamp_config. The tx_type and
4808c2ecf20Sopenharmony_cirx_filter are hints to the driver what it is expected to do. If
4818c2ecf20Sopenharmony_cithe requested fine-grained filtering for incoming packets is not
4828c2ecf20Sopenharmony_cisupported, the driver may time stamp more than just the requested types
4838c2ecf20Sopenharmony_ciof packets.
4848c2ecf20Sopenharmony_ci
4858c2ecf20Sopenharmony_ciDrivers are free to use a more permissive configuration than the requested
4868c2ecf20Sopenharmony_ciconfiguration. It is expected that drivers should only implement directly the
4878c2ecf20Sopenharmony_cimost generic mode that can be supported. For example if the hardware can
4888c2ecf20Sopenharmony_cisupport HWTSTAMP_FILTER_V2_EVENT, then it should generally always upscale
4898c2ecf20Sopenharmony_ciHWTSTAMP_FILTER_V2_L2_SYNC_MESSAGE, and so forth, as HWTSTAMP_FILTER_V2_EVENT
4908c2ecf20Sopenharmony_ciis more generic (and more useful to applications).
4918c2ecf20Sopenharmony_ci
4928c2ecf20Sopenharmony_ciA driver which supports hardware time stamping shall update the struct
4938c2ecf20Sopenharmony_ciwith the actual, possibly more permissive configuration. If the
4948c2ecf20Sopenharmony_cirequested packets cannot be time stamped, then nothing should be
4958c2ecf20Sopenharmony_cichanged and ERANGE shall be returned (in contrast to EINVAL, which
4968c2ecf20Sopenharmony_ciindicates that SIOCSHWTSTAMP is not supported at all).
4978c2ecf20Sopenharmony_ci
4988c2ecf20Sopenharmony_ciOnly a processes with admin rights may change the configuration. User
4998c2ecf20Sopenharmony_cispace is responsible to ensure that multiple processes don't interfere
5008c2ecf20Sopenharmony_ciwith each other and that the settings are reset.
5018c2ecf20Sopenharmony_ci
5028c2ecf20Sopenharmony_ciAny process can read the actual configuration by passing this
5038c2ecf20Sopenharmony_cistructure to ioctl(SIOCGHWTSTAMP) in the same way.  However, this has
5048c2ecf20Sopenharmony_cinot been implemented in all drivers.
5058c2ecf20Sopenharmony_ci
5068c2ecf20Sopenharmony_ci::
5078c2ecf20Sopenharmony_ci
5088c2ecf20Sopenharmony_ci    /* possible values for hwtstamp_config->tx_type */
5098c2ecf20Sopenharmony_ci    enum {
5108c2ecf20Sopenharmony_ci	    /*
5118c2ecf20Sopenharmony_ci	    * no outgoing packet will need hardware time stamping;
5128c2ecf20Sopenharmony_ci	    * should a packet arrive which asks for it, no hardware
5138c2ecf20Sopenharmony_ci	    * time stamping will be done
5148c2ecf20Sopenharmony_ci	    */
5158c2ecf20Sopenharmony_ci	    HWTSTAMP_TX_OFF,
5168c2ecf20Sopenharmony_ci
5178c2ecf20Sopenharmony_ci	    /*
5188c2ecf20Sopenharmony_ci	    * enables hardware time stamping for outgoing packets;
5198c2ecf20Sopenharmony_ci	    * the sender of the packet decides which are to be
5208c2ecf20Sopenharmony_ci	    * time stamped by setting SOF_TIMESTAMPING_TX_SOFTWARE
5218c2ecf20Sopenharmony_ci	    * before sending the packet
5228c2ecf20Sopenharmony_ci	    */
5238c2ecf20Sopenharmony_ci	    HWTSTAMP_TX_ON,
5248c2ecf20Sopenharmony_ci    };
5258c2ecf20Sopenharmony_ci
5268c2ecf20Sopenharmony_ci    /* possible values for hwtstamp_config->rx_filter */
5278c2ecf20Sopenharmony_ci    enum {
5288c2ecf20Sopenharmony_ci	    /* time stamp no incoming packet at all */
5298c2ecf20Sopenharmony_ci	    HWTSTAMP_FILTER_NONE,
5308c2ecf20Sopenharmony_ci
5318c2ecf20Sopenharmony_ci	    /* time stamp any incoming packet */
5328c2ecf20Sopenharmony_ci	    HWTSTAMP_FILTER_ALL,
5338c2ecf20Sopenharmony_ci
5348c2ecf20Sopenharmony_ci	    /* return value: time stamp all packets requested plus some others */
5358c2ecf20Sopenharmony_ci	    HWTSTAMP_FILTER_SOME,
5368c2ecf20Sopenharmony_ci
5378c2ecf20Sopenharmony_ci	    /* PTP v1, UDP, any kind of event packet */
5388c2ecf20Sopenharmony_ci	    HWTSTAMP_FILTER_PTP_V1_L4_EVENT,
5398c2ecf20Sopenharmony_ci
5408c2ecf20Sopenharmony_ci	    /* for the complete list of values, please check
5418c2ecf20Sopenharmony_ci	    * the include file include/uapi/linux/net_tstamp.h
5428c2ecf20Sopenharmony_ci	    */
5438c2ecf20Sopenharmony_ci    };
5448c2ecf20Sopenharmony_ci
5458c2ecf20Sopenharmony_ci3.1 Hardware Timestamping Implementation: Device Drivers
5468c2ecf20Sopenharmony_ci--------------------------------------------------------
5478c2ecf20Sopenharmony_ci
5488c2ecf20Sopenharmony_ciA driver which supports hardware time stamping must support the
5498c2ecf20Sopenharmony_ciSIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with
5508c2ecf20Sopenharmony_cithe actual values as described in the section on SIOCSHWTSTAMP.  It
5518c2ecf20Sopenharmony_cishould also support SIOCGHWTSTAMP.
5528c2ecf20Sopenharmony_ci
5538c2ecf20Sopenharmony_ciTime stamps for received packets must be stored in the skb. To get a pointer
5548c2ecf20Sopenharmony_cito the shared time stamp structure of the skb call skb_hwtstamps(). Then
5558c2ecf20Sopenharmony_ciset the time stamps in the structure::
5568c2ecf20Sopenharmony_ci
5578c2ecf20Sopenharmony_ci    struct skb_shared_hwtstamps {
5588c2ecf20Sopenharmony_ci	    /* hardware time stamp transformed into duration
5598c2ecf20Sopenharmony_ci	    * since arbitrary point in time
5608c2ecf20Sopenharmony_ci	    */
5618c2ecf20Sopenharmony_ci	    ktime_t	hwtstamp;
5628c2ecf20Sopenharmony_ci    };
5638c2ecf20Sopenharmony_ci
5648c2ecf20Sopenharmony_ciTime stamps for outgoing packets are to be generated as follows:
5658c2ecf20Sopenharmony_ci
5668c2ecf20Sopenharmony_ci- In hard_start_xmit(), check if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)
5678c2ecf20Sopenharmony_ci  is set no-zero. If yes, then the driver is expected to do hardware time
5688c2ecf20Sopenharmony_ci  stamping.
5698c2ecf20Sopenharmony_ci- If this is possible for the skb and requested, then declare
5708c2ecf20Sopenharmony_ci  that the driver is doing the time stamping by setting the flag
5718c2ecf20Sopenharmony_ci  SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with::
5728c2ecf20Sopenharmony_ci
5738c2ecf20Sopenharmony_ci      skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
5748c2ecf20Sopenharmony_ci
5758c2ecf20Sopenharmony_ci  You might want to keep a pointer to the associated skb for the next step
5768c2ecf20Sopenharmony_ci  and not free the skb. A driver not supporting hardware time stamping doesn't
5778c2ecf20Sopenharmony_ci  do that. A driver must never touch sk_buff::tstamp! It is used to store
5788c2ecf20Sopenharmony_ci  software generated time stamps by the network subsystem.
5798c2ecf20Sopenharmony_ci- Driver should call skb_tx_timestamp() as close to passing sk_buff to hardware
5808c2ecf20Sopenharmony_ci  as possible. skb_tx_timestamp() provides a software time stamp if requested
5818c2ecf20Sopenharmony_ci  and hardware timestamping is not possible (SKBTX_IN_PROGRESS not set).
5828c2ecf20Sopenharmony_ci- As soon as the driver has sent the packet and/or obtained a
5838c2ecf20Sopenharmony_ci  hardware time stamp for it, it passes the time stamp back by
5848c2ecf20Sopenharmony_ci  calling skb_hwtstamp_tx() with the original skb, the raw
5858c2ecf20Sopenharmony_ci  hardware time stamp. skb_hwtstamp_tx() clones the original skb and
5868c2ecf20Sopenharmony_ci  adds the timestamps, therefore the original skb has to be freed now.
5878c2ecf20Sopenharmony_ci  If obtaining the hardware time stamp somehow fails, then the driver
5888c2ecf20Sopenharmony_ci  should not fall back to software time stamping. The rationale is that
5898c2ecf20Sopenharmony_ci  this would occur at a later time in the processing pipeline than other
5908c2ecf20Sopenharmony_ci  software time stamping and therefore could lead to unexpected deltas
5918c2ecf20Sopenharmony_ci  between time stamps.
5928c2ecf20Sopenharmony_ci
5938c2ecf20Sopenharmony_ci3.2 Special considerations for stacked PTP Hardware Clocks
5948c2ecf20Sopenharmony_ci----------------------------------------------------------
5958c2ecf20Sopenharmony_ci
5968c2ecf20Sopenharmony_ciThere are situations when there may be more than one PHC (PTP Hardware Clock)
5978c2ecf20Sopenharmony_ciin the data path of a packet. The kernel has no explicit mechanism to allow the
5988c2ecf20Sopenharmony_ciuser to select which PHC to use for timestamping Ethernet frames. Instead, the
5998c2ecf20Sopenharmony_ciassumption is that the outermost PHC is always the most preferable, and that
6008c2ecf20Sopenharmony_cikernel drivers collaborate towards achieving that goal. Currently there are 3
6018c2ecf20Sopenharmony_cicases of stacked PHCs, detailed below:
6028c2ecf20Sopenharmony_ci
6038c2ecf20Sopenharmony_ci3.2.1 DSA (Distributed Switch Architecture) switches
6048c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6058c2ecf20Sopenharmony_ci
6068c2ecf20Sopenharmony_ciThese are Ethernet switches which have one of their ports connected to an
6078c2ecf20Sopenharmony_ci(otherwise completely unaware) host Ethernet interface, and perform the role of
6088c2ecf20Sopenharmony_cia port multiplier with optional forwarding acceleration features.  Each DSA
6098c2ecf20Sopenharmony_ciswitch port is visible to the user as a standalone (virtual) network interface,
6108c2ecf20Sopenharmony_ciand its network I/O is performed, under the hood, indirectly through the host
6118c2ecf20Sopenharmony_ciinterface (redirecting to the host port on TX, and intercepting frames on RX).
6128c2ecf20Sopenharmony_ci
6138c2ecf20Sopenharmony_ciWhen a DSA switch is attached to a host port, PTP synchronization has to
6148c2ecf20Sopenharmony_cisuffer, since the switch's variable queuing delay introduces a path delay
6158c2ecf20Sopenharmony_cijitter between the host port and its PTP partner. For this reason, some DSA
6168c2ecf20Sopenharmony_ciswitches include a timestamping clock of their own, and have the ability to
6178c2ecf20Sopenharmony_ciperform network timestamping on their own MAC, such that path delays only
6188c2ecf20Sopenharmony_cimeasure wire and PHY propagation latencies. Timestamping DSA switches are
6198c2ecf20Sopenharmony_cisupported in Linux and expose the same ABI as any other network interface (save
6208c2ecf20Sopenharmony_cifor the fact that the DSA interfaces are in fact virtual in terms of network
6218c2ecf20Sopenharmony_ciI/O, they do have their own PHC).  It is typical, but not mandatory, for all
6228c2ecf20Sopenharmony_ciinterfaces of a DSA switch to share the same PHC.
6238c2ecf20Sopenharmony_ci
6248c2ecf20Sopenharmony_ciBy design, PTP timestamping with a DSA switch does not need any special
6258c2ecf20Sopenharmony_cihandling in the driver for the host port it is attached to.  However, when the
6268c2ecf20Sopenharmony_cihost port also supports PTP timestamping, DSA will take care of intercepting
6278c2ecf20Sopenharmony_cithe ``.ndo_do_ioctl`` calls towards the host port, and block attempts to enable
6288c2ecf20Sopenharmony_cihardware timestamping on it. This is because the SO_TIMESTAMPING API does not
6298c2ecf20Sopenharmony_ciallow the delivery of multiple hardware timestamps for the same packet, so
6308c2ecf20Sopenharmony_cianybody else except for the DSA switch port must be prevented from doing so.
6318c2ecf20Sopenharmony_ci
6328c2ecf20Sopenharmony_ciIn code, DSA provides for most of the infrastructure for timestamping already,
6338c2ecf20Sopenharmony_ciin generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
6348c2ecf20Sopenharmony_ciPTP event messages (any other packets, including PTP general messages, are not
6358c2ecf20Sopenharmony_citimestamped), and provides two hooks to drivers:
6368c2ecf20Sopenharmony_ci
6378c2ecf20Sopenharmony_ci- ``.port_txtstamp()``: The driver is passed a clone of the timestampable skb
6388c2ecf20Sopenharmony_ci  to be transmitted, before actually transmitting it. Typically, a switch will
6398c2ecf20Sopenharmony_ci  have a PTP TX timestamp register (or sometimes a FIFO) where the timestamp
6408c2ecf20Sopenharmony_ci  becomes available. There may be an IRQ that is raised upon this timestamp's
6418c2ecf20Sopenharmony_ci  availability, or the driver might have to poll after invoking
6428c2ecf20Sopenharmony_ci  ``dev_queue_xmit()`` towards the host interface. Either way, in the
6438c2ecf20Sopenharmony_ci  ``.port_txtstamp()`` method, the driver only needs to save the clone for
6448c2ecf20Sopenharmony_ci  later use (when the timestamp becomes available). Each skb is annotated with
6458c2ecf20Sopenharmony_ci  a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
6468c2ecf20Sopenharmony_ci  job of keeping track of which clone belongs to which skb.
6478c2ecf20Sopenharmony_ci
6488c2ecf20Sopenharmony_ci- ``.port_rxtstamp()``: The original (and only) timestampable skb is provided
6498c2ecf20Sopenharmony_ci  to the driver, for it to annotate it with a timestamp, if that is immediately
6508c2ecf20Sopenharmony_ci  available, or defer to later. On reception, timestamps might either be
6518c2ecf20Sopenharmony_ci  available in-band (through metadata in the DSA header, or attached in other
6528c2ecf20Sopenharmony_ci  ways to the packet), or out-of-band (through another RX timestamping FIFO).
6538c2ecf20Sopenharmony_ci  Deferral on RX is typically necessary when retrieving the timestamp needs a
6548c2ecf20Sopenharmony_ci  sleepable context. In that case, it is the responsibility of the DSA driver
6558c2ecf20Sopenharmony_ci  to call ``netif_rx_ni()`` on the freshly timestamped skb.
6568c2ecf20Sopenharmony_ci
6578c2ecf20Sopenharmony_ci3.2.2 Ethernet PHYs
6588c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^
6598c2ecf20Sopenharmony_ci
6608c2ecf20Sopenharmony_ciThese are devices that typically fulfill a Layer 1 role in the network stack,
6618c2ecf20Sopenharmony_cihence they do not have a representation in terms of a network interface as DSA
6628c2ecf20Sopenharmony_ciswitches do. However, PHYs may be able to detect and timestamp PTP packets, for
6638c2ecf20Sopenharmony_ciperformance reasons: timestamps taken as close as possible to the wire have the
6648c2ecf20Sopenharmony_cipotential to yield a more stable and precise synchronization.
6658c2ecf20Sopenharmony_ci
6668c2ecf20Sopenharmony_ciA PHY driver that supports PTP timestamping must create a ``struct
6678c2ecf20Sopenharmony_cimii_timestamper`` and add a pointer to it in ``phydev->mii_ts``. The presence
6688c2ecf20Sopenharmony_ciof this pointer will be checked by the networking stack.
6698c2ecf20Sopenharmony_ci
6708c2ecf20Sopenharmony_ciSince PHYs do not have network interface representations, the timestamping and
6718c2ecf20Sopenharmony_ciethtool ioctl operations for them need to be mediated by their respective MAC
6728c2ecf20Sopenharmony_cidriver.  Therefore, as opposed to DSA switches, modifications need to be done
6738c2ecf20Sopenharmony_cito each individual MAC driver for PHY timestamping support. This entails:
6748c2ecf20Sopenharmony_ci
6758c2ecf20Sopenharmony_ci- Checking, in ``.ndo_do_ioctl``, whether ``phy_has_hwtstamp(netdev->phydev)``
6768c2ecf20Sopenharmony_ci  is true or not. If it is, then the MAC driver should not process this request
6778c2ecf20Sopenharmony_ci  but instead pass it on to the PHY using ``phy_mii_ioctl()``.
6788c2ecf20Sopenharmony_ci
6798c2ecf20Sopenharmony_ci- On RX, special intervention may or may not be needed, depending on the
6808c2ecf20Sopenharmony_ci  function used to deliver skb's up the network stack. In the case of plain
6818c2ecf20Sopenharmony_ci  ``netif_rx()`` and similar, MAC drivers must check whether
6828c2ecf20Sopenharmony_ci  ``skb_defer_rx_timestamp(skb)`` is necessary or not - and if it is, don't
6838c2ecf20Sopenharmony_ci  call ``netif_rx()`` at all.  If ``CONFIG_NETWORK_PHY_TIMESTAMPING`` is
6848c2ecf20Sopenharmony_ci  enabled, and ``skb->dev->phydev->mii_ts`` exists, its ``.rxtstamp()`` hook
6858c2ecf20Sopenharmony_ci  will be called now, to determine, using logic very similar to DSA, whether
6868c2ecf20Sopenharmony_ci  deferral for RX timestamping is necessary.  Again like DSA, it becomes the
6878c2ecf20Sopenharmony_ci  responsibility of the PHY driver to send the packet up the stack when the
6888c2ecf20Sopenharmony_ci  timestamp is available.
6898c2ecf20Sopenharmony_ci
6908c2ecf20Sopenharmony_ci  For other skb receive functions, such as ``napi_gro_receive`` and
6918c2ecf20Sopenharmony_ci  ``netif_receive_skb``, the stack automatically checks whether
6928c2ecf20Sopenharmony_ci  ``skb_defer_rx_timestamp()`` is necessary, so this check is not needed inside
6938c2ecf20Sopenharmony_ci  the driver.
6948c2ecf20Sopenharmony_ci
6958c2ecf20Sopenharmony_ci- On TX, again, special intervention might or might not be needed.  The
6968c2ecf20Sopenharmony_ci  function that calls the ``mii_ts->txtstamp()`` hook is named
6978c2ecf20Sopenharmony_ci  ``skb_clone_tx_timestamp()``. This function can either be called directly
6988c2ecf20Sopenharmony_ci  (case in which explicit MAC driver support is indeed needed), but the
6998c2ecf20Sopenharmony_ci  function also piggybacks from the ``skb_tx_timestamp()`` call, which many MAC
7008c2ecf20Sopenharmony_ci  drivers already perform for software timestamping purposes. Therefore, if a
7018c2ecf20Sopenharmony_ci  MAC supports software timestamping, it does not need to do anything further
7028c2ecf20Sopenharmony_ci  at this stage.
7038c2ecf20Sopenharmony_ci
7048c2ecf20Sopenharmony_ci3.2.3 MII bus snooping devices
7058c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7068c2ecf20Sopenharmony_ci
7078c2ecf20Sopenharmony_ciThese perform the same role as timestamping Ethernet PHYs, save for the fact
7088c2ecf20Sopenharmony_cithat they are discrete devices and can therefore be used in conjunction with
7098c2ecf20Sopenharmony_ciany PHY even if it doesn't support timestamping. In Linux, they are
7108c2ecf20Sopenharmony_cidiscoverable and attachable to a ``struct phy_device`` through Device Tree, and
7118c2ecf20Sopenharmony_cifor the rest, they use the same mii_ts infrastructure as those. See
7128c2ecf20Sopenharmony_ciDocumentation/devicetree/bindings/ptp/timestamper.txt for more details.
7138c2ecf20Sopenharmony_ci
7148c2ecf20Sopenharmony_ci3.2.4 Other caveats for MAC drivers
7158c2ecf20Sopenharmony_ci^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7168c2ecf20Sopenharmony_ci
7178c2ecf20Sopenharmony_ciStacked PHCs, especially DSA (but not only) - since that doesn't require any
7188c2ecf20Sopenharmony_cimodification to MAC drivers, so it is more difficult to ensure correctness of
7198c2ecf20Sopenharmony_ciall possible code paths - is that they uncover bugs which were impossible to
7208c2ecf20Sopenharmony_citrigger before the existence of stacked PTP clocks.  One example has to do with
7218c2ecf20Sopenharmony_cithis line of code, already presented earlier::
7228c2ecf20Sopenharmony_ci
7238c2ecf20Sopenharmony_ci      skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
7248c2ecf20Sopenharmony_ci
7258c2ecf20Sopenharmony_ciAny TX timestamping logic, be it a plain MAC driver, a DSA switch driver, a PHY
7268c2ecf20Sopenharmony_cidriver or a MII bus snooping device driver, should set this flag.
7278c2ecf20Sopenharmony_ciBut a MAC driver that is unaware of PHC stacking might get tripped up by
7288c2ecf20Sopenharmony_cisomebody other than itself setting this flag, and deliver a duplicate
7298c2ecf20Sopenharmony_citimestamp.
7308c2ecf20Sopenharmony_ciFor example, a typical driver design for TX timestamping might be to split the
7318c2ecf20Sopenharmony_citransmission part into 2 portions:
7328c2ecf20Sopenharmony_ci
7338c2ecf20Sopenharmony_ci1. "TX": checks whether PTP timestamping has been previously enabled through
7348c2ecf20Sopenharmony_ci   the ``.ndo_do_ioctl`` ("``priv->hwtstamp_tx_enabled == true``") and the
7358c2ecf20Sopenharmony_ci   current skb requires a TX timestamp ("``skb_shinfo(skb)->tx_flags &
7368c2ecf20Sopenharmony_ci   SKBTX_HW_TSTAMP``"). If this is true, it sets the
7378c2ecf20Sopenharmony_ci   "``skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS``" flag. Note: as
7388c2ecf20Sopenharmony_ci   described above, in the case of a stacked PHC system, this condition should
7398c2ecf20Sopenharmony_ci   never trigger, as this MAC is certainly not the outermost PHC. But this is
7408c2ecf20Sopenharmony_ci   not where the typical issue is.  Transmission proceeds with this packet.
7418c2ecf20Sopenharmony_ci
7428c2ecf20Sopenharmony_ci2. "TX confirmation": Transmission has finished. The driver checks whether it
7438c2ecf20Sopenharmony_ci   is necessary to collect any TX timestamp for it. Here is where the typical
7448c2ecf20Sopenharmony_ci   issues are: the MAC driver takes a shortcut and only checks whether
7458c2ecf20Sopenharmony_ci   "``skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS``" was set. With a stacked
7468c2ecf20Sopenharmony_ci   PHC system, this is incorrect because this MAC driver is not the only entity
7478c2ecf20Sopenharmony_ci   in the TX data path who could have enabled SKBTX_IN_PROGRESS in the first
7488c2ecf20Sopenharmony_ci   place.
7498c2ecf20Sopenharmony_ci
7508c2ecf20Sopenharmony_ciThe correct solution for this problem is for MAC drivers to have a compound
7518c2ecf20Sopenharmony_cicheck in their "TX confirmation" portion, not only for
7528c2ecf20Sopenharmony_ci"``skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS``", but also for
7538c2ecf20Sopenharmony_ci"``priv->hwtstamp_tx_enabled == true``". Because the rest of the system ensures
7548c2ecf20Sopenharmony_cithat PTP timestamping is not enabled for anything other than the outermost PHC,
7558c2ecf20Sopenharmony_cithis enhanced check will avoid delivering a duplicated TX timestamp to user
7568c2ecf20Sopenharmony_cispace.
757