162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci=============
462306a36Sopenharmony_ciDCCP protocol
562306a36Sopenharmony_ci=============
662306a36Sopenharmony_ci
762306a36Sopenharmony_ci
862306a36Sopenharmony_ci.. Contents
962306a36Sopenharmony_ci   - Introduction
1062306a36Sopenharmony_ci   - Missing features
1162306a36Sopenharmony_ci   - Socket options
1262306a36Sopenharmony_ci   - Sysctl variables
1362306a36Sopenharmony_ci   - IOCTLs
1462306a36Sopenharmony_ci   - Other tunables
1562306a36Sopenharmony_ci   - Notes
1662306a36Sopenharmony_ci
1762306a36Sopenharmony_ci
1862306a36Sopenharmony_ciIntroduction
1962306a36Sopenharmony_ci============
2062306a36Sopenharmony_ciDatagram Congestion Control Protocol (DCCP) is an unreliable, connection
2162306a36Sopenharmony_cioriented protocol designed to solve issues present in UDP and TCP, particularly
2262306a36Sopenharmony_cifor real-time and multimedia (streaming) traffic.
2362306a36Sopenharmony_ciIt divides into a base protocol (RFC 4340) and pluggable congestion control
2462306a36Sopenharmony_cimodules called CCIDs. Like pluggable TCP congestion control, at least one CCID
2562306a36Sopenharmony_cineeds to be enabled in order for the protocol to function properly. In the Linux
2662306a36Sopenharmony_ciimplementation, this is the TCP-like CCID2 (RFC 4341). Additional CCIDs, such as
2762306a36Sopenharmony_cithe TCP-friendly CCID3 (RFC 4342), are optional.
2862306a36Sopenharmony_ciFor a brief introduction to CCIDs and suggestions for choosing a CCID to match
2962306a36Sopenharmony_cigiven applications, see section 10 of RFC 4340.
3062306a36Sopenharmony_ci
3162306a36Sopenharmony_ciIt has a base protocol and pluggable congestion control IDs (CCIDs).
3262306a36Sopenharmony_ci
3362306a36Sopenharmony_ciDCCP is a Proposed Standard (RFC 2026), and the homepage for DCCP as a protocol
3462306a36Sopenharmony_ciis at http://www.ietf.org/html.charters/dccp-charter.html
3562306a36Sopenharmony_ci
3662306a36Sopenharmony_ci
3762306a36Sopenharmony_ciMissing features
3862306a36Sopenharmony_ci================
3962306a36Sopenharmony_ciThe Linux DCCP implementation does not currently support all the features that are
4062306a36Sopenharmony_cispecified in RFCs 4340...42.
4162306a36Sopenharmony_ci
4262306a36Sopenharmony_ciThe known bugs are at:
4362306a36Sopenharmony_ci
4462306a36Sopenharmony_ci	http://www.linuxfoundation.org/collaborate/workgroups/networking/todo#DCCP
4562306a36Sopenharmony_ci
4662306a36Sopenharmony_ciFor more up-to-date versions of the DCCP implementation, please consider using
4762306a36Sopenharmony_cithe experimental DCCP test tree; instructions for checking this out are on:
4862306a36Sopenharmony_cihttp://www.linuxfoundation.org/collaborate/workgroups/networking/dccp_testing#Experimental_DCCP_source_tree
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ci
5162306a36Sopenharmony_ciSocket options
5262306a36Sopenharmony_ci==============
5362306a36Sopenharmony_ciDCCP_SOCKOPT_QPOLICY_ID sets the dequeuing policy for outgoing packets. It takes
5462306a36Sopenharmony_cia policy ID as argument and can only be set before the connection (i.e. changes
5562306a36Sopenharmony_ciduring an established connection are not supported). Currently, two policies are
5662306a36Sopenharmony_cidefined: the "simple" policy (DCCPQ_POLICY_SIMPLE), which does nothing special,
5762306a36Sopenharmony_ciand a priority-based variant (DCCPQ_POLICY_PRIO). The latter allows to pass an
5862306a36Sopenharmony_ciu32 priority value as ancillary data to sendmsg(), where higher numbers indicate
5962306a36Sopenharmony_cia higher packet priority (similar to SO_PRIORITY). This ancillary data needs to
6062306a36Sopenharmony_cibe formatted using a cmsg(3) message header filled in as follows::
6162306a36Sopenharmony_ci
6262306a36Sopenharmony_ci	cmsg->cmsg_level = SOL_DCCP;
6362306a36Sopenharmony_ci	cmsg->cmsg_type	 = DCCP_SCM_PRIORITY;
6462306a36Sopenharmony_ci	cmsg->cmsg_len	 = CMSG_LEN(sizeof(uint32_t));	/* or CMSG_LEN(4) */
6562306a36Sopenharmony_ci
6662306a36Sopenharmony_ciDCCP_SOCKOPT_QPOLICY_TXQLEN sets the maximum length of the output queue. A zero
6762306a36Sopenharmony_civalue is always interpreted as unbounded queue length. If different from zero,
6862306a36Sopenharmony_cithe interpretation of this parameter depends on the current dequeuing policy
6962306a36Sopenharmony_ci(see above): the "simple" policy will enforce a fixed queue size by returning
7062306a36Sopenharmony_ciEAGAIN, whereas the "prio" policy enforces a fixed queue length by dropping the
7162306a36Sopenharmony_cilowest-priority packet first. The default value for this parameter is
7262306a36Sopenharmony_ciinitialised from /proc/sys/net/dccp/default/tx_qlen.
7362306a36Sopenharmony_ci
7462306a36Sopenharmony_ciDCCP_SOCKOPT_SERVICE sets the service. The specification mandates use of
7562306a36Sopenharmony_ciservice codes (RFC 4340, sec. 8.1.2); if this socket option is not set,
7662306a36Sopenharmony_cithe socket will fall back to 0 (which means that no meaningful service code
7762306a36Sopenharmony_ciis present). On active sockets this is set before connect(); specifying more
7862306a36Sopenharmony_cithan one code has no effect (all subsequent service codes are ignored). The
7962306a36Sopenharmony_cicase is different for passive sockets, where multiple service codes (up to 32)
8062306a36Sopenharmony_cican be set before calling bind().
8162306a36Sopenharmony_ci
8262306a36Sopenharmony_ciDCCP_SOCKOPT_GET_CUR_MPS is read-only and retrieves the current maximum packet
8362306a36Sopenharmony_cisize (application payload size) in bytes, see RFC 4340, section 14.
8462306a36Sopenharmony_ci
8562306a36Sopenharmony_ciDCCP_SOCKOPT_AVAILABLE_CCIDS is also read-only and returns the list of CCIDs
8662306a36Sopenharmony_cisupported by the endpoint. The option value is an array of type uint8_t whose
8762306a36Sopenharmony_cisize is passed as option length. The minimum array size is 4 elements, the
8862306a36Sopenharmony_civalue returned in the optlen argument always reflects the true number of
8962306a36Sopenharmony_cibuilt-in CCIDs.
9062306a36Sopenharmony_ci
9162306a36Sopenharmony_ciDCCP_SOCKOPT_CCID is write-only and sets both the TX and RX CCIDs at the same
9262306a36Sopenharmony_citime, combining the operation of the next two socket options. This option is
9362306a36Sopenharmony_cipreferable over the latter two, since often applications will use the same
9462306a36Sopenharmony_citype of CCID for both directions; and mixed use of CCIDs is not currently well
9562306a36Sopenharmony_ciunderstood. This socket option takes as argument at least one uint8_t value, or
9662306a36Sopenharmony_cian array of uint8_t values, which must match available CCIDS (see above). CCIDs
9762306a36Sopenharmony_cimust be registered on the socket before calling connect() or listen().
9862306a36Sopenharmony_ci
9962306a36Sopenharmony_ciDCCP_SOCKOPT_TX_CCID is read/write. It returns the current CCID (if set) or sets
10062306a36Sopenharmony_cithe preference list for the TX CCID, using the same format as DCCP_SOCKOPT_CCID.
10162306a36Sopenharmony_ciPlease note that the getsockopt argument type here is ``int``, not uint8_t.
10262306a36Sopenharmony_ci
10362306a36Sopenharmony_ciDCCP_SOCKOPT_RX_CCID is analogous to DCCP_SOCKOPT_TX_CCID, but for the RX CCID.
10462306a36Sopenharmony_ci
10562306a36Sopenharmony_ciDCCP_SOCKOPT_SERVER_TIMEWAIT enables the server (listening socket) to hold
10662306a36Sopenharmony_citimewait state when closing the connection (RFC 4340, 8.3). The usual case is
10762306a36Sopenharmony_cithat the closing server sends a CloseReq, whereupon the client holds timewait
10862306a36Sopenharmony_cistate. When this boolean socket option is on, the server sends a Close instead
10962306a36Sopenharmony_ciand will enter TIMEWAIT. This option must be set after accept() returns.
11062306a36Sopenharmony_ci
11162306a36Sopenharmony_ciDCCP_SOCKOPT_SEND_CSCOV and DCCP_SOCKOPT_RECV_CSCOV are used for setting the
11262306a36Sopenharmony_cipartial checksum coverage (RFC 4340, sec. 9.2). The default is that checksums
11362306a36Sopenharmony_cialways cover the entire packet and that only fully covered application data is
11462306a36Sopenharmony_ciaccepted by the receiver. Hence, when using this feature on the sender, it must
11562306a36Sopenharmony_cibe enabled at the receiver, too with suitable choice of CsCov.
11662306a36Sopenharmony_ci
11762306a36Sopenharmony_ciDCCP_SOCKOPT_SEND_CSCOV sets the sender checksum coverage. Values in the
11862306a36Sopenharmony_ci	range 0..15 are acceptable. The default setting is 0 (full coverage),
11962306a36Sopenharmony_ci	values between 1..15 indicate partial coverage.
12062306a36Sopenharmony_ci
12162306a36Sopenharmony_ciDCCP_SOCKOPT_RECV_CSCOV is for the receiver and has a different meaning: it
12262306a36Sopenharmony_ci	sets a threshold, where again values 0..15 are acceptable. The default
12362306a36Sopenharmony_ci	of 0 means that all packets with a partial coverage will be discarded.
12462306a36Sopenharmony_ci	Values in the range 1..15 indicate that packets with minimally such a
12562306a36Sopenharmony_ci	coverage value are also acceptable. The higher the number, the more
12662306a36Sopenharmony_ci	restrictive this setting (see [RFC 4340, sec. 9.2.1]). Partial coverage
12762306a36Sopenharmony_ci	settings are inherited to the child socket after accept().
12862306a36Sopenharmony_ci
12962306a36Sopenharmony_ciThe following two options apply to CCID 3 exclusively and are getsockopt()-only.
13062306a36Sopenharmony_ciIn either case, a TFRC info struct (defined in <linux/tfrc.h>) is returned.
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ciDCCP_SOCKOPT_CCID_RX_INFO
13362306a36Sopenharmony_ci	Returns a ``struct tfrc_rx_info`` in optval; the buffer for optval and
13462306a36Sopenharmony_ci	optlen must be set to at least sizeof(struct tfrc_rx_info).
13562306a36Sopenharmony_ci
13662306a36Sopenharmony_ciDCCP_SOCKOPT_CCID_TX_INFO
13762306a36Sopenharmony_ci	Returns a ``struct tfrc_tx_info`` in optval; the buffer for optval and
13862306a36Sopenharmony_ci	optlen must be set to at least sizeof(struct tfrc_tx_info).
13962306a36Sopenharmony_ci
14062306a36Sopenharmony_ciOn unidirectional connections it is useful to close the unused half-connection
14162306a36Sopenharmony_civia shutdown (SHUT_WR or SHUT_RD): this will reduce per-packet processing costs.
14262306a36Sopenharmony_ci
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_ciSysctl variables
14562306a36Sopenharmony_ci================
14662306a36Sopenharmony_ciSeveral DCCP default parameters can be managed by the following sysctls
14762306a36Sopenharmony_ci(sysctl net.dccp.default or /proc/sys/net/dccp/default):
14862306a36Sopenharmony_ci
14962306a36Sopenharmony_cirequest_retries
15062306a36Sopenharmony_ci	The number of active connection initiation retries (the number of
15162306a36Sopenharmony_ci	Requests minus one) before timing out. In addition, it also governs
15262306a36Sopenharmony_ci	the behaviour of the other, passive side: this variable also sets
15362306a36Sopenharmony_ci	the number of times DCCP repeats sending a Response when the initial
15462306a36Sopenharmony_ci	handshake does not progress from RESPOND to OPEN (i.e. when no Ack
15562306a36Sopenharmony_ci	is received after the initial Request).  This value should be greater
15662306a36Sopenharmony_ci	than 0, suggested is less than 10. Analogue of tcp_syn_retries.
15762306a36Sopenharmony_ci
15862306a36Sopenharmony_ciretries1
15962306a36Sopenharmony_ci	How often a DCCP Response is retransmitted until the listening DCCP
16062306a36Sopenharmony_ci	side considers its connecting peer dead. Analogue of tcp_retries1.
16162306a36Sopenharmony_ci
16262306a36Sopenharmony_ciretries2
16362306a36Sopenharmony_ci	The number of times a general DCCP packet is retransmitted. This has
16462306a36Sopenharmony_ci	importance for retransmitted acknowledgments and feature negotiation,
16562306a36Sopenharmony_ci	data packets are never retransmitted. Analogue of tcp_retries2.
16662306a36Sopenharmony_ci
16762306a36Sopenharmony_citx_ccid = 2
16862306a36Sopenharmony_ci	Default CCID for the sender-receiver half-connection. Depending on the
16962306a36Sopenharmony_ci	choice of CCID, the Send Ack Vector feature is enabled automatically.
17062306a36Sopenharmony_ci
17162306a36Sopenharmony_cirx_ccid = 2
17262306a36Sopenharmony_ci	Default CCID for the receiver-sender half-connection; see tx_ccid.
17362306a36Sopenharmony_ci
17462306a36Sopenharmony_ciseq_window = 100
17562306a36Sopenharmony_ci	The initial sequence window (sec. 7.5.2) of the sender. This influences
17662306a36Sopenharmony_ci	the local ackno validity and the remote seqno validity windows (7.5.1).
17762306a36Sopenharmony_ci	Values in the range Wmin = 32 (RFC 4340, 7.5.2) up to 2^32-1 can be set.
17862306a36Sopenharmony_ci
17962306a36Sopenharmony_citx_qlen = 5
18062306a36Sopenharmony_ci	The size of the transmit buffer in packets. A value of 0 corresponds
18162306a36Sopenharmony_ci	to an unbounded transmit buffer.
18262306a36Sopenharmony_ci
18362306a36Sopenharmony_cisync_ratelimit = 125 ms
18462306a36Sopenharmony_ci	The timeout between subsequent DCCP-Sync packets sent in response to
18562306a36Sopenharmony_ci	sequence-invalid packets on the same socket (RFC 4340, 7.5.4). The unit
18662306a36Sopenharmony_ci	of this parameter is milliseconds; a value of 0 disables rate-limiting.
18762306a36Sopenharmony_ci
18862306a36Sopenharmony_ci
18962306a36Sopenharmony_ciIOCTLS
19062306a36Sopenharmony_ci======
19162306a36Sopenharmony_ciFIONREAD
19262306a36Sopenharmony_ci	Works as in udp(7): returns in the ``int`` argument pointer the size of
19362306a36Sopenharmony_ci	the next pending datagram in bytes, or 0 when no datagram is pending.
19462306a36Sopenharmony_ci
19562306a36Sopenharmony_ciSIOCOUTQ
19662306a36Sopenharmony_ci	Returns the number of unsent data bytes in the socket send queue as ``int``
19762306a36Sopenharmony_ci	into the buffer specified by the argument pointer.
19862306a36Sopenharmony_ci
19962306a36Sopenharmony_ciOther tunables
20062306a36Sopenharmony_ci==============
20162306a36Sopenharmony_ciPer-route rto_min support
20262306a36Sopenharmony_ci	CCID-2 supports the RTAX_RTO_MIN per-route setting for the minimum value
20362306a36Sopenharmony_ci	of the RTO timer. This setting can be modified via the 'rto_min' option
20462306a36Sopenharmony_ci	of iproute2; for example::
20562306a36Sopenharmony_ci
20662306a36Sopenharmony_ci		> ip route change 10.0.0.0/24   rto_min 250j dev wlan0
20762306a36Sopenharmony_ci		> ip route add    10.0.0.254/32 rto_min 800j dev wlan0
20862306a36Sopenharmony_ci		> ip route show dev wlan0
20962306a36Sopenharmony_ci
21062306a36Sopenharmony_ci	CCID-3 also supports the rto_min setting: it is used to define the lower
21162306a36Sopenharmony_ci	bound for the expiry of the nofeedback timer. This can be useful on LANs
21262306a36Sopenharmony_ci	with very low RTTs (e.g., loopback, Gbit ethernet).
21362306a36Sopenharmony_ci
21462306a36Sopenharmony_ci
21562306a36Sopenharmony_ciNotes
21662306a36Sopenharmony_ci=====
21762306a36Sopenharmony_ciDCCP does not travel through NAT successfully at present on many boxes. This is
21862306a36Sopenharmony_cibecause the checksum covers the pseudo-header as per TCP and UDP. Linux NAT
21962306a36Sopenharmony_cisupport for DCCP has been added.
220