162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
262306a36Sopenharmony_ci
362306a36Sopenharmony_ci====
462306a36Sopenharmony_ciXFRM
562306a36Sopenharmony_ci====
662306a36Sopenharmony_ci
762306a36Sopenharmony_ciThe sync patches work is based on initial patches from
862306a36Sopenharmony_ciKrisztian <hidden@balabit.hu> and others and additional patches
962306a36Sopenharmony_cifrom Jamal <hadi@cyberus.ca>.
1062306a36Sopenharmony_ci
1162306a36Sopenharmony_ciThe end goal for syncing is to be able to insert attributes + generate
1262306a36Sopenharmony_cievents so that the SA can be safely moved from one machine to another
1362306a36Sopenharmony_cifor HA purposes.
1462306a36Sopenharmony_ciThe idea is to synchronize the SA so that the takeover machine can do
1562306a36Sopenharmony_cithe processing of the SA as accurate as possible if it has access to it.
1662306a36Sopenharmony_ci
1762306a36Sopenharmony_ciWe already have the ability to generate SA add/del/upd events.
1862306a36Sopenharmony_ciThese patches add ability to sync and have accurate lifetime byte (to
1962306a36Sopenharmony_ciensure proper decay of SAs) and replay counters to avoid replay attacks
2062306a36Sopenharmony_ciwith as minimal loss at failover time.
2162306a36Sopenharmony_ciThis way a backup stays as closely up-to-date as an active member.
2262306a36Sopenharmony_ci
2362306a36Sopenharmony_ciBecause the above items change for every packet the SA receives,
2462306a36Sopenharmony_ciit is possible for a lot of the events to be generated.
2562306a36Sopenharmony_ciFor this reason, we also add a nagle-like algorithm to restrict
2662306a36Sopenharmony_cithe events. i.e we are going to set thresholds to say "let me
2762306a36Sopenharmony_ciknow if the replay sequence threshold is reached or 10 secs have passed"
2862306a36Sopenharmony_ciThese thresholds are set system-wide via sysctls or can be updated
2962306a36Sopenharmony_ciper SA.
3062306a36Sopenharmony_ci
3162306a36Sopenharmony_ciThe identified items that need to be synchronized are:
3262306a36Sopenharmony_ci- the lifetime byte counter
3362306a36Sopenharmony_cinote that: lifetime time limit is not important if you assume the failover
3462306a36Sopenharmony_cimachine is known ahead of time since the decay of the time countdown
3562306a36Sopenharmony_ciis not driven by packet arrival.
3662306a36Sopenharmony_ci- the replay sequence for both inbound and outbound
3762306a36Sopenharmony_ci
3862306a36Sopenharmony_ci1) Message Structure
3962306a36Sopenharmony_ci----------------------
4062306a36Sopenharmony_ci
4162306a36Sopenharmony_cinlmsghdr:aevent_id:optional-TLVs.
4262306a36Sopenharmony_ci
4362306a36Sopenharmony_ciThe netlink message types are:
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciXFRM_MSG_NEWAE and XFRM_MSG_GETAE.
4662306a36Sopenharmony_ci
4762306a36Sopenharmony_ciA XFRM_MSG_GETAE does not have TLVs.
4862306a36Sopenharmony_ci
4962306a36Sopenharmony_ciA XFRM_MSG_NEWAE will have at least two TLVs (as is
5062306a36Sopenharmony_cidiscussed further below).
5162306a36Sopenharmony_ci
5262306a36Sopenharmony_ciaevent_id structure looks like::
5362306a36Sopenharmony_ci
5462306a36Sopenharmony_ci   struct xfrm_aevent_id {
5562306a36Sopenharmony_ci	     struct xfrm_usersa_id           sa_id;
5662306a36Sopenharmony_ci	     xfrm_address_t                  saddr;
5762306a36Sopenharmony_ci	     __u32                           flags;
5862306a36Sopenharmony_ci	     __u32                           reqid;
5962306a36Sopenharmony_ci   };
6062306a36Sopenharmony_ci
6162306a36Sopenharmony_ciThe unique SA is identified by the combination of xfrm_usersa_id,
6262306a36Sopenharmony_cireqid and saddr.
6362306a36Sopenharmony_ci
6462306a36Sopenharmony_ciflags are used to indicate different things. The possible
6562306a36Sopenharmony_ciflags are::
6662306a36Sopenharmony_ci
6762306a36Sopenharmony_ci	XFRM_AE_RTHR=1, /* replay threshold*/
6862306a36Sopenharmony_ci	XFRM_AE_RVAL=2, /* replay value */
6962306a36Sopenharmony_ci	XFRM_AE_LVAL=4, /* lifetime value */
7062306a36Sopenharmony_ci	XFRM_AE_ETHR=8, /* expiry timer threshold */
7162306a36Sopenharmony_ci	XFRM_AE_CR=16, /* Event cause is replay update */
7262306a36Sopenharmony_ci	XFRM_AE_CE=32, /* Event cause is timer expiry */
7362306a36Sopenharmony_ci	XFRM_AE_CU=64, /* Event cause is policy update */
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ciHow these flags are used is dependent on the direction of the
7662306a36Sopenharmony_cimessage (kernel<->user) as well the cause (config, query or event).
7762306a36Sopenharmony_ciThis is described below in the different messages.
7862306a36Sopenharmony_ci
7962306a36Sopenharmony_ciThe pid will be set appropriately in netlink to recognize direction
8062306a36Sopenharmony_ci(0 to the kernel and pid = processid that created the event
8162306a36Sopenharmony_ciwhen going from kernel to user space)
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ciA program needs to subscribe to multicast group XFRMNLGRP_AEVENTS
8462306a36Sopenharmony_cito get notified of these events.
8562306a36Sopenharmony_ci
8662306a36Sopenharmony_ci2) TLVS reflect the different parameters:
8762306a36Sopenharmony_ci-----------------------------------------
8862306a36Sopenharmony_ci
8962306a36Sopenharmony_cia) byte value (XFRMA_LTIME_VAL)
9062306a36Sopenharmony_ci
9162306a36Sopenharmony_ciThis TLV carries the running/current counter for byte lifetime since
9262306a36Sopenharmony_cilast event.
9362306a36Sopenharmony_ci
9462306a36Sopenharmony_cib)replay value (XFRMA_REPLAY_VAL)
9562306a36Sopenharmony_ci
9662306a36Sopenharmony_ciThis TLV carries the running/current counter for replay sequence since
9762306a36Sopenharmony_cilast event.
9862306a36Sopenharmony_ci
9962306a36Sopenharmony_cic)replay threshold (XFRMA_REPLAY_THRESH)
10062306a36Sopenharmony_ci
10162306a36Sopenharmony_ciThis TLV carries the threshold being used by the kernel to trigger events
10262306a36Sopenharmony_ciwhen the replay sequence is exceeded.
10362306a36Sopenharmony_ci
10462306a36Sopenharmony_cid) expiry timer (XFRMA_ETIMER_THRESH)
10562306a36Sopenharmony_ci
10662306a36Sopenharmony_ciThis is a timer value in milliseconds which is used as the nagle
10762306a36Sopenharmony_civalue to rate limit the events.
10862306a36Sopenharmony_ci
10962306a36Sopenharmony_ci3) Default configurations for the parameters:
11062306a36Sopenharmony_ci---------------------------------------------
11162306a36Sopenharmony_ci
11262306a36Sopenharmony_ciBy default these events should be turned off unless there is
11362306a36Sopenharmony_ciat least one listener registered to listen to the multicast
11462306a36Sopenharmony_cigroup XFRMNLGRP_AEVENTS.
11562306a36Sopenharmony_ci
11662306a36Sopenharmony_ciPrograms installing SAs will need to specify the two thresholds, however,
11762306a36Sopenharmony_ciin order to not change existing applications such as racoon
11862306a36Sopenharmony_ciwe also provide default threshold values for these different parameters
11962306a36Sopenharmony_ciin case they are not specified.
12062306a36Sopenharmony_ci
12162306a36Sopenharmony_cithe two sysctls/proc entries are:
12262306a36Sopenharmony_ci
12362306a36Sopenharmony_cia) /proc/sys/net/core/sysctl_xfrm_aevent_etime
12462306a36Sopenharmony_ciused to provide default values for the XFRMA_ETIMER_THRESH in incremental
12562306a36Sopenharmony_ciunits of time of 100ms. The default is 10 (1 second)
12662306a36Sopenharmony_ci
12762306a36Sopenharmony_cib) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth
12862306a36Sopenharmony_ciused to provide default values for XFRMA_REPLAY_THRESH parameter
12962306a36Sopenharmony_ciin incremental packet count. The default is two packets.
13062306a36Sopenharmony_ci
13162306a36Sopenharmony_ci4) Message types
13262306a36Sopenharmony_ci----------------
13362306a36Sopenharmony_ci
13462306a36Sopenharmony_cia) XFRM_MSG_GETAE issued by user-->kernel.
13562306a36Sopenharmony_ci   XFRM_MSG_GETAE does not carry any TLVs.
13662306a36Sopenharmony_ci
13762306a36Sopenharmony_ciThe response is a XFRM_MSG_NEWAE which is formatted based on what
13862306a36Sopenharmony_ciXFRM_MSG_GETAE queried for.
13962306a36Sopenharmony_ci
14062306a36Sopenharmony_ciThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
14162306a36Sopenharmony_ci* if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved
14262306a36Sopenharmony_ci* if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved
14362306a36Sopenharmony_ci
14462306a36Sopenharmony_cib) XFRM_MSG_NEWAE is issued by either user space to configure
14562306a36Sopenharmony_ci   or kernel to announce events or respond to a XFRM_MSG_GETAE.
14662306a36Sopenharmony_ci
14762306a36Sopenharmony_cii) user --> kernel to configure a specific SA.
14862306a36Sopenharmony_ci
14962306a36Sopenharmony_ciany of the values or threshold parameters can be updated by passing the
15062306a36Sopenharmony_ciappropriate TLV.
15162306a36Sopenharmony_ci
15262306a36Sopenharmony_ciA response is issued back to the sender in user space to indicate success
15362306a36Sopenharmony_cior failure.
15462306a36Sopenharmony_ci
15562306a36Sopenharmony_ciIn the case of success, additionally an event with
15662306a36Sopenharmony_ciXFRM_MSG_NEWAE is also issued to any listeners as described in iii).
15762306a36Sopenharmony_ci
15862306a36Sopenharmony_ciii) kernel->user direction as a response to XFRM_MSG_GETAE
15962306a36Sopenharmony_ci
16062306a36Sopenharmony_ciThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
16162306a36Sopenharmony_ci
16262306a36Sopenharmony_ciThe threshold TLVs will be included if explicitly requested in
16362306a36Sopenharmony_cithe XFRM_MSG_GETAE message.
16462306a36Sopenharmony_ci
16562306a36Sopenharmony_ciiii) kernel->user to report as event if someone sets any values or
16662306a36Sopenharmony_ci     thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above).
16762306a36Sopenharmony_ci     In such a case XFRM_AE_CU flag is set to inform the user that
16862306a36Sopenharmony_ci     the change happened as a result of an update.
16962306a36Sopenharmony_ci     The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
17062306a36Sopenharmony_ci
17162306a36Sopenharmony_ciiv) kernel->user to report event when replay threshold or a timeout
17262306a36Sopenharmony_ci    is exceeded.
17362306a36Sopenharmony_ci
17462306a36Sopenharmony_ciIn such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout
17562306a36Sopenharmony_cihappened) is set to inform the user what happened.
17662306a36Sopenharmony_ciNote the two flags are mutually exclusive.
17762306a36Sopenharmony_ciThe message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
17862306a36Sopenharmony_ci
17962306a36Sopenharmony_ciExceptions to threshold settings
18062306a36Sopenharmony_ci--------------------------------
18162306a36Sopenharmony_ci
18262306a36Sopenharmony_ciIf you have an SA that is getting hit by traffic in bursts such that
18362306a36Sopenharmony_cithere is a period where the timer threshold expires with no packets
18462306a36Sopenharmony_ciseen, then an odd behavior is seen as follows:
18562306a36Sopenharmony_ciThe first packet arrival after a timer expiry will trigger a timeout
18662306a36Sopenharmony_cievent; i.e we don't wait for a timeout period or a packet threshold
18762306a36Sopenharmony_cito be reached. This is done for simplicity and efficiency reasons.
18862306a36Sopenharmony_ci
18962306a36Sopenharmony_ci-JHS
190