162306a36Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 262306a36Sopenharmony_ci 362306a36Sopenharmony_ci==== 462306a36Sopenharmony_ciXFRM 562306a36Sopenharmony_ci==== 662306a36Sopenharmony_ci 762306a36Sopenharmony_ciThe sync patches work is based on initial patches from 862306a36Sopenharmony_ciKrisztian <hidden@balabit.hu> and others and additional patches 962306a36Sopenharmony_cifrom Jamal <hadi@cyberus.ca>. 1062306a36Sopenharmony_ci 1162306a36Sopenharmony_ciThe end goal for syncing is to be able to insert attributes + generate 1262306a36Sopenharmony_cievents so that the SA can be safely moved from one machine to another 1362306a36Sopenharmony_cifor HA purposes. 1462306a36Sopenharmony_ciThe idea is to synchronize the SA so that the takeover machine can do 1562306a36Sopenharmony_cithe processing of the SA as accurate as possible if it has access to it. 1662306a36Sopenharmony_ci 1762306a36Sopenharmony_ciWe already have the ability to generate SA add/del/upd events. 1862306a36Sopenharmony_ciThese patches add ability to sync and have accurate lifetime byte (to 1962306a36Sopenharmony_ciensure proper decay of SAs) and replay counters to avoid replay attacks 2062306a36Sopenharmony_ciwith as minimal loss at failover time. 2162306a36Sopenharmony_ciThis way a backup stays as closely up-to-date as an active member. 2262306a36Sopenharmony_ci 2362306a36Sopenharmony_ciBecause the above items change for every packet the SA receives, 2462306a36Sopenharmony_ciit is possible for a lot of the events to be generated. 2562306a36Sopenharmony_ciFor this reason, we also add a nagle-like algorithm to restrict 2662306a36Sopenharmony_cithe events. i.e we are going to set thresholds to say "let me 2762306a36Sopenharmony_ciknow if the replay sequence threshold is reached or 10 secs have passed" 2862306a36Sopenharmony_ciThese thresholds are set system-wide via sysctls or can be updated 2962306a36Sopenharmony_ciper SA. 3062306a36Sopenharmony_ci 3162306a36Sopenharmony_ciThe identified items that need to be synchronized are: 3262306a36Sopenharmony_ci- the lifetime byte counter 3362306a36Sopenharmony_cinote that: lifetime time limit is not important if you assume the failover 3462306a36Sopenharmony_cimachine is known ahead of time since the decay of the time countdown 3562306a36Sopenharmony_ciis not driven by packet arrival. 3662306a36Sopenharmony_ci- the replay sequence for both inbound and outbound 3762306a36Sopenharmony_ci 3862306a36Sopenharmony_ci1) Message Structure 3962306a36Sopenharmony_ci---------------------- 4062306a36Sopenharmony_ci 4162306a36Sopenharmony_cinlmsghdr:aevent_id:optional-TLVs. 4262306a36Sopenharmony_ci 4362306a36Sopenharmony_ciThe netlink message types are: 4462306a36Sopenharmony_ci 4562306a36Sopenharmony_ciXFRM_MSG_NEWAE and XFRM_MSG_GETAE. 4662306a36Sopenharmony_ci 4762306a36Sopenharmony_ciA XFRM_MSG_GETAE does not have TLVs. 4862306a36Sopenharmony_ci 4962306a36Sopenharmony_ciA XFRM_MSG_NEWAE will have at least two TLVs (as is 5062306a36Sopenharmony_cidiscussed further below). 5162306a36Sopenharmony_ci 5262306a36Sopenharmony_ciaevent_id structure looks like:: 5362306a36Sopenharmony_ci 5462306a36Sopenharmony_ci struct xfrm_aevent_id { 5562306a36Sopenharmony_ci struct xfrm_usersa_id sa_id; 5662306a36Sopenharmony_ci xfrm_address_t saddr; 5762306a36Sopenharmony_ci __u32 flags; 5862306a36Sopenharmony_ci __u32 reqid; 5962306a36Sopenharmony_ci }; 6062306a36Sopenharmony_ci 6162306a36Sopenharmony_ciThe unique SA is identified by the combination of xfrm_usersa_id, 6262306a36Sopenharmony_cireqid and saddr. 6362306a36Sopenharmony_ci 6462306a36Sopenharmony_ciflags are used to indicate different things. The possible 6562306a36Sopenharmony_ciflags are:: 6662306a36Sopenharmony_ci 6762306a36Sopenharmony_ci XFRM_AE_RTHR=1, /* replay threshold*/ 6862306a36Sopenharmony_ci XFRM_AE_RVAL=2, /* replay value */ 6962306a36Sopenharmony_ci XFRM_AE_LVAL=4, /* lifetime value */ 7062306a36Sopenharmony_ci XFRM_AE_ETHR=8, /* expiry timer threshold */ 7162306a36Sopenharmony_ci XFRM_AE_CR=16, /* Event cause is replay update */ 7262306a36Sopenharmony_ci XFRM_AE_CE=32, /* Event cause is timer expiry */ 7362306a36Sopenharmony_ci XFRM_AE_CU=64, /* Event cause is policy update */ 7462306a36Sopenharmony_ci 7562306a36Sopenharmony_ciHow these flags are used is dependent on the direction of the 7662306a36Sopenharmony_cimessage (kernel<->user) as well the cause (config, query or event). 7762306a36Sopenharmony_ciThis is described below in the different messages. 7862306a36Sopenharmony_ci 7962306a36Sopenharmony_ciThe pid will be set appropriately in netlink to recognize direction 8062306a36Sopenharmony_ci(0 to the kernel and pid = processid that created the event 8162306a36Sopenharmony_ciwhen going from kernel to user space) 8262306a36Sopenharmony_ci 8362306a36Sopenharmony_ciA program needs to subscribe to multicast group XFRMNLGRP_AEVENTS 8462306a36Sopenharmony_cito get notified of these events. 8562306a36Sopenharmony_ci 8662306a36Sopenharmony_ci2) TLVS reflect the different parameters: 8762306a36Sopenharmony_ci----------------------------------------- 8862306a36Sopenharmony_ci 8962306a36Sopenharmony_cia) byte value (XFRMA_LTIME_VAL) 9062306a36Sopenharmony_ci 9162306a36Sopenharmony_ciThis TLV carries the running/current counter for byte lifetime since 9262306a36Sopenharmony_cilast event. 9362306a36Sopenharmony_ci 9462306a36Sopenharmony_cib)replay value (XFRMA_REPLAY_VAL) 9562306a36Sopenharmony_ci 9662306a36Sopenharmony_ciThis TLV carries the running/current counter for replay sequence since 9762306a36Sopenharmony_cilast event. 9862306a36Sopenharmony_ci 9962306a36Sopenharmony_cic)replay threshold (XFRMA_REPLAY_THRESH) 10062306a36Sopenharmony_ci 10162306a36Sopenharmony_ciThis TLV carries the threshold being used by the kernel to trigger events 10262306a36Sopenharmony_ciwhen the replay sequence is exceeded. 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_cid) expiry timer (XFRMA_ETIMER_THRESH) 10562306a36Sopenharmony_ci 10662306a36Sopenharmony_ciThis is a timer value in milliseconds which is used as the nagle 10762306a36Sopenharmony_civalue to rate limit the events. 10862306a36Sopenharmony_ci 10962306a36Sopenharmony_ci3) Default configurations for the parameters: 11062306a36Sopenharmony_ci--------------------------------------------- 11162306a36Sopenharmony_ci 11262306a36Sopenharmony_ciBy default these events should be turned off unless there is 11362306a36Sopenharmony_ciat least one listener registered to listen to the multicast 11462306a36Sopenharmony_cigroup XFRMNLGRP_AEVENTS. 11562306a36Sopenharmony_ci 11662306a36Sopenharmony_ciPrograms installing SAs will need to specify the two thresholds, however, 11762306a36Sopenharmony_ciin order to not change existing applications such as racoon 11862306a36Sopenharmony_ciwe also provide default threshold values for these different parameters 11962306a36Sopenharmony_ciin case they are not specified. 12062306a36Sopenharmony_ci 12162306a36Sopenharmony_cithe two sysctls/proc entries are: 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_cia) /proc/sys/net/core/sysctl_xfrm_aevent_etime 12462306a36Sopenharmony_ciused to provide default values for the XFRMA_ETIMER_THRESH in incremental 12562306a36Sopenharmony_ciunits of time of 100ms. The default is 10 (1 second) 12662306a36Sopenharmony_ci 12762306a36Sopenharmony_cib) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth 12862306a36Sopenharmony_ciused to provide default values for XFRMA_REPLAY_THRESH parameter 12962306a36Sopenharmony_ciin incremental packet count. The default is two packets. 13062306a36Sopenharmony_ci 13162306a36Sopenharmony_ci4) Message types 13262306a36Sopenharmony_ci---------------- 13362306a36Sopenharmony_ci 13462306a36Sopenharmony_cia) XFRM_MSG_GETAE issued by user-->kernel. 13562306a36Sopenharmony_ci XFRM_MSG_GETAE does not carry any TLVs. 13662306a36Sopenharmony_ci 13762306a36Sopenharmony_ciThe response is a XFRM_MSG_NEWAE which is formatted based on what 13862306a36Sopenharmony_ciXFRM_MSG_GETAE queried for. 13962306a36Sopenharmony_ci 14062306a36Sopenharmony_ciThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 14162306a36Sopenharmony_ci* if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved 14262306a36Sopenharmony_ci* if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved 14362306a36Sopenharmony_ci 14462306a36Sopenharmony_cib) XFRM_MSG_NEWAE is issued by either user space to configure 14562306a36Sopenharmony_ci or kernel to announce events or respond to a XFRM_MSG_GETAE. 14662306a36Sopenharmony_ci 14762306a36Sopenharmony_cii) user --> kernel to configure a specific SA. 14862306a36Sopenharmony_ci 14962306a36Sopenharmony_ciany of the values or threshold parameters can be updated by passing the 15062306a36Sopenharmony_ciappropriate TLV. 15162306a36Sopenharmony_ci 15262306a36Sopenharmony_ciA response is issued back to the sender in user space to indicate success 15362306a36Sopenharmony_cior failure. 15462306a36Sopenharmony_ci 15562306a36Sopenharmony_ciIn the case of success, additionally an event with 15662306a36Sopenharmony_ciXFRM_MSG_NEWAE is also issued to any listeners as described in iii). 15762306a36Sopenharmony_ci 15862306a36Sopenharmony_ciii) kernel->user direction as a response to XFRM_MSG_GETAE 15962306a36Sopenharmony_ci 16062306a36Sopenharmony_ciThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 16162306a36Sopenharmony_ci 16262306a36Sopenharmony_ciThe threshold TLVs will be included if explicitly requested in 16362306a36Sopenharmony_cithe XFRM_MSG_GETAE message. 16462306a36Sopenharmony_ci 16562306a36Sopenharmony_ciiii) kernel->user to report as event if someone sets any values or 16662306a36Sopenharmony_ci thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). 16762306a36Sopenharmony_ci In such a case XFRM_AE_CU flag is set to inform the user that 16862306a36Sopenharmony_ci the change happened as a result of an update. 16962306a36Sopenharmony_ci The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 17062306a36Sopenharmony_ci 17162306a36Sopenharmony_ciiv) kernel->user to report event when replay threshold or a timeout 17262306a36Sopenharmony_ci is exceeded. 17362306a36Sopenharmony_ci 17462306a36Sopenharmony_ciIn such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout 17562306a36Sopenharmony_cihappened) is set to inform the user what happened. 17662306a36Sopenharmony_ciNote the two flags are mutually exclusive. 17762306a36Sopenharmony_ciThe message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 17862306a36Sopenharmony_ci 17962306a36Sopenharmony_ciExceptions to threshold settings 18062306a36Sopenharmony_ci-------------------------------- 18162306a36Sopenharmony_ci 18262306a36Sopenharmony_ciIf you have an SA that is getting hit by traffic in bursts such that 18362306a36Sopenharmony_cithere is a period where the timer threshold expires with no packets 18462306a36Sopenharmony_ciseen, then an odd behavior is seen as follows: 18562306a36Sopenharmony_ciThe first packet arrival after a timer expiry will trigger a timeout 18662306a36Sopenharmony_cievent; i.e we don't wait for a timeout period or a packet threshold 18762306a36Sopenharmony_cito be reached. This is done for simplicity and efficiency reasons. 18862306a36Sopenharmony_ci 18962306a36Sopenharmony_ci-JHS 190