162306a36Sopenharmony_ci==============================
262306a36Sopenharmony_ciGeneral notification mechanism
362306a36Sopenharmony_ci==============================
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciThe general notification mechanism is built on top of the standard pipe driver
662306a36Sopenharmony_ciwhereby it effectively splices notification messages from the kernel into pipes
762306a36Sopenharmony_ciopened by userspace.  This can be used in conjunction with::
862306a36Sopenharmony_ci
962306a36Sopenharmony_ci  * Key/keyring notifications
1062306a36Sopenharmony_ci
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ciThe notifications buffers can be enabled by:
1362306a36Sopenharmony_ci
1462306a36Sopenharmony_ci	"General setup"/"General notification queue"
1562306a36Sopenharmony_ci	(CONFIG_WATCH_QUEUE)
1662306a36Sopenharmony_ci
1762306a36Sopenharmony_ciThis document has the following sections:
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ci.. contents:: :local:
2062306a36Sopenharmony_ci
2162306a36Sopenharmony_ci
2262306a36Sopenharmony_ciOverview
2362306a36Sopenharmony_ci========
2462306a36Sopenharmony_ci
2562306a36Sopenharmony_ciThis facility appears as a pipe that is opened in a special mode.  The pipe's
2662306a36Sopenharmony_ciinternal ring buffer is used to hold messages that are generated by the kernel.
2762306a36Sopenharmony_ciThese messages are then read out by read().  Splice and similar are disabled on
2862306a36Sopenharmony_cisuch pipes due to them wanting to, under some circumstances, revert their
2962306a36Sopenharmony_ciadditions to the ring - which might end up interleaved with notification
3062306a36Sopenharmony_cimessages.
3162306a36Sopenharmony_ci
3262306a36Sopenharmony_ciThe owner of the pipe has to tell the kernel which sources it would like to
3362306a36Sopenharmony_ciwatch through that pipe.  Only sources that have been connected to a pipe will
3462306a36Sopenharmony_ciinsert messages into it.  Note that a source may be bound to multiple pipes and
3562306a36Sopenharmony_ciinsert messages into all of them simultaneously.
3662306a36Sopenharmony_ci
3762306a36Sopenharmony_ciFilters may also be emplaced on a pipe so that certain source types and
3862306a36Sopenharmony_cisubevents can be ignored if they're not of interest.
3962306a36Sopenharmony_ci
4062306a36Sopenharmony_ciA message will be discarded if there isn't a slot available in the ring or if
4162306a36Sopenharmony_cino preallocated message buffer is available.  In both of these cases, read()
4262306a36Sopenharmony_ciwill insert a WATCH_META_LOSS_NOTIFICATION message into the output buffer after
4362306a36Sopenharmony_cithe last message currently in the buffer has been read.
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ciNote that when producing a notification, the kernel does not wait for the
4662306a36Sopenharmony_ciconsumers to collect it, but rather just continues on.  This means that
4762306a36Sopenharmony_cinotifications can be generated whilst spinlocks are held and also protects the
4862306a36Sopenharmony_cikernel from being held up indefinitely by a userspace malfunction.
4962306a36Sopenharmony_ci
5062306a36Sopenharmony_ci
5162306a36Sopenharmony_ciMessage Structure
5262306a36Sopenharmony_ci=================
5362306a36Sopenharmony_ci
5462306a36Sopenharmony_ciNotification messages begin with a short header::
5562306a36Sopenharmony_ci
5662306a36Sopenharmony_ci	struct watch_notification {
5762306a36Sopenharmony_ci		__u32	type:24;
5862306a36Sopenharmony_ci		__u32	subtype:8;
5962306a36Sopenharmony_ci		__u32	info;
6062306a36Sopenharmony_ci	};
6162306a36Sopenharmony_ci
6262306a36Sopenharmony_ci"type" indicates the source of the notification record and "subtype" indicates
6362306a36Sopenharmony_cithe type of record from that source (see the Watch Sources section below).  The
6462306a36Sopenharmony_citype may also be "WATCH_TYPE_META".  This is a special record type generated
6562306a36Sopenharmony_ciinternally by the watch queue itself.  There are two subtypes:
6662306a36Sopenharmony_ci
6762306a36Sopenharmony_ci  * WATCH_META_REMOVAL_NOTIFICATION
6862306a36Sopenharmony_ci  * WATCH_META_LOSS_NOTIFICATION
6962306a36Sopenharmony_ci
7062306a36Sopenharmony_ciThe first indicates that an object on which a watch was installed was removed
7162306a36Sopenharmony_cior destroyed and the second indicates that some messages have been lost.
7262306a36Sopenharmony_ci
7362306a36Sopenharmony_ci"info" indicates a bunch of things, including:
7462306a36Sopenharmony_ci
7562306a36Sopenharmony_ci  * The length of the message in bytes, including the header (mask with
7662306a36Sopenharmony_ci    WATCH_INFO_LENGTH and shift by WATCH_INFO_LENGTH__SHIFT).  This indicates
7762306a36Sopenharmony_ci    the size of the record, which may be between 8 and 127 bytes.
7862306a36Sopenharmony_ci
7962306a36Sopenharmony_ci  * The watch ID (mask with WATCH_INFO_ID and shift by WATCH_INFO_ID__SHIFT).
8062306a36Sopenharmony_ci    This indicates that caller's ID of the watch, which may be between 0
8162306a36Sopenharmony_ci    and 255.  Multiple watches may share a queue, and this provides a means to
8262306a36Sopenharmony_ci    distinguish them.
8362306a36Sopenharmony_ci
8462306a36Sopenharmony_ci  * A type-specific field (WATCH_INFO_TYPE_INFO).  This is set by the
8562306a36Sopenharmony_ci    notification producer to indicate some meaning specific to the type and
8662306a36Sopenharmony_ci    subtype.
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ciEverything in info apart from the length can be used for filtering.
8962306a36Sopenharmony_ci
9062306a36Sopenharmony_ciThe header can be followed by supplementary information.  The format of this is
9162306a36Sopenharmony_ciat the discretion is defined by the type and subtype.
9262306a36Sopenharmony_ci
9362306a36Sopenharmony_ci
9462306a36Sopenharmony_ciWatch List (Notification Source) API
9562306a36Sopenharmony_ci====================================
9662306a36Sopenharmony_ci
9762306a36Sopenharmony_ciA "watch list" is a list of watchers that are subscribed to a source of
9862306a36Sopenharmony_cinotifications.  A list may be attached to an object (say a key or a superblock)
9962306a36Sopenharmony_cior may be global (say for device events).  From a userspace perspective, a
10062306a36Sopenharmony_cinon-global watch list is typically referred to by reference to the object it
10162306a36Sopenharmony_cibelongs to (such as using KEYCTL_NOTIFY and giving it a key serial number to
10262306a36Sopenharmony_ciwatch that specific key).
10362306a36Sopenharmony_ci
10462306a36Sopenharmony_ciTo manage a watch list, the following functions are provided:
10562306a36Sopenharmony_ci
10662306a36Sopenharmony_ci  * ::
10762306a36Sopenharmony_ci
10862306a36Sopenharmony_ci	void init_watch_list(struct watch_list *wlist,
10962306a36Sopenharmony_ci			     void (*release_watch)(struct watch *wlist));
11062306a36Sopenharmony_ci
11162306a36Sopenharmony_ci    Initialise a watch list.  If ``release_watch`` is not NULL, then this
11262306a36Sopenharmony_ci    indicates a function that should be called when the watch_list object is
11362306a36Sopenharmony_ci    destroyed to discard any references the watch list holds on the watched
11462306a36Sopenharmony_ci    object.
11562306a36Sopenharmony_ci
11662306a36Sopenharmony_ci  * ``void remove_watch_list(struct watch_list *wlist);``
11762306a36Sopenharmony_ci
11862306a36Sopenharmony_ci    This removes all of the watches subscribed to a watch_list and frees them
11962306a36Sopenharmony_ci    and then destroys the watch_list object itself.
12062306a36Sopenharmony_ci
12162306a36Sopenharmony_ci
12262306a36Sopenharmony_ciWatch Queue (Notification Output) API
12362306a36Sopenharmony_ci=====================================
12462306a36Sopenharmony_ci
12562306a36Sopenharmony_ciA "watch queue" is the buffer allocated by an application that notification
12662306a36Sopenharmony_cirecords will be written into.  The workings of this are hidden entirely inside
12762306a36Sopenharmony_ciof the pipe device driver, but it is necessary to gain a reference to it to set
12862306a36Sopenharmony_cia watch.  These can be managed with:
12962306a36Sopenharmony_ci
13062306a36Sopenharmony_ci  * ``struct watch_queue *get_watch_queue(int fd);``
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ci    Since watch queues are indicated to the kernel by the fd of the pipe that
13362306a36Sopenharmony_ci    implements the buffer, userspace must hand that fd through a system call.
13462306a36Sopenharmony_ci    This can be used to look up an opaque pointer to the watch queue from the
13562306a36Sopenharmony_ci    system call.
13662306a36Sopenharmony_ci
13762306a36Sopenharmony_ci  * ``void put_watch_queue(struct watch_queue *wqueue);``
13862306a36Sopenharmony_ci
13962306a36Sopenharmony_ci    This discards the reference obtained from ``get_watch_queue()``.
14062306a36Sopenharmony_ci
14162306a36Sopenharmony_ci
14262306a36Sopenharmony_ciWatch Subscription API
14362306a36Sopenharmony_ci======================
14462306a36Sopenharmony_ci
14562306a36Sopenharmony_ciA "watch" is a subscription on a watch list, indicating the watch queue, and
14662306a36Sopenharmony_cithus the buffer, into which notification records should be written.  The watch
14762306a36Sopenharmony_ciqueue object may also carry filtering rules for that object, as set by
14862306a36Sopenharmony_ciuserspace.  Some parts of the watch struct can be set by the driver::
14962306a36Sopenharmony_ci
15062306a36Sopenharmony_ci	struct watch {
15162306a36Sopenharmony_ci		union {
15262306a36Sopenharmony_ci			u32		info_id;	/* ID to be OR'd in to info field */
15362306a36Sopenharmony_ci			...
15462306a36Sopenharmony_ci		};
15562306a36Sopenharmony_ci		void			*private;	/* Private data for the watched object */
15662306a36Sopenharmony_ci		u64			id;		/* Internal identifier */
15762306a36Sopenharmony_ci		...
15862306a36Sopenharmony_ci	};
15962306a36Sopenharmony_ci
16062306a36Sopenharmony_ciThe ``info_id`` value should be an 8-bit number obtained from userspace and
16162306a36Sopenharmony_cishifted by WATCH_INFO_ID__SHIFT.  This is OR'd into the WATCH_INFO_ID field of
16262306a36Sopenharmony_cistruct watch_notification::info when and if the notification is written into
16362306a36Sopenharmony_cithe associated watch queue buffer.
16462306a36Sopenharmony_ci
16562306a36Sopenharmony_ciThe ``private`` field is the driver's data associated with the watch_list and
16662306a36Sopenharmony_ciis cleaned up by the ``watch_list::release_watch()`` method.
16762306a36Sopenharmony_ci
16862306a36Sopenharmony_ciThe ``id`` field is the source's ID.  Notifications that are posted with a
16962306a36Sopenharmony_cidifferent ID are ignored.
17062306a36Sopenharmony_ci
17162306a36Sopenharmony_ciThe following functions are provided to manage watches:
17262306a36Sopenharmony_ci
17362306a36Sopenharmony_ci  * ``void init_watch(struct watch *watch, struct watch_queue *wqueue);``
17462306a36Sopenharmony_ci
17562306a36Sopenharmony_ci    Initialise a watch object, setting its pointer to the watch queue, using
17662306a36Sopenharmony_ci    appropriate barriering to avoid lockdep complaints.
17762306a36Sopenharmony_ci
17862306a36Sopenharmony_ci  * ``int add_watch_to_object(struct watch *watch, struct watch_list *wlist);``
17962306a36Sopenharmony_ci
18062306a36Sopenharmony_ci    Subscribe a watch to a watch list (notification source).  The
18162306a36Sopenharmony_ci    driver-settable fields in the watch struct must have been set before this
18262306a36Sopenharmony_ci    is called.
18362306a36Sopenharmony_ci
18462306a36Sopenharmony_ci  * ::
18562306a36Sopenharmony_ci
18662306a36Sopenharmony_ci	int remove_watch_from_object(struct watch_list *wlist,
18762306a36Sopenharmony_ci				     struct watch_queue *wqueue,
18862306a36Sopenharmony_ci				     u64 id, false);
18962306a36Sopenharmony_ci
19062306a36Sopenharmony_ci    Remove a watch from a watch list, where the watch must match the specified
19162306a36Sopenharmony_ci    watch queue (``wqueue``) and object identifier (``id``).  A notification
19262306a36Sopenharmony_ci    (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue to
19362306a36Sopenharmony_ci    indicate that the watch got removed.
19462306a36Sopenharmony_ci
19562306a36Sopenharmony_ci  * ``int remove_watch_from_object(struct watch_list *wlist, NULL, 0, true);``
19662306a36Sopenharmony_ci
19762306a36Sopenharmony_ci    Remove all the watches from a watch list.  It is expected that this will be
19862306a36Sopenharmony_ci    called preparatory to destruction and that the watch list will be
19962306a36Sopenharmony_ci    inaccessible to new watches by this point.  A notification
20062306a36Sopenharmony_ci    (``WATCH_META_REMOVAL_NOTIFICATION``) is sent to the watch queue of each
20162306a36Sopenharmony_ci    subscribed watch to indicate that the watch got removed.
20262306a36Sopenharmony_ci
20362306a36Sopenharmony_ci
20462306a36Sopenharmony_ciNotification Posting API
20562306a36Sopenharmony_ci========================
20662306a36Sopenharmony_ci
20762306a36Sopenharmony_ciTo post a notification to watch list so that the subscribed watches can see it,
20862306a36Sopenharmony_cithe following function should be used::
20962306a36Sopenharmony_ci
21062306a36Sopenharmony_ci	void post_watch_notification(struct watch_list *wlist,
21162306a36Sopenharmony_ci				     struct watch_notification *n,
21262306a36Sopenharmony_ci				     const struct cred *cred,
21362306a36Sopenharmony_ci				     u64 id);
21462306a36Sopenharmony_ci
21562306a36Sopenharmony_ciThe notification should be preformatted and a pointer to the header (``n``)
21662306a36Sopenharmony_cishould be passed in.  The notification may be larger than this and the size in
21762306a36Sopenharmony_ciunits of buffer slots is noted in ``n->info & WATCH_INFO_LENGTH``.
21862306a36Sopenharmony_ci
21962306a36Sopenharmony_ciThe ``cred`` struct indicates the credentials of the source (subject) and is
22062306a36Sopenharmony_cipassed to the LSMs, such as SELinux, to allow or suppress the recording of the
22162306a36Sopenharmony_cinote in each individual queue according to the credentials of that queue
22262306a36Sopenharmony_ci(object).
22362306a36Sopenharmony_ci
22462306a36Sopenharmony_ciThe ``id`` is the ID of the source object (such as the serial number on a key).
22562306a36Sopenharmony_ciOnly watches that have the same ID set in them will see this notification.
22662306a36Sopenharmony_ci
22762306a36Sopenharmony_ci
22862306a36Sopenharmony_ciWatch Sources
22962306a36Sopenharmony_ci=============
23062306a36Sopenharmony_ci
23162306a36Sopenharmony_ciAny particular buffer can be fed from multiple sources.  Sources include:
23262306a36Sopenharmony_ci
23362306a36Sopenharmony_ci  * WATCH_TYPE_KEY_NOTIFY
23462306a36Sopenharmony_ci
23562306a36Sopenharmony_ci    Notifications of this type indicate changes to keys and keyrings, including
23662306a36Sopenharmony_ci    the changes of keyring contents or the attributes of keys.
23762306a36Sopenharmony_ci
23862306a36Sopenharmony_ci    See Documentation/security/keys/core.rst for more information.
23962306a36Sopenharmony_ci
24062306a36Sopenharmony_ci
24162306a36Sopenharmony_ciEvent Filtering
24262306a36Sopenharmony_ci===============
24362306a36Sopenharmony_ci
24462306a36Sopenharmony_ciOnce a watch queue has been created, a set of filters can be applied to limit
24562306a36Sopenharmony_cithe events that are received using::
24662306a36Sopenharmony_ci
24762306a36Sopenharmony_ci	struct watch_notification_filter filter = {
24862306a36Sopenharmony_ci		...
24962306a36Sopenharmony_ci	};
25062306a36Sopenharmony_ci	ioctl(fd, IOC_WATCH_QUEUE_SET_FILTER, &filter)
25162306a36Sopenharmony_ci
25262306a36Sopenharmony_ciThe filter description is a variable of type::
25362306a36Sopenharmony_ci
25462306a36Sopenharmony_ci	struct watch_notification_filter {
25562306a36Sopenharmony_ci		__u32	nr_filters;
25662306a36Sopenharmony_ci		__u32	__reserved;
25762306a36Sopenharmony_ci		struct watch_notification_type_filter filters[];
25862306a36Sopenharmony_ci	};
25962306a36Sopenharmony_ci
26062306a36Sopenharmony_ciWhere "nr_filters" is the number of filters in filters[] and "__reserved"
26162306a36Sopenharmony_cishould be 0.  The "filters" array has elements of the following type::
26262306a36Sopenharmony_ci
26362306a36Sopenharmony_ci	struct watch_notification_type_filter {
26462306a36Sopenharmony_ci		__u32	type;
26562306a36Sopenharmony_ci		__u32	info_filter;
26662306a36Sopenharmony_ci		__u32	info_mask;
26762306a36Sopenharmony_ci		__u32	subtype_filter[8];
26862306a36Sopenharmony_ci	};
26962306a36Sopenharmony_ci
27062306a36Sopenharmony_ciWhere:
27162306a36Sopenharmony_ci
27262306a36Sopenharmony_ci  * ``type`` is the event type to filter for and should be something like
27362306a36Sopenharmony_ci    "WATCH_TYPE_KEY_NOTIFY"
27462306a36Sopenharmony_ci
27562306a36Sopenharmony_ci  * ``info_filter`` and ``info_mask`` act as a filter on the info field of the
27662306a36Sopenharmony_ci    notification record.  The notification is only written into the buffer if::
27762306a36Sopenharmony_ci
27862306a36Sopenharmony_ci	(watch.info & info_mask) == info_filter
27962306a36Sopenharmony_ci
28062306a36Sopenharmony_ci    This could be used, for example, to ignore events that are not exactly on
28162306a36Sopenharmony_ci    the watched point in a mount tree.
28262306a36Sopenharmony_ci
28362306a36Sopenharmony_ci  * ``subtype_filter`` is a bitmask indicating the subtypes that are of
28462306a36Sopenharmony_ci    interest.  Bit 0 of subtype_filter[0] corresponds to subtype 0, bit 1 to
28562306a36Sopenharmony_ci    subtype 1, and so on.
28662306a36Sopenharmony_ci
28762306a36Sopenharmony_ciIf the argument to the ioctl() is NULL, then the filters will be removed and
28862306a36Sopenharmony_ciall events from the watched sources will come through.
28962306a36Sopenharmony_ci
29062306a36Sopenharmony_ci
29162306a36Sopenharmony_ciUserspace Code Example
29262306a36Sopenharmony_ci======================
29362306a36Sopenharmony_ci
29462306a36Sopenharmony_ciA buffer is created with something like the following::
29562306a36Sopenharmony_ci
29662306a36Sopenharmony_ci	pipe2(fds, O_TMPFILE);
29762306a36Sopenharmony_ci	ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, 256);
29862306a36Sopenharmony_ci
29962306a36Sopenharmony_ciIt can then be set to receive keyring change notifications::
30062306a36Sopenharmony_ci
30162306a36Sopenharmony_ci	keyctl(KEYCTL_WATCH_KEY, KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
30262306a36Sopenharmony_ci
30362306a36Sopenharmony_ciThe notifications can then be consumed by something like the following::
30462306a36Sopenharmony_ci
30562306a36Sopenharmony_ci	static void consumer(int rfd, struct watch_queue_buffer *buf)
30662306a36Sopenharmony_ci	{
30762306a36Sopenharmony_ci		unsigned char buffer[128];
30862306a36Sopenharmony_ci		ssize_t buf_len;
30962306a36Sopenharmony_ci
31062306a36Sopenharmony_ci		while (buf_len = read(rfd, buffer, sizeof(buffer)),
31162306a36Sopenharmony_ci		       buf_len > 0
31262306a36Sopenharmony_ci		       ) {
31362306a36Sopenharmony_ci			void *p = buffer;
31462306a36Sopenharmony_ci			void *end = buffer + buf_len;
31562306a36Sopenharmony_ci			while (p < end) {
31662306a36Sopenharmony_ci				union {
31762306a36Sopenharmony_ci					struct watch_notification n;
31862306a36Sopenharmony_ci					unsigned char buf1[128];
31962306a36Sopenharmony_ci				} n;
32062306a36Sopenharmony_ci				size_t largest, len;
32162306a36Sopenharmony_ci
32262306a36Sopenharmony_ci				largest = end - p;
32362306a36Sopenharmony_ci				if (largest > 128)
32462306a36Sopenharmony_ci					largest = 128;
32562306a36Sopenharmony_ci				memcpy(&n, p, largest);
32662306a36Sopenharmony_ci
32762306a36Sopenharmony_ci				len = (n->info & WATCH_INFO_LENGTH) >>
32862306a36Sopenharmony_ci					WATCH_INFO_LENGTH__SHIFT;
32962306a36Sopenharmony_ci				if (len == 0 || len > largest)
33062306a36Sopenharmony_ci					return;
33162306a36Sopenharmony_ci
33262306a36Sopenharmony_ci				switch (n.n.type) {
33362306a36Sopenharmony_ci				case WATCH_TYPE_META:
33462306a36Sopenharmony_ci					got_meta(&n.n);
33562306a36Sopenharmony_ci				case WATCH_TYPE_KEY_NOTIFY:
33662306a36Sopenharmony_ci					saw_key_change(&n.n);
33762306a36Sopenharmony_ci					break;
33862306a36Sopenharmony_ci				}
33962306a36Sopenharmony_ci
34062306a36Sopenharmony_ci				p += len;
34162306a36Sopenharmony_ci			}
34262306a36Sopenharmony_ci		}
34362306a36Sopenharmony_ci	}
344