18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci
38c2ecf20Sopenharmony_ci==================================
48c2ecf20Sopenharmony_cirelay interface (formerly relayfs)
58c2ecf20Sopenharmony_ci==================================
68c2ecf20Sopenharmony_ci
78c2ecf20Sopenharmony_ciThe relay interface provides a means for kernel applications to
88c2ecf20Sopenharmony_ciefficiently log and transfer large quantities of data from the kernel
98c2ecf20Sopenharmony_cito userspace via user-defined 'relay channels'.
108c2ecf20Sopenharmony_ci
118c2ecf20Sopenharmony_ciA 'relay channel' is a kernel->user data relay mechanism implemented
128c2ecf20Sopenharmony_cias a set of per-cpu kernel buffers ('channel buffers'), each
138c2ecf20Sopenharmony_cirepresented as a regular file ('relay file') in user space.  Kernel
148c2ecf20Sopenharmony_ciclients write into the channel buffers using efficient write
158c2ecf20Sopenharmony_cifunctions; these automatically log into the current cpu's channel
168c2ecf20Sopenharmony_cibuffer.  User space applications mmap() or read() from the relay files
178c2ecf20Sopenharmony_ciand retrieve the data as it becomes available.  The relay files
188c2ecf20Sopenharmony_cithemselves are files created in a host filesystem, e.g. debugfs, and
198c2ecf20Sopenharmony_ciare associated with the channel buffers using the API described below.
208c2ecf20Sopenharmony_ci
218c2ecf20Sopenharmony_ciThe format of the data logged into the channel buffers is completely
228c2ecf20Sopenharmony_ciup to the kernel client; the relay interface does however provide
238c2ecf20Sopenharmony_cihooks which allow kernel clients to impose some structure on the
248c2ecf20Sopenharmony_cibuffer data.  The relay interface doesn't implement any form of data
258c2ecf20Sopenharmony_cifiltering - this also is left to the kernel client.  The purpose is to
268c2ecf20Sopenharmony_cikeep things as simple as possible.
278c2ecf20Sopenharmony_ci
288c2ecf20Sopenharmony_ciThis document provides an overview of the relay interface API.  The
298c2ecf20Sopenharmony_cidetails of the function parameters are documented along with the
308c2ecf20Sopenharmony_cifunctions in the relay interface code - please see that for details.
318c2ecf20Sopenharmony_ci
328c2ecf20Sopenharmony_ciSemantics
338c2ecf20Sopenharmony_ci=========
348c2ecf20Sopenharmony_ci
358c2ecf20Sopenharmony_ciEach relay channel has one buffer per CPU, each buffer has one or more
368c2ecf20Sopenharmony_cisub-buffers.  Messages are written to the first sub-buffer until it is
378c2ecf20Sopenharmony_citoo full to contain a new message, in which case it is written to
388c2ecf20Sopenharmony_cithe next (if available).  Messages are never split across sub-buffers.
398c2ecf20Sopenharmony_ciAt this point, userspace can be notified so it empties the first
408c2ecf20Sopenharmony_cisub-buffer, while the kernel continues writing to the next.
418c2ecf20Sopenharmony_ci
428c2ecf20Sopenharmony_ciWhen notified that a sub-buffer is full, the kernel knows how many
438c2ecf20Sopenharmony_cibytes of it are padding i.e. unused space occurring because a complete
448c2ecf20Sopenharmony_cimessage couldn't fit into a sub-buffer.  Userspace can use this
458c2ecf20Sopenharmony_ciknowledge to copy only valid data.
468c2ecf20Sopenharmony_ci
478c2ecf20Sopenharmony_ciAfter copying it, userspace can notify the kernel that a sub-buffer
488c2ecf20Sopenharmony_cihas been consumed.
498c2ecf20Sopenharmony_ci
508c2ecf20Sopenharmony_ciA relay channel can operate in a mode where it will overwrite data not
518c2ecf20Sopenharmony_ciyet collected by userspace, and not wait for it to be consumed.
528c2ecf20Sopenharmony_ci
538c2ecf20Sopenharmony_ciThe relay channel itself does not provide for communication of such
548c2ecf20Sopenharmony_cidata between userspace and kernel, allowing the kernel side to remain
558c2ecf20Sopenharmony_cisimple and not impose a single interface on userspace.  It does
568c2ecf20Sopenharmony_ciprovide a set of examples and a separate helper though, described
578c2ecf20Sopenharmony_cibelow.
588c2ecf20Sopenharmony_ci
598c2ecf20Sopenharmony_ciThe read() interface both removes padding and internally consumes the
608c2ecf20Sopenharmony_ciread sub-buffers; thus in cases where read(2) is being used to drain
618c2ecf20Sopenharmony_cithe channel buffers, special-purpose communication between kernel and
628c2ecf20Sopenharmony_ciuser isn't necessary for basic operation.
638c2ecf20Sopenharmony_ci
648c2ecf20Sopenharmony_ciOne of the major goals of the relay interface is to provide a low
658c2ecf20Sopenharmony_cioverhead mechanism for conveying kernel data to userspace.  While the
668c2ecf20Sopenharmony_ciread() interface is easy to use, it's not as efficient as the mmap()
678c2ecf20Sopenharmony_ciapproach; the example code attempts to make the tradeoff between the
688c2ecf20Sopenharmony_citwo approaches as small as possible.
698c2ecf20Sopenharmony_ci
708c2ecf20Sopenharmony_ciklog and relay-apps example code
718c2ecf20Sopenharmony_ci================================
728c2ecf20Sopenharmony_ci
738c2ecf20Sopenharmony_ciThe relay interface itself is ready to use, but to make things easier,
748c2ecf20Sopenharmony_cia couple simple utility functions and a set of examples are provided.
758c2ecf20Sopenharmony_ci
768c2ecf20Sopenharmony_ciThe relay-apps example tarball, available on the relay sourceforge
778c2ecf20Sopenharmony_cisite, contains a set of self-contained examples, each consisting of a
788c2ecf20Sopenharmony_cipair of .c files containing boilerplate code for each of the user and
798c2ecf20Sopenharmony_cikernel sides of a relay application.  When combined these two sets of
808c2ecf20Sopenharmony_ciboilerplate code provide glue to easily stream data to disk, without
818c2ecf20Sopenharmony_cihaving to bother with mundane housekeeping chores.
828c2ecf20Sopenharmony_ci
838c2ecf20Sopenharmony_ciThe 'klog debugging functions' patch (klog.patch in the relay-apps
848c2ecf20Sopenharmony_citarball) provides a couple of high-level logging functions to the
858c2ecf20Sopenharmony_cikernel which allow writing formatted text or raw data to a channel,
868c2ecf20Sopenharmony_ciregardless of whether a channel to write into exists or not, or even
878c2ecf20Sopenharmony_ciwhether the relay interface is compiled into the kernel or not.  These
888c2ecf20Sopenharmony_cifunctions allow you to put unconditional 'trace' statements anywhere
898c2ecf20Sopenharmony_ciin the kernel or kernel modules; only when there is a 'klog handler'
908c2ecf20Sopenharmony_ciregistered will data actually be logged (see the klog and kleak
918c2ecf20Sopenharmony_ciexamples for details).
928c2ecf20Sopenharmony_ci
938c2ecf20Sopenharmony_ciIt is of course possible to use the relay interface from scratch,
948c2ecf20Sopenharmony_cii.e. without using any of the relay-apps example code or klog, but
958c2ecf20Sopenharmony_ciyou'll have to implement communication between userspace and kernel,
968c2ecf20Sopenharmony_ciallowing both to convey the state of buffers (full, empty, amount of
978c2ecf20Sopenharmony_cipadding).  The read() interface both removes padding and internally
988c2ecf20Sopenharmony_ciconsumes the read sub-buffers; thus in cases where read(2) is being
998c2ecf20Sopenharmony_ciused to drain the channel buffers, special-purpose communication
1008c2ecf20Sopenharmony_cibetween kernel and user isn't necessary for basic operation.  Things
1018c2ecf20Sopenharmony_cisuch as buffer-full conditions would still need to be communicated via
1028c2ecf20Sopenharmony_cisome channel though.
1038c2ecf20Sopenharmony_ci
1048c2ecf20Sopenharmony_ciklog and the relay-apps examples can be found in the relay-apps
1058c2ecf20Sopenharmony_citarball on http://relayfs.sourceforge.net
1068c2ecf20Sopenharmony_ci
1078c2ecf20Sopenharmony_ciThe relay interface user space API
1088c2ecf20Sopenharmony_ci==================================
1098c2ecf20Sopenharmony_ci
1108c2ecf20Sopenharmony_ciThe relay interface implements basic file operations for user space
1118c2ecf20Sopenharmony_ciaccess to relay channel buffer data.  Here are the file operations
1128c2ecf20Sopenharmony_cithat are available and some comments regarding their behavior:
1138c2ecf20Sopenharmony_ci
1148c2ecf20Sopenharmony_ci=========== ============================================================
1158c2ecf20Sopenharmony_ciopen()	    enables user to open an _existing_ channel buffer.
1168c2ecf20Sopenharmony_ci
1178c2ecf20Sopenharmony_cimmap()      results in channel buffer being mapped into the caller's
1188c2ecf20Sopenharmony_ci	    memory space. Note that you can't do a partial mmap - you
1198c2ecf20Sopenharmony_ci	    must map the entire file, which is NRBUF * SUBBUFSIZE.
1208c2ecf20Sopenharmony_ci
1218c2ecf20Sopenharmony_ciread()      read the contents of a channel buffer.  The bytes read are
1228c2ecf20Sopenharmony_ci	    'consumed' by the reader, i.e. they won't be available
1238c2ecf20Sopenharmony_ci	    again to subsequent reads.  If the channel is being used
1248c2ecf20Sopenharmony_ci	    in no-overwrite mode (the default), it can be read at any
1258c2ecf20Sopenharmony_ci	    time even if there's an active kernel writer.  If the
1268c2ecf20Sopenharmony_ci	    channel is being used in overwrite mode and there are
1278c2ecf20Sopenharmony_ci	    active channel writers, results may be unpredictable -
1288c2ecf20Sopenharmony_ci	    users should make sure that all logging to the channel has
1298c2ecf20Sopenharmony_ci	    ended before using read() with overwrite mode.  Sub-buffer
1308c2ecf20Sopenharmony_ci	    padding is automatically removed and will not be seen by
1318c2ecf20Sopenharmony_ci	    the reader.
1328c2ecf20Sopenharmony_ci
1338c2ecf20Sopenharmony_cisendfile()  transfer data from a channel buffer to an output file
1348c2ecf20Sopenharmony_ci	    descriptor. Sub-buffer padding is automatically removed
1358c2ecf20Sopenharmony_ci	    and will not be seen by the reader.
1368c2ecf20Sopenharmony_ci
1378c2ecf20Sopenharmony_cipoll()      POLLIN/POLLRDNORM/POLLERR supported.  User applications are
1388c2ecf20Sopenharmony_ci	    notified when sub-buffer boundaries are crossed.
1398c2ecf20Sopenharmony_ci
1408c2ecf20Sopenharmony_ciclose()     decrements the channel buffer's refcount.  When the refcount
1418c2ecf20Sopenharmony_ci	    reaches 0, i.e. when no process or kernel client has the
1428c2ecf20Sopenharmony_ci	    buffer open, the channel buffer is freed.
1438c2ecf20Sopenharmony_ci=========== ============================================================
1448c2ecf20Sopenharmony_ci
1458c2ecf20Sopenharmony_ciIn order for a user application to make use of relay files, the
1468c2ecf20Sopenharmony_cihost filesystem must be mounted.  For example::
1478c2ecf20Sopenharmony_ci
1488c2ecf20Sopenharmony_ci	mount -t debugfs debugfs /sys/kernel/debug
1498c2ecf20Sopenharmony_ci
1508c2ecf20Sopenharmony_ci.. Note::
1518c2ecf20Sopenharmony_ci
1528c2ecf20Sopenharmony_ci	the host filesystem doesn't need to be mounted for kernel
1538c2ecf20Sopenharmony_ci	clients to create or use channels - it only needs to be
1548c2ecf20Sopenharmony_ci	mounted when user space applications need access to the buffer
1558c2ecf20Sopenharmony_ci	data.
1568c2ecf20Sopenharmony_ci
1578c2ecf20Sopenharmony_ci
1588c2ecf20Sopenharmony_ciThe relay interface kernel API
1598c2ecf20Sopenharmony_ci==============================
1608c2ecf20Sopenharmony_ci
1618c2ecf20Sopenharmony_ciHere's a summary of the API the relay interface provides to in-kernel clients:
1628c2ecf20Sopenharmony_ci
1638c2ecf20Sopenharmony_ciTBD(curr. line MT:/API/)
1648c2ecf20Sopenharmony_ci  channel management functions::
1658c2ecf20Sopenharmony_ci
1668c2ecf20Sopenharmony_ci    relay_open(base_filename, parent, subbuf_size, n_subbufs,
1678c2ecf20Sopenharmony_ci               callbacks, private_data)
1688c2ecf20Sopenharmony_ci    relay_close(chan)
1698c2ecf20Sopenharmony_ci    relay_flush(chan)
1708c2ecf20Sopenharmony_ci    relay_reset(chan)
1718c2ecf20Sopenharmony_ci
1728c2ecf20Sopenharmony_ci  channel management typically called on instigation of userspace::
1738c2ecf20Sopenharmony_ci
1748c2ecf20Sopenharmony_ci    relay_subbufs_consumed(chan, cpu, subbufs_consumed)
1758c2ecf20Sopenharmony_ci
1768c2ecf20Sopenharmony_ci  write functions::
1778c2ecf20Sopenharmony_ci
1788c2ecf20Sopenharmony_ci    relay_write(chan, data, length)
1798c2ecf20Sopenharmony_ci    __relay_write(chan, data, length)
1808c2ecf20Sopenharmony_ci    relay_reserve(chan, length)
1818c2ecf20Sopenharmony_ci
1828c2ecf20Sopenharmony_ci  callbacks::
1838c2ecf20Sopenharmony_ci
1848c2ecf20Sopenharmony_ci    subbuf_start(buf, subbuf, prev_subbuf, prev_padding)
1858c2ecf20Sopenharmony_ci    buf_mapped(buf, filp)
1868c2ecf20Sopenharmony_ci    buf_unmapped(buf, filp)
1878c2ecf20Sopenharmony_ci    create_buf_file(filename, parent, mode, buf, is_global)
1888c2ecf20Sopenharmony_ci    remove_buf_file(dentry)
1898c2ecf20Sopenharmony_ci
1908c2ecf20Sopenharmony_ci  helper functions::
1918c2ecf20Sopenharmony_ci
1928c2ecf20Sopenharmony_ci    relay_buf_full(buf)
1938c2ecf20Sopenharmony_ci    subbuf_start_reserve(buf, length)
1948c2ecf20Sopenharmony_ci
1958c2ecf20Sopenharmony_ci
1968c2ecf20Sopenharmony_ciCreating a channel
1978c2ecf20Sopenharmony_ci------------------
1988c2ecf20Sopenharmony_ci
1998c2ecf20Sopenharmony_cirelay_open() is used to create a channel, along with its per-cpu
2008c2ecf20Sopenharmony_cichannel buffers.  Each channel buffer will have an associated file
2018c2ecf20Sopenharmony_cicreated for it in the host filesystem, which can be and mmapped or
2028c2ecf20Sopenharmony_ciread from in user space.  The files are named basename0...basenameN-1
2038c2ecf20Sopenharmony_ciwhere N is the number of online cpus, and by default will be created
2048c2ecf20Sopenharmony_ciin the root of the filesystem (if the parent param is NULL).  If you
2058c2ecf20Sopenharmony_ciwant a directory structure to contain your relay files, you should
2068c2ecf20Sopenharmony_cicreate it using the host filesystem's directory creation function,
2078c2ecf20Sopenharmony_cie.g. debugfs_create_dir(), and pass the parent directory to
2088c2ecf20Sopenharmony_cirelay_open().  Users are responsible for cleaning up any directory
2098c2ecf20Sopenharmony_cistructure they create, when the channel is closed - again the host
2108c2ecf20Sopenharmony_cifilesystem's directory removal functions should be used for that,
2118c2ecf20Sopenharmony_cie.g. debugfs_remove().
2128c2ecf20Sopenharmony_ci
2138c2ecf20Sopenharmony_ciIn order for a channel to be created and the host filesystem's files
2148c2ecf20Sopenharmony_ciassociated with its channel buffers, the user must provide definitions
2158c2ecf20Sopenharmony_cifor two callback functions, create_buf_file() and remove_buf_file().
2168c2ecf20Sopenharmony_cicreate_buf_file() is called once for each per-cpu buffer from
2178c2ecf20Sopenharmony_cirelay_open() and allows the user to create the file which will be used
2188c2ecf20Sopenharmony_cito represent the corresponding channel buffer.  The callback should
2198c2ecf20Sopenharmony_cireturn the dentry of the file created to represent the channel buffer.
2208c2ecf20Sopenharmony_ciremove_buf_file() must also be defined; it's responsible for deleting
2218c2ecf20Sopenharmony_cithe file(s) created in create_buf_file() and is called during
2228c2ecf20Sopenharmony_cirelay_close().
2238c2ecf20Sopenharmony_ci
2248c2ecf20Sopenharmony_ciHere are some typical definitions for these callbacks, in this case
2258c2ecf20Sopenharmony_ciusing debugfs::
2268c2ecf20Sopenharmony_ci
2278c2ecf20Sopenharmony_ci    /*
2288c2ecf20Sopenharmony_ci    * create_buf_file() callback.  Creates relay file in debugfs.
2298c2ecf20Sopenharmony_ci    */
2308c2ecf20Sopenharmony_ci    static struct dentry *create_buf_file_handler(const char *filename,
2318c2ecf20Sopenharmony_ci						struct dentry *parent,
2328c2ecf20Sopenharmony_ci						umode_t mode,
2338c2ecf20Sopenharmony_ci						struct rchan_buf *buf,
2348c2ecf20Sopenharmony_ci						int *is_global)
2358c2ecf20Sopenharmony_ci    {
2368c2ecf20Sopenharmony_ci	    return debugfs_create_file(filename, mode, parent, buf,
2378c2ecf20Sopenharmony_ci				    &relay_file_operations);
2388c2ecf20Sopenharmony_ci    }
2398c2ecf20Sopenharmony_ci
2408c2ecf20Sopenharmony_ci    /*
2418c2ecf20Sopenharmony_ci    * remove_buf_file() callback.  Removes relay file from debugfs.
2428c2ecf20Sopenharmony_ci    */
2438c2ecf20Sopenharmony_ci    static int remove_buf_file_handler(struct dentry *dentry)
2448c2ecf20Sopenharmony_ci    {
2458c2ecf20Sopenharmony_ci	    debugfs_remove(dentry);
2468c2ecf20Sopenharmony_ci
2478c2ecf20Sopenharmony_ci	    return 0;
2488c2ecf20Sopenharmony_ci    }
2498c2ecf20Sopenharmony_ci
2508c2ecf20Sopenharmony_ci    /*
2518c2ecf20Sopenharmony_ci    * relay interface callbacks
2528c2ecf20Sopenharmony_ci    */
2538c2ecf20Sopenharmony_ci    static struct rchan_callbacks relay_callbacks =
2548c2ecf20Sopenharmony_ci    {
2558c2ecf20Sopenharmony_ci	    .create_buf_file = create_buf_file_handler,
2568c2ecf20Sopenharmony_ci	    .remove_buf_file = remove_buf_file_handler,
2578c2ecf20Sopenharmony_ci    };
2588c2ecf20Sopenharmony_ci
2598c2ecf20Sopenharmony_ciAnd an example relay_open() invocation using them::
2608c2ecf20Sopenharmony_ci
2618c2ecf20Sopenharmony_ci  chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks, NULL);
2628c2ecf20Sopenharmony_ci
2638c2ecf20Sopenharmony_ciIf the create_buf_file() callback fails, or isn't defined, channel
2648c2ecf20Sopenharmony_cicreation and thus relay_open() will fail.
2658c2ecf20Sopenharmony_ci
2668c2ecf20Sopenharmony_ciThe total size of each per-cpu buffer is calculated by multiplying the
2678c2ecf20Sopenharmony_cinumber of sub-buffers by the sub-buffer size passed into relay_open().
2688c2ecf20Sopenharmony_ciThe idea behind sub-buffers is that they're basically an extension of
2698c2ecf20Sopenharmony_cidouble-buffering to N buffers, and they also allow applications to
2708c2ecf20Sopenharmony_cieasily implement random-access-on-buffer-boundary schemes, which can
2718c2ecf20Sopenharmony_cibe important for some high-volume applications.  The number and size
2728c2ecf20Sopenharmony_ciof sub-buffers is completely dependent on the application and even for
2738c2ecf20Sopenharmony_cithe same application, different conditions will warrant different
2748c2ecf20Sopenharmony_civalues for these parameters at different times.  Typically, the right
2758c2ecf20Sopenharmony_civalues to use are best decided after some experimentation; in general,
2768c2ecf20Sopenharmony_cithough, it's safe to assume that having only 1 sub-buffer is a bad
2778c2ecf20Sopenharmony_ciidea - you're guaranteed to either overwrite data or lose events
2788c2ecf20Sopenharmony_cidepending on the channel mode being used.
2798c2ecf20Sopenharmony_ci
2808c2ecf20Sopenharmony_ciThe create_buf_file() implementation can also be defined in such a way
2818c2ecf20Sopenharmony_cias to allow the creation of a single 'global' buffer instead of the
2828c2ecf20Sopenharmony_cidefault per-cpu set.  This can be useful for applications interested
2838c2ecf20Sopenharmony_cimainly in seeing the relative ordering of system-wide events without
2848c2ecf20Sopenharmony_cithe need to bother with saving explicit timestamps for the purpose of
2858c2ecf20Sopenharmony_cimerging/sorting per-cpu files in a postprocessing step.
2868c2ecf20Sopenharmony_ci
2878c2ecf20Sopenharmony_ciTo have relay_open() create a global buffer, the create_buf_file()
2888c2ecf20Sopenharmony_ciimplementation should set the value of the is_global outparam to a
2898c2ecf20Sopenharmony_cinon-zero value in addition to creating the file that will be used to
2908c2ecf20Sopenharmony_cirepresent the single buffer.  In the case of a global buffer,
2918c2ecf20Sopenharmony_cicreate_buf_file() and remove_buf_file() will be called only once.  The
2928c2ecf20Sopenharmony_cinormal channel-writing functions, e.g. relay_write(), can still be
2938c2ecf20Sopenharmony_ciused - writes from any cpu will transparently end up in the global
2948c2ecf20Sopenharmony_cibuffer - but since it is a global buffer, callers should make sure
2958c2ecf20Sopenharmony_cithey use the proper locking for such a buffer, either by wrapping
2968c2ecf20Sopenharmony_ciwrites in a spinlock, or by copying a write function from relay.h and
2978c2ecf20Sopenharmony_cicreating a local version that internally does the proper locking.
2988c2ecf20Sopenharmony_ci
2998c2ecf20Sopenharmony_ciThe private_data passed into relay_open() allows clients to associate
3008c2ecf20Sopenharmony_ciuser-defined data with a channel, and is immediately available
3018c2ecf20Sopenharmony_ci(including in create_buf_file()) via chan->private_data or
3028c2ecf20Sopenharmony_cibuf->chan->private_data.
3038c2ecf20Sopenharmony_ci
3048c2ecf20Sopenharmony_ciBuffer-only channels
3058c2ecf20Sopenharmony_ci--------------------
3068c2ecf20Sopenharmony_ci
3078c2ecf20Sopenharmony_ciThese channels have no files associated and can be created with
3088c2ecf20Sopenharmony_cirelay_open(NULL, NULL, ...). Such channels are useful in scenarios such
3098c2ecf20Sopenharmony_cias when doing early tracing in the kernel, before the VFS is up. In these
3108c2ecf20Sopenharmony_cicases, one may open a buffer-only channel and then call
3118c2ecf20Sopenharmony_cirelay_late_setup_files() when the kernel is ready to handle files,
3128c2ecf20Sopenharmony_cito expose the buffered data to the userspace.
3138c2ecf20Sopenharmony_ci
3148c2ecf20Sopenharmony_ciChannel 'modes'
3158c2ecf20Sopenharmony_ci---------------
3168c2ecf20Sopenharmony_ci
3178c2ecf20Sopenharmony_cirelay channels can be used in either of two modes - 'overwrite' or
3188c2ecf20Sopenharmony_ci'no-overwrite'.  The mode is entirely determined by the implementation
3198c2ecf20Sopenharmony_ciof the subbuf_start() callback, as described below.  The default if no
3208c2ecf20Sopenharmony_cisubbuf_start() callback is defined is 'no-overwrite' mode.  If the
3218c2ecf20Sopenharmony_cidefault mode suits your needs, and you plan to use the read()
3228c2ecf20Sopenharmony_ciinterface to retrieve channel data, you can ignore the details of this
3238c2ecf20Sopenharmony_cisection, as it pertains mainly to mmap() implementations.
3248c2ecf20Sopenharmony_ci
3258c2ecf20Sopenharmony_ciIn 'overwrite' mode, also known as 'flight recorder' mode, writes
3268c2ecf20Sopenharmony_cicontinuously cycle around the buffer and will never fail, but will
3278c2ecf20Sopenharmony_ciunconditionally overwrite old data regardless of whether it's actually
3288c2ecf20Sopenharmony_cibeen consumed.  In no-overwrite mode, writes will fail, i.e. data will
3298c2ecf20Sopenharmony_cibe lost, if the number of unconsumed sub-buffers equals the total
3308c2ecf20Sopenharmony_cinumber of sub-buffers in the channel.  It should be clear that if
3318c2ecf20Sopenharmony_cithere is no consumer or if the consumer can't consume sub-buffers fast
3328c2ecf20Sopenharmony_cienough, data will be lost in either case; the only difference is
3338c2ecf20Sopenharmony_ciwhether data is lost from the beginning or the end of a buffer.
3348c2ecf20Sopenharmony_ci
3358c2ecf20Sopenharmony_ciAs explained above, a relay channel is made of up one or more
3368c2ecf20Sopenharmony_ciper-cpu channel buffers, each implemented as a circular buffer
3378c2ecf20Sopenharmony_cisubdivided into one or more sub-buffers.  Messages are written into
3388c2ecf20Sopenharmony_cithe current sub-buffer of the channel's current per-cpu buffer via the
3398c2ecf20Sopenharmony_ciwrite functions described below.  Whenever a message can't fit into
3408c2ecf20Sopenharmony_cithe current sub-buffer, because there's no room left for it, the
3418c2ecf20Sopenharmony_ciclient is notified via the subbuf_start() callback that a switch to a
3428c2ecf20Sopenharmony_cinew sub-buffer is about to occur.  The client uses this callback to 1)
3438c2ecf20Sopenharmony_ciinitialize the next sub-buffer if appropriate 2) finalize the previous
3448c2ecf20Sopenharmony_cisub-buffer if appropriate and 3) return a boolean value indicating
3458c2ecf20Sopenharmony_ciwhether or not to actually move on to the next sub-buffer.
3468c2ecf20Sopenharmony_ci
3478c2ecf20Sopenharmony_ciTo implement 'no-overwrite' mode, the userspace client would provide
3488c2ecf20Sopenharmony_cian implementation of the subbuf_start() callback something like the
3498c2ecf20Sopenharmony_cifollowing::
3508c2ecf20Sopenharmony_ci
3518c2ecf20Sopenharmony_ci    static int subbuf_start(struct rchan_buf *buf,
3528c2ecf20Sopenharmony_ci			    void *subbuf,
3538c2ecf20Sopenharmony_ci			    void *prev_subbuf,
3548c2ecf20Sopenharmony_ci			    unsigned int prev_padding)
3558c2ecf20Sopenharmony_ci    {
3568c2ecf20Sopenharmony_ci	    if (prev_subbuf)
3578c2ecf20Sopenharmony_ci		    *((unsigned *)prev_subbuf) = prev_padding;
3588c2ecf20Sopenharmony_ci
3598c2ecf20Sopenharmony_ci	    if (relay_buf_full(buf))
3608c2ecf20Sopenharmony_ci		    return 0;
3618c2ecf20Sopenharmony_ci
3628c2ecf20Sopenharmony_ci	    subbuf_start_reserve(buf, sizeof(unsigned int));
3638c2ecf20Sopenharmony_ci
3648c2ecf20Sopenharmony_ci	    return 1;
3658c2ecf20Sopenharmony_ci    }
3668c2ecf20Sopenharmony_ci
3678c2ecf20Sopenharmony_ciIf the current buffer is full, i.e. all sub-buffers remain unconsumed,
3688c2ecf20Sopenharmony_cithe callback returns 0 to indicate that the buffer switch should not
3698c2ecf20Sopenharmony_cioccur yet, i.e. until the consumer has had a chance to read the
3708c2ecf20Sopenharmony_cicurrent set of ready sub-buffers.  For the relay_buf_full() function
3718c2ecf20Sopenharmony_cito make sense, the consumer is responsible for notifying the relay
3728c2ecf20Sopenharmony_ciinterface when sub-buffers have been consumed via
3738c2ecf20Sopenharmony_cirelay_subbufs_consumed().  Any subsequent attempts to write into the
3748c2ecf20Sopenharmony_cibuffer will again invoke the subbuf_start() callback with the same
3758c2ecf20Sopenharmony_ciparameters; only when the consumer has consumed one or more of the
3768c2ecf20Sopenharmony_ciready sub-buffers will relay_buf_full() return 0, in which case the
3778c2ecf20Sopenharmony_cibuffer switch can continue.
3788c2ecf20Sopenharmony_ci
3798c2ecf20Sopenharmony_ciThe implementation of the subbuf_start() callback for 'overwrite' mode
3808c2ecf20Sopenharmony_ciwould be very similar::
3818c2ecf20Sopenharmony_ci
3828c2ecf20Sopenharmony_ci    static int subbuf_start(struct rchan_buf *buf,
3838c2ecf20Sopenharmony_ci			    void *subbuf,
3848c2ecf20Sopenharmony_ci			    void *prev_subbuf,
3858c2ecf20Sopenharmony_ci			    size_t prev_padding)
3868c2ecf20Sopenharmony_ci    {
3878c2ecf20Sopenharmony_ci	    if (prev_subbuf)
3888c2ecf20Sopenharmony_ci		    *((unsigned *)prev_subbuf) = prev_padding;
3898c2ecf20Sopenharmony_ci
3908c2ecf20Sopenharmony_ci	    subbuf_start_reserve(buf, sizeof(unsigned int));
3918c2ecf20Sopenharmony_ci
3928c2ecf20Sopenharmony_ci	    return 1;
3938c2ecf20Sopenharmony_ci    }
3948c2ecf20Sopenharmony_ci
3958c2ecf20Sopenharmony_ciIn this case, the relay_buf_full() check is meaningless and the
3968c2ecf20Sopenharmony_cicallback always returns 1, causing the buffer switch to occur
3978c2ecf20Sopenharmony_ciunconditionally.  It's also meaningless for the client to use the
3988c2ecf20Sopenharmony_cirelay_subbufs_consumed() function in this mode, as it's never
3998c2ecf20Sopenharmony_ciconsulted.
4008c2ecf20Sopenharmony_ci
4018c2ecf20Sopenharmony_ciThe default subbuf_start() implementation, used if the client doesn't
4028c2ecf20Sopenharmony_cidefine any callbacks, or doesn't define the subbuf_start() callback,
4038c2ecf20Sopenharmony_ciimplements the simplest possible 'no-overwrite' mode, i.e. it does
4048c2ecf20Sopenharmony_cinothing but return 0.
4058c2ecf20Sopenharmony_ci
4068c2ecf20Sopenharmony_ciHeader information can be reserved at the beginning of each sub-buffer
4078c2ecf20Sopenharmony_ciby calling the subbuf_start_reserve() helper function from within the
4088c2ecf20Sopenharmony_cisubbuf_start() callback.  This reserved area can be used to store
4098c2ecf20Sopenharmony_ciwhatever information the client wants.  In the example above, room is
4108c2ecf20Sopenharmony_cireserved in each sub-buffer to store the padding count for that
4118c2ecf20Sopenharmony_cisub-buffer.  This is filled in for the previous sub-buffer in the
4128c2ecf20Sopenharmony_cisubbuf_start() implementation; the padding value for the previous
4138c2ecf20Sopenharmony_cisub-buffer is passed into the subbuf_start() callback along with a
4148c2ecf20Sopenharmony_cipointer to the previous sub-buffer, since the padding value isn't
4158c2ecf20Sopenharmony_ciknown until a sub-buffer is filled.  The subbuf_start() callback is
4168c2ecf20Sopenharmony_cialso called for the first sub-buffer when the channel is opened, to
4178c2ecf20Sopenharmony_cigive the client a chance to reserve space in it.  In this case the
4188c2ecf20Sopenharmony_ciprevious sub-buffer pointer passed into the callback will be NULL, so
4198c2ecf20Sopenharmony_cithe client should check the value of the prev_subbuf pointer before
4208c2ecf20Sopenharmony_ciwriting into the previous sub-buffer.
4218c2ecf20Sopenharmony_ci
4228c2ecf20Sopenharmony_ciWriting to a channel
4238c2ecf20Sopenharmony_ci--------------------
4248c2ecf20Sopenharmony_ci
4258c2ecf20Sopenharmony_ciKernel clients write data into the current cpu's channel buffer using
4268c2ecf20Sopenharmony_cirelay_write() or __relay_write().  relay_write() is the main logging
4278c2ecf20Sopenharmony_cifunction - it uses local_irqsave() to protect the buffer and should be
4288c2ecf20Sopenharmony_ciused if you might be logging from interrupt context.  If you know
4298c2ecf20Sopenharmony_ciyou'll never be logging from interrupt context, you can use
4308c2ecf20Sopenharmony_ci__relay_write(), which only disables preemption.  These functions
4318c2ecf20Sopenharmony_cidon't return a value, so you can't determine whether or not they
4328c2ecf20Sopenharmony_cifailed - the assumption is that you wouldn't want to check a return
4338c2ecf20Sopenharmony_civalue in the fast logging path anyway, and that they'll always succeed
4348c2ecf20Sopenharmony_ciunless the buffer is full and no-overwrite mode is being used, in
4358c2ecf20Sopenharmony_ciwhich case you can detect a failed write in the subbuf_start()
4368c2ecf20Sopenharmony_cicallback by calling the relay_buf_full() helper function.
4378c2ecf20Sopenharmony_ci
4388c2ecf20Sopenharmony_cirelay_reserve() is used to reserve a slot in a channel buffer which
4398c2ecf20Sopenharmony_cican be written to later.  This would typically be used in applications
4408c2ecf20Sopenharmony_cithat need to write directly into a channel buffer without having to
4418c2ecf20Sopenharmony_cistage data in a temporary buffer beforehand.  Because the actual write
4428c2ecf20Sopenharmony_cimay not happen immediately after the slot is reserved, applications
4438c2ecf20Sopenharmony_ciusing relay_reserve() can keep a count of the number of bytes actually
4448c2ecf20Sopenharmony_ciwritten, either in space reserved in the sub-buffers themselves or as
4458c2ecf20Sopenharmony_cia separate array.  See the 'reserve' example in the relay-apps tarball
4468c2ecf20Sopenharmony_ciat http://relayfs.sourceforge.net for an example of how this can be
4478c2ecf20Sopenharmony_cidone.  Because the write is under control of the client and is
4488c2ecf20Sopenharmony_ciseparated from the reserve, relay_reserve() doesn't protect the buffer
4498c2ecf20Sopenharmony_ciat all - it's up to the client to provide the appropriate
4508c2ecf20Sopenharmony_cisynchronization when using relay_reserve().
4518c2ecf20Sopenharmony_ci
4528c2ecf20Sopenharmony_ciClosing a channel
4538c2ecf20Sopenharmony_ci-----------------
4548c2ecf20Sopenharmony_ci
4558c2ecf20Sopenharmony_ciThe client calls relay_close() when it's finished using the channel.
4568c2ecf20Sopenharmony_ciThe channel and its associated buffers are destroyed when there are no
4578c2ecf20Sopenharmony_cilonger any references to any of the channel buffers.  relay_flush()
4588c2ecf20Sopenharmony_ciforces a sub-buffer switch on all the channel buffers, and can be used
4598c2ecf20Sopenharmony_cito finalize and process the last sub-buffers before the channel is
4608c2ecf20Sopenharmony_ciclosed.
4618c2ecf20Sopenharmony_ci
4628c2ecf20Sopenharmony_ciMisc
4638c2ecf20Sopenharmony_ci----
4648c2ecf20Sopenharmony_ci
4658c2ecf20Sopenharmony_ciSome applications may want to keep a channel around and re-use it
4668c2ecf20Sopenharmony_cirather than open and close a new channel for each use.  relay_reset()
4678c2ecf20Sopenharmony_cican be used for this purpose - it resets a channel to its initial
4688c2ecf20Sopenharmony_cistate without reallocating channel buffer memory or destroying
4698c2ecf20Sopenharmony_ciexisting mappings.  It should however only be called when it's safe to
4708c2ecf20Sopenharmony_cido so, i.e. when the channel isn't currently being written to.
4718c2ecf20Sopenharmony_ci
4728c2ecf20Sopenharmony_ciFinally, there are a couple of utility callbacks that can be used for
4738c2ecf20Sopenharmony_cidifferent purposes.  buf_mapped() is called whenever a channel buffer
4748c2ecf20Sopenharmony_ciis mmapped from user space and buf_unmapped() is called when it's
4758c2ecf20Sopenharmony_ciunmapped.  The client can use this notification to trigger actions
4768c2ecf20Sopenharmony_ciwithin the kernel application, such as enabling/disabling logging to
4778c2ecf20Sopenharmony_cithe channel.
4788c2ecf20Sopenharmony_ci
4798c2ecf20Sopenharmony_ci
4808c2ecf20Sopenharmony_ciResources
4818c2ecf20Sopenharmony_ci=========
4828c2ecf20Sopenharmony_ci
4838c2ecf20Sopenharmony_ciFor news, example code, mailing list, etc. see the relay interface homepage:
4848c2ecf20Sopenharmony_ci
4858c2ecf20Sopenharmony_ci    http://relayfs.sourceforge.net
4868c2ecf20Sopenharmony_ci
4878c2ecf20Sopenharmony_ci
4888c2ecf20Sopenharmony_ciCredits
4898c2ecf20Sopenharmony_ci=======
4908c2ecf20Sopenharmony_ci
4918c2ecf20Sopenharmony_ciThe ideas and specs for the relay interface came about as a result of
4928c2ecf20Sopenharmony_cidiscussions on tracing involving the following:
4938c2ecf20Sopenharmony_ci
4948c2ecf20Sopenharmony_ciMichel Dagenais		<michel.dagenais@polymtl.ca>
4958c2ecf20Sopenharmony_ciRichard Moore		<richardj_moore@uk.ibm.com>
4968c2ecf20Sopenharmony_ciBob Wisniewski		<bob@watson.ibm.com>
4978c2ecf20Sopenharmony_ciKarim Yaghmour		<karim@opersys.com>
4988c2ecf20Sopenharmony_ciTom Zanussi		<zanussi@us.ibm.com>
4998c2ecf20Sopenharmony_ci
5008c2ecf20Sopenharmony_ciAlso thanks to Hubertus Franke for a lot of useful suggestions and bug
5018c2ecf20Sopenharmony_cireports.
502