18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 28c2ecf20Sopenharmony_ci 38c2ecf20Sopenharmony_ci================================== 48c2ecf20Sopenharmony_cirelay interface (formerly relayfs) 58c2ecf20Sopenharmony_ci================================== 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ciThe relay interface provides a means for kernel applications to 88c2ecf20Sopenharmony_ciefficiently log and transfer large quantities of data from the kernel 98c2ecf20Sopenharmony_cito userspace via user-defined 'relay channels'. 108c2ecf20Sopenharmony_ci 118c2ecf20Sopenharmony_ciA 'relay channel' is a kernel->user data relay mechanism implemented 128c2ecf20Sopenharmony_cias a set of per-cpu kernel buffers ('channel buffers'), each 138c2ecf20Sopenharmony_cirepresented as a regular file ('relay file') in user space. Kernel 148c2ecf20Sopenharmony_ciclients write into the channel buffers using efficient write 158c2ecf20Sopenharmony_cifunctions; these automatically log into the current cpu's channel 168c2ecf20Sopenharmony_cibuffer. User space applications mmap() or read() from the relay files 178c2ecf20Sopenharmony_ciand retrieve the data as it becomes available. The relay files 188c2ecf20Sopenharmony_cithemselves are files created in a host filesystem, e.g. debugfs, and 198c2ecf20Sopenharmony_ciare associated with the channel buffers using the API described below. 208c2ecf20Sopenharmony_ci 218c2ecf20Sopenharmony_ciThe format of the data logged into the channel buffers is completely 228c2ecf20Sopenharmony_ciup to the kernel client; the relay interface does however provide 238c2ecf20Sopenharmony_cihooks which allow kernel clients to impose some structure on the 248c2ecf20Sopenharmony_cibuffer data. The relay interface doesn't implement any form of data 258c2ecf20Sopenharmony_cifiltering - this also is left to the kernel client. The purpose is to 268c2ecf20Sopenharmony_cikeep things as simple as possible. 278c2ecf20Sopenharmony_ci 288c2ecf20Sopenharmony_ciThis document provides an overview of the relay interface API. The 298c2ecf20Sopenharmony_cidetails of the function parameters are documented along with the 308c2ecf20Sopenharmony_cifunctions in the relay interface code - please see that for details. 318c2ecf20Sopenharmony_ci 328c2ecf20Sopenharmony_ciSemantics 338c2ecf20Sopenharmony_ci========= 348c2ecf20Sopenharmony_ci 358c2ecf20Sopenharmony_ciEach relay channel has one buffer per CPU, each buffer has one or more 368c2ecf20Sopenharmony_cisub-buffers. Messages are written to the first sub-buffer until it is 378c2ecf20Sopenharmony_citoo full to contain a new message, in which case it is written to 388c2ecf20Sopenharmony_cithe next (if available). Messages are never split across sub-buffers. 398c2ecf20Sopenharmony_ciAt this point, userspace can be notified so it empties the first 408c2ecf20Sopenharmony_cisub-buffer, while the kernel continues writing to the next. 418c2ecf20Sopenharmony_ci 428c2ecf20Sopenharmony_ciWhen notified that a sub-buffer is full, the kernel knows how many 438c2ecf20Sopenharmony_cibytes of it are padding i.e. unused space occurring because a complete 448c2ecf20Sopenharmony_cimessage couldn't fit into a sub-buffer. Userspace can use this 458c2ecf20Sopenharmony_ciknowledge to copy only valid data. 468c2ecf20Sopenharmony_ci 478c2ecf20Sopenharmony_ciAfter copying it, userspace can notify the kernel that a sub-buffer 488c2ecf20Sopenharmony_cihas been consumed. 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ciA relay channel can operate in a mode where it will overwrite data not 518c2ecf20Sopenharmony_ciyet collected by userspace, and not wait for it to be consumed. 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ciThe relay channel itself does not provide for communication of such 548c2ecf20Sopenharmony_cidata between userspace and kernel, allowing the kernel side to remain 558c2ecf20Sopenharmony_cisimple and not impose a single interface on userspace. It does 568c2ecf20Sopenharmony_ciprovide a set of examples and a separate helper though, described 578c2ecf20Sopenharmony_cibelow. 588c2ecf20Sopenharmony_ci 598c2ecf20Sopenharmony_ciThe read() interface both removes padding and internally consumes the 608c2ecf20Sopenharmony_ciread sub-buffers; thus in cases where read(2) is being used to drain 618c2ecf20Sopenharmony_cithe channel buffers, special-purpose communication between kernel and 628c2ecf20Sopenharmony_ciuser isn't necessary for basic operation. 638c2ecf20Sopenharmony_ci 648c2ecf20Sopenharmony_ciOne of the major goals of the relay interface is to provide a low 658c2ecf20Sopenharmony_cioverhead mechanism for conveying kernel data to userspace. While the 668c2ecf20Sopenharmony_ciread() interface is easy to use, it's not as efficient as the mmap() 678c2ecf20Sopenharmony_ciapproach; the example code attempts to make the tradeoff between the 688c2ecf20Sopenharmony_citwo approaches as small as possible. 698c2ecf20Sopenharmony_ci 708c2ecf20Sopenharmony_ciklog and relay-apps example code 718c2ecf20Sopenharmony_ci================================ 728c2ecf20Sopenharmony_ci 738c2ecf20Sopenharmony_ciThe relay interface itself is ready to use, but to make things easier, 748c2ecf20Sopenharmony_cia couple simple utility functions and a set of examples are provided. 758c2ecf20Sopenharmony_ci 768c2ecf20Sopenharmony_ciThe relay-apps example tarball, available on the relay sourceforge 778c2ecf20Sopenharmony_cisite, contains a set of self-contained examples, each consisting of a 788c2ecf20Sopenharmony_cipair of .c files containing boilerplate code for each of the user and 798c2ecf20Sopenharmony_cikernel sides of a relay application. When combined these two sets of 808c2ecf20Sopenharmony_ciboilerplate code provide glue to easily stream data to disk, without 818c2ecf20Sopenharmony_cihaving to bother with mundane housekeeping chores. 828c2ecf20Sopenharmony_ci 838c2ecf20Sopenharmony_ciThe 'klog debugging functions' patch (klog.patch in the relay-apps 848c2ecf20Sopenharmony_citarball) provides a couple of high-level logging functions to the 858c2ecf20Sopenharmony_cikernel which allow writing formatted text or raw data to a channel, 868c2ecf20Sopenharmony_ciregardless of whether a channel to write into exists or not, or even 878c2ecf20Sopenharmony_ciwhether the relay interface is compiled into the kernel or not. These 888c2ecf20Sopenharmony_cifunctions allow you to put unconditional 'trace' statements anywhere 898c2ecf20Sopenharmony_ciin the kernel or kernel modules; only when there is a 'klog handler' 908c2ecf20Sopenharmony_ciregistered will data actually be logged (see the klog and kleak 918c2ecf20Sopenharmony_ciexamples for details). 928c2ecf20Sopenharmony_ci 938c2ecf20Sopenharmony_ciIt is of course possible to use the relay interface from scratch, 948c2ecf20Sopenharmony_cii.e. without using any of the relay-apps example code or klog, but 958c2ecf20Sopenharmony_ciyou'll have to implement communication between userspace and kernel, 968c2ecf20Sopenharmony_ciallowing both to convey the state of buffers (full, empty, amount of 978c2ecf20Sopenharmony_cipadding). The read() interface both removes padding and internally 988c2ecf20Sopenharmony_ciconsumes the read sub-buffers; thus in cases where read(2) is being 998c2ecf20Sopenharmony_ciused to drain the channel buffers, special-purpose communication 1008c2ecf20Sopenharmony_cibetween kernel and user isn't necessary for basic operation. Things 1018c2ecf20Sopenharmony_cisuch as buffer-full conditions would still need to be communicated via 1028c2ecf20Sopenharmony_cisome channel though. 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ciklog and the relay-apps examples can be found in the relay-apps 1058c2ecf20Sopenharmony_citarball on http://relayfs.sourceforge.net 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ciThe relay interface user space API 1088c2ecf20Sopenharmony_ci================================== 1098c2ecf20Sopenharmony_ci 1108c2ecf20Sopenharmony_ciThe relay interface implements basic file operations for user space 1118c2ecf20Sopenharmony_ciaccess to relay channel buffer data. Here are the file operations 1128c2ecf20Sopenharmony_cithat are available and some comments regarding their behavior: 1138c2ecf20Sopenharmony_ci 1148c2ecf20Sopenharmony_ci=========== ============================================================ 1158c2ecf20Sopenharmony_ciopen() enables user to open an _existing_ channel buffer. 1168c2ecf20Sopenharmony_ci 1178c2ecf20Sopenharmony_cimmap() results in channel buffer being mapped into the caller's 1188c2ecf20Sopenharmony_ci memory space. Note that you can't do a partial mmap - you 1198c2ecf20Sopenharmony_ci must map the entire file, which is NRBUF * SUBBUFSIZE. 1208c2ecf20Sopenharmony_ci 1218c2ecf20Sopenharmony_ciread() read the contents of a channel buffer. The bytes read are 1228c2ecf20Sopenharmony_ci 'consumed' by the reader, i.e. they won't be available 1238c2ecf20Sopenharmony_ci again to subsequent reads. If the channel is being used 1248c2ecf20Sopenharmony_ci in no-overwrite mode (the default), it can be read at any 1258c2ecf20Sopenharmony_ci time even if there's an active kernel writer. If the 1268c2ecf20Sopenharmony_ci channel is being used in overwrite mode and there are 1278c2ecf20Sopenharmony_ci active channel writers, results may be unpredictable - 1288c2ecf20Sopenharmony_ci users should make sure that all logging to the channel has 1298c2ecf20Sopenharmony_ci ended before using read() with overwrite mode. Sub-buffer 1308c2ecf20Sopenharmony_ci padding is automatically removed and will not be seen by 1318c2ecf20Sopenharmony_ci the reader. 1328c2ecf20Sopenharmony_ci 1338c2ecf20Sopenharmony_cisendfile() transfer data from a channel buffer to an output file 1348c2ecf20Sopenharmony_ci descriptor. Sub-buffer padding is automatically removed 1358c2ecf20Sopenharmony_ci and will not be seen by the reader. 1368c2ecf20Sopenharmony_ci 1378c2ecf20Sopenharmony_cipoll() POLLIN/POLLRDNORM/POLLERR supported. User applications are 1388c2ecf20Sopenharmony_ci notified when sub-buffer boundaries are crossed. 1398c2ecf20Sopenharmony_ci 1408c2ecf20Sopenharmony_ciclose() decrements the channel buffer's refcount. When the refcount 1418c2ecf20Sopenharmony_ci reaches 0, i.e. when no process or kernel client has the 1428c2ecf20Sopenharmony_ci buffer open, the channel buffer is freed. 1438c2ecf20Sopenharmony_ci=========== ============================================================ 1448c2ecf20Sopenharmony_ci 1458c2ecf20Sopenharmony_ciIn order for a user application to make use of relay files, the 1468c2ecf20Sopenharmony_cihost filesystem must be mounted. For example:: 1478c2ecf20Sopenharmony_ci 1488c2ecf20Sopenharmony_ci mount -t debugfs debugfs /sys/kernel/debug 1498c2ecf20Sopenharmony_ci 1508c2ecf20Sopenharmony_ci.. Note:: 1518c2ecf20Sopenharmony_ci 1528c2ecf20Sopenharmony_ci the host filesystem doesn't need to be mounted for kernel 1538c2ecf20Sopenharmony_ci clients to create or use channels - it only needs to be 1548c2ecf20Sopenharmony_ci mounted when user space applications need access to the buffer 1558c2ecf20Sopenharmony_ci data. 1568c2ecf20Sopenharmony_ci 1578c2ecf20Sopenharmony_ci 1588c2ecf20Sopenharmony_ciThe relay interface kernel API 1598c2ecf20Sopenharmony_ci============================== 1608c2ecf20Sopenharmony_ci 1618c2ecf20Sopenharmony_ciHere's a summary of the API the relay interface provides to in-kernel clients: 1628c2ecf20Sopenharmony_ci 1638c2ecf20Sopenharmony_ciTBD(curr. line MT:/API/) 1648c2ecf20Sopenharmony_ci channel management functions:: 1658c2ecf20Sopenharmony_ci 1668c2ecf20Sopenharmony_ci relay_open(base_filename, parent, subbuf_size, n_subbufs, 1678c2ecf20Sopenharmony_ci callbacks, private_data) 1688c2ecf20Sopenharmony_ci relay_close(chan) 1698c2ecf20Sopenharmony_ci relay_flush(chan) 1708c2ecf20Sopenharmony_ci relay_reset(chan) 1718c2ecf20Sopenharmony_ci 1728c2ecf20Sopenharmony_ci channel management typically called on instigation of userspace:: 1738c2ecf20Sopenharmony_ci 1748c2ecf20Sopenharmony_ci relay_subbufs_consumed(chan, cpu, subbufs_consumed) 1758c2ecf20Sopenharmony_ci 1768c2ecf20Sopenharmony_ci write functions:: 1778c2ecf20Sopenharmony_ci 1788c2ecf20Sopenharmony_ci relay_write(chan, data, length) 1798c2ecf20Sopenharmony_ci __relay_write(chan, data, length) 1808c2ecf20Sopenharmony_ci relay_reserve(chan, length) 1818c2ecf20Sopenharmony_ci 1828c2ecf20Sopenharmony_ci callbacks:: 1838c2ecf20Sopenharmony_ci 1848c2ecf20Sopenharmony_ci subbuf_start(buf, subbuf, prev_subbuf, prev_padding) 1858c2ecf20Sopenharmony_ci buf_mapped(buf, filp) 1868c2ecf20Sopenharmony_ci buf_unmapped(buf, filp) 1878c2ecf20Sopenharmony_ci create_buf_file(filename, parent, mode, buf, is_global) 1888c2ecf20Sopenharmony_ci remove_buf_file(dentry) 1898c2ecf20Sopenharmony_ci 1908c2ecf20Sopenharmony_ci helper functions:: 1918c2ecf20Sopenharmony_ci 1928c2ecf20Sopenharmony_ci relay_buf_full(buf) 1938c2ecf20Sopenharmony_ci subbuf_start_reserve(buf, length) 1948c2ecf20Sopenharmony_ci 1958c2ecf20Sopenharmony_ci 1968c2ecf20Sopenharmony_ciCreating a channel 1978c2ecf20Sopenharmony_ci------------------ 1988c2ecf20Sopenharmony_ci 1998c2ecf20Sopenharmony_cirelay_open() is used to create a channel, along with its per-cpu 2008c2ecf20Sopenharmony_cichannel buffers. Each channel buffer will have an associated file 2018c2ecf20Sopenharmony_cicreated for it in the host filesystem, which can be and mmapped or 2028c2ecf20Sopenharmony_ciread from in user space. The files are named basename0...basenameN-1 2038c2ecf20Sopenharmony_ciwhere N is the number of online cpus, and by default will be created 2048c2ecf20Sopenharmony_ciin the root of the filesystem (if the parent param is NULL). If you 2058c2ecf20Sopenharmony_ciwant a directory structure to contain your relay files, you should 2068c2ecf20Sopenharmony_cicreate it using the host filesystem's directory creation function, 2078c2ecf20Sopenharmony_cie.g. debugfs_create_dir(), and pass the parent directory to 2088c2ecf20Sopenharmony_cirelay_open(). Users are responsible for cleaning up any directory 2098c2ecf20Sopenharmony_cistructure they create, when the channel is closed - again the host 2108c2ecf20Sopenharmony_cifilesystem's directory removal functions should be used for that, 2118c2ecf20Sopenharmony_cie.g. debugfs_remove(). 2128c2ecf20Sopenharmony_ci 2138c2ecf20Sopenharmony_ciIn order for a channel to be created and the host filesystem's files 2148c2ecf20Sopenharmony_ciassociated with its channel buffers, the user must provide definitions 2158c2ecf20Sopenharmony_cifor two callback functions, create_buf_file() and remove_buf_file(). 2168c2ecf20Sopenharmony_cicreate_buf_file() is called once for each per-cpu buffer from 2178c2ecf20Sopenharmony_cirelay_open() and allows the user to create the file which will be used 2188c2ecf20Sopenharmony_cito represent the corresponding channel buffer. The callback should 2198c2ecf20Sopenharmony_cireturn the dentry of the file created to represent the channel buffer. 2208c2ecf20Sopenharmony_ciremove_buf_file() must also be defined; it's responsible for deleting 2218c2ecf20Sopenharmony_cithe file(s) created in create_buf_file() and is called during 2228c2ecf20Sopenharmony_cirelay_close(). 2238c2ecf20Sopenharmony_ci 2248c2ecf20Sopenharmony_ciHere are some typical definitions for these callbacks, in this case 2258c2ecf20Sopenharmony_ciusing debugfs:: 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ci /* 2288c2ecf20Sopenharmony_ci * create_buf_file() callback. Creates relay file in debugfs. 2298c2ecf20Sopenharmony_ci */ 2308c2ecf20Sopenharmony_ci static struct dentry *create_buf_file_handler(const char *filename, 2318c2ecf20Sopenharmony_ci struct dentry *parent, 2328c2ecf20Sopenharmony_ci umode_t mode, 2338c2ecf20Sopenharmony_ci struct rchan_buf *buf, 2348c2ecf20Sopenharmony_ci int *is_global) 2358c2ecf20Sopenharmony_ci { 2368c2ecf20Sopenharmony_ci return debugfs_create_file(filename, mode, parent, buf, 2378c2ecf20Sopenharmony_ci &relay_file_operations); 2388c2ecf20Sopenharmony_ci } 2398c2ecf20Sopenharmony_ci 2408c2ecf20Sopenharmony_ci /* 2418c2ecf20Sopenharmony_ci * remove_buf_file() callback. Removes relay file from debugfs. 2428c2ecf20Sopenharmony_ci */ 2438c2ecf20Sopenharmony_ci static int remove_buf_file_handler(struct dentry *dentry) 2448c2ecf20Sopenharmony_ci { 2458c2ecf20Sopenharmony_ci debugfs_remove(dentry); 2468c2ecf20Sopenharmony_ci 2478c2ecf20Sopenharmony_ci return 0; 2488c2ecf20Sopenharmony_ci } 2498c2ecf20Sopenharmony_ci 2508c2ecf20Sopenharmony_ci /* 2518c2ecf20Sopenharmony_ci * relay interface callbacks 2528c2ecf20Sopenharmony_ci */ 2538c2ecf20Sopenharmony_ci static struct rchan_callbacks relay_callbacks = 2548c2ecf20Sopenharmony_ci { 2558c2ecf20Sopenharmony_ci .create_buf_file = create_buf_file_handler, 2568c2ecf20Sopenharmony_ci .remove_buf_file = remove_buf_file_handler, 2578c2ecf20Sopenharmony_ci }; 2588c2ecf20Sopenharmony_ci 2598c2ecf20Sopenharmony_ciAnd an example relay_open() invocation using them:: 2608c2ecf20Sopenharmony_ci 2618c2ecf20Sopenharmony_ci chan = relay_open("cpu", NULL, SUBBUF_SIZE, N_SUBBUFS, &relay_callbacks, NULL); 2628c2ecf20Sopenharmony_ci 2638c2ecf20Sopenharmony_ciIf the create_buf_file() callback fails, or isn't defined, channel 2648c2ecf20Sopenharmony_cicreation and thus relay_open() will fail. 2658c2ecf20Sopenharmony_ci 2668c2ecf20Sopenharmony_ciThe total size of each per-cpu buffer is calculated by multiplying the 2678c2ecf20Sopenharmony_cinumber of sub-buffers by the sub-buffer size passed into relay_open(). 2688c2ecf20Sopenharmony_ciThe idea behind sub-buffers is that they're basically an extension of 2698c2ecf20Sopenharmony_cidouble-buffering to N buffers, and they also allow applications to 2708c2ecf20Sopenharmony_cieasily implement random-access-on-buffer-boundary schemes, which can 2718c2ecf20Sopenharmony_cibe important for some high-volume applications. The number and size 2728c2ecf20Sopenharmony_ciof sub-buffers is completely dependent on the application and even for 2738c2ecf20Sopenharmony_cithe same application, different conditions will warrant different 2748c2ecf20Sopenharmony_civalues for these parameters at different times. Typically, the right 2758c2ecf20Sopenharmony_civalues to use are best decided after some experimentation; in general, 2768c2ecf20Sopenharmony_cithough, it's safe to assume that having only 1 sub-buffer is a bad 2778c2ecf20Sopenharmony_ciidea - you're guaranteed to either overwrite data or lose events 2788c2ecf20Sopenharmony_cidepending on the channel mode being used. 2798c2ecf20Sopenharmony_ci 2808c2ecf20Sopenharmony_ciThe create_buf_file() implementation can also be defined in such a way 2818c2ecf20Sopenharmony_cias to allow the creation of a single 'global' buffer instead of the 2828c2ecf20Sopenharmony_cidefault per-cpu set. This can be useful for applications interested 2838c2ecf20Sopenharmony_cimainly in seeing the relative ordering of system-wide events without 2848c2ecf20Sopenharmony_cithe need to bother with saving explicit timestamps for the purpose of 2858c2ecf20Sopenharmony_cimerging/sorting per-cpu files in a postprocessing step. 2868c2ecf20Sopenharmony_ci 2878c2ecf20Sopenharmony_ciTo have relay_open() create a global buffer, the create_buf_file() 2888c2ecf20Sopenharmony_ciimplementation should set the value of the is_global outparam to a 2898c2ecf20Sopenharmony_cinon-zero value in addition to creating the file that will be used to 2908c2ecf20Sopenharmony_cirepresent the single buffer. In the case of a global buffer, 2918c2ecf20Sopenharmony_cicreate_buf_file() and remove_buf_file() will be called only once. The 2928c2ecf20Sopenharmony_cinormal channel-writing functions, e.g. relay_write(), can still be 2938c2ecf20Sopenharmony_ciused - writes from any cpu will transparently end up in the global 2948c2ecf20Sopenharmony_cibuffer - but since it is a global buffer, callers should make sure 2958c2ecf20Sopenharmony_cithey use the proper locking for such a buffer, either by wrapping 2968c2ecf20Sopenharmony_ciwrites in a spinlock, or by copying a write function from relay.h and 2978c2ecf20Sopenharmony_cicreating a local version that internally does the proper locking. 2988c2ecf20Sopenharmony_ci 2998c2ecf20Sopenharmony_ciThe private_data passed into relay_open() allows clients to associate 3008c2ecf20Sopenharmony_ciuser-defined data with a channel, and is immediately available 3018c2ecf20Sopenharmony_ci(including in create_buf_file()) via chan->private_data or 3028c2ecf20Sopenharmony_cibuf->chan->private_data. 3038c2ecf20Sopenharmony_ci 3048c2ecf20Sopenharmony_ciBuffer-only channels 3058c2ecf20Sopenharmony_ci-------------------- 3068c2ecf20Sopenharmony_ci 3078c2ecf20Sopenharmony_ciThese channels have no files associated and can be created with 3088c2ecf20Sopenharmony_cirelay_open(NULL, NULL, ...). Such channels are useful in scenarios such 3098c2ecf20Sopenharmony_cias when doing early tracing in the kernel, before the VFS is up. In these 3108c2ecf20Sopenharmony_cicases, one may open a buffer-only channel and then call 3118c2ecf20Sopenharmony_cirelay_late_setup_files() when the kernel is ready to handle files, 3128c2ecf20Sopenharmony_cito expose the buffered data to the userspace. 3138c2ecf20Sopenharmony_ci 3148c2ecf20Sopenharmony_ciChannel 'modes' 3158c2ecf20Sopenharmony_ci--------------- 3168c2ecf20Sopenharmony_ci 3178c2ecf20Sopenharmony_cirelay channels can be used in either of two modes - 'overwrite' or 3188c2ecf20Sopenharmony_ci'no-overwrite'. The mode is entirely determined by the implementation 3198c2ecf20Sopenharmony_ciof the subbuf_start() callback, as described below. The default if no 3208c2ecf20Sopenharmony_cisubbuf_start() callback is defined is 'no-overwrite' mode. If the 3218c2ecf20Sopenharmony_cidefault mode suits your needs, and you plan to use the read() 3228c2ecf20Sopenharmony_ciinterface to retrieve channel data, you can ignore the details of this 3238c2ecf20Sopenharmony_cisection, as it pertains mainly to mmap() implementations. 3248c2ecf20Sopenharmony_ci 3258c2ecf20Sopenharmony_ciIn 'overwrite' mode, also known as 'flight recorder' mode, writes 3268c2ecf20Sopenharmony_cicontinuously cycle around the buffer and will never fail, but will 3278c2ecf20Sopenharmony_ciunconditionally overwrite old data regardless of whether it's actually 3288c2ecf20Sopenharmony_cibeen consumed. In no-overwrite mode, writes will fail, i.e. data will 3298c2ecf20Sopenharmony_cibe lost, if the number of unconsumed sub-buffers equals the total 3308c2ecf20Sopenharmony_cinumber of sub-buffers in the channel. It should be clear that if 3318c2ecf20Sopenharmony_cithere is no consumer or if the consumer can't consume sub-buffers fast 3328c2ecf20Sopenharmony_cienough, data will be lost in either case; the only difference is 3338c2ecf20Sopenharmony_ciwhether data is lost from the beginning or the end of a buffer. 3348c2ecf20Sopenharmony_ci 3358c2ecf20Sopenharmony_ciAs explained above, a relay channel is made of up one or more 3368c2ecf20Sopenharmony_ciper-cpu channel buffers, each implemented as a circular buffer 3378c2ecf20Sopenharmony_cisubdivided into one or more sub-buffers. Messages are written into 3388c2ecf20Sopenharmony_cithe current sub-buffer of the channel's current per-cpu buffer via the 3398c2ecf20Sopenharmony_ciwrite functions described below. Whenever a message can't fit into 3408c2ecf20Sopenharmony_cithe current sub-buffer, because there's no room left for it, the 3418c2ecf20Sopenharmony_ciclient is notified via the subbuf_start() callback that a switch to a 3428c2ecf20Sopenharmony_cinew sub-buffer is about to occur. The client uses this callback to 1) 3438c2ecf20Sopenharmony_ciinitialize the next sub-buffer if appropriate 2) finalize the previous 3448c2ecf20Sopenharmony_cisub-buffer if appropriate and 3) return a boolean value indicating 3458c2ecf20Sopenharmony_ciwhether or not to actually move on to the next sub-buffer. 3468c2ecf20Sopenharmony_ci 3478c2ecf20Sopenharmony_ciTo implement 'no-overwrite' mode, the userspace client would provide 3488c2ecf20Sopenharmony_cian implementation of the subbuf_start() callback something like the 3498c2ecf20Sopenharmony_cifollowing:: 3508c2ecf20Sopenharmony_ci 3518c2ecf20Sopenharmony_ci static int subbuf_start(struct rchan_buf *buf, 3528c2ecf20Sopenharmony_ci void *subbuf, 3538c2ecf20Sopenharmony_ci void *prev_subbuf, 3548c2ecf20Sopenharmony_ci unsigned int prev_padding) 3558c2ecf20Sopenharmony_ci { 3568c2ecf20Sopenharmony_ci if (prev_subbuf) 3578c2ecf20Sopenharmony_ci *((unsigned *)prev_subbuf) = prev_padding; 3588c2ecf20Sopenharmony_ci 3598c2ecf20Sopenharmony_ci if (relay_buf_full(buf)) 3608c2ecf20Sopenharmony_ci return 0; 3618c2ecf20Sopenharmony_ci 3628c2ecf20Sopenharmony_ci subbuf_start_reserve(buf, sizeof(unsigned int)); 3638c2ecf20Sopenharmony_ci 3648c2ecf20Sopenharmony_ci return 1; 3658c2ecf20Sopenharmony_ci } 3668c2ecf20Sopenharmony_ci 3678c2ecf20Sopenharmony_ciIf the current buffer is full, i.e. all sub-buffers remain unconsumed, 3688c2ecf20Sopenharmony_cithe callback returns 0 to indicate that the buffer switch should not 3698c2ecf20Sopenharmony_cioccur yet, i.e. until the consumer has had a chance to read the 3708c2ecf20Sopenharmony_cicurrent set of ready sub-buffers. For the relay_buf_full() function 3718c2ecf20Sopenharmony_cito make sense, the consumer is responsible for notifying the relay 3728c2ecf20Sopenharmony_ciinterface when sub-buffers have been consumed via 3738c2ecf20Sopenharmony_cirelay_subbufs_consumed(). Any subsequent attempts to write into the 3748c2ecf20Sopenharmony_cibuffer will again invoke the subbuf_start() callback with the same 3758c2ecf20Sopenharmony_ciparameters; only when the consumer has consumed one or more of the 3768c2ecf20Sopenharmony_ciready sub-buffers will relay_buf_full() return 0, in which case the 3778c2ecf20Sopenharmony_cibuffer switch can continue. 3788c2ecf20Sopenharmony_ci 3798c2ecf20Sopenharmony_ciThe implementation of the subbuf_start() callback for 'overwrite' mode 3808c2ecf20Sopenharmony_ciwould be very similar:: 3818c2ecf20Sopenharmony_ci 3828c2ecf20Sopenharmony_ci static int subbuf_start(struct rchan_buf *buf, 3838c2ecf20Sopenharmony_ci void *subbuf, 3848c2ecf20Sopenharmony_ci void *prev_subbuf, 3858c2ecf20Sopenharmony_ci size_t prev_padding) 3868c2ecf20Sopenharmony_ci { 3878c2ecf20Sopenharmony_ci if (prev_subbuf) 3888c2ecf20Sopenharmony_ci *((unsigned *)prev_subbuf) = prev_padding; 3898c2ecf20Sopenharmony_ci 3908c2ecf20Sopenharmony_ci subbuf_start_reserve(buf, sizeof(unsigned int)); 3918c2ecf20Sopenharmony_ci 3928c2ecf20Sopenharmony_ci return 1; 3938c2ecf20Sopenharmony_ci } 3948c2ecf20Sopenharmony_ci 3958c2ecf20Sopenharmony_ciIn this case, the relay_buf_full() check is meaningless and the 3968c2ecf20Sopenharmony_cicallback always returns 1, causing the buffer switch to occur 3978c2ecf20Sopenharmony_ciunconditionally. It's also meaningless for the client to use the 3988c2ecf20Sopenharmony_cirelay_subbufs_consumed() function in this mode, as it's never 3998c2ecf20Sopenharmony_ciconsulted. 4008c2ecf20Sopenharmony_ci 4018c2ecf20Sopenharmony_ciThe default subbuf_start() implementation, used if the client doesn't 4028c2ecf20Sopenharmony_cidefine any callbacks, or doesn't define the subbuf_start() callback, 4038c2ecf20Sopenharmony_ciimplements the simplest possible 'no-overwrite' mode, i.e. it does 4048c2ecf20Sopenharmony_cinothing but return 0. 4058c2ecf20Sopenharmony_ci 4068c2ecf20Sopenharmony_ciHeader information can be reserved at the beginning of each sub-buffer 4078c2ecf20Sopenharmony_ciby calling the subbuf_start_reserve() helper function from within the 4088c2ecf20Sopenharmony_cisubbuf_start() callback. This reserved area can be used to store 4098c2ecf20Sopenharmony_ciwhatever information the client wants. In the example above, room is 4108c2ecf20Sopenharmony_cireserved in each sub-buffer to store the padding count for that 4118c2ecf20Sopenharmony_cisub-buffer. This is filled in for the previous sub-buffer in the 4128c2ecf20Sopenharmony_cisubbuf_start() implementation; the padding value for the previous 4138c2ecf20Sopenharmony_cisub-buffer is passed into the subbuf_start() callback along with a 4148c2ecf20Sopenharmony_cipointer to the previous sub-buffer, since the padding value isn't 4158c2ecf20Sopenharmony_ciknown until a sub-buffer is filled. The subbuf_start() callback is 4168c2ecf20Sopenharmony_cialso called for the first sub-buffer when the channel is opened, to 4178c2ecf20Sopenharmony_cigive the client a chance to reserve space in it. In this case the 4188c2ecf20Sopenharmony_ciprevious sub-buffer pointer passed into the callback will be NULL, so 4198c2ecf20Sopenharmony_cithe client should check the value of the prev_subbuf pointer before 4208c2ecf20Sopenharmony_ciwriting into the previous sub-buffer. 4218c2ecf20Sopenharmony_ci 4228c2ecf20Sopenharmony_ciWriting to a channel 4238c2ecf20Sopenharmony_ci-------------------- 4248c2ecf20Sopenharmony_ci 4258c2ecf20Sopenharmony_ciKernel clients write data into the current cpu's channel buffer using 4268c2ecf20Sopenharmony_cirelay_write() or __relay_write(). relay_write() is the main logging 4278c2ecf20Sopenharmony_cifunction - it uses local_irqsave() to protect the buffer and should be 4288c2ecf20Sopenharmony_ciused if you might be logging from interrupt context. If you know 4298c2ecf20Sopenharmony_ciyou'll never be logging from interrupt context, you can use 4308c2ecf20Sopenharmony_ci__relay_write(), which only disables preemption. These functions 4318c2ecf20Sopenharmony_cidon't return a value, so you can't determine whether or not they 4328c2ecf20Sopenharmony_cifailed - the assumption is that you wouldn't want to check a return 4338c2ecf20Sopenharmony_civalue in the fast logging path anyway, and that they'll always succeed 4348c2ecf20Sopenharmony_ciunless the buffer is full and no-overwrite mode is being used, in 4358c2ecf20Sopenharmony_ciwhich case you can detect a failed write in the subbuf_start() 4368c2ecf20Sopenharmony_cicallback by calling the relay_buf_full() helper function. 4378c2ecf20Sopenharmony_ci 4388c2ecf20Sopenharmony_cirelay_reserve() is used to reserve a slot in a channel buffer which 4398c2ecf20Sopenharmony_cican be written to later. This would typically be used in applications 4408c2ecf20Sopenharmony_cithat need to write directly into a channel buffer without having to 4418c2ecf20Sopenharmony_cistage data in a temporary buffer beforehand. Because the actual write 4428c2ecf20Sopenharmony_cimay not happen immediately after the slot is reserved, applications 4438c2ecf20Sopenharmony_ciusing relay_reserve() can keep a count of the number of bytes actually 4448c2ecf20Sopenharmony_ciwritten, either in space reserved in the sub-buffers themselves or as 4458c2ecf20Sopenharmony_cia separate array. See the 'reserve' example in the relay-apps tarball 4468c2ecf20Sopenharmony_ciat http://relayfs.sourceforge.net for an example of how this can be 4478c2ecf20Sopenharmony_cidone. Because the write is under control of the client and is 4488c2ecf20Sopenharmony_ciseparated from the reserve, relay_reserve() doesn't protect the buffer 4498c2ecf20Sopenharmony_ciat all - it's up to the client to provide the appropriate 4508c2ecf20Sopenharmony_cisynchronization when using relay_reserve(). 4518c2ecf20Sopenharmony_ci 4528c2ecf20Sopenharmony_ciClosing a channel 4538c2ecf20Sopenharmony_ci----------------- 4548c2ecf20Sopenharmony_ci 4558c2ecf20Sopenharmony_ciThe client calls relay_close() when it's finished using the channel. 4568c2ecf20Sopenharmony_ciThe channel and its associated buffers are destroyed when there are no 4578c2ecf20Sopenharmony_cilonger any references to any of the channel buffers. relay_flush() 4588c2ecf20Sopenharmony_ciforces a sub-buffer switch on all the channel buffers, and can be used 4598c2ecf20Sopenharmony_cito finalize and process the last sub-buffers before the channel is 4608c2ecf20Sopenharmony_ciclosed. 4618c2ecf20Sopenharmony_ci 4628c2ecf20Sopenharmony_ciMisc 4638c2ecf20Sopenharmony_ci---- 4648c2ecf20Sopenharmony_ci 4658c2ecf20Sopenharmony_ciSome applications may want to keep a channel around and re-use it 4668c2ecf20Sopenharmony_cirather than open and close a new channel for each use. relay_reset() 4678c2ecf20Sopenharmony_cican be used for this purpose - it resets a channel to its initial 4688c2ecf20Sopenharmony_cistate without reallocating channel buffer memory or destroying 4698c2ecf20Sopenharmony_ciexisting mappings. It should however only be called when it's safe to 4708c2ecf20Sopenharmony_cido so, i.e. when the channel isn't currently being written to. 4718c2ecf20Sopenharmony_ci 4728c2ecf20Sopenharmony_ciFinally, there are a couple of utility callbacks that can be used for 4738c2ecf20Sopenharmony_cidifferent purposes. buf_mapped() is called whenever a channel buffer 4748c2ecf20Sopenharmony_ciis mmapped from user space and buf_unmapped() is called when it's 4758c2ecf20Sopenharmony_ciunmapped. The client can use this notification to trigger actions 4768c2ecf20Sopenharmony_ciwithin the kernel application, such as enabling/disabling logging to 4778c2ecf20Sopenharmony_cithe channel. 4788c2ecf20Sopenharmony_ci 4798c2ecf20Sopenharmony_ci 4808c2ecf20Sopenharmony_ciResources 4818c2ecf20Sopenharmony_ci========= 4828c2ecf20Sopenharmony_ci 4838c2ecf20Sopenharmony_ciFor news, example code, mailing list, etc. see the relay interface homepage: 4848c2ecf20Sopenharmony_ci 4858c2ecf20Sopenharmony_ci http://relayfs.sourceforge.net 4868c2ecf20Sopenharmony_ci 4878c2ecf20Sopenharmony_ci 4888c2ecf20Sopenharmony_ciCredits 4898c2ecf20Sopenharmony_ci======= 4908c2ecf20Sopenharmony_ci 4918c2ecf20Sopenharmony_ciThe ideas and specs for the relay interface came about as a result of 4928c2ecf20Sopenharmony_cidiscussions on tracing involving the following: 4938c2ecf20Sopenharmony_ci 4948c2ecf20Sopenharmony_ciMichel Dagenais <michel.dagenais@polymtl.ca> 4958c2ecf20Sopenharmony_ciRichard Moore <richardj_moore@uk.ibm.com> 4968c2ecf20Sopenharmony_ciBob Wisniewski <bob@watson.ibm.com> 4978c2ecf20Sopenharmony_ciKarim Yaghmour <karim@opersys.com> 4988c2ecf20Sopenharmony_ciTom Zanussi <zanussi@us.ibm.com> 4998c2ecf20Sopenharmony_ci 5008c2ecf20Sopenharmony_ciAlso thanks to Hubertus Franke for a lot of useful suggestions and bug 5018c2ecf20Sopenharmony_cireports. 502