162306a36Sopenharmony_ci========= 262306a36Sopenharmony_ciRPC Cache 362306a36Sopenharmony_ci========= 462306a36Sopenharmony_ci 562306a36Sopenharmony_ciThis document gives a brief introduction to the caching 662306a36Sopenharmony_cimechanisms in the sunrpc layer that is used, in particular, 762306a36Sopenharmony_cifor NFS authentication. 862306a36Sopenharmony_ci 962306a36Sopenharmony_ciCaches 1062306a36Sopenharmony_ci====== 1162306a36Sopenharmony_ci 1262306a36Sopenharmony_ciThe caching replaces the old exports table and allows for 1362306a36Sopenharmony_cia wide variety of values to be caches. 1462306a36Sopenharmony_ci 1562306a36Sopenharmony_ciThere are a number of caches that are similar in structure though 1662306a36Sopenharmony_ciquite possibly very different in content and use. There is a corpus 1762306a36Sopenharmony_ciof common code for managing these caches. 1862306a36Sopenharmony_ci 1962306a36Sopenharmony_ciExamples of caches that are likely to be needed are: 2062306a36Sopenharmony_ci 2162306a36Sopenharmony_ci - mapping from IP address to client name 2262306a36Sopenharmony_ci - mapping from client name and filesystem to export options 2362306a36Sopenharmony_ci - mapping from UID to list of GIDs, to work around NFS's limitation 2462306a36Sopenharmony_ci of 16 gids. 2562306a36Sopenharmony_ci - mappings between local UID/GID and remote UID/GID for sites that 2662306a36Sopenharmony_ci do not have uniform uid assignment 2762306a36Sopenharmony_ci - mapping from network identify to public key for crypto authentication. 2862306a36Sopenharmony_ci 2962306a36Sopenharmony_ciThe common code handles such things as: 3062306a36Sopenharmony_ci 3162306a36Sopenharmony_ci - general cache lookup with correct locking 3262306a36Sopenharmony_ci - supporting 'NEGATIVE' as well as positive entries 3362306a36Sopenharmony_ci - allowing an EXPIRED time on cache items, and removing 3462306a36Sopenharmony_ci items after they expire, and are no longer in-use. 3562306a36Sopenharmony_ci - making requests to user-space to fill in cache entries 3662306a36Sopenharmony_ci - allowing user-space to directly set entries in the cache 3762306a36Sopenharmony_ci - delaying RPC requests that depend on as-yet incomplete 3862306a36Sopenharmony_ci cache entries, and replaying those requests when the cache entry 3962306a36Sopenharmony_ci is complete. 4062306a36Sopenharmony_ci - clean out old entries as they expire. 4162306a36Sopenharmony_ci 4262306a36Sopenharmony_ciCreating a Cache 4362306a36Sopenharmony_ci---------------- 4462306a36Sopenharmony_ci 4562306a36Sopenharmony_ci- A cache needs a datum to store. This is in the form of a 4662306a36Sopenharmony_ci structure definition that must contain a struct cache_head 4762306a36Sopenharmony_ci as an element, usually the first. 4862306a36Sopenharmony_ci It will also contain a key and some content. 4962306a36Sopenharmony_ci Each cache element is reference counted and contains 5062306a36Sopenharmony_ci expiry and update times for use in cache management. 5162306a36Sopenharmony_ci- A cache needs a "cache_detail" structure that 5262306a36Sopenharmony_ci describes the cache. This stores the hash table, some 5362306a36Sopenharmony_ci parameters for cache management, and some operations detailing how 5462306a36Sopenharmony_ci to work with particular cache items. 5562306a36Sopenharmony_ci 5662306a36Sopenharmony_ci The operations are: 5762306a36Sopenharmony_ci 5862306a36Sopenharmony_ci struct cache_head \*alloc(void) 5962306a36Sopenharmony_ci This simply allocates appropriate memory and returns 6062306a36Sopenharmony_ci a pointer to the cache_detail embedded within the 6162306a36Sopenharmony_ci structure 6262306a36Sopenharmony_ci 6362306a36Sopenharmony_ci void cache_put(struct kref \*) 6462306a36Sopenharmony_ci This is called when the last reference to an item is 6562306a36Sopenharmony_ci dropped. The pointer passed is to the 'ref' field 6662306a36Sopenharmony_ci in the cache_head. cache_put should release any 6762306a36Sopenharmony_ci references create by 'cache_init' and, if CACHE_VALID 6862306a36Sopenharmony_ci is set, any references created by cache_update. 6962306a36Sopenharmony_ci It should then release the memory allocated by 7062306a36Sopenharmony_ci 'alloc'. 7162306a36Sopenharmony_ci 7262306a36Sopenharmony_ci int match(struct cache_head \*orig, struct cache_head \*new) 7362306a36Sopenharmony_ci test if the keys in the two structures match. Return 7462306a36Sopenharmony_ci 1 if they do, 0 if they don't. 7562306a36Sopenharmony_ci 7662306a36Sopenharmony_ci void init(struct cache_head \*orig, struct cache_head \*new) 7762306a36Sopenharmony_ci Set the 'key' fields in 'new' from 'orig'. This may 7862306a36Sopenharmony_ci include taking references to shared objects. 7962306a36Sopenharmony_ci 8062306a36Sopenharmony_ci void update(struct cache_head \*orig, struct cache_head \*new) 8162306a36Sopenharmony_ci Set the 'content' fields in 'new' from 'orig'. 8262306a36Sopenharmony_ci 8362306a36Sopenharmony_ci int cache_show(struct seq_file \*m, struct cache_detail \*cd, struct cache_head \*h) 8462306a36Sopenharmony_ci Optional. Used to provide a /proc file that lists the 8562306a36Sopenharmony_ci contents of a cache. This should show one item, 8662306a36Sopenharmony_ci usually on just one line. 8762306a36Sopenharmony_ci 8862306a36Sopenharmony_ci int cache_request(struct cache_detail \*cd, struct cache_head \*h, char \*\*bpp, int \*blen) 8962306a36Sopenharmony_ci Format a request to be send to user-space for an item 9062306a36Sopenharmony_ci to be instantiated. \*bpp is a buffer of size \*blen. 9162306a36Sopenharmony_ci bpp should be moved forward over the encoded message, 9262306a36Sopenharmony_ci and \*blen should be reduced to show how much free 9362306a36Sopenharmony_ci space remains. Return 0 on success or <0 if not 9462306a36Sopenharmony_ci enough room or other problem. 9562306a36Sopenharmony_ci 9662306a36Sopenharmony_ci int cache_parse(struct cache_detail \*cd, char \*buf, int len) 9762306a36Sopenharmony_ci A message from user space has arrived to fill out a 9862306a36Sopenharmony_ci cache entry. It is in 'buf' of length 'len'. 9962306a36Sopenharmony_ci cache_parse should parse this, find the item in the 10062306a36Sopenharmony_ci cache with sunrpc_cache_lookup_rcu, and update the item 10162306a36Sopenharmony_ci with sunrpc_cache_update. 10262306a36Sopenharmony_ci 10362306a36Sopenharmony_ci 10462306a36Sopenharmony_ci- A cache needs to be registered using cache_register(). This 10562306a36Sopenharmony_ci includes it on a list of caches that will be regularly 10662306a36Sopenharmony_ci cleaned to discard old data. 10762306a36Sopenharmony_ci 10862306a36Sopenharmony_ciUsing a cache 10962306a36Sopenharmony_ci------------- 11062306a36Sopenharmony_ci 11162306a36Sopenharmony_ciTo find a value in a cache, call sunrpc_cache_lookup_rcu passing a pointer 11262306a36Sopenharmony_cito the cache_head in a sample item with the 'key' fields filled in. 11362306a36Sopenharmony_ciThis will be passed to ->match to identify the target entry. If no 11462306a36Sopenharmony_cientry is found, a new entry will be create, added to the cache, and 11562306a36Sopenharmony_cimarked as not containing valid data. 11662306a36Sopenharmony_ci 11762306a36Sopenharmony_ciThe item returned is typically passed to cache_check which will check 11862306a36Sopenharmony_ciif the data is valid, and may initiate an up-call to get fresh data. 11962306a36Sopenharmony_cicache_check will return -ENOENT in the entry is negative or if an up 12062306a36Sopenharmony_cicall is needed but not possible, -EAGAIN if an upcall is pending, 12162306a36Sopenharmony_cior 0 if the data is valid; 12262306a36Sopenharmony_ci 12362306a36Sopenharmony_cicache_check can be passed a "struct cache_req\*". This structure is 12462306a36Sopenharmony_citypically embedded in the actual request and can be used to create a 12562306a36Sopenharmony_cideferred copy of the request (struct cache_deferred_req). This is 12662306a36Sopenharmony_cidone when the found cache item is not uptodate, but the is reason to 12762306a36Sopenharmony_cibelieve that userspace might provide information soon. When the cache 12862306a36Sopenharmony_ciitem does become valid, the deferred copy of the request will be 12962306a36Sopenharmony_cirevisited (->revisit). It is expected that this method will 13062306a36Sopenharmony_cireschedule the request for processing. 13162306a36Sopenharmony_ci 13262306a36Sopenharmony_ciThe value returned by sunrpc_cache_lookup_rcu can also be passed to 13362306a36Sopenharmony_cisunrpc_cache_update to set the content for the item. A second item is 13462306a36Sopenharmony_cipassed which should hold the content. If the item found by _lookup 13562306a36Sopenharmony_cihas valid data, then it is discarded and a new item is created. This 13662306a36Sopenharmony_cisaves any user of an item from worrying about content changing while 13762306a36Sopenharmony_ciit is being inspected. If the item found by _lookup does not contain 13862306a36Sopenharmony_civalid data, then the content is copied across and CACHE_VALID is set. 13962306a36Sopenharmony_ci 14062306a36Sopenharmony_ciPopulating a cache 14162306a36Sopenharmony_ci------------------ 14262306a36Sopenharmony_ci 14362306a36Sopenharmony_ciEach cache has a name, and when the cache is registered, a directory 14462306a36Sopenharmony_ciwith that name is created in /proc/net/rpc 14562306a36Sopenharmony_ci 14662306a36Sopenharmony_ciThis directory contains a file called 'channel' which is a channel 14762306a36Sopenharmony_cifor communicating between kernel and user for populating the cache. 14862306a36Sopenharmony_ciThis directory may later contain other files of interacting 14962306a36Sopenharmony_ciwith the cache. 15062306a36Sopenharmony_ci 15162306a36Sopenharmony_ciThe 'channel' works a bit like a datagram socket. Each 'write' is 15262306a36Sopenharmony_cipassed as a whole to the cache for parsing and interpretation. 15362306a36Sopenharmony_ciEach cache can treat the write requests differently, but it is 15462306a36Sopenharmony_ciexpected that a message written will contain: 15562306a36Sopenharmony_ci 15662306a36Sopenharmony_ci - a key 15762306a36Sopenharmony_ci - an expiry time 15862306a36Sopenharmony_ci - a content. 15962306a36Sopenharmony_ci 16062306a36Sopenharmony_ciwith the intention that an item in the cache with the give key 16162306a36Sopenharmony_cishould be create or updated to have the given content, and the 16262306a36Sopenharmony_ciexpiry time should be set on that item. 16362306a36Sopenharmony_ci 16462306a36Sopenharmony_ciReading from a channel is a bit more interesting. When a cache 16562306a36Sopenharmony_cilookup fails, or when it succeeds but finds an entry that may soon 16662306a36Sopenharmony_ciexpire, a request is lodged for that cache item to be updated by 16762306a36Sopenharmony_ciuser-space. These requests appear in the channel file. 16862306a36Sopenharmony_ci 16962306a36Sopenharmony_ciSuccessive reads will return successive requests. 17062306a36Sopenharmony_ciIf there are no more requests to return, read will return EOF, but a 17162306a36Sopenharmony_ciselect or poll for read will block waiting for another request to be 17262306a36Sopenharmony_ciadded. 17362306a36Sopenharmony_ci 17462306a36Sopenharmony_ciThus a user-space helper is likely to:: 17562306a36Sopenharmony_ci 17662306a36Sopenharmony_ci open the channel. 17762306a36Sopenharmony_ci select for readable 17862306a36Sopenharmony_ci read a request 17962306a36Sopenharmony_ci write a response 18062306a36Sopenharmony_ci loop. 18162306a36Sopenharmony_ci 18262306a36Sopenharmony_ciIf it dies and needs to be restarted, any requests that have not been 18362306a36Sopenharmony_cianswered will still appear in the file and will be read by the new 18462306a36Sopenharmony_ciinstance of the helper. 18562306a36Sopenharmony_ci 18662306a36Sopenharmony_ciEach cache should define a "cache_parse" method which takes a message 18762306a36Sopenharmony_ciwritten from user-space and processes it. It should return an error 18862306a36Sopenharmony_ci(which propagates back to the write syscall) or 0. 18962306a36Sopenharmony_ci 19062306a36Sopenharmony_ciEach cache should also define a "cache_request" method which 19162306a36Sopenharmony_citakes a cache item and encodes a request into the buffer 19262306a36Sopenharmony_ciprovided. 19362306a36Sopenharmony_ci 19462306a36Sopenharmony_ci.. note:: 19562306a36Sopenharmony_ci If a cache has no active readers on the channel, and has had not 19662306a36Sopenharmony_ci active readers for more than 60 seconds, further requests will not be 19762306a36Sopenharmony_ci added to the channel but instead all lookups that do not find a valid 19862306a36Sopenharmony_ci entry will fail. This is partly for backward compatibility: The 19962306a36Sopenharmony_ci previous nfs exports table was deemed to be authoritative and a 20062306a36Sopenharmony_ci failed lookup meant a definite 'no'. 20162306a36Sopenharmony_ci 20262306a36Sopenharmony_cirequest/response format 20362306a36Sopenharmony_ci----------------------- 20462306a36Sopenharmony_ci 20562306a36Sopenharmony_ciWhile each cache is free to use its own format for requests 20662306a36Sopenharmony_ciand responses over channel, the following is recommended as 20762306a36Sopenharmony_ciappropriate and support routines are available to help: 20862306a36Sopenharmony_ciEach request or response record should be printable ASCII 20962306a36Sopenharmony_ciwith precisely one newline character which should be at the end. 21062306a36Sopenharmony_ciFields within the record should be separated by spaces, normally one. 21162306a36Sopenharmony_ciIf spaces, newlines, or nul characters are needed in a field they 21262306a36Sopenharmony_cimuch be quoted. two mechanisms are available: 21362306a36Sopenharmony_ci 21462306a36Sopenharmony_ci- If a field begins '\x' then it must contain an even number of 21562306a36Sopenharmony_ci hex digits, and pairs of these digits provide the bytes in the 21662306a36Sopenharmony_ci field. 21762306a36Sopenharmony_ci- otherwise a \ in the field must be followed by 3 octal digits 21862306a36Sopenharmony_ci which give the code for a byte. Other characters are treated 21962306a36Sopenharmony_ci as them selves. At the very least, space, newline, nul, and 22062306a36Sopenharmony_ci '\' must be quoted in this way. 221