162306a36Sopenharmony_ci=========
262306a36Sopenharmony_ciRPC Cache
362306a36Sopenharmony_ci=========
462306a36Sopenharmony_ci
562306a36Sopenharmony_ciThis document gives a brief introduction to the caching
662306a36Sopenharmony_cimechanisms in the sunrpc layer that is used, in particular,
762306a36Sopenharmony_cifor NFS authentication.
862306a36Sopenharmony_ci
962306a36Sopenharmony_ciCaches
1062306a36Sopenharmony_ci======
1162306a36Sopenharmony_ci
1262306a36Sopenharmony_ciThe caching replaces the old exports table and allows for
1362306a36Sopenharmony_cia wide variety of values to be caches.
1462306a36Sopenharmony_ci
1562306a36Sopenharmony_ciThere are a number of caches that are similar in structure though
1662306a36Sopenharmony_ciquite possibly very different in content and use.  There is a corpus
1762306a36Sopenharmony_ciof common code for managing these caches.
1862306a36Sopenharmony_ci
1962306a36Sopenharmony_ciExamples of caches that are likely to be needed are:
2062306a36Sopenharmony_ci
2162306a36Sopenharmony_ci  - mapping from IP address to client name
2262306a36Sopenharmony_ci  - mapping from client name and filesystem to export options
2362306a36Sopenharmony_ci  - mapping from UID to list of GIDs, to work around NFS's limitation
2462306a36Sopenharmony_ci    of 16 gids.
2562306a36Sopenharmony_ci  - mappings between local UID/GID and remote UID/GID for sites that
2662306a36Sopenharmony_ci    do not have uniform uid assignment
2762306a36Sopenharmony_ci  - mapping from network identify to public key for crypto authentication.
2862306a36Sopenharmony_ci
2962306a36Sopenharmony_ciThe common code handles such things as:
3062306a36Sopenharmony_ci
3162306a36Sopenharmony_ci   - general cache lookup with correct locking
3262306a36Sopenharmony_ci   - supporting 'NEGATIVE' as well as positive entries
3362306a36Sopenharmony_ci   - allowing an EXPIRED time on cache items, and removing
3462306a36Sopenharmony_ci     items after they expire, and are no longer in-use.
3562306a36Sopenharmony_ci   - making requests to user-space to fill in cache entries
3662306a36Sopenharmony_ci   - allowing user-space to directly set entries in the cache
3762306a36Sopenharmony_ci   - delaying RPC requests that depend on as-yet incomplete
3862306a36Sopenharmony_ci     cache entries, and replaying those requests when the cache entry
3962306a36Sopenharmony_ci     is complete.
4062306a36Sopenharmony_ci   - clean out old entries as they expire.
4162306a36Sopenharmony_ci
4262306a36Sopenharmony_ciCreating a Cache
4362306a36Sopenharmony_ci----------------
4462306a36Sopenharmony_ci
4562306a36Sopenharmony_ci-  A cache needs a datum to store.  This is in the form of a
4662306a36Sopenharmony_ci   structure definition that must contain a struct cache_head
4762306a36Sopenharmony_ci   as an element, usually the first.
4862306a36Sopenharmony_ci   It will also contain a key and some content.
4962306a36Sopenharmony_ci   Each cache element is reference counted and contains
5062306a36Sopenharmony_ci   expiry and update times for use in cache management.
5162306a36Sopenharmony_ci-  A cache needs a "cache_detail" structure that
5262306a36Sopenharmony_ci   describes the cache.  This stores the hash table, some
5362306a36Sopenharmony_ci   parameters for cache management, and some operations detailing how
5462306a36Sopenharmony_ci   to work with particular cache items.
5562306a36Sopenharmony_ci
5662306a36Sopenharmony_ci   The operations are:
5762306a36Sopenharmony_ci
5862306a36Sopenharmony_ci    struct cache_head \*alloc(void)
5962306a36Sopenharmony_ci      This simply allocates appropriate memory and returns
6062306a36Sopenharmony_ci      a pointer to the cache_detail embedded within the
6162306a36Sopenharmony_ci      structure
6262306a36Sopenharmony_ci
6362306a36Sopenharmony_ci    void cache_put(struct kref \*)
6462306a36Sopenharmony_ci      This is called when the last reference to an item is
6562306a36Sopenharmony_ci      dropped.  The pointer passed is to the 'ref' field
6662306a36Sopenharmony_ci      in the cache_head.  cache_put should release any
6762306a36Sopenharmony_ci      references create by 'cache_init' and, if CACHE_VALID
6862306a36Sopenharmony_ci      is set, any references created by cache_update.
6962306a36Sopenharmony_ci      It should then release the memory allocated by
7062306a36Sopenharmony_ci      'alloc'.
7162306a36Sopenharmony_ci
7262306a36Sopenharmony_ci    int match(struct cache_head \*orig, struct cache_head \*new)
7362306a36Sopenharmony_ci      test if the keys in the two structures match.  Return
7462306a36Sopenharmony_ci      1 if they do, 0 if they don't.
7562306a36Sopenharmony_ci
7662306a36Sopenharmony_ci    void init(struct cache_head \*orig, struct cache_head \*new)
7762306a36Sopenharmony_ci      Set the 'key' fields in 'new' from 'orig'.  This may
7862306a36Sopenharmony_ci      include taking references to shared objects.
7962306a36Sopenharmony_ci
8062306a36Sopenharmony_ci    void update(struct cache_head \*orig, struct cache_head \*new)
8162306a36Sopenharmony_ci      Set the 'content' fields in 'new' from 'orig'.
8262306a36Sopenharmony_ci
8362306a36Sopenharmony_ci    int cache_show(struct seq_file \*m, struct cache_detail \*cd, struct cache_head \*h)
8462306a36Sopenharmony_ci      Optional.  Used to provide a /proc file that lists the
8562306a36Sopenharmony_ci      contents of a cache.  This should show one item,
8662306a36Sopenharmony_ci      usually on just one line.
8762306a36Sopenharmony_ci
8862306a36Sopenharmony_ci    int cache_request(struct cache_detail \*cd, struct cache_head \*h, char \*\*bpp, int \*blen)
8962306a36Sopenharmony_ci      Format a request to be send to user-space for an item
9062306a36Sopenharmony_ci      to be instantiated.  \*bpp is a buffer of size \*blen.
9162306a36Sopenharmony_ci      bpp should be moved forward over the encoded message,
9262306a36Sopenharmony_ci      and  \*blen should be reduced to show how much free
9362306a36Sopenharmony_ci      space remains.  Return 0 on success or <0 if not
9462306a36Sopenharmony_ci      enough room or other problem.
9562306a36Sopenharmony_ci
9662306a36Sopenharmony_ci    int cache_parse(struct cache_detail \*cd, char \*buf, int len)
9762306a36Sopenharmony_ci      A message from user space has arrived to fill out a
9862306a36Sopenharmony_ci      cache entry.  It is in 'buf' of length 'len'.
9962306a36Sopenharmony_ci      cache_parse should parse this, find the item in the
10062306a36Sopenharmony_ci      cache with sunrpc_cache_lookup_rcu, and update the item
10162306a36Sopenharmony_ci      with sunrpc_cache_update.
10262306a36Sopenharmony_ci
10362306a36Sopenharmony_ci
10462306a36Sopenharmony_ci-  A cache needs to be registered using cache_register().  This
10562306a36Sopenharmony_ci   includes it on a list of caches that will be regularly
10662306a36Sopenharmony_ci   cleaned to discard old data.
10762306a36Sopenharmony_ci
10862306a36Sopenharmony_ciUsing a cache
10962306a36Sopenharmony_ci-------------
11062306a36Sopenharmony_ci
11162306a36Sopenharmony_ciTo find a value in a cache, call sunrpc_cache_lookup_rcu passing a pointer
11262306a36Sopenharmony_cito the cache_head in a sample item with the 'key' fields filled in.
11362306a36Sopenharmony_ciThis will be passed to ->match to identify the target entry.  If no
11462306a36Sopenharmony_cientry is found, a new entry will be create, added to the cache, and
11562306a36Sopenharmony_cimarked as not containing valid data.
11662306a36Sopenharmony_ci
11762306a36Sopenharmony_ciThe item returned is typically passed to cache_check which will check
11862306a36Sopenharmony_ciif the data is valid, and may initiate an up-call to get fresh data.
11962306a36Sopenharmony_cicache_check will return -ENOENT in the entry is negative or if an up
12062306a36Sopenharmony_cicall is needed but not possible, -EAGAIN if an upcall is pending,
12162306a36Sopenharmony_cior 0 if the data is valid;
12262306a36Sopenharmony_ci
12362306a36Sopenharmony_cicache_check can be passed a "struct cache_req\*".  This structure is
12462306a36Sopenharmony_citypically embedded in the actual request and can be used to create a
12562306a36Sopenharmony_cideferred copy of the request (struct cache_deferred_req).  This is
12662306a36Sopenharmony_cidone when the found cache item is not uptodate, but the is reason to
12762306a36Sopenharmony_cibelieve that userspace might provide information soon.  When the cache
12862306a36Sopenharmony_ciitem does become valid, the deferred copy of the request will be
12962306a36Sopenharmony_cirevisited (->revisit).  It is expected that this method will
13062306a36Sopenharmony_cireschedule the request for processing.
13162306a36Sopenharmony_ci
13262306a36Sopenharmony_ciThe value returned by sunrpc_cache_lookup_rcu can also be passed to
13362306a36Sopenharmony_cisunrpc_cache_update to set the content for the item.  A second item is
13462306a36Sopenharmony_cipassed which should hold the content.  If the item found by _lookup
13562306a36Sopenharmony_cihas valid data, then it is discarded and a new item is created.  This
13662306a36Sopenharmony_cisaves any user of an item from worrying about content changing while
13762306a36Sopenharmony_ciit is being inspected.  If the item found by _lookup does not contain
13862306a36Sopenharmony_civalid data, then the content is copied across and CACHE_VALID is set.
13962306a36Sopenharmony_ci
14062306a36Sopenharmony_ciPopulating a cache
14162306a36Sopenharmony_ci------------------
14262306a36Sopenharmony_ci
14362306a36Sopenharmony_ciEach cache has a name, and when the cache is registered, a directory
14462306a36Sopenharmony_ciwith that name is created in /proc/net/rpc
14562306a36Sopenharmony_ci
14662306a36Sopenharmony_ciThis directory contains a file called 'channel' which is a channel
14762306a36Sopenharmony_cifor communicating between kernel and user for populating the cache.
14862306a36Sopenharmony_ciThis directory may later contain other files of interacting
14962306a36Sopenharmony_ciwith the cache.
15062306a36Sopenharmony_ci
15162306a36Sopenharmony_ciThe 'channel' works a bit like a datagram socket. Each 'write' is
15262306a36Sopenharmony_cipassed as a whole to the cache for parsing and interpretation.
15362306a36Sopenharmony_ciEach cache can treat the write requests differently, but it is
15462306a36Sopenharmony_ciexpected that a message written will contain:
15562306a36Sopenharmony_ci
15662306a36Sopenharmony_ci  - a key
15762306a36Sopenharmony_ci  - an expiry time
15862306a36Sopenharmony_ci  - a content.
15962306a36Sopenharmony_ci
16062306a36Sopenharmony_ciwith the intention that an item in the cache with the give key
16162306a36Sopenharmony_cishould be create or updated to have the given content, and the
16262306a36Sopenharmony_ciexpiry time should be set on that item.
16362306a36Sopenharmony_ci
16462306a36Sopenharmony_ciReading from a channel is a bit more interesting.  When a cache
16562306a36Sopenharmony_cilookup fails, or when it succeeds but finds an entry that may soon
16662306a36Sopenharmony_ciexpire, a request is lodged for that cache item to be updated by
16762306a36Sopenharmony_ciuser-space.  These requests appear in the channel file.
16862306a36Sopenharmony_ci
16962306a36Sopenharmony_ciSuccessive reads will return successive requests.
17062306a36Sopenharmony_ciIf there are no more requests to return, read will return EOF, but a
17162306a36Sopenharmony_ciselect or poll for read will block waiting for another request to be
17262306a36Sopenharmony_ciadded.
17362306a36Sopenharmony_ci
17462306a36Sopenharmony_ciThus a user-space helper is likely to::
17562306a36Sopenharmony_ci
17662306a36Sopenharmony_ci  open the channel.
17762306a36Sopenharmony_ci    select for readable
17862306a36Sopenharmony_ci    read a request
17962306a36Sopenharmony_ci    write a response
18062306a36Sopenharmony_ci  loop.
18162306a36Sopenharmony_ci
18262306a36Sopenharmony_ciIf it dies and needs to be restarted, any requests that have not been
18362306a36Sopenharmony_cianswered will still appear in the file and will be read by the new
18462306a36Sopenharmony_ciinstance of the helper.
18562306a36Sopenharmony_ci
18662306a36Sopenharmony_ciEach cache should define a "cache_parse" method which takes a message
18762306a36Sopenharmony_ciwritten from user-space and processes it.  It should return an error
18862306a36Sopenharmony_ci(which propagates back to the write syscall) or 0.
18962306a36Sopenharmony_ci
19062306a36Sopenharmony_ciEach cache should also define a "cache_request" method which
19162306a36Sopenharmony_citakes a cache item and encodes a request into the buffer
19262306a36Sopenharmony_ciprovided.
19362306a36Sopenharmony_ci
19462306a36Sopenharmony_ci.. note::
19562306a36Sopenharmony_ci  If a cache has no active readers on the channel, and has had not
19662306a36Sopenharmony_ci  active readers for more than 60 seconds, further requests will not be
19762306a36Sopenharmony_ci  added to the channel but instead all lookups that do not find a valid
19862306a36Sopenharmony_ci  entry will fail.  This is partly for backward compatibility: The
19962306a36Sopenharmony_ci  previous nfs exports table was deemed to be authoritative and a
20062306a36Sopenharmony_ci  failed lookup meant a definite 'no'.
20162306a36Sopenharmony_ci
20262306a36Sopenharmony_cirequest/response format
20362306a36Sopenharmony_ci-----------------------
20462306a36Sopenharmony_ci
20562306a36Sopenharmony_ciWhile each cache is free to use its own format for requests
20662306a36Sopenharmony_ciand responses over channel, the following is recommended as
20762306a36Sopenharmony_ciappropriate and support routines are available to help:
20862306a36Sopenharmony_ciEach request or response record should be printable ASCII
20962306a36Sopenharmony_ciwith precisely one newline character which should be at the end.
21062306a36Sopenharmony_ciFields within the record should be separated by spaces, normally one.
21162306a36Sopenharmony_ciIf spaces, newlines, or nul characters are needed in a field they
21262306a36Sopenharmony_cimuch be quoted.  two mechanisms are available:
21362306a36Sopenharmony_ci
21462306a36Sopenharmony_ci-  If a field begins '\x' then it must contain an even number of
21562306a36Sopenharmony_ci   hex digits, and pairs of these digits provide the bytes in the
21662306a36Sopenharmony_ci   field.
21762306a36Sopenharmony_ci-  otherwise a \ in the field must be followed by 3 octal digits
21862306a36Sopenharmony_ci   which give the code for a byte.  Other characters are treated
21962306a36Sopenharmony_ci   as them selves.  At the very least, space, newline, nul, and
22062306a36Sopenharmony_ci   '\' must be quoted in this way.
221