18c2ecf20Sopenharmony_ci===============
28c2ecf20Sopenharmony_ciPersistent data
38c2ecf20Sopenharmony_ci===============
48c2ecf20Sopenharmony_ci
58c2ecf20Sopenharmony_ciIntroduction
68c2ecf20Sopenharmony_ci============
78c2ecf20Sopenharmony_ci
88c2ecf20Sopenharmony_ciThe more-sophisticated device-mapper targets require complex metadata
98c2ecf20Sopenharmony_cithat is managed in kernel.  In late 2010 we were seeing that various
108c2ecf20Sopenharmony_cidifferent targets were rolling their own data structures, for example:
118c2ecf20Sopenharmony_ci
128c2ecf20Sopenharmony_ci- Mikulas Patocka's multisnap implementation
138c2ecf20Sopenharmony_ci- Heinz Mauelshagen's thin provisioning target
148c2ecf20Sopenharmony_ci- Another btree-based caching target posted to dm-devel
158c2ecf20Sopenharmony_ci- Another multi-snapshot target based on a design of Daniel Phillips
168c2ecf20Sopenharmony_ci
178c2ecf20Sopenharmony_ciMaintaining these data structures takes a lot of work, so if possible
188c2ecf20Sopenharmony_ciwe'd like to reduce the number.
198c2ecf20Sopenharmony_ci
208c2ecf20Sopenharmony_ciThe persistent-data library is an attempt to provide a re-usable
218c2ecf20Sopenharmony_ciframework for people who want to store metadata in device-mapper
228c2ecf20Sopenharmony_citargets.  It's currently used by the thin-provisioning target and an
238c2ecf20Sopenharmony_ciupcoming hierarchical storage target.
248c2ecf20Sopenharmony_ci
258c2ecf20Sopenharmony_ciOverview
268c2ecf20Sopenharmony_ci========
278c2ecf20Sopenharmony_ci
288c2ecf20Sopenharmony_ciThe main documentation is in the header files which can all be found
298c2ecf20Sopenharmony_ciunder drivers/md/persistent-data.
308c2ecf20Sopenharmony_ci
318c2ecf20Sopenharmony_ciThe block manager
328c2ecf20Sopenharmony_ci-----------------
338c2ecf20Sopenharmony_ci
348c2ecf20Sopenharmony_cidm-block-manager.[hc]
358c2ecf20Sopenharmony_ci
368c2ecf20Sopenharmony_ciThis provides access to the data on disk in fixed sized-blocks.  There
378c2ecf20Sopenharmony_ciis a read/write locking interface to prevent concurrent accesses, and
388c2ecf20Sopenharmony_cikeep data that is being used in the cache.
398c2ecf20Sopenharmony_ci
408c2ecf20Sopenharmony_ciClients of persistent-data are unlikely to use this directly.
418c2ecf20Sopenharmony_ci
428c2ecf20Sopenharmony_ciThe transaction manager
438c2ecf20Sopenharmony_ci-----------------------
448c2ecf20Sopenharmony_ci
458c2ecf20Sopenharmony_cidm-transaction-manager.[hc]
468c2ecf20Sopenharmony_ci
478c2ecf20Sopenharmony_ciThis restricts access to blocks and enforces copy-on-write semantics.
488c2ecf20Sopenharmony_ciThe only way you can get hold of a writable block through the
498c2ecf20Sopenharmony_citransaction manager is by shadowing an existing block (ie. doing
508c2ecf20Sopenharmony_cicopy-on-write) or allocating a fresh one.  Shadowing is elided within
518c2ecf20Sopenharmony_cithe same transaction so performance is reasonable.  The commit method
528c2ecf20Sopenharmony_ciensures that all data is flushed before it writes the superblock.
538c2ecf20Sopenharmony_ciOn power failure your metadata will be as it was when last committed.
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ciThe Space Maps
568c2ecf20Sopenharmony_ci--------------
578c2ecf20Sopenharmony_ci
588c2ecf20Sopenharmony_cidm-space-map.h
598c2ecf20Sopenharmony_cidm-space-map-metadata.[hc]
608c2ecf20Sopenharmony_cidm-space-map-disk.[hc]
618c2ecf20Sopenharmony_ci
628c2ecf20Sopenharmony_ciOn-disk data structures that keep track of reference counts of blocks.
638c2ecf20Sopenharmony_ciAlso acts as the allocator of new blocks.  Currently two
648c2ecf20Sopenharmony_ciimplementations: a simpler one for managing blocks on a different
658c2ecf20Sopenharmony_cidevice (eg. thinly-provisioned data blocks); and one for managing
668c2ecf20Sopenharmony_cithe metadata space.  The latter is complicated by the need to store
678c2ecf20Sopenharmony_ciits own data within the space it's managing.
688c2ecf20Sopenharmony_ci
698c2ecf20Sopenharmony_ciThe data structures
708c2ecf20Sopenharmony_ci-------------------
718c2ecf20Sopenharmony_ci
728c2ecf20Sopenharmony_cidm-btree.[hc]
738c2ecf20Sopenharmony_cidm-btree-remove.c
748c2ecf20Sopenharmony_cidm-btree-spine.c
758c2ecf20Sopenharmony_cidm-btree-internal.h
768c2ecf20Sopenharmony_ci
778c2ecf20Sopenharmony_ciCurrently there is only one data structure, a hierarchical btree.
788c2ecf20Sopenharmony_ciThere are plans to add more.  For example, something with an
798c2ecf20Sopenharmony_ciarray-like interface would see a lot of use.
808c2ecf20Sopenharmony_ci
818c2ecf20Sopenharmony_ciThe btree is 'hierarchical' in that you can define it to be composed
828c2ecf20Sopenharmony_ciof nested btrees, and take multiple keys.  For example, the
838c2ecf20Sopenharmony_cithin-provisioning target uses a btree with two levels of nesting.
848c2ecf20Sopenharmony_ciThe first maps a device id to a mapping tree, and that in turn maps a
858c2ecf20Sopenharmony_civirtual block to a physical block.
868c2ecf20Sopenharmony_ci
878c2ecf20Sopenharmony_ciValues stored in the btrees can have arbitrary size.  Keys are always
888c2ecf20Sopenharmony_ci64bits, although nesting allows you to use multiple keys.
89