18c2ecf20Sopenharmony_ci==================== 28c2ecf20Sopenharmony_ciChanges since 2.5.0: 38c2ecf20Sopenharmony_ci==================== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ci--- 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ci**recommended** 88c2ecf20Sopenharmony_ci 98c2ecf20Sopenharmony_ciNew helpers: sb_bread(), sb_getblk(), sb_find_get_block(), set_bh(), 108c2ecf20Sopenharmony_cisb_set_blocksize() and sb_min_blocksize(). 118c2ecf20Sopenharmony_ci 128c2ecf20Sopenharmony_ciUse them. 138c2ecf20Sopenharmony_ci 148c2ecf20Sopenharmony_ci(sb_find_get_block() replaces 2.4's get_hash_table()) 158c2ecf20Sopenharmony_ci 168c2ecf20Sopenharmony_ci--- 178c2ecf20Sopenharmony_ci 188c2ecf20Sopenharmony_ci**recommended** 198c2ecf20Sopenharmony_ci 208c2ecf20Sopenharmony_ciNew methods: ->alloc_inode() and ->destroy_inode(). 218c2ecf20Sopenharmony_ci 228c2ecf20Sopenharmony_ciRemove inode->u.foo_inode_i 238c2ecf20Sopenharmony_ci 248c2ecf20Sopenharmony_ciDeclare:: 258c2ecf20Sopenharmony_ci 268c2ecf20Sopenharmony_ci struct foo_inode_info { 278c2ecf20Sopenharmony_ci /* fs-private stuff */ 288c2ecf20Sopenharmony_ci struct inode vfs_inode; 298c2ecf20Sopenharmony_ci }; 308c2ecf20Sopenharmony_ci static inline struct foo_inode_info *FOO_I(struct inode *inode) 318c2ecf20Sopenharmony_ci { 328c2ecf20Sopenharmony_ci return list_entry(inode, struct foo_inode_info, vfs_inode); 338c2ecf20Sopenharmony_ci } 348c2ecf20Sopenharmony_ci 358c2ecf20Sopenharmony_ciUse FOO_I(inode) instead of &inode->u.foo_inode_i; 368c2ecf20Sopenharmony_ci 378c2ecf20Sopenharmony_ciAdd foo_alloc_inode() and foo_destroy_inode() - the former should allocate 388c2ecf20Sopenharmony_cifoo_inode_info and return the address of ->vfs_inode, the latter should free 398c2ecf20Sopenharmony_ciFOO_I(inode) (see in-tree filesystems for examples). 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ciMake them ->alloc_inode and ->destroy_inode in your super_operations. 428c2ecf20Sopenharmony_ci 438c2ecf20Sopenharmony_ciKeep in mind that now you need explicit initialization of private data 448c2ecf20Sopenharmony_citypically between calling iget_locked() and unlocking the inode. 458c2ecf20Sopenharmony_ci 468c2ecf20Sopenharmony_ciAt some point that will become mandatory. 478c2ecf20Sopenharmony_ci 488c2ecf20Sopenharmony_ci--- 498c2ecf20Sopenharmony_ci 508c2ecf20Sopenharmony_ci**mandatory** 518c2ecf20Sopenharmony_ci 528c2ecf20Sopenharmony_ciChange of file_system_type method (->read_super to ->get_sb) 538c2ecf20Sopenharmony_ci 548c2ecf20Sopenharmony_ci->read_super() is no more. Ditto for DECLARE_FSTYPE and DECLARE_FSTYPE_DEV. 558c2ecf20Sopenharmony_ci 568c2ecf20Sopenharmony_ciTurn your foo_read_super() into a function that would return 0 in case of 578c2ecf20Sopenharmony_cisuccess and negative number in case of error (-EINVAL unless you have more 588c2ecf20Sopenharmony_ciinformative error value to report). Call it foo_fill_super(). Now declare:: 598c2ecf20Sopenharmony_ci 608c2ecf20Sopenharmony_ci int foo_get_sb(struct file_system_type *fs_type, 618c2ecf20Sopenharmony_ci int flags, const char *dev_name, void *data, struct vfsmount *mnt) 628c2ecf20Sopenharmony_ci { 638c2ecf20Sopenharmony_ci return get_sb_bdev(fs_type, flags, dev_name, data, foo_fill_super, 648c2ecf20Sopenharmony_ci mnt); 658c2ecf20Sopenharmony_ci } 668c2ecf20Sopenharmony_ci 678c2ecf20Sopenharmony_ci(or similar with s/bdev/nodev/ or s/bdev/single/, depending on the kind of 688c2ecf20Sopenharmony_cifilesystem). 698c2ecf20Sopenharmony_ci 708c2ecf20Sopenharmony_ciReplace DECLARE_FSTYPE... with explicit initializer and have ->get_sb set as 718c2ecf20Sopenharmony_cifoo_get_sb. 728c2ecf20Sopenharmony_ci 738c2ecf20Sopenharmony_ci--- 748c2ecf20Sopenharmony_ci 758c2ecf20Sopenharmony_ci**mandatory** 768c2ecf20Sopenharmony_ci 778c2ecf20Sopenharmony_ciLocking change: ->s_vfs_rename_sem is taken only by cross-directory renames. 788c2ecf20Sopenharmony_ciMost likely there is no need to change anything, but if you relied on 798c2ecf20Sopenharmony_ciglobal exclusion between renames for some internal purpose - you need to 808c2ecf20Sopenharmony_cichange your internal locking. Otherwise exclusion warranties remain the 818c2ecf20Sopenharmony_cisame (i.e. parents and victim are locked, etc.). 828c2ecf20Sopenharmony_ci 838c2ecf20Sopenharmony_ci--- 848c2ecf20Sopenharmony_ci 858c2ecf20Sopenharmony_ci**informational** 868c2ecf20Sopenharmony_ci 878c2ecf20Sopenharmony_ciNow we have the exclusion between ->lookup() and directory removal (by 888c2ecf20Sopenharmony_ci->rmdir() and ->rename()). If you used to need that exclusion and do 898c2ecf20Sopenharmony_ciit by internal locking (most of filesystems couldn't care less) - you 908c2ecf20Sopenharmony_cican relax your locking. 918c2ecf20Sopenharmony_ci 928c2ecf20Sopenharmony_ci--- 938c2ecf20Sopenharmony_ci 948c2ecf20Sopenharmony_ci**mandatory** 958c2ecf20Sopenharmony_ci 968c2ecf20Sopenharmony_ci->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(), 978c2ecf20Sopenharmony_ci->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename() 988c2ecf20Sopenharmony_ciand ->readdir() are called without BKL now. Grab it on entry, drop upon return 998c2ecf20Sopenharmony_ci- that will guarantee the same locking you used to have. If your method or its 1008c2ecf20Sopenharmony_ciparts do not need BKL - better yet, now you can shift lock_kernel() and 1018c2ecf20Sopenharmony_ciunlock_kernel() so that they would protect exactly what needs to be 1028c2ecf20Sopenharmony_ciprotected. 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ci--- 1058c2ecf20Sopenharmony_ci 1068c2ecf20Sopenharmony_ci**mandatory** 1078c2ecf20Sopenharmony_ci 1088c2ecf20Sopenharmony_ciBKL is also moved from around sb operations. BKL should have been shifted into 1098c2ecf20Sopenharmony_ciindividual fs sb_op functions. If you don't need it, remove it. 1108c2ecf20Sopenharmony_ci 1118c2ecf20Sopenharmony_ci--- 1128c2ecf20Sopenharmony_ci 1138c2ecf20Sopenharmony_ci**informational** 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_cicheck for ->link() target not being a directory is done by callers. Feel 1168c2ecf20Sopenharmony_cifree to drop it... 1178c2ecf20Sopenharmony_ci 1188c2ecf20Sopenharmony_ci--- 1198c2ecf20Sopenharmony_ci 1208c2ecf20Sopenharmony_ci**informational** 1218c2ecf20Sopenharmony_ci 1228c2ecf20Sopenharmony_ci->link() callers hold ->i_mutex on the object we are linking to. Some of your 1238c2ecf20Sopenharmony_ciproblems might be over... 1248c2ecf20Sopenharmony_ci 1258c2ecf20Sopenharmony_ci--- 1268c2ecf20Sopenharmony_ci 1278c2ecf20Sopenharmony_ci**mandatory** 1288c2ecf20Sopenharmony_ci 1298c2ecf20Sopenharmony_cinew file_system_type method - kill_sb(superblock). If you are converting 1308c2ecf20Sopenharmony_cian existing filesystem, set it according to ->fs_flags:: 1318c2ecf20Sopenharmony_ci 1328c2ecf20Sopenharmony_ci FS_REQUIRES_DEV - kill_block_super 1338c2ecf20Sopenharmony_ci FS_LITTER - kill_litter_super 1348c2ecf20Sopenharmony_ci neither - kill_anon_super 1358c2ecf20Sopenharmony_ci 1368c2ecf20Sopenharmony_ciFS_LITTER is gone - just remove it from fs_flags. 1378c2ecf20Sopenharmony_ci 1388c2ecf20Sopenharmony_ci--- 1398c2ecf20Sopenharmony_ci 1408c2ecf20Sopenharmony_ci**mandatory** 1418c2ecf20Sopenharmony_ci 1428c2ecf20Sopenharmony_ciFS_SINGLE is gone (actually, that had happened back when ->get_sb() 1438c2ecf20Sopenharmony_ciwent in - and hadn't been documented ;-/). Just remove it from fs_flags 1448c2ecf20Sopenharmony_ci(and see ->get_sb() entry for other actions). 1458c2ecf20Sopenharmony_ci 1468c2ecf20Sopenharmony_ci--- 1478c2ecf20Sopenharmony_ci 1488c2ecf20Sopenharmony_ci**mandatory** 1498c2ecf20Sopenharmony_ci 1508c2ecf20Sopenharmony_ci->setattr() is called without BKL now. Caller _always_ holds ->i_mutex, so 1518c2ecf20Sopenharmony_ciwatch for ->i_mutex-grabbing code that might be used by your ->setattr(). 1528c2ecf20Sopenharmony_ciCallers of notify_change() need ->i_mutex now. 1538c2ecf20Sopenharmony_ci 1548c2ecf20Sopenharmony_ci--- 1558c2ecf20Sopenharmony_ci 1568c2ecf20Sopenharmony_ci**recommended** 1578c2ecf20Sopenharmony_ci 1588c2ecf20Sopenharmony_ciNew super_block field ``struct export_operations *s_export_op`` for 1598c2ecf20Sopenharmony_ciexplicit support for exporting, e.g. via NFS. The structure is fully 1608c2ecf20Sopenharmony_cidocumented at its declaration in include/linux/fs.h, and in 1618c2ecf20Sopenharmony_ciDocumentation/filesystems/nfs/exporting.rst. 1628c2ecf20Sopenharmony_ci 1638c2ecf20Sopenharmony_ciBriefly it allows for the definition of decode_fh and encode_fh operations 1648c2ecf20Sopenharmony_cito encode and decode filehandles, and allows the filesystem to use 1658c2ecf20Sopenharmony_cia standard helper function for decode_fh, and provide file-system specific 1668c2ecf20Sopenharmony_cisupport for this helper, particularly get_parent. 1678c2ecf20Sopenharmony_ci 1688c2ecf20Sopenharmony_ciIt is planned that this will be required for exporting once the code 1698c2ecf20Sopenharmony_cisettles down a bit. 1708c2ecf20Sopenharmony_ci 1718c2ecf20Sopenharmony_ci**mandatory** 1728c2ecf20Sopenharmony_ci 1738c2ecf20Sopenharmony_cis_export_op is now required for exporting a filesystem. 1748c2ecf20Sopenharmony_ciisofs, ext2, ext3, resierfs, fat 1758c2ecf20Sopenharmony_cican be used as examples of very different filesystems. 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ci--- 1788c2ecf20Sopenharmony_ci 1798c2ecf20Sopenharmony_ci**mandatory** 1808c2ecf20Sopenharmony_ci 1818c2ecf20Sopenharmony_ciiget4() and the read_inode2 callback have been superseded by iget5_locked() 1828c2ecf20Sopenharmony_ciwhich has the following prototype:: 1838c2ecf20Sopenharmony_ci 1848c2ecf20Sopenharmony_ci struct inode *iget5_locked(struct super_block *sb, unsigned long ino, 1858c2ecf20Sopenharmony_ci int (*test)(struct inode *, void *), 1868c2ecf20Sopenharmony_ci int (*set)(struct inode *, void *), 1878c2ecf20Sopenharmony_ci void *data); 1888c2ecf20Sopenharmony_ci 1898c2ecf20Sopenharmony_ci'test' is an additional function that can be used when the inode 1908c2ecf20Sopenharmony_cinumber is not sufficient to identify the actual file object. 'set' 1918c2ecf20Sopenharmony_cishould be a non-blocking function that initializes those parts of a 1928c2ecf20Sopenharmony_cinewly created inode to allow the test function to succeed. 'data' is 1938c2ecf20Sopenharmony_cipassed as an opaque value to both test and set functions. 1948c2ecf20Sopenharmony_ci 1958c2ecf20Sopenharmony_ciWhen the inode has been created by iget5_locked(), it will be returned with the 1968c2ecf20Sopenharmony_ciI_NEW flag set and will still be locked. The filesystem then needs to finalize 1978c2ecf20Sopenharmony_cithe initialization. Once the inode is initialized it must be unlocked by 1988c2ecf20Sopenharmony_cicalling unlock_new_inode(). 1998c2ecf20Sopenharmony_ci 2008c2ecf20Sopenharmony_ciThe filesystem is responsible for setting (and possibly testing) i_ino 2018c2ecf20Sopenharmony_ciwhen appropriate. There is also a simpler iget_locked function that 2028c2ecf20Sopenharmony_cijust takes the superblock and inode number as arguments and does the 2038c2ecf20Sopenharmony_citest and set for you. 2048c2ecf20Sopenharmony_ci 2058c2ecf20Sopenharmony_cie.g.:: 2068c2ecf20Sopenharmony_ci 2078c2ecf20Sopenharmony_ci inode = iget_locked(sb, ino); 2088c2ecf20Sopenharmony_ci if (inode->i_state & I_NEW) { 2098c2ecf20Sopenharmony_ci err = read_inode_from_disk(inode); 2108c2ecf20Sopenharmony_ci if (err < 0) { 2118c2ecf20Sopenharmony_ci iget_failed(inode); 2128c2ecf20Sopenharmony_ci return err; 2138c2ecf20Sopenharmony_ci } 2148c2ecf20Sopenharmony_ci unlock_new_inode(inode); 2158c2ecf20Sopenharmony_ci } 2168c2ecf20Sopenharmony_ci 2178c2ecf20Sopenharmony_ciNote that if the process of setting up a new inode fails, then iget_failed() 2188c2ecf20Sopenharmony_cishould be called on the inode to render it dead, and an appropriate error 2198c2ecf20Sopenharmony_cishould be passed back to the caller. 2208c2ecf20Sopenharmony_ci 2218c2ecf20Sopenharmony_ci--- 2228c2ecf20Sopenharmony_ci 2238c2ecf20Sopenharmony_ci**recommended** 2248c2ecf20Sopenharmony_ci 2258c2ecf20Sopenharmony_ci->getattr() finally getting used. See instances in nfs, minix, etc. 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ci--- 2288c2ecf20Sopenharmony_ci 2298c2ecf20Sopenharmony_ci**mandatory** 2308c2ecf20Sopenharmony_ci 2318c2ecf20Sopenharmony_ci->revalidate() is gone. If your filesystem had it - provide ->getattr() 2328c2ecf20Sopenharmony_ciand let it call whatever you had as ->revlidate() + (for symlinks that 2338c2ecf20Sopenharmony_cihad ->revalidate()) add calls in ->follow_link()/->readlink(). 2348c2ecf20Sopenharmony_ci 2358c2ecf20Sopenharmony_ci--- 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ci**mandatory** 2388c2ecf20Sopenharmony_ci 2398c2ecf20Sopenharmony_ci->d_parent changes are not protected by BKL anymore. Read access is safe 2408c2ecf20Sopenharmony_ciif at least one of the following is true: 2418c2ecf20Sopenharmony_ci 2428c2ecf20Sopenharmony_ci * filesystem has no cross-directory rename() 2438c2ecf20Sopenharmony_ci * we know that parent had been locked (e.g. we are looking at 2448c2ecf20Sopenharmony_ci ->d_parent of ->lookup() argument). 2458c2ecf20Sopenharmony_ci * we are called from ->rename(). 2468c2ecf20Sopenharmony_ci * the child's ->d_lock is held 2478c2ecf20Sopenharmony_ci 2488c2ecf20Sopenharmony_ciAudit your code and add locking if needed. Notice that any place that is 2498c2ecf20Sopenharmony_cinot protected by the conditions above is risky even in the old tree - you 2508c2ecf20Sopenharmony_cihad been relying on BKL and that's prone to screwups. Old tree had quite 2518c2ecf20Sopenharmony_cia few holes of that kind - unprotected access to ->d_parent leading to 2528c2ecf20Sopenharmony_cianything from oops to silent memory corruption. 2538c2ecf20Sopenharmony_ci 2548c2ecf20Sopenharmony_ci--- 2558c2ecf20Sopenharmony_ci 2568c2ecf20Sopenharmony_ci**mandatory** 2578c2ecf20Sopenharmony_ci 2588c2ecf20Sopenharmony_ciFS_NOMOUNT is gone. If you use it - just set SB_NOUSER in flags 2598c2ecf20Sopenharmony_ci(see rootfs for one kind of solution and bdev/socket/pipe for another). 2608c2ecf20Sopenharmony_ci 2618c2ecf20Sopenharmony_ci--- 2628c2ecf20Sopenharmony_ci 2638c2ecf20Sopenharmony_ci**recommended** 2648c2ecf20Sopenharmony_ci 2658c2ecf20Sopenharmony_ciUse bdev_read_only(bdev) instead of is_read_only(kdev). The latter 2668c2ecf20Sopenharmony_ciis still alive, but only because of the mess in drivers/s390/block/dasd.c. 2678c2ecf20Sopenharmony_ciAs soon as it gets fixed is_read_only() will die. 2688c2ecf20Sopenharmony_ci 2698c2ecf20Sopenharmony_ci--- 2708c2ecf20Sopenharmony_ci 2718c2ecf20Sopenharmony_ci**mandatory** 2728c2ecf20Sopenharmony_ci 2738c2ecf20Sopenharmony_ci->permission() is called without BKL now. Grab it on entry, drop upon 2748c2ecf20Sopenharmony_cireturn - that will guarantee the same locking you used to have. If 2758c2ecf20Sopenharmony_ciyour method or its parts do not need BKL - better yet, now you can 2768c2ecf20Sopenharmony_cishift lock_kernel() and unlock_kernel() so that they would protect 2778c2ecf20Sopenharmony_ciexactly what needs to be protected. 2788c2ecf20Sopenharmony_ci 2798c2ecf20Sopenharmony_ci--- 2808c2ecf20Sopenharmony_ci 2818c2ecf20Sopenharmony_ci**mandatory** 2828c2ecf20Sopenharmony_ci 2838c2ecf20Sopenharmony_ci->statfs() is now called without BKL held. BKL should have been 2848c2ecf20Sopenharmony_cishifted into individual fs sb_op functions where it's not clear that 2858c2ecf20Sopenharmony_ciit's safe to remove it. If you don't need it, remove it. 2868c2ecf20Sopenharmony_ci 2878c2ecf20Sopenharmony_ci--- 2888c2ecf20Sopenharmony_ci 2898c2ecf20Sopenharmony_ci**mandatory** 2908c2ecf20Sopenharmony_ci 2918c2ecf20Sopenharmony_ciis_read_only() is gone; use bdev_read_only() instead. 2928c2ecf20Sopenharmony_ci 2938c2ecf20Sopenharmony_ci--- 2948c2ecf20Sopenharmony_ci 2958c2ecf20Sopenharmony_ci**mandatory** 2968c2ecf20Sopenharmony_ci 2978c2ecf20Sopenharmony_cidestroy_buffers() is gone; use invalidate_bdev(). 2988c2ecf20Sopenharmony_ci 2998c2ecf20Sopenharmony_ci--- 3008c2ecf20Sopenharmony_ci 3018c2ecf20Sopenharmony_ci**mandatory** 3028c2ecf20Sopenharmony_ci 3038c2ecf20Sopenharmony_cifsync_dev() is gone; use fsync_bdev(). NOTE: lvm breakage is 3048c2ecf20Sopenharmony_cideliberate; as soon as struct block_device * is propagated in a reasonable 3058c2ecf20Sopenharmony_ciway by that code fixing will become trivial; until then nothing can be 3068c2ecf20Sopenharmony_cidone. 3078c2ecf20Sopenharmony_ci 3088c2ecf20Sopenharmony_ci**mandatory** 3098c2ecf20Sopenharmony_ci 3108c2ecf20Sopenharmony_ciblock truncatation on error exit from ->write_begin, and ->direct_IO 3118c2ecf20Sopenharmony_cimoved from generic methods (block_write_begin, cont_write_begin, 3128c2ecf20Sopenharmony_cinobh_write_begin, blockdev_direct_IO*) to callers. Take a look at 3138c2ecf20Sopenharmony_ciext2_write_failed and callers for an example. 3148c2ecf20Sopenharmony_ci 3158c2ecf20Sopenharmony_ci**mandatory** 3168c2ecf20Sopenharmony_ci 3178c2ecf20Sopenharmony_ci->truncate is gone. The whole truncate sequence needs to be 3188c2ecf20Sopenharmony_ciimplemented in ->setattr, which is now mandatory for filesystems 3198c2ecf20Sopenharmony_ciimplementing on-disk size changes. Start with a copy of the old inode_setattr 3208c2ecf20Sopenharmony_ciand vmtruncate, and the reorder the vmtruncate + foofs_vmtruncate sequence to 3218c2ecf20Sopenharmony_cibe in order of zeroing blocks using block_truncate_page or similar helpers, 3228c2ecf20Sopenharmony_cisize update and on finally on-disk truncation which should not fail. 3238c2ecf20Sopenharmony_cisetattr_prepare (which used to be inode_change_ok) now includes the size checks 3248c2ecf20Sopenharmony_cifor ATTR_SIZE and must be called in the beginning of ->setattr unconditionally. 3258c2ecf20Sopenharmony_ci 3268c2ecf20Sopenharmony_ci**mandatory** 3278c2ecf20Sopenharmony_ci 3288c2ecf20Sopenharmony_ci->clear_inode() and ->delete_inode() are gone; ->evict_inode() should 3298c2ecf20Sopenharmony_cibe used instead. It gets called whenever the inode is evicted, whether it has 3308c2ecf20Sopenharmony_ciremaining links or not. Caller does *not* evict the pagecache or inode-associated 3318c2ecf20Sopenharmony_cimetadata buffers; the method has to use truncate_inode_pages_final() to get rid 3328c2ecf20Sopenharmony_ciof those. Caller makes sure async writeback cannot be running for the inode while 3338c2ecf20Sopenharmony_ci(or after) ->evict_inode() is called. 3348c2ecf20Sopenharmony_ci 3358c2ecf20Sopenharmony_ci->drop_inode() returns int now; it's called on final iput() with 3368c2ecf20Sopenharmony_ciinode->i_lock held and it returns true if filesystems wants the inode to be 3378c2ecf20Sopenharmony_cidropped. As before, generic_drop_inode() is still the default and it's been 3388c2ecf20Sopenharmony_ciupdated appropriately. generic_delete_inode() is also alive and it consists 3398c2ecf20Sopenharmony_cisimply of return 1. Note that all actual eviction work is done by caller after 3408c2ecf20Sopenharmony_ci->drop_inode() returns. 3418c2ecf20Sopenharmony_ci 3428c2ecf20Sopenharmony_ciAs before, clear_inode() must be called exactly once on each call of 3438c2ecf20Sopenharmony_ci->evict_inode() (as it used to be for each call of ->delete_inode()). Unlike 3448c2ecf20Sopenharmony_cibefore, if you are using inode-associated metadata buffers (i.e. 3458c2ecf20Sopenharmony_cimark_buffer_dirty_inode()), it's your responsibility to call 3468c2ecf20Sopenharmony_ciinvalidate_inode_buffers() before clear_inode(). 3478c2ecf20Sopenharmony_ci 3488c2ecf20Sopenharmony_ciNOTE: checking i_nlink in the beginning of ->write_inode() and bailing out 3498c2ecf20Sopenharmony_ciif it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput() 3508c2ecf20Sopenharmony_cimay happen while the inode is in the middle of ->write_inode(); e.g. if you blindly 3518c2ecf20Sopenharmony_cifree the on-disk inode, you may end up doing that while ->write_inode() is writing 3528c2ecf20Sopenharmony_cito it. 3538c2ecf20Sopenharmony_ci 3548c2ecf20Sopenharmony_ci--- 3558c2ecf20Sopenharmony_ci 3568c2ecf20Sopenharmony_ci**mandatory** 3578c2ecf20Sopenharmony_ci 3588c2ecf20Sopenharmony_ci.d_delete() now only advises the dcache as to whether or not to cache 3598c2ecf20Sopenharmony_ciunreferenced dentries, and is now only called when the dentry refcount goes to 3608c2ecf20Sopenharmony_ci0. Even on 0 refcount transition, it must be able to tolerate being called 0, 3618c2ecf20Sopenharmony_ci1, or more times (eg. constant, idempotent). 3628c2ecf20Sopenharmony_ci 3638c2ecf20Sopenharmony_ci--- 3648c2ecf20Sopenharmony_ci 3658c2ecf20Sopenharmony_ci**mandatory** 3668c2ecf20Sopenharmony_ci 3678c2ecf20Sopenharmony_ci.d_compare() calling convention and locking rules are significantly 3688c2ecf20Sopenharmony_cichanged. Read updated documentation in Documentation/filesystems/vfs.rst (and 3698c2ecf20Sopenharmony_cilook at examples of other filesystems) for guidance. 3708c2ecf20Sopenharmony_ci 3718c2ecf20Sopenharmony_ci--- 3728c2ecf20Sopenharmony_ci 3738c2ecf20Sopenharmony_ci**mandatory** 3748c2ecf20Sopenharmony_ci 3758c2ecf20Sopenharmony_ci.d_hash() calling convention and locking rules are significantly 3768c2ecf20Sopenharmony_cichanged. Read updated documentation in Documentation/filesystems/vfs.rst (and 3778c2ecf20Sopenharmony_cilook at examples of other filesystems) for guidance. 3788c2ecf20Sopenharmony_ci 3798c2ecf20Sopenharmony_ci--- 3808c2ecf20Sopenharmony_ci 3818c2ecf20Sopenharmony_ci**mandatory** 3828c2ecf20Sopenharmony_ci 3838c2ecf20Sopenharmony_cidcache_lock is gone, replaced by fine grained locks. See fs/dcache.c 3848c2ecf20Sopenharmony_cifor details of what locks to replace dcache_lock with in order to protect 3858c2ecf20Sopenharmony_ciparticular things. Most of the time, a filesystem only needs ->d_lock, which 3868c2ecf20Sopenharmony_ciprotects *all* the dcache state of a given dentry. 3878c2ecf20Sopenharmony_ci 3888c2ecf20Sopenharmony_ci--- 3898c2ecf20Sopenharmony_ci 3908c2ecf20Sopenharmony_ci**mandatory** 3918c2ecf20Sopenharmony_ci 3928c2ecf20Sopenharmony_ciFilesystems must RCU-free their inodes, if they can have been accessed 3938c2ecf20Sopenharmony_civia rcu-walk path walk (basically, if the file can have had a path name in the 3948c2ecf20Sopenharmony_civfs namespace). 3958c2ecf20Sopenharmony_ci 3968c2ecf20Sopenharmony_ciEven though i_dentry and i_rcu share storage in a union, we will 3978c2ecf20Sopenharmony_ciinitialize the former in inode_init_always(), so just leave it alone in 3988c2ecf20Sopenharmony_cithe callback. It used to be necessary to clean it there, but not anymore 3998c2ecf20Sopenharmony_ci(starting at 3.2). 4008c2ecf20Sopenharmony_ci 4018c2ecf20Sopenharmony_ci--- 4028c2ecf20Sopenharmony_ci 4038c2ecf20Sopenharmony_ci**recommended** 4048c2ecf20Sopenharmony_ci 4058c2ecf20Sopenharmony_civfs now tries to do path walking in "rcu-walk mode", which avoids 4068c2ecf20Sopenharmony_ciatomic operations and scalability hazards on dentries and inodes (see 4078c2ecf20Sopenharmony_ciDocumentation/filesystems/path-lookup.txt). d_hash and d_compare changes 4088c2ecf20Sopenharmony_ci(above) are examples of the changes required to support this. For more complex 4098c2ecf20Sopenharmony_cifilesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so 4108c2ecf20Sopenharmony_cino changes are required to the filesystem. However, this is costly and loses 4118c2ecf20Sopenharmony_cithe benefits of rcu-walk mode. We will begin to add filesystem callbacks that 4128c2ecf20Sopenharmony_ciare rcu-walk aware, shown below. Filesystems should take advantage of this 4138c2ecf20Sopenharmony_ciwhere possible. 4148c2ecf20Sopenharmony_ci 4158c2ecf20Sopenharmony_ci--- 4168c2ecf20Sopenharmony_ci 4178c2ecf20Sopenharmony_ci**mandatory** 4188c2ecf20Sopenharmony_ci 4198c2ecf20Sopenharmony_cid_revalidate is a callback that is made on every path element (if 4208c2ecf20Sopenharmony_cithe filesystem provides it), which requires dropping out of rcu-walk mode. This 4218c2ecf20Sopenharmony_cimay now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be 4228c2ecf20Sopenharmony_cireturned if the filesystem cannot handle rcu-walk. See 4238c2ecf20Sopenharmony_ciDocumentation/filesystems/vfs.rst for more details. 4248c2ecf20Sopenharmony_ci 4258c2ecf20Sopenharmony_cipermission is an inode permission check that is called on many or all 4268c2ecf20Sopenharmony_cidirectory inodes on the way down a path walk (to check for exec permission). It 4278c2ecf20Sopenharmony_cimust now be rcu-walk aware (mask & MAY_NOT_BLOCK). See 4288c2ecf20Sopenharmony_ciDocumentation/filesystems/vfs.rst for more details. 4298c2ecf20Sopenharmony_ci 4308c2ecf20Sopenharmony_ci--- 4318c2ecf20Sopenharmony_ci 4328c2ecf20Sopenharmony_ci**mandatory** 4338c2ecf20Sopenharmony_ci 4348c2ecf20Sopenharmony_ciIn ->fallocate() you must check the mode option passed in. If your 4358c2ecf20Sopenharmony_cifilesystem does not support hole punching (deallocating space in the middle of a 4368c2ecf20Sopenharmony_cifile) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode. 4378c2ecf20Sopenharmony_ciCurrently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set, 4388c2ecf20Sopenharmony_ciso the i_size should not change when hole punching, even when puching the end of 4398c2ecf20Sopenharmony_cia file off. 4408c2ecf20Sopenharmony_ci 4418c2ecf20Sopenharmony_ci--- 4428c2ecf20Sopenharmony_ci 4438c2ecf20Sopenharmony_ci**mandatory** 4448c2ecf20Sopenharmony_ci 4458c2ecf20Sopenharmony_ci->get_sb() is gone. Switch to use of ->mount(). Typically it's just 4468c2ecf20Sopenharmony_cia matter of switching from calling ``get_sb_``... to ``mount_``... and changing 4478c2ecf20Sopenharmony_cithe function type. If you were doing it manually, just switch from setting 4488c2ecf20Sopenharmony_ci->mnt_root to some pointer to returning that pointer. On errors return 4498c2ecf20Sopenharmony_ciERR_PTR(...). 4508c2ecf20Sopenharmony_ci 4518c2ecf20Sopenharmony_ci--- 4528c2ecf20Sopenharmony_ci 4538c2ecf20Sopenharmony_ci**mandatory** 4548c2ecf20Sopenharmony_ci 4558c2ecf20Sopenharmony_ci->permission() and generic_permission()have lost flags 4568c2ecf20Sopenharmony_ciargument; instead of passing IPERM_FLAG_RCU we add MAY_NOT_BLOCK into mask. 4578c2ecf20Sopenharmony_ci 4588c2ecf20Sopenharmony_cigeneric_permission() has also lost the check_acl argument; ACL checking 4598c2ecf20Sopenharmony_cihas been taken to VFS and filesystems need to provide a non-NULL ->i_op->get_acl 4608c2ecf20Sopenharmony_cito read an ACL from disk. 4618c2ecf20Sopenharmony_ci 4628c2ecf20Sopenharmony_ci--- 4638c2ecf20Sopenharmony_ci 4648c2ecf20Sopenharmony_ci**mandatory** 4658c2ecf20Sopenharmony_ci 4668c2ecf20Sopenharmony_ciIf you implement your own ->llseek() you must handle SEEK_HOLE and 4678c2ecf20Sopenharmony_ciSEEK_DATA. You can hanle this by returning -EINVAL, but it would be nicer to 4688c2ecf20Sopenharmony_cisupport it in some way. The generic handler assumes that the entire file is 4698c2ecf20Sopenharmony_cidata and there is a virtual hole at the end of the file. So if the provided 4708c2ecf20Sopenharmony_cioffset is less than i_size and SEEK_DATA is specified, return the same offset. 4718c2ecf20Sopenharmony_ciIf the above is true for the offset and you are given SEEK_HOLE, return the end 4728c2ecf20Sopenharmony_ciof the file. If the offset is i_size or greater return -ENXIO in either case. 4738c2ecf20Sopenharmony_ci 4748c2ecf20Sopenharmony_ci**mandatory** 4758c2ecf20Sopenharmony_ci 4768c2ecf20Sopenharmony_ciIf you have your own ->fsync() you must make sure to call 4778c2ecf20Sopenharmony_cifilemap_write_and_wait_range() so that all dirty pages are synced out properly. 4788c2ecf20Sopenharmony_ciYou must also keep in mind that ->fsync() is not called with i_mutex held 4798c2ecf20Sopenharmony_cianymore, so if you require i_mutex locking you must make sure to take it and 4808c2ecf20Sopenharmony_cirelease it yourself. 4818c2ecf20Sopenharmony_ci 4828c2ecf20Sopenharmony_ci--- 4838c2ecf20Sopenharmony_ci 4848c2ecf20Sopenharmony_ci**mandatory** 4858c2ecf20Sopenharmony_ci 4868c2ecf20Sopenharmony_cid_alloc_root() is gone, along with a lot of bugs caused by code 4878c2ecf20Sopenharmony_cimisusing it. Replacement: d_make_root(inode). On success d_make_root(inode) 4888c2ecf20Sopenharmony_ciallocates and returns a new dentry instantiated with the passed in inode. 4898c2ecf20Sopenharmony_ciOn failure NULL is returned and the passed in inode is dropped so the reference 4908c2ecf20Sopenharmony_cito inode is consumed in all cases and failure handling need not do any cleanup 4918c2ecf20Sopenharmony_cifor the inode. If d_make_root(inode) is passed a NULL inode it returns NULL 4928c2ecf20Sopenharmony_ciand also requires no further error handling. Typical usage is:: 4938c2ecf20Sopenharmony_ci 4948c2ecf20Sopenharmony_ci inode = foofs_new_inode(....); 4958c2ecf20Sopenharmony_ci s->s_root = d_make_root(inode); 4968c2ecf20Sopenharmony_ci if (!s->s_root) 4978c2ecf20Sopenharmony_ci /* Nothing needed for the inode cleanup */ 4988c2ecf20Sopenharmony_ci return -ENOMEM; 4998c2ecf20Sopenharmony_ci ... 5008c2ecf20Sopenharmony_ci 5018c2ecf20Sopenharmony_ci--- 5028c2ecf20Sopenharmony_ci 5038c2ecf20Sopenharmony_ci**mandatory** 5048c2ecf20Sopenharmony_ci 5058c2ecf20Sopenharmony_ciThe witch is dead! Well, 2/3 of it, anyway. ->d_revalidate() and 5068c2ecf20Sopenharmony_ci->lookup() do *not* take struct nameidata anymore; just the flags. 5078c2ecf20Sopenharmony_ci 5088c2ecf20Sopenharmony_ci--- 5098c2ecf20Sopenharmony_ci 5108c2ecf20Sopenharmony_ci**mandatory** 5118c2ecf20Sopenharmony_ci 5128c2ecf20Sopenharmony_ci->create() doesn't take ``struct nameidata *``; unlike the previous 5138c2ecf20Sopenharmony_citwo, it gets "is it an O_EXCL or equivalent?" boolean argument. Note that 5148c2ecf20Sopenharmony_cilocal filesystems can ignore tha argument - they are guaranteed that the 5158c2ecf20Sopenharmony_ciobject doesn't exist. It's remote/distributed ones that might care... 5168c2ecf20Sopenharmony_ci 5178c2ecf20Sopenharmony_ci--- 5188c2ecf20Sopenharmony_ci 5198c2ecf20Sopenharmony_ci**mandatory** 5208c2ecf20Sopenharmony_ci 5218c2ecf20Sopenharmony_ciFS_REVAL_DOT is gone; if you used to have it, add ->d_weak_revalidate() 5228c2ecf20Sopenharmony_ciin your dentry operations instead. 5238c2ecf20Sopenharmony_ci 5248c2ecf20Sopenharmony_ci--- 5258c2ecf20Sopenharmony_ci 5268c2ecf20Sopenharmony_ci**mandatory** 5278c2ecf20Sopenharmony_ci 5288c2ecf20Sopenharmony_civfs_readdir() is gone; switch to iterate_dir() instead 5298c2ecf20Sopenharmony_ci 5308c2ecf20Sopenharmony_ci--- 5318c2ecf20Sopenharmony_ci 5328c2ecf20Sopenharmony_ci**mandatory** 5338c2ecf20Sopenharmony_ci 5348c2ecf20Sopenharmony_ci->readdir() is gone now; switch to ->iterate() 5358c2ecf20Sopenharmony_ci 5368c2ecf20Sopenharmony_ci**mandatory** 5378c2ecf20Sopenharmony_ci 5388c2ecf20Sopenharmony_civfs_follow_link has been removed. Filesystems must use nd_set_link 5398c2ecf20Sopenharmony_cifrom ->follow_link for normal symlinks, or nd_jump_link for magic 5408c2ecf20Sopenharmony_ci/proc/<pid> style links. 5418c2ecf20Sopenharmony_ci 5428c2ecf20Sopenharmony_ci--- 5438c2ecf20Sopenharmony_ci 5448c2ecf20Sopenharmony_ci**mandatory** 5458c2ecf20Sopenharmony_ci 5468c2ecf20Sopenharmony_ciiget5_locked()/ilookup5()/ilookup5_nowait() test() callback used to be 5478c2ecf20Sopenharmony_cicalled with both ->i_lock and inode_hash_lock held; the former is *not* 5488c2ecf20Sopenharmony_citaken anymore, so verify that your callbacks do not rely on it (none 5498c2ecf20Sopenharmony_ciof the in-tree instances did). inode_hash_lock is still held, 5508c2ecf20Sopenharmony_ciof course, so they are still serialized wrt removal from inode hash, 5518c2ecf20Sopenharmony_cias well as wrt set() callback of iget5_locked(). 5528c2ecf20Sopenharmony_ci 5538c2ecf20Sopenharmony_ci--- 5548c2ecf20Sopenharmony_ci 5558c2ecf20Sopenharmony_ci**mandatory** 5568c2ecf20Sopenharmony_ci 5578c2ecf20Sopenharmony_cid_materialise_unique() is gone; d_splice_alias() does everything you 5588c2ecf20Sopenharmony_cineed now. Remember that they have opposite orders of arguments ;-/ 5598c2ecf20Sopenharmony_ci 5608c2ecf20Sopenharmony_ci--- 5618c2ecf20Sopenharmony_ci 5628c2ecf20Sopenharmony_ci**mandatory** 5638c2ecf20Sopenharmony_ci 5648c2ecf20Sopenharmony_cif_dentry is gone; use f_path.dentry, or, better yet, see if you can avoid 5658c2ecf20Sopenharmony_ciit entirely. 5668c2ecf20Sopenharmony_ci 5678c2ecf20Sopenharmony_ci--- 5688c2ecf20Sopenharmony_ci 5698c2ecf20Sopenharmony_ci**mandatory** 5708c2ecf20Sopenharmony_ci 5718c2ecf20Sopenharmony_cinever call ->read() and ->write() directly; use __vfs_{read,write} or 5728c2ecf20Sopenharmony_ciwrappers; instead of checking for ->write or ->read being NULL, look for 5738c2ecf20Sopenharmony_ciFMODE_CAN_{WRITE,READ} in file->f_mode. 5748c2ecf20Sopenharmony_ci 5758c2ecf20Sopenharmony_ci--- 5768c2ecf20Sopenharmony_ci 5778c2ecf20Sopenharmony_ci**mandatory** 5788c2ecf20Sopenharmony_ci 5798c2ecf20Sopenharmony_cido _not_ use new_sync_{read,write} for ->read/->write; leave it NULL 5808c2ecf20Sopenharmony_ciinstead. 5818c2ecf20Sopenharmony_ci 5828c2ecf20Sopenharmony_ci--- 5838c2ecf20Sopenharmony_ci 5848c2ecf20Sopenharmony_ci**mandatory** 5858c2ecf20Sopenharmony_ci ->aio_read/->aio_write are gone. Use ->read_iter/->write_iter. 5868c2ecf20Sopenharmony_ci 5878c2ecf20Sopenharmony_ci--- 5888c2ecf20Sopenharmony_ci 5898c2ecf20Sopenharmony_ci**recommended** 5908c2ecf20Sopenharmony_ci 5918c2ecf20Sopenharmony_cifor embedded ("fast") symlinks just set inode->i_link to wherever the 5928c2ecf20Sopenharmony_cisymlink body is and use simple_follow_link() as ->follow_link(). 5938c2ecf20Sopenharmony_ci 5948c2ecf20Sopenharmony_ci--- 5958c2ecf20Sopenharmony_ci 5968c2ecf20Sopenharmony_ci**mandatory** 5978c2ecf20Sopenharmony_ci 5988c2ecf20Sopenharmony_cicalling conventions for ->follow_link() have changed. Instead of returning 5998c2ecf20Sopenharmony_cicookie and using nd_set_link() to store the body to traverse, we return 6008c2ecf20Sopenharmony_cithe body to traverse and store the cookie using explicit void ** argument. 6018c2ecf20Sopenharmony_cinameidata isn't passed at all - nd_jump_link() doesn't need it and 6028c2ecf20Sopenharmony_cind_[gs]et_link() is gone. 6038c2ecf20Sopenharmony_ci 6048c2ecf20Sopenharmony_ci--- 6058c2ecf20Sopenharmony_ci 6068c2ecf20Sopenharmony_ci**mandatory** 6078c2ecf20Sopenharmony_ci 6088c2ecf20Sopenharmony_cicalling conventions for ->put_link() have changed. It gets inode instead of 6098c2ecf20Sopenharmony_cidentry, it does not get nameidata at all and it gets called only when cookie 6108c2ecf20Sopenharmony_ciis non-NULL. Note that link body isn't available anymore, so if you need it, 6118c2ecf20Sopenharmony_cistore it as cookie. 6128c2ecf20Sopenharmony_ci 6138c2ecf20Sopenharmony_ci--- 6148c2ecf20Sopenharmony_ci 6158c2ecf20Sopenharmony_ci**mandatory** 6168c2ecf20Sopenharmony_ci 6178c2ecf20Sopenharmony_ciany symlink that might use page_follow_link_light/page_put_link() must 6188c2ecf20Sopenharmony_cihave inode_nohighmem(inode) called before anything might start playing with 6198c2ecf20Sopenharmony_ciits pagecache. No highmem pages should end up in the pagecache of such 6208c2ecf20Sopenharmony_cisymlinks. That includes any preseeding that might be done during symlink 6218c2ecf20Sopenharmony_cicreation. __page_symlink() will honour the mapping gfp flags, so once 6228c2ecf20Sopenharmony_ciyou've done inode_nohighmem() it's safe to use, but if you allocate and 6238c2ecf20Sopenharmony_ciinsert the page manually, make sure to use the right gfp flags. 6248c2ecf20Sopenharmony_ci 6258c2ecf20Sopenharmony_ci--- 6268c2ecf20Sopenharmony_ci 6278c2ecf20Sopenharmony_ci**mandatory** 6288c2ecf20Sopenharmony_ci 6298c2ecf20Sopenharmony_ci->follow_link() is replaced with ->get_link(); same API, except that 6308c2ecf20Sopenharmony_ci 6318c2ecf20Sopenharmony_ci * ->get_link() gets inode as a separate argument 6328c2ecf20Sopenharmony_ci * ->get_link() may be called in RCU mode - in that case NULL 6338c2ecf20Sopenharmony_ci dentry is passed 6348c2ecf20Sopenharmony_ci 6358c2ecf20Sopenharmony_ci--- 6368c2ecf20Sopenharmony_ci 6378c2ecf20Sopenharmony_ci**mandatory** 6388c2ecf20Sopenharmony_ci 6398c2ecf20Sopenharmony_ci->get_link() gets struct delayed_call ``*done`` now, and should do 6408c2ecf20Sopenharmony_ciset_delayed_call() where it used to set ``*cookie``. 6418c2ecf20Sopenharmony_ci 6428c2ecf20Sopenharmony_ci->put_link() is gone - just give the destructor to set_delayed_call() 6438c2ecf20Sopenharmony_ciin ->get_link(). 6448c2ecf20Sopenharmony_ci 6458c2ecf20Sopenharmony_ci--- 6468c2ecf20Sopenharmony_ci 6478c2ecf20Sopenharmony_ci**mandatory** 6488c2ecf20Sopenharmony_ci 6498c2ecf20Sopenharmony_ci->getxattr() and xattr_handler.get() get dentry and inode passed separately. 6508c2ecf20Sopenharmony_cidentry might be yet to be attached to inode, so do _not_ use its ->d_inode 6518c2ecf20Sopenharmony_ciin the instances. Rationale: !@#!@# security_d_instantiate() needs to be 6528c2ecf20Sopenharmony_cicalled before we attach dentry to inode. 6538c2ecf20Sopenharmony_ci 6548c2ecf20Sopenharmony_ci--- 6558c2ecf20Sopenharmony_ci 6568c2ecf20Sopenharmony_ci**mandatory** 6578c2ecf20Sopenharmony_ci 6588c2ecf20Sopenharmony_cisymlinks are no longer the only inodes that do *not* have i_bdev/i_cdev/ 6598c2ecf20Sopenharmony_cii_pipe/i_link union zeroed out at inode eviction. As the result, you can't 6608c2ecf20Sopenharmony_ciassume that non-NULL value in ->i_nlink at ->destroy_inode() implies that 6618c2ecf20Sopenharmony_ciit's a symlink. Checking ->i_mode is really needed now. In-tree we had 6628c2ecf20Sopenharmony_cito fix shmem_destroy_callback() that used to take that kind of shortcut; 6638c2ecf20Sopenharmony_ciwatch out, since that shortcut is no longer valid. 6648c2ecf20Sopenharmony_ci 6658c2ecf20Sopenharmony_ci--- 6668c2ecf20Sopenharmony_ci 6678c2ecf20Sopenharmony_ci**mandatory** 6688c2ecf20Sopenharmony_ci 6698c2ecf20Sopenharmony_ci->i_mutex is replaced with ->i_rwsem now. inode_lock() et.al. work as 6708c2ecf20Sopenharmony_cithey used to - they just take it exclusive. However, ->lookup() may be 6718c2ecf20Sopenharmony_cicalled with parent locked shared. Its instances must not 6728c2ecf20Sopenharmony_ci 6738c2ecf20Sopenharmony_ci * use d_instantiate) and d_rehash() separately - use d_add() or 6748c2ecf20Sopenharmony_ci d_splice_alias() instead. 6758c2ecf20Sopenharmony_ci * use d_rehash() alone - call d_add(new_dentry, NULL) instead. 6768c2ecf20Sopenharmony_ci * in the unlikely case when (read-only) access to filesystem 6778c2ecf20Sopenharmony_ci data structures needs exclusion for some reason, arrange it 6788c2ecf20Sopenharmony_ci yourself. None of the in-tree filesystems needed that. 6798c2ecf20Sopenharmony_ci * rely on ->d_parent and ->d_name not changing after dentry has 6808c2ecf20Sopenharmony_ci been fed to d_add() or d_splice_alias(). Again, none of the 6818c2ecf20Sopenharmony_ci in-tree instances relied upon that. 6828c2ecf20Sopenharmony_ci 6838c2ecf20Sopenharmony_ciWe are guaranteed that lookups of the same name in the same directory 6848c2ecf20Sopenharmony_ciwill not happen in parallel ("same" in the sense of your ->d_compare()). 6858c2ecf20Sopenharmony_ciLookups on different names in the same directory can and do happen in 6868c2ecf20Sopenharmony_ciparallel now. 6878c2ecf20Sopenharmony_ci 6888c2ecf20Sopenharmony_ci--- 6898c2ecf20Sopenharmony_ci 6908c2ecf20Sopenharmony_ci**recommended** 6918c2ecf20Sopenharmony_ci 6928c2ecf20Sopenharmony_ci->iterate_shared() is added; it's a parallel variant of ->iterate(). 6938c2ecf20Sopenharmony_ciExclusion on struct file level is still provided (as well as that 6948c2ecf20Sopenharmony_cibetween it and lseek on the same struct file), but if your directory 6958c2ecf20Sopenharmony_cihas been opened several times, you can get these called in parallel. 6968c2ecf20Sopenharmony_ciExclusion between that method and all directory-modifying ones is 6978c2ecf20Sopenharmony_cistill provided, of course. 6988c2ecf20Sopenharmony_ci 6998c2ecf20Sopenharmony_ciOften enough ->iterate() can serve as ->iterate_shared() without any 7008c2ecf20Sopenharmony_cichanges - it is a read-only operation, after all. If you have any 7018c2ecf20Sopenharmony_ciper-inode or per-dentry in-core data structures modified by ->iterate(), 7028c2ecf20Sopenharmony_ciyou might need something to serialize the access to them. If you 7038c2ecf20Sopenharmony_cido dcache pre-seeding, you'll need to switch to d_alloc_parallel() for 7048c2ecf20Sopenharmony_cithat; look for in-tree examples. 7058c2ecf20Sopenharmony_ci 7068c2ecf20Sopenharmony_ciOld method is only used if the new one is absent; eventually it will 7078c2ecf20Sopenharmony_cibe removed. Switch while you still can; the old one won't stay. 7088c2ecf20Sopenharmony_ci 7098c2ecf20Sopenharmony_ci--- 7108c2ecf20Sopenharmony_ci 7118c2ecf20Sopenharmony_ci**mandatory** 7128c2ecf20Sopenharmony_ci 7138c2ecf20Sopenharmony_ci->atomic_open() calls without O_CREAT may happen in parallel. 7148c2ecf20Sopenharmony_ci 7158c2ecf20Sopenharmony_ci--- 7168c2ecf20Sopenharmony_ci 7178c2ecf20Sopenharmony_ci**mandatory** 7188c2ecf20Sopenharmony_ci 7198c2ecf20Sopenharmony_ci->setxattr() and xattr_handler.set() get dentry and inode passed separately. 7208c2ecf20Sopenharmony_cidentry might be yet to be attached to inode, so do _not_ use its ->d_inode 7218c2ecf20Sopenharmony_ciin the instances. Rationale: !@#!@# security_d_instantiate() needs to be 7228c2ecf20Sopenharmony_cicalled before we attach dentry to inode and !@#!@##!@$!$#!@#$!@$!@$ smack 7238c2ecf20Sopenharmony_ci->d_instantiate() uses not just ->getxattr() but ->setxattr() as well. 7248c2ecf20Sopenharmony_ci 7258c2ecf20Sopenharmony_ci--- 7268c2ecf20Sopenharmony_ci 7278c2ecf20Sopenharmony_ci**mandatory** 7288c2ecf20Sopenharmony_ci 7298c2ecf20Sopenharmony_ci->d_compare() doesn't get parent as a separate argument anymore. If you 7308c2ecf20Sopenharmony_ciused it for finding the struct super_block involved, dentry->d_sb will 7318c2ecf20Sopenharmony_ciwork just as well; if it's something more complicated, use dentry->d_parent. 7328c2ecf20Sopenharmony_ciJust be careful not to assume that fetching it more than once will yield 7338c2ecf20Sopenharmony_cithe same value - in RCU mode it could change under you. 7348c2ecf20Sopenharmony_ci 7358c2ecf20Sopenharmony_ci--- 7368c2ecf20Sopenharmony_ci 7378c2ecf20Sopenharmony_ci**mandatory** 7388c2ecf20Sopenharmony_ci 7398c2ecf20Sopenharmony_ci->rename() has an added flags argument. Any flags not handled by the 7408c2ecf20Sopenharmony_cifilesystem should result in EINVAL being returned. 7418c2ecf20Sopenharmony_ci 7428c2ecf20Sopenharmony_ci--- 7438c2ecf20Sopenharmony_ci 7448c2ecf20Sopenharmony_ci 7458c2ecf20Sopenharmony_ci**recommended** 7468c2ecf20Sopenharmony_ci 7478c2ecf20Sopenharmony_ci->readlink is optional for symlinks. Don't set, unless filesystem needs 7488c2ecf20Sopenharmony_cito fake something for readlink(2). 7498c2ecf20Sopenharmony_ci 7508c2ecf20Sopenharmony_ci--- 7518c2ecf20Sopenharmony_ci 7528c2ecf20Sopenharmony_ci**mandatory** 7538c2ecf20Sopenharmony_ci 7548c2ecf20Sopenharmony_ci->getattr() is now passed a struct path rather than a vfsmount and 7558c2ecf20Sopenharmony_cidentry separately, and it now has request_mask and query_flags arguments 7568c2ecf20Sopenharmony_cito specify the fields and sync type requested by statx. Filesystems not 7578c2ecf20Sopenharmony_cisupporting any statx-specific features may ignore the new arguments. 7588c2ecf20Sopenharmony_ci 7598c2ecf20Sopenharmony_ci--- 7608c2ecf20Sopenharmony_ci 7618c2ecf20Sopenharmony_ci**mandatory** 7628c2ecf20Sopenharmony_ci 7638c2ecf20Sopenharmony_ci->atomic_open() calling conventions have changed. Gone is ``int *opened``, 7648c2ecf20Sopenharmony_cialong with FILE_OPENED/FILE_CREATED. In place of those we have 7658c2ecf20Sopenharmony_ciFMODE_OPENED/FMODE_CREATED, set in file->f_mode. Additionally, return 7668c2ecf20Sopenharmony_civalue for 'called finish_no_open(), open it yourself' case has become 7678c2ecf20Sopenharmony_ci0, not 1. Since finish_no_open() itself is returning 0 now, that part 7688c2ecf20Sopenharmony_cidoes not need any changes in ->atomic_open() instances. 7698c2ecf20Sopenharmony_ci 7708c2ecf20Sopenharmony_ci--- 7718c2ecf20Sopenharmony_ci 7728c2ecf20Sopenharmony_ci**mandatory** 7738c2ecf20Sopenharmony_ci 7748c2ecf20Sopenharmony_cialloc_file() has become static now; two wrappers are to be used instead. 7758c2ecf20Sopenharmony_cialloc_file_pseudo(inode, vfsmount, name, flags, ops) is for the cases 7768c2ecf20Sopenharmony_ciwhen dentry needs to be created; that's the majority of old alloc_file() 7778c2ecf20Sopenharmony_ciusers. Calling conventions: on success a reference to new struct file 7788c2ecf20Sopenharmony_ciis returned and callers reference to inode is subsumed by that. On 7798c2ecf20Sopenharmony_cifailure, ERR_PTR() is returned and no caller's references are affected, 7808c2ecf20Sopenharmony_ciso the caller needs to drop the inode reference it held. 7818c2ecf20Sopenharmony_cialloc_file_clone(file, flags, ops) does not affect any caller's references. 7828c2ecf20Sopenharmony_ciOn success you get a new struct file sharing the mount/dentry with the 7838c2ecf20Sopenharmony_cioriginal, on failure - ERR_PTR(). 7848c2ecf20Sopenharmony_ci 7858c2ecf20Sopenharmony_ci--- 7868c2ecf20Sopenharmony_ci 7878c2ecf20Sopenharmony_ci**mandatory** 7888c2ecf20Sopenharmony_ci 7898c2ecf20Sopenharmony_ci->clone_file_range() and ->dedupe_file_range have been replaced with 7908c2ecf20Sopenharmony_ci->remap_file_range(). See Documentation/filesystems/vfs.rst for more 7918c2ecf20Sopenharmony_ciinformation. 7928c2ecf20Sopenharmony_ci 7938c2ecf20Sopenharmony_ci--- 7948c2ecf20Sopenharmony_ci 7958c2ecf20Sopenharmony_ci**recommended** 7968c2ecf20Sopenharmony_ci 7978c2ecf20Sopenharmony_ci->lookup() instances doing an equivalent of:: 7988c2ecf20Sopenharmony_ci 7998c2ecf20Sopenharmony_ci if (IS_ERR(inode)) 8008c2ecf20Sopenharmony_ci return ERR_CAST(inode); 8018c2ecf20Sopenharmony_ci return d_splice_alias(inode, dentry); 8028c2ecf20Sopenharmony_ci 8038c2ecf20Sopenharmony_cidon't need to bother with the check - d_splice_alias() will do the 8048c2ecf20Sopenharmony_ciright thing when given ERR_PTR(...) as inode. Moreover, passing NULL 8058c2ecf20Sopenharmony_ciinode to d_splice_alias() will also do the right thing (equivalent of 8068c2ecf20Sopenharmony_cid_add(dentry, NULL); return NULL;), so that kind of special cases 8078c2ecf20Sopenharmony_cialso doesn't need a separate treatment. 8088c2ecf20Sopenharmony_ci 8098c2ecf20Sopenharmony_ci--- 8108c2ecf20Sopenharmony_ci 8118c2ecf20Sopenharmony_ci**strongly recommended** 8128c2ecf20Sopenharmony_ci 8138c2ecf20Sopenharmony_citake the RCU-delayed parts of ->destroy_inode() into a new method - 8148c2ecf20Sopenharmony_ci->free_inode(). If ->destroy_inode() becomes empty - all the better, 8158c2ecf20Sopenharmony_cijust get rid of it. Synchronous work (e.g. the stuff that can't 8168c2ecf20Sopenharmony_cibe done from an RCU callback, or any WARN_ON() where we want the 8178c2ecf20Sopenharmony_cistack trace) *might* be movable to ->evict_inode(); however, 8188c2ecf20Sopenharmony_cithat goes only for the things that are not needed to balance something 8198c2ecf20Sopenharmony_cidone by ->alloc_inode(). IOW, if it's cleaning up the stuff that 8208c2ecf20Sopenharmony_cimight have accumulated over the life of in-core inode, ->evict_inode() 8218c2ecf20Sopenharmony_cimight be a fit. 8228c2ecf20Sopenharmony_ci 8238c2ecf20Sopenharmony_ciRules for inode destruction: 8248c2ecf20Sopenharmony_ci 8258c2ecf20Sopenharmony_ci * if ->destroy_inode() is non-NULL, it gets called 8268c2ecf20Sopenharmony_ci * if ->free_inode() is non-NULL, it gets scheduled by call_rcu() 8278c2ecf20Sopenharmony_ci * combination of NULL ->destroy_inode and NULL ->free_inode is 8288c2ecf20Sopenharmony_ci treated as NULL/free_inode_nonrcu, to preserve the compatibility. 8298c2ecf20Sopenharmony_ci 8308c2ecf20Sopenharmony_ciNote that the callback (be it via ->free_inode() or explicit call_rcu() 8318c2ecf20Sopenharmony_ciin ->destroy_inode()) is *NOT* ordered wrt superblock destruction; 8328c2ecf20Sopenharmony_cias the matter of fact, the superblock and all associated structures 8338c2ecf20Sopenharmony_cimight be already gone. The filesystem driver is guaranteed to be still 8348c2ecf20Sopenharmony_cithere, but that's it. Freeing memory in the callback is fine; doing 8358c2ecf20Sopenharmony_cimore than that is possible, but requires a lot of care and is best 8368c2ecf20Sopenharmony_ciavoided. 8378c2ecf20Sopenharmony_ci 8388c2ecf20Sopenharmony_ci--- 8398c2ecf20Sopenharmony_ci 8408c2ecf20Sopenharmony_ci**mandatory** 8418c2ecf20Sopenharmony_ci 8428c2ecf20Sopenharmony_ciDCACHE_RCUACCESS is gone; having an RCU delay on dentry freeing is the 8438c2ecf20Sopenharmony_cidefault. DCACHE_NORCU opts out, and only d_alloc_pseudo() has any 8448c2ecf20Sopenharmony_cibusiness doing so. 8458c2ecf20Sopenharmony_ci 8468c2ecf20Sopenharmony_ci--- 8478c2ecf20Sopenharmony_ci 8488c2ecf20Sopenharmony_ci**mandatory** 8498c2ecf20Sopenharmony_ci 8508c2ecf20Sopenharmony_cid_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are 8518c2ecf20Sopenharmony_civery suspect (and won't work in modules). Such uses are very likely to 8528c2ecf20Sopenharmony_cibe misspelled d_alloc_anon(). 8538c2ecf20Sopenharmony_ci 8548c2ecf20Sopenharmony_ci--- 8558c2ecf20Sopenharmony_ci 8568c2ecf20Sopenharmony_ci**mandatory** 8578c2ecf20Sopenharmony_ci 8588c2ecf20Sopenharmony_ci[should've been added in 2016] stale comment in finish_open() nonwithstanding, 8598c2ecf20Sopenharmony_cifailure exits in ->atomic_open() instances should *NOT* fput() the file, 8608c2ecf20Sopenharmony_cino matter what. Everything is handled by the caller. 8618c2ecf20Sopenharmony_ci 8628c2ecf20Sopenharmony_ci--- 8638c2ecf20Sopenharmony_ci 8648c2ecf20Sopenharmony_ci**mandatory** 8658c2ecf20Sopenharmony_ci 8668c2ecf20Sopenharmony_ciclone_private_mount() returns a longterm mount now, so the proper destructor of 8678c2ecf20Sopenharmony_ciits result is kern_unmount() or kern_unmount_array(). 8688c2ecf20Sopenharmony_ci 8698c2ecf20Sopenharmony_ci--- 8708c2ecf20Sopenharmony_ci 8718c2ecf20Sopenharmony_ci**mandatory** 8728c2ecf20Sopenharmony_ci 8738c2ecf20Sopenharmony_ciIf ->rename() update of .. on cross-directory move needs an exclusion with 8748c2ecf20Sopenharmony_cidirectory modifications, do *not* lock the subdirectory in question in your 8758c2ecf20Sopenharmony_ci->rename() - it's done by the caller now [that item should've been added in 8768c2ecf20Sopenharmony_ci28eceeda130f "fs: Lock moved directories"]. 8778c2ecf20Sopenharmony_ci 8788c2ecf20Sopenharmony_ci--- 8798c2ecf20Sopenharmony_ci 8808c2ecf20Sopenharmony_ci**mandatory** 8818c2ecf20Sopenharmony_ci 8828c2ecf20Sopenharmony_ciOn same-directory ->rename() the (tautological) update of .. is not protected 8838c2ecf20Sopenharmony_ciby any locks; just don't do it if the old parent is the same as the new one. 8848c2ecf20Sopenharmony_ciWe really can't lock two subdirectories in same-directory rename - not without 8858c2ecf20Sopenharmony_cideadlocks. 886