18c2ecf20Sopenharmony_ci================================== 28c2ecf20Sopenharmony_ciCache and TLB Flushing Under Linux 38c2ecf20Sopenharmony_ci================================== 48c2ecf20Sopenharmony_ci 58c2ecf20Sopenharmony_ci:Author: David S. Miller <davem@redhat.com> 68c2ecf20Sopenharmony_ci 78c2ecf20Sopenharmony_ciThis document describes the cache/tlb flushing interfaces called 88c2ecf20Sopenharmony_ciby the Linux VM subsystem. It enumerates over each interface, 98c2ecf20Sopenharmony_cidescribes its intended purpose, and what side effect is expected 108c2ecf20Sopenharmony_ciafter the interface is invoked. 118c2ecf20Sopenharmony_ci 128c2ecf20Sopenharmony_ciThe side effects described below are stated for a uniprocessor 138c2ecf20Sopenharmony_ciimplementation, and what is to happen on that single processor. The 148c2ecf20Sopenharmony_ciSMP cases are a simple extension, in that you just extend the 158c2ecf20Sopenharmony_cidefinition such that the side effect for a particular interface occurs 168c2ecf20Sopenharmony_cion all processors in the system. Don't let this scare you into 178c2ecf20Sopenharmony_cithinking SMP cache/tlb flushing must be so inefficient, this is in 188c2ecf20Sopenharmony_cifact an area where many optimizations are possible. For example, 198c2ecf20Sopenharmony_ciif it can be proven that a user address space has never executed 208c2ecf20Sopenharmony_cion a cpu (see mm_cpumask()), one need not perform a flush 218c2ecf20Sopenharmony_cifor this address space on that cpu. 228c2ecf20Sopenharmony_ci 238c2ecf20Sopenharmony_ciFirst, the TLB flushing interfaces, since they are the simplest. The 248c2ecf20Sopenharmony_ci"TLB" is abstracted under Linux as something the cpu uses to cache 258c2ecf20Sopenharmony_civirtual-->physical address translations obtained from the software 268c2ecf20Sopenharmony_cipage tables. Meaning that if the software page tables change, it is 278c2ecf20Sopenharmony_cipossible for stale translations to exist in this "TLB" cache. 288c2ecf20Sopenharmony_ciTherefore when software page table changes occur, the kernel will 298c2ecf20Sopenharmony_ciinvoke one of the following flush methods _after_ the page table 308c2ecf20Sopenharmony_cichanges occur: 318c2ecf20Sopenharmony_ci 328c2ecf20Sopenharmony_ci1) ``void flush_tlb_all(void)`` 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ci The most severe flush of all. After this interface runs, 358c2ecf20Sopenharmony_ci any previous page table modification whatsoever will be 368c2ecf20Sopenharmony_ci visible to the cpu. 378c2ecf20Sopenharmony_ci 388c2ecf20Sopenharmony_ci This is usually invoked when the kernel page tables are 398c2ecf20Sopenharmony_ci changed, since such translations are "global" in nature. 408c2ecf20Sopenharmony_ci 418c2ecf20Sopenharmony_ci2) ``void flush_tlb_mm(struct mm_struct *mm)`` 428c2ecf20Sopenharmony_ci 438c2ecf20Sopenharmony_ci This interface flushes an entire user address space from 448c2ecf20Sopenharmony_ci the TLB. After running, this interface must make sure that 458c2ecf20Sopenharmony_ci any previous page table modifications for the address space 468c2ecf20Sopenharmony_ci 'mm' will be visible to the cpu. That is, after running, 478c2ecf20Sopenharmony_ci there will be no entries in the TLB for 'mm'. 488c2ecf20Sopenharmony_ci 498c2ecf20Sopenharmony_ci This interface is used to handle whole address space 508c2ecf20Sopenharmony_ci page table operations such as what happens during 518c2ecf20Sopenharmony_ci fork, and exec. 528c2ecf20Sopenharmony_ci 538c2ecf20Sopenharmony_ci3) ``void flush_tlb_range(struct vm_area_struct *vma, 548c2ecf20Sopenharmony_ci unsigned long start, unsigned long end)`` 558c2ecf20Sopenharmony_ci 568c2ecf20Sopenharmony_ci Here we are flushing a specific range of (user) virtual 578c2ecf20Sopenharmony_ci address translations from the TLB. After running, this 588c2ecf20Sopenharmony_ci interface must make sure that any previous page table 598c2ecf20Sopenharmony_ci modifications for the address space 'vma->vm_mm' in the range 608c2ecf20Sopenharmony_ci 'start' to 'end-1' will be visible to the cpu. That is, after 618c2ecf20Sopenharmony_ci running, there will be no entries in the TLB for 'mm' for 628c2ecf20Sopenharmony_ci virtual addresses in the range 'start' to 'end-1'. 638c2ecf20Sopenharmony_ci 648c2ecf20Sopenharmony_ci The "vma" is the backing store being used for the region. 658c2ecf20Sopenharmony_ci Primarily, this is used for munmap() type operations. 668c2ecf20Sopenharmony_ci 678c2ecf20Sopenharmony_ci The interface is provided in hopes that the port can find 688c2ecf20Sopenharmony_ci a suitably efficient method for removing multiple page 698c2ecf20Sopenharmony_ci sized translations from the TLB, instead of having the kernel 708c2ecf20Sopenharmony_ci call flush_tlb_page (see below) for each entry which may be 718c2ecf20Sopenharmony_ci modified. 728c2ecf20Sopenharmony_ci 738c2ecf20Sopenharmony_ci4) ``void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)`` 748c2ecf20Sopenharmony_ci 758c2ecf20Sopenharmony_ci This time we need to remove the PAGE_SIZE sized translation 768c2ecf20Sopenharmony_ci from the TLB. The 'vma' is the backing structure used by 778c2ecf20Sopenharmony_ci Linux to keep track of mmap'd regions for a process, the 788c2ecf20Sopenharmony_ci address space is available via vma->vm_mm. Also, one may 798c2ecf20Sopenharmony_ci test (vma->vm_flags & VM_EXEC) to see if this region is 808c2ecf20Sopenharmony_ci executable (and thus could be in the 'instruction TLB' in 818c2ecf20Sopenharmony_ci split-tlb type setups). 828c2ecf20Sopenharmony_ci 838c2ecf20Sopenharmony_ci After running, this interface must make sure that any previous 848c2ecf20Sopenharmony_ci page table modification for address space 'vma->vm_mm' for 858c2ecf20Sopenharmony_ci user virtual address 'addr' will be visible to the cpu. That 868c2ecf20Sopenharmony_ci is, after running, there will be no entries in the TLB for 878c2ecf20Sopenharmony_ci 'vma->vm_mm' for virtual address 'addr'. 888c2ecf20Sopenharmony_ci 898c2ecf20Sopenharmony_ci This is used primarily during fault processing. 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ci5) ``void update_mmu_cache(struct vm_area_struct *vma, 928c2ecf20Sopenharmony_ci unsigned long address, pte_t *ptep)`` 938c2ecf20Sopenharmony_ci 948c2ecf20Sopenharmony_ci At the end of every page fault, this routine is invoked to 958c2ecf20Sopenharmony_ci tell the architecture specific code that a translation 968c2ecf20Sopenharmony_ci now exists at virtual address "address" for address space 978c2ecf20Sopenharmony_ci "vma->vm_mm", in the software page tables. 988c2ecf20Sopenharmony_ci 998c2ecf20Sopenharmony_ci A port may use this information in any way it so chooses. 1008c2ecf20Sopenharmony_ci For example, it could use this event to pre-load TLB 1018c2ecf20Sopenharmony_ci translations for software managed TLB configurations. 1028c2ecf20Sopenharmony_ci The sparc64 port currently does this. 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ciNext, we have the cache flushing interfaces. In general, when Linux 1058c2ecf20Sopenharmony_ciis changing an existing virtual-->physical mapping to a new value, 1068c2ecf20Sopenharmony_cithe sequence will be in one of the following forms:: 1078c2ecf20Sopenharmony_ci 1088c2ecf20Sopenharmony_ci 1) flush_cache_mm(mm); 1098c2ecf20Sopenharmony_ci change_all_page_tables_of(mm); 1108c2ecf20Sopenharmony_ci flush_tlb_mm(mm); 1118c2ecf20Sopenharmony_ci 1128c2ecf20Sopenharmony_ci 2) flush_cache_range(vma, start, end); 1138c2ecf20Sopenharmony_ci change_range_of_page_tables(mm, start, end); 1148c2ecf20Sopenharmony_ci flush_tlb_range(vma, start, end); 1158c2ecf20Sopenharmony_ci 1168c2ecf20Sopenharmony_ci 3) flush_cache_page(vma, addr, pfn); 1178c2ecf20Sopenharmony_ci set_pte(pte_pointer, new_pte_val); 1188c2ecf20Sopenharmony_ci flush_tlb_page(vma, addr); 1198c2ecf20Sopenharmony_ci 1208c2ecf20Sopenharmony_ciThe cache level flush will always be first, because this allows 1218c2ecf20Sopenharmony_cius to properly handle systems whose caches are strict and require 1228c2ecf20Sopenharmony_cia virtual-->physical translation to exist for a virtual address 1238c2ecf20Sopenharmony_ciwhen that virtual address is flushed from the cache. The HyperSparc 1248c2ecf20Sopenharmony_cicpu is one such cpu with this attribute. 1258c2ecf20Sopenharmony_ci 1268c2ecf20Sopenharmony_ciThe cache flushing routines below need only deal with cache flushing 1278c2ecf20Sopenharmony_cito the extent that it is necessary for a particular cpu. Mostly, 1288c2ecf20Sopenharmony_cithese routines must be implemented for cpus which have virtually 1298c2ecf20Sopenharmony_ciindexed caches which must be flushed when virtual-->physical 1308c2ecf20Sopenharmony_citranslations are changed or removed. So, for example, the physically 1318c2ecf20Sopenharmony_ciindexed physically tagged caches of IA32 processors have no need to 1328c2ecf20Sopenharmony_ciimplement these interfaces since the caches are fully synchronized 1338c2ecf20Sopenharmony_ciand have no dependency on translation information. 1348c2ecf20Sopenharmony_ci 1358c2ecf20Sopenharmony_ciHere are the routines, one by one: 1368c2ecf20Sopenharmony_ci 1378c2ecf20Sopenharmony_ci1) ``void flush_cache_mm(struct mm_struct *mm)`` 1388c2ecf20Sopenharmony_ci 1398c2ecf20Sopenharmony_ci This interface flushes an entire user address space from 1408c2ecf20Sopenharmony_ci the caches. That is, after running, there will be no cache 1418c2ecf20Sopenharmony_ci lines associated with 'mm'. 1428c2ecf20Sopenharmony_ci 1438c2ecf20Sopenharmony_ci This interface is used to handle whole address space 1448c2ecf20Sopenharmony_ci page table operations such as what happens during exit and exec. 1458c2ecf20Sopenharmony_ci 1468c2ecf20Sopenharmony_ci2) ``void flush_cache_dup_mm(struct mm_struct *mm)`` 1478c2ecf20Sopenharmony_ci 1488c2ecf20Sopenharmony_ci This interface flushes an entire user address space from 1498c2ecf20Sopenharmony_ci the caches. That is, after running, there will be no cache 1508c2ecf20Sopenharmony_ci lines associated with 'mm'. 1518c2ecf20Sopenharmony_ci 1528c2ecf20Sopenharmony_ci This interface is used to handle whole address space 1538c2ecf20Sopenharmony_ci page table operations such as what happens during fork. 1548c2ecf20Sopenharmony_ci 1558c2ecf20Sopenharmony_ci This option is separate from flush_cache_mm to allow some 1568c2ecf20Sopenharmony_ci optimizations for VIPT caches. 1578c2ecf20Sopenharmony_ci 1588c2ecf20Sopenharmony_ci3) ``void flush_cache_range(struct vm_area_struct *vma, 1598c2ecf20Sopenharmony_ci unsigned long start, unsigned long end)`` 1608c2ecf20Sopenharmony_ci 1618c2ecf20Sopenharmony_ci Here we are flushing a specific range of (user) virtual 1628c2ecf20Sopenharmony_ci addresses from the cache. After running, there will be no 1638c2ecf20Sopenharmony_ci entries in the cache for 'vma->vm_mm' for virtual addresses in 1648c2ecf20Sopenharmony_ci the range 'start' to 'end-1'. 1658c2ecf20Sopenharmony_ci 1668c2ecf20Sopenharmony_ci The "vma" is the backing store being used for the region. 1678c2ecf20Sopenharmony_ci Primarily, this is used for munmap() type operations. 1688c2ecf20Sopenharmony_ci 1698c2ecf20Sopenharmony_ci The interface is provided in hopes that the port can find 1708c2ecf20Sopenharmony_ci a suitably efficient method for removing multiple page 1718c2ecf20Sopenharmony_ci sized regions from the cache, instead of having the kernel 1728c2ecf20Sopenharmony_ci call flush_cache_page (see below) for each entry which may be 1738c2ecf20Sopenharmony_ci modified. 1748c2ecf20Sopenharmony_ci 1758c2ecf20Sopenharmony_ci4) ``void flush_cache_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn)`` 1768c2ecf20Sopenharmony_ci 1778c2ecf20Sopenharmony_ci This time we need to remove a PAGE_SIZE sized range 1788c2ecf20Sopenharmony_ci from the cache. The 'vma' is the backing structure used by 1798c2ecf20Sopenharmony_ci Linux to keep track of mmap'd regions for a process, the 1808c2ecf20Sopenharmony_ci address space is available via vma->vm_mm. Also, one may 1818c2ecf20Sopenharmony_ci test (vma->vm_flags & VM_EXEC) to see if this region is 1828c2ecf20Sopenharmony_ci executable (and thus could be in the 'instruction cache' in 1838c2ecf20Sopenharmony_ci "Harvard" type cache layouts). 1848c2ecf20Sopenharmony_ci 1858c2ecf20Sopenharmony_ci The 'pfn' indicates the physical page frame (shift this value 1868c2ecf20Sopenharmony_ci left by PAGE_SHIFT to get the physical address) that 'addr' 1878c2ecf20Sopenharmony_ci translates to. It is this mapping which should be removed from 1888c2ecf20Sopenharmony_ci the cache. 1898c2ecf20Sopenharmony_ci 1908c2ecf20Sopenharmony_ci After running, there will be no entries in the cache for 1918c2ecf20Sopenharmony_ci 'vma->vm_mm' for virtual address 'addr' which translates 1928c2ecf20Sopenharmony_ci to 'pfn'. 1938c2ecf20Sopenharmony_ci 1948c2ecf20Sopenharmony_ci This is used primarily during fault processing. 1958c2ecf20Sopenharmony_ci 1968c2ecf20Sopenharmony_ci5) ``void flush_cache_kmaps(void)`` 1978c2ecf20Sopenharmony_ci 1988c2ecf20Sopenharmony_ci This routine need only be implemented if the platform utilizes 1998c2ecf20Sopenharmony_ci highmem. It will be called right before all of the kmaps 2008c2ecf20Sopenharmony_ci are invalidated. 2018c2ecf20Sopenharmony_ci 2028c2ecf20Sopenharmony_ci After running, there will be no entries in the cache for 2038c2ecf20Sopenharmony_ci the kernel virtual address range PKMAP_ADDR(0) to 2048c2ecf20Sopenharmony_ci PKMAP_ADDR(LAST_PKMAP). 2058c2ecf20Sopenharmony_ci 2068c2ecf20Sopenharmony_ci This routing should be implemented in asm/highmem.h 2078c2ecf20Sopenharmony_ci 2088c2ecf20Sopenharmony_ci6) ``void flush_cache_vmap(unsigned long start, unsigned long end)`` 2098c2ecf20Sopenharmony_ci ``void flush_cache_vunmap(unsigned long start, unsigned long end)`` 2108c2ecf20Sopenharmony_ci 2118c2ecf20Sopenharmony_ci Here in these two interfaces we are flushing a specific range 2128c2ecf20Sopenharmony_ci of (kernel) virtual addresses from the cache. After running, 2138c2ecf20Sopenharmony_ci there will be no entries in the cache for the kernel address 2148c2ecf20Sopenharmony_ci space for virtual addresses in the range 'start' to 'end-1'. 2158c2ecf20Sopenharmony_ci 2168c2ecf20Sopenharmony_ci The first of these two routines is invoked after map_kernel_range() 2178c2ecf20Sopenharmony_ci has installed the page table entries. The second is invoked 2188c2ecf20Sopenharmony_ci before unmap_kernel_range() deletes the page table entries. 2198c2ecf20Sopenharmony_ci 2208c2ecf20Sopenharmony_ciThere exists another whole class of cpu cache issues which currently 2218c2ecf20Sopenharmony_cirequire a whole different set of interfaces to handle properly. 2228c2ecf20Sopenharmony_ciThe biggest problem is that of virtual aliasing in the data cache 2238c2ecf20Sopenharmony_ciof a processor. 2248c2ecf20Sopenharmony_ci 2258c2ecf20Sopenharmony_ciIs your port susceptible to virtual aliasing in its D-cache? 2268c2ecf20Sopenharmony_ciWell, if your D-cache is virtually indexed, is larger in size than 2278c2ecf20Sopenharmony_ciPAGE_SIZE, and does not prevent multiple cache lines for the same 2288c2ecf20Sopenharmony_ciphysical address from existing at once, you have this problem. 2298c2ecf20Sopenharmony_ci 2308c2ecf20Sopenharmony_ciIf your D-cache has this problem, first define asm/shmparam.h SHMLBA 2318c2ecf20Sopenharmony_ciproperly, it should essentially be the size of your virtually 2328c2ecf20Sopenharmony_ciaddressed D-cache (or if the size is variable, the largest possible 2338c2ecf20Sopenharmony_cisize). This setting will force the SYSv IPC layer to only allow user 2348c2ecf20Sopenharmony_ciprocesses to mmap shared memory at address which are a multiple of 2358c2ecf20Sopenharmony_cithis value. 2368c2ecf20Sopenharmony_ci 2378c2ecf20Sopenharmony_ci.. note:: 2388c2ecf20Sopenharmony_ci 2398c2ecf20Sopenharmony_ci This does not fix shared mmaps, check out the sparc64 port for 2408c2ecf20Sopenharmony_ci one way to solve this (in particular SPARC_FLAG_MMAPSHARED). 2418c2ecf20Sopenharmony_ci 2428c2ecf20Sopenharmony_ciNext, you have to solve the D-cache aliasing issue for all 2438c2ecf20Sopenharmony_ciother cases. Please keep in mind that fact that, for a given page 2448c2ecf20Sopenharmony_cimapped into some user address space, there is always at least one more 2458c2ecf20Sopenharmony_cimapping, that of the kernel in its linear mapping starting at 2468c2ecf20Sopenharmony_ciPAGE_OFFSET. So immediately, once the first user maps a given 2478c2ecf20Sopenharmony_ciphysical page into its address space, by implication the D-cache 2488c2ecf20Sopenharmony_cialiasing problem has the potential to exist since the kernel already 2498c2ecf20Sopenharmony_cimaps this page at its virtual address. 2508c2ecf20Sopenharmony_ci 2518c2ecf20Sopenharmony_ci ``void copy_user_page(void *to, void *from, unsigned long addr, struct page *page)`` 2528c2ecf20Sopenharmony_ci ``void clear_user_page(void *to, unsigned long addr, struct page *page)`` 2538c2ecf20Sopenharmony_ci 2548c2ecf20Sopenharmony_ci These two routines store data in user anonymous or COW 2558c2ecf20Sopenharmony_ci pages. It allows a port to efficiently avoid D-cache alias 2568c2ecf20Sopenharmony_ci issues between userspace and the kernel. 2578c2ecf20Sopenharmony_ci 2588c2ecf20Sopenharmony_ci For example, a port may temporarily map 'from' and 'to' to 2598c2ecf20Sopenharmony_ci kernel virtual addresses during the copy. The virtual address 2608c2ecf20Sopenharmony_ci for these two pages is chosen in such a way that the kernel 2618c2ecf20Sopenharmony_ci load/store instructions happen to virtual addresses which are 2628c2ecf20Sopenharmony_ci of the same "color" as the user mapping of the page. Sparc64 2638c2ecf20Sopenharmony_ci for example, uses this technique. 2648c2ecf20Sopenharmony_ci 2658c2ecf20Sopenharmony_ci The 'addr' parameter tells the virtual address where the 2668c2ecf20Sopenharmony_ci user will ultimately have this page mapped, and the 'page' 2678c2ecf20Sopenharmony_ci parameter gives a pointer to the struct page of the target. 2688c2ecf20Sopenharmony_ci 2698c2ecf20Sopenharmony_ci If D-cache aliasing is not an issue, these two routines may 2708c2ecf20Sopenharmony_ci simply call memcpy/memset directly and do nothing more. 2718c2ecf20Sopenharmony_ci 2728c2ecf20Sopenharmony_ci ``void flush_dcache_page(struct page *page)`` 2738c2ecf20Sopenharmony_ci 2748c2ecf20Sopenharmony_ci Any time the kernel writes to a page cache page, _OR_ 2758c2ecf20Sopenharmony_ci the kernel is about to read from a page cache page and 2768c2ecf20Sopenharmony_ci user space shared/writable mappings of this page potentially 2778c2ecf20Sopenharmony_ci exist, this routine is called. 2788c2ecf20Sopenharmony_ci 2798c2ecf20Sopenharmony_ci .. note:: 2808c2ecf20Sopenharmony_ci 2818c2ecf20Sopenharmony_ci This routine need only be called for page cache pages 2828c2ecf20Sopenharmony_ci which can potentially ever be mapped into the address 2838c2ecf20Sopenharmony_ci space of a user process. So for example, VFS layer code 2848c2ecf20Sopenharmony_ci handling vfs symlinks in the page cache need not call 2858c2ecf20Sopenharmony_ci this interface at all. 2868c2ecf20Sopenharmony_ci 2878c2ecf20Sopenharmony_ci The phrase "kernel writes to a page cache page" means, 2888c2ecf20Sopenharmony_ci specifically, that the kernel executes store instructions 2898c2ecf20Sopenharmony_ci that dirty data in that page at the page->virtual mapping 2908c2ecf20Sopenharmony_ci of that page. It is important to flush here to handle 2918c2ecf20Sopenharmony_ci D-cache aliasing, to make sure these kernel stores are 2928c2ecf20Sopenharmony_ci visible to user space mappings of that page. 2938c2ecf20Sopenharmony_ci 2948c2ecf20Sopenharmony_ci The corollary case is just as important, if there are users 2958c2ecf20Sopenharmony_ci which have shared+writable mappings of this file, we must make 2968c2ecf20Sopenharmony_ci sure that kernel reads of these pages will see the most recent 2978c2ecf20Sopenharmony_ci stores done by the user. 2988c2ecf20Sopenharmony_ci 2998c2ecf20Sopenharmony_ci If D-cache aliasing is not an issue, this routine may 3008c2ecf20Sopenharmony_ci simply be defined as a nop on that architecture. 3018c2ecf20Sopenharmony_ci 3028c2ecf20Sopenharmony_ci There is a bit set aside in page->flags (PG_arch_1) as 3038c2ecf20Sopenharmony_ci "architecture private". The kernel guarantees that, 3048c2ecf20Sopenharmony_ci for pagecache pages, it will clear this bit when such 3058c2ecf20Sopenharmony_ci a page first enters the pagecache. 3068c2ecf20Sopenharmony_ci 3078c2ecf20Sopenharmony_ci This allows these interfaces to be implemented much more 3088c2ecf20Sopenharmony_ci efficiently. It allows one to "defer" (perhaps indefinitely) 3098c2ecf20Sopenharmony_ci the actual flush if there are currently no user processes 3108c2ecf20Sopenharmony_ci mapping this page. See sparc64's flush_dcache_page and 3118c2ecf20Sopenharmony_ci update_mmu_cache implementations for an example of how to go 3128c2ecf20Sopenharmony_ci about doing this. 3138c2ecf20Sopenharmony_ci 3148c2ecf20Sopenharmony_ci The idea is, first at flush_dcache_page() time, if 3158c2ecf20Sopenharmony_ci page->mapping->i_mmap is an empty tree, just mark the architecture 3168c2ecf20Sopenharmony_ci private page flag bit. Later, in update_mmu_cache(), a check is 3178c2ecf20Sopenharmony_ci made of this flag bit, and if set the flush is done and the flag 3188c2ecf20Sopenharmony_ci bit is cleared. 3198c2ecf20Sopenharmony_ci 3208c2ecf20Sopenharmony_ci .. important:: 3218c2ecf20Sopenharmony_ci 3228c2ecf20Sopenharmony_ci It is often important, if you defer the flush, 3238c2ecf20Sopenharmony_ci that the actual flush occurs on the same CPU 3248c2ecf20Sopenharmony_ci as did the cpu stores into the page to make it 3258c2ecf20Sopenharmony_ci dirty. Again, see sparc64 for examples of how 3268c2ecf20Sopenharmony_ci to deal with this. 3278c2ecf20Sopenharmony_ci 3288c2ecf20Sopenharmony_ci ``void copy_to_user_page(struct vm_area_struct *vma, struct page *page, 3298c2ecf20Sopenharmony_ci unsigned long user_vaddr, void *dst, void *src, int len)`` 3308c2ecf20Sopenharmony_ci ``void copy_from_user_page(struct vm_area_struct *vma, struct page *page, 3318c2ecf20Sopenharmony_ci unsigned long user_vaddr, void *dst, void *src, int len)`` 3328c2ecf20Sopenharmony_ci 3338c2ecf20Sopenharmony_ci When the kernel needs to copy arbitrary data in and out 3348c2ecf20Sopenharmony_ci of arbitrary user pages (f.e. for ptrace()) it will use 3358c2ecf20Sopenharmony_ci these two routines. 3368c2ecf20Sopenharmony_ci 3378c2ecf20Sopenharmony_ci Any necessary cache flushing or other coherency operations 3388c2ecf20Sopenharmony_ci that need to occur should happen here. If the processor's 3398c2ecf20Sopenharmony_ci instruction cache does not snoop cpu stores, it is very 3408c2ecf20Sopenharmony_ci likely that you will need to flush the instruction cache 3418c2ecf20Sopenharmony_ci for copy_to_user_page(). 3428c2ecf20Sopenharmony_ci 3438c2ecf20Sopenharmony_ci ``void flush_anon_page(struct vm_area_struct *vma, struct page *page, 3448c2ecf20Sopenharmony_ci unsigned long vmaddr)`` 3458c2ecf20Sopenharmony_ci 3468c2ecf20Sopenharmony_ci When the kernel needs to access the contents of an anonymous 3478c2ecf20Sopenharmony_ci page, it calls this function (currently only 3488c2ecf20Sopenharmony_ci get_user_pages()). Note: flush_dcache_page() deliberately 3498c2ecf20Sopenharmony_ci doesn't work for an anonymous page. The default 3508c2ecf20Sopenharmony_ci implementation is a nop (and should remain so for all coherent 3518c2ecf20Sopenharmony_ci architectures). For incoherent architectures, it should flush 3528c2ecf20Sopenharmony_ci the cache of the page at vmaddr. 3538c2ecf20Sopenharmony_ci 3548c2ecf20Sopenharmony_ci ``void flush_kernel_dcache_page(struct page *page)`` 3558c2ecf20Sopenharmony_ci 3568c2ecf20Sopenharmony_ci When the kernel needs to modify a user page is has obtained 3578c2ecf20Sopenharmony_ci with kmap, it calls this function after all modifications are 3588c2ecf20Sopenharmony_ci complete (but before kunmapping it) to bring the underlying 3598c2ecf20Sopenharmony_ci page up to date. It is assumed here that the user has no 3608c2ecf20Sopenharmony_ci incoherent cached copies (i.e. the original page was obtained 3618c2ecf20Sopenharmony_ci from a mechanism like get_user_pages()). The default 3628c2ecf20Sopenharmony_ci implementation is a nop and should remain so on all coherent 3638c2ecf20Sopenharmony_ci architectures. On incoherent architectures, this should flush 3648c2ecf20Sopenharmony_ci the kernel cache for page (using page_address(page)). 3658c2ecf20Sopenharmony_ci 3668c2ecf20Sopenharmony_ci 3678c2ecf20Sopenharmony_ci ``void flush_icache_range(unsigned long start, unsigned long end)`` 3688c2ecf20Sopenharmony_ci 3698c2ecf20Sopenharmony_ci When the kernel stores into addresses that it will execute 3708c2ecf20Sopenharmony_ci out of (eg when loading modules), this function is called. 3718c2ecf20Sopenharmony_ci 3728c2ecf20Sopenharmony_ci If the icache does not snoop stores then this routine will need 3738c2ecf20Sopenharmony_ci to flush it. 3748c2ecf20Sopenharmony_ci 3758c2ecf20Sopenharmony_ci ``void flush_icache_page(struct vm_area_struct *vma, struct page *page)`` 3768c2ecf20Sopenharmony_ci 3778c2ecf20Sopenharmony_ci All the functionality of flush_icache_page can be implemented in 3788c2ecf20Sopenharmony_ci flush_dcache_page and update_mmu_cache. In the future, the hope 3798c2ecf20Sopenharmony_ci is to remove this interface completely. 3808c2ecf20Sopenharmony_ci 3818c2ecf20Sopenharmony_ciThe final category of APIs is for I/O to deliberately aliased address 3828c2ecf20Sopenharmony_ciranges inside the kernel. Such aliases are set up by use of the 3838c2ecf20Sopenharmony_civmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O 3848c2ecf20Sopenharmony_cisubsystem assumes that the user mapping and kernel offset mapping are 3858c2ecf20Sopenharmony_cithe only aliases. This isn't true for vmap aliases, so anything in 3868c2ecf20Sopenharmony_cithe kernel trying to do I/O to vmap areas must manually manage 3878c2ecf20Sopenharmony_cicoherency. It must do this by flushing the vmap range before doing 3888c2ecf20Sopenharmony_ciI/O and invalidating it after the I/O returns. 3898c2ecf20Sopenharmony_ci 3908c2ecf20Sopenharmony_ci ``void flush_kernel_vmap_range(void *vaddr, int size)`` 3918c2ecf20Sopenharmony_ci 3928c2ecf20Sopenharmony_ci flushes the kernel cache for a given virtual address range in 3938c2ecf20Sopenharmony_ci the vmap area. This is to make sure that any data the kernel 3948c2ecf20Sopenharmony_ci modified in the vmap range is made visible to the physical 3958c2ecf20Sopenharmony_ci page. The design is to make this area safe to perform I/O on. 3968c2ecf20Sopenharmony_ci Note that this API does *not* also flush the offset map alias 3978c2ecf20Sopenharmony_ci of the area. 3988c2ecf20Sopenharmony_ci 3998c2ecf20Sopenharmony_ci ``void invalidate_kernel_vmap_range(void *vaddr, int size) invalidates`` 4008c2ecf20Sopenharmony_ci 4018c2ecf20Sopenharmony_ci the cache for a given virtual address range in the vmap area 4028c2ecf20Sopenharmony_ci which prevents the processor from making the cache stale by 4038c2ecf20Sopenharmony_ci speculatively reading data while the I/O was occurring to the 4048c2ecf20Sopenharmony_ci physical pages. This is only necessary for data reads into the 4058c2ecf20Sopenharmony_ci vmap area. 406