18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0 28c2ecf20Sopenharmony_ci.. _VAS-API: 38c2ecf20Sopenharmony_ci 48c2ecf20Sopenharmony_ci=================================================== 58c2ecf20Sopenharmony_ciVirtual Accelerator Switchboard (VAS) userspace API 68c2ecf20Sopenharmony_ci=================================================== 78c2ecf20Sopenharmony_ci 88c2ecf20Sopenharmony_ciIntroduction 98c2ecf20Sopenharmony_ci============ 108c2ecf20Sopenharmony_ci 118c2ecf20Sopenharmony_ciPower9 processor introduced Virtual Accelerator Switchboard (VAS) which 128c2ecf20Sopenharmony_ciallows both userspace and kernel communicate to co-processor 138c2ecf20Sopenharmony_ci(hardware accelerator) referred to as the Nest Accelerator (NX). The NX 148c2ecf20Sopenharmony_ciunit comprises of one or more hardware engines or co-processor types 158c2ecf20Sopenharmony_cisuch as 842 compression, GZIP compression and encryption. On power9, 168c2ecf20Sopenharmony_ciuserspace applications will have access to only GZIP Compression engine 178c2ecf20Sopenharmony_ciwhich supports ZLIB and GZIP compression algorithms in the hardware. 188c2ecf20Sopenharmony_ci 198c2ecf20Sopenharmony_ciTo communicate with NX, kernel has to establish a channel or window and 208c2ecf20Sopenharmony_cithen requests can be submitted directly without kernel involvement. 218c2ecf20Sopenharmony_ciRequests to the GZIP engine must be formatted as a co-processor Request 228c2ecf20Sopenharmony_ciBlock (CRB) and these CRBs must be submitted to the NX using COPY/PASTE 238c2ecf20Sopenharmony_ciinstructions to paste the CRB to hardware address that is associated with 248c2ecf20Sopenharmony_cithe engine's request queue. 258c2ecf20Sopenharmony_ci 268c2ecf20Sopenharmony_ciThe GZIP engine provides two priority levels of requests: Normal and 278c2ecf20Sopenharmony_ciHigh. Only Normal requests are supported from userspace right now. 288c2ecf20Sopenharmony_ci 298c2ecf20Sopenharmony_ciThis document explains userspace API that is used to interact with 308c2ecf20Sopenharmony_cikernel to setup channel / window which can be used to send compression 318c2ecf20Sopenharmony_cirequests directly to NX accelerator. 328c2ecf20Sopenharmony_ci 338c2ecf20Sopenharmony_ci 348c2ecf20Sopenharmony_ciOverview 358c2ecf20Sopenharmony_ci======== 368c2ecf20Sopenharmony_ci 378c2ecf20Sopenharmony_ciApplication access to the GZIP engine is provided through 388c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip device node implemented by the VAS/NX device driver. 398c2ecf20Sopenharmony_ciAn application must open the /dev/crypto/nx-gzip device to obtain a file 408c2ecf20Sopenharmony_cidescriptor (fd). Then should issue VAS_TX_WIN_OPEN ioctl with this fd to 418c2ecf20Sopenharmony_ciestablish connection to the engine. It means send window is opened on GZIP 428c2ecf20Sopenharmony_ciengine for this process. Once a connection is established, the application 438c2ecf20Sopenharmony_cishould use the mmap() system call to map the hardware address of engine's 448c2ecf20Sopenharmony_cirequest queue into the application's virtual address space. 458c2ecf20Sopenharmony_ci 468c2ecf20Sopenharmony_ciThe application can then submit one or more requests to the engine by 478c2ecf20Sopenharmony_ciusing copy/paste instructions and pasting the CRBs to the virtual address 488c2ecf20Sopenharmony_ci(aka paste_address) returned by mmap(). User space can close the 498c2ecf20Sopenharmony_ciestablished connection or send window by closing the file descriptior 508c2ecf20Sopenharmony_ci(close(fd)) or upon the process exit. 518c2ecf20Sopenharmony_ci 528c2ecf20Sopenharmony_ciNote that applications can send several requests with the same window or 538c2ecf20Sopenharmony_cican establish multiple windows, but one window for each file descriptor. 548c2ecf20Sopenharmony_ci 558c2ecf20Sopenharmony_ciFollowing sections provide additional details and references about the 568c2ecf20Sopenharmony_ciindividual steps. 578c2ecf20Sopenharmony_ci 588c2ecf20Sopenharmony_ciNX-GZIP Device Node 598c2ecf20Sopenharmony_ci=================== 608c2ecf20Sopenharmony_ci 618c2ecf20Sopenharmony_ciThere is one /dev/crypto/nx-gzip node in the system and it provides 628c2ecf20Sopenharmony_ciaccess to all GZIP engines in the system. The only valid operations on 638c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip are: 648c2ecf20Sopenharmony_ci 658c2ecf20Sopenharmony_ci * open() the device for read and write. 668c2ecf20Sopenharmony_ci * issue VAS_TX_WIN_OPEN ioctl 678c2ecf20Sopenharmony_ci * mmap() the engine's request queue into application's virtual 688c2ecf20Sopenharmony_ci address space (i.e. get a paste_address for the co-processor 698c2ecf20Sopenharmony_ci engine). 708c2ecf20Sopenharmony_ci * close the device node. 718c2ecf20Sopenharmony_ci 728c2ecf20Sopenharmony_ciOther file operations on this device node are undefined. 738c2ecf20Sopenharmony_ci 748c2ecf20Sopenharmony_ciNote that the copy and paste operations go directly to the hardware and 758c2ecf20Sopenharmony_cido not go through this device. Refer COPY/PASTE document for more 768c2ecf20Sopenharmony_cidetails. 778c2ecf20Sopenharmony_ci 788c2ecf20Sopenharmony_ciAlthough a system may have several instances of the NX co-processor 798c2ecf20Sopenharmony_ciengines (typically, one per P9 chip) there is just one 808c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip device node in the system. When the nx-gzip device 818c2ecf20Sopenharmony_cinode is opened, Kernel opens send window on a suitable instance of NX 828c2ecf20Sopenharmony_ciaccelerator. It finds CPU on which the user process is executing and 838c2ecf20Sopenharmony_cidetermine the NX instance for the corresponding chip on which this CPU 848c2ecf20Sopenharmony_cibelongs. 858c2ecf20Sopenharmony_ci 868c2ecf20Sopenharmony_ciApplications may chose a specific instance of the NX co-processor using 878c2ecf20Sopenharmony_cithe vas_id field in the VAS_TX_WIN_OPEN ioctl as detailed below. 888c2ecf20Sopenharmony_ci 898c2ecf20Sopenharmony_ciA userspace library libnxz is available here but still in development: 908c2ecf20Sopenharmony_ci 918c2ecf20Sopenharmony_ci https://github.com/abalib/power-gzip 928c2ecf20Sopenharmony_ci 938c2ecf20Sopenharmony_ciApplications that use inflate / deflate calls can link with libnxz 948c2ecf20Sopenharmony_ciinstead of libz and use NX GZIP compression without any modification. 958c2ecf20Sopenharmony_ci 968c2ecf20Sopenharmony_ciOpen /dev/crypto/nx-gzip 978c2ecf20Sopenharmony_ci======================== 988c2ecf20Sopenharmony_ci 998c2ecf20Sopenharmony_ciThe nx-gzip device should be opened for read and write. No special 1008c2ecf20Sopenharmony_ciprivileges are needed to open the device. Each window corresponds to one 1018c2ecf20Sopenharmony_cifile descriptor. So if the userspace process needs multiple windows, 1028c2ecf20Sopenharmony_ciseveral open calls have to be issued. 1038c2ecf20Sopenharmony_ci 1048c2ecf20Sopenharmony_ciSee open(2) system call man pages for other details such as return values, 1058c2ecf20Sopenharmony_cierror codes and restrictions. 1068c2ecf20Sopenharmony_ci 1078c2ecf20Sopenharmony_ciVAS_TX_WIN_OPEN ioctl 1088c2ecf20Sopenharmony_ci===================== 1098c2ecf20Sopenharmony_ci 1108c2ecf20Sopenharmony_ciApplications should use the VAS_TX_WIN_OPEN ioctl as follows to establish 1118c2ecf20Sopenharmony_cia connection with NX co-processor engine: 1128c2ecf20Sopenharmony_ci 1138c2ecf20Sopenharmony_ci :: 1148c2ecf20Sopenharmony_ci 1158c2ecf20Sopenharmony_ci struct vas_tx_win_open_attr { 1168c2ecf20Sopenharmony_ci __u32 version; 1178c2ecf20Sopenharmony_ci __s16 vas_id; /* specific instance of vas or -1 1188c2ecf20Sopenharmony_ci for default */ 1198c2ecf20Sopenharmony_ci __u16 reserved1; 1208c2ecf20Sopenharmony_ci __u64 flags; /* For future use */ 1218c2ecf20Sopenharmony_ci __u64 reserved2[6]; 1228c2ecf20Sopenharmony_ci }; 1238c2ecf20Sopenharmony_ci 1248c2ecf20Sopenharmony_ci version: 1258c2ecf20Sopenharmony_ci The version field must be currently set to 1. 1268c2ecf20Sopenharmony_ci vas_id: 1278c2ecf20Sopenharmony_ci If '-1' is passed, kernel will make a best-effort attempt 1288c2ecf20Sopenharmony_ci to assign an optimal instance of NX for the process. To 1298c2ecf20Sopenharmony_ci select the specific VAS instance, refer 1308c2ecf20Sopenharmony_ci "Discovery of available VAS engines" section below. 1318c2ecf20Sopenharmony_ci 1328c2ecf20Sopenharmony_ci flags, reserved1 and reserved2[6] fields are for future extension 1338c2ecf20Sopenharmony_ci and must be set to 0. 1348c2ecf20Sopenharmony_ci 1358c2ecf20Sopenharmony_ci The attributes attr for the VAS_TX_WIN_OPEN ioctl are defined as 1368c2ecf20Sopenharmony_ci follows:: 1378c2ecf20Sopenharmony_ci 1388c2ecf20Sopenharmony_ci #define VAS_MAGIC 'v' 1398c2ecf20Sopenharmony_ci #define VAS_TX_WIN_OPEN _IOW(VAS_MAGIC, 1, 1408c2ecf20Sopenharmony_ci struct vas_tx_win_open_attr) 1418c2ecf20Sopenharmony_ci 1428c2ecf20Sopenharmony_ci struct vas_tx_win_open_attr attr; 1438c2ecf20Sopenharmony_ci rc = ioctl(fd, VAS_TX_WIN_OPEN, &attr); 1448c2ecf20Sopenharmony_ci 1458c2ecf20Sopenharmony_ci The VAS_TX_WIN_OPEN ioctl returns 0 on success. On errors, it 1468c2ecf20Sopenharmony_ci returns -1 and sets the errno variable to indicate the error. 1478c2ecf20Sopenharmony_ci 1488c2ecf20Sopenharmony_ci Error conditions: 1498c2ecf20Sopenharmony_ci 1508c2ecf20Sopenharmony_ci ====== ================================================ 1518c2ecf20Sopenharmony_ci EINVAL fd does not refer to a valid VAS device. 1528c2ecf20Sopenharmony_ci EINVAL Invalid vas ID 1538c2ecf20Sopenharmony_ci EINVAL version is not set with proper value 1548c2ecf20Sopenharmony_ci EEXIST Window is already opened for the given fd 1558c2ecf20Sopenharmony_ci ENOMEM Memory is not available to allocate window 1568c2ecf20Sopenharmony_ci ENOSPC System has too many active windows (connections) 1578c2ecf20Sopenharmony_ci opened 1588c2ecf20Sopenharmony_ci EINVAL reserved fields are not set to 0. 1598c2ecf20Sopenharmony_ci ====== ================================================ 1608c2ecf20Sopenharmony_ci 1618c2ecf20Sopenharmony_ci See the ioctl(2) man page for more details, error codes and 1628c2ecf20Sopenharmony_ci restrictions. 1638c2ecf20Sopenharmony_ci 1648c2ecf20Sopenharmony_cimmap() NX-GZIP device 1658c2ecf20Sopenharmony_ci===================== 1668c2ecf20Sopenharmony_ci 1678c2ecf20Sopenharmony_ciThe mmap() system call for a NX-GZIP device fd returns a paste_address 1688c2ecf20Sopenharmony_cithat the application can use to copy/paste its CRB to the hardware engines. 1698c2ecf20Sopenharmony_ci 1708c2ecf20Sopenharmony_ci :: 1718c2ecf20Sopenharmony_ci 1728c2ecf20Sopenharmony_ci paste_addr = mmap(addr, size, prot, flags, fd, offset); 1738c2ecf20Sopenharmony_ci 1748c2ecf20Sopenharmony_ci Only restrictions on mmap for a NX-GZIP device fd are: 1758c2ecf20Sopenharmony_ci 1768c2ecf20Sopenharmony_ci * size should be PAGE_SIZE 1778c2ecf20Sopenharmony_ci * offset parameter should be 0ULL 1788c2ecf20Sopenharmony_ci 1798c2ecf20Sopenharmony_ci Refer to mmap(2) man page for additional details/restrictions. 1808c2ecf20Sopenharmony_ci In addition to the error conditions listed on the mmap(2) man 1818c2ecf20Sopenharmony_ci page, can also fail with one of the following error codes: 1828c2ecf20Sopenharmony_ci 1838c2ecf20Sopenharmony_ci ====== ============================================= 1848c2ecf20Sopenharmony_ci EINVAL fd is not associated with an open window 1858c2ecf20Sopenharmony_ci (i.e mmap() does not follow a successful call 1868c2ecf20Sopenharmony_ci to the VAS_TX_WIN_OPEN ioctl). 1878c2ecf20Sopenharmony_ci EINVAL offset field is not 0ULL. 1888c2ecf20Sopenharmony_ci ====== ============================================= 1898c2ecf20Sopenharmony_ci 1908c2ecf20Sopenharmony_ciDiscovery of available VAS engines 1918c2ecf20Sopenharmony_ci================================== 1928c2ecf20Sopenharmony_ci 1938c2ecf20Sopenharmony_ciEach available VAS instance in the system will have a device tree node 1948c2ecf20Sopenharmony_cilike /proc/device-tree/vas@* or /proc/device-tree/xscom@*/vas@*. 1958c2ecf20Sopenharmony_ciDetermine the chip or VAS instance and use the corresponding ibm,vas-id 1968c2ecf20Sopenharmony_ciproperty value in this node to select specific VAS instance. 1978c2ecf20Sopenharmony_ci 1988c2ecf20Sopenharmony_ciCopy/Paste operations 1998c2ecf20Sopenharmony_ci===================== 2008c2ecf20Sopenharmony_ci 2018c2ecf20Sopenharmony_ciApplications should use the copy and paste instructions to send CRB to NX. 2028c2ecf20Sopenharmony_ciRefer section 4.4 in PowerISA for Copy/Paste instructions: 2038c2ecf20Sopenharmony_cihttps://openpowerfoundation.org/?resource_lib=power-isa-version-3-0 2048c2ecf20Sopenharmony_ci 2058c2ecf20Sopenharmony_ciCRB Specification and use NX 2068c2ecf20Sopenharmony_ci============================ 2078c2ecf20Sopenharmony_ci 2088c2ecf20Sopenharmony_ciApplications should format requests to the co-processor using the 2098c2ecf20Sopenharmony_cico-processor Request Block (CRBs). Refer NX-GZIP user's manual for the format 2108c2ecf20Sopenharmony_ciof CRB and use NX from userspace such as sending requests and checking 2118c2ecf20Sopenharmony_cirequest status. 2128c2ecf20Sopenharmony_ci 2138c2ecf20Sopenharmony_ciNX Fault handling 2148c2ecf20Sopenharmony_ci================= 2158c2ecf20Sopenharmony_ci 2168c2ecf20Sopenharmony_ciApplications send requests to NX and wait for the status by polling on 2178c2ecf20Sopenharmony_cico-processor Status Block (CSB) flags. NX updates status in CSB after each 2188c2ecf20Sopenharmony_cirequest is processed. Refer NX-GZIP user's manual for the format of CSB and 2198c2ecf20Sopenharmony_cistatus flags. 2208c2ecf20Sopenharmony_ci 2218c2ecf20Sopenharmony_ciIn case if NX encounters translation error (called NX page fault) on CSB 2228c2ecf20Sopenharmony_ciaddress or any request buffer, raises an interrupt on the CPU to handle the 2238c2ecf20Sopenharmony_cifault. Page fault can happen if an application passes invalid addresses or 2248c2ecf20Sopenharmony_cirequest buffers are not in memory. The operating system handles the fault by 2258c2ecf20Sopenharmony_ciupdating CSB with the following data:: 2268c2ecf20Sopenharmony_ci 2278c2ecf20Sopenharmony_ci csb.flags = CSB_V; 2288c2ecf20Sopenharmony_ci csb.cc = CSB_CC_FAULT_ADDRESS; 2298c2ecf20Sopenharmony_ci csb.ce = CSB_CE_TERMINATION; 2308c2ecf20Sopenharmony_ci csb.address = fault_address; 2318c2ecf20Sopenharmony_ci 2328c2ecf20Sopenharmony_ciWhen an application receives translation error, it can touch or access 2338c2ecf20Sopenharmony_cithe page that has a fault address so that this page will be in memory. Then 2348c2ecf20Sopenharmony_cithe application can resend this request to NX. 2358c2ecf20Sopenharmony_ci 2368c2ecf20Sopenharmony_ciIf the OS can not update CSB due to invalid CSB address, sends SEGV signal 2378c2ecf20Sopenharmony_cito the process who opened the send window on which the original request was 2388c2ecf20Sopenharmony_ciissued. This signal returns with the following siginfo struct:: 2398c2ecf20Sopenharmony_ci 2408c2ecf20Sopenharmony_ci siginfo.si_signo = SIGSEGV; 2418c2ecf20Sopenharmony_ci siginfo.si_errno = EFAULT; 2428c2ecf20Sopenharmony_ci siginfo.si_code = SEGV_MAPERR; 2438c2ecf20Sopenharmony_ci siginfo.si_addr = CSB adress; 2448c2ecf20Sopenharmony_ci 2458c2ecf20Sopenharmony_ciIn the case of multi-thread applications, NX send windows can be shared 2468c2ecf20Sopenharmony_ciacross all threads. For example, a child thread can open a send window, 2478c2ecf20Sopenharmony_cibut other threads can send requests to NX using this window. These 2488c2ecf20Sopenharmony_cirequests will be successful even in the case of OS handling faults as long 2498c2ecf20Sopenharmony_cias CSB address is valid. If the NX request contains an invalid CSB address, 2508c2ecf20Sopenharmony_cithe signal will be sent to the child thread that opened the window. But if 2518c2ecf20Sopenharmony_cithe thread is exited without closing the window and the request is issued 2528c2ecf20Sopenharmony_ciusing this window. the signal will be issued to the thread group leader 2538c2ecf20Sopenharmony_ci(tgid). It is up to the application whether to ignore or handle these 2548c2ecf20Sopenharmony_cisignals. 2558c2ecf20Sopenharmony_ci 2568c2ecf20Sopenharmony_ciNX-GZIP User's Manual: 2578c2ecf20Sopenharmony_cihttps://github.com/libnxz/power-gzip/blob/master/power_nx_gzip_um.pdf 2588c2ecf20Sopenharmony_ci 2598c2ecf20Sopenharmony_ciSimple example 2608c2ecf20Sopenharmony_ci============== 2618c2ecf20Sopenharmony_ci 2628c2ecf20Sopenharmony_ci :: 2638c2ecf20Sopenharmony_ci 2648c2ecf20Sopenharmony_ci int use_nx_gzip() 2658c2ecf20Sopenharmony_ci { 2668c2ecf20Sopenharmony_ci int rc, fd; 2678c2ecf20Sopenharmony_ci void *addr; 2688c2ecf20Sopenharmony_ci struct vas_setup_attr txattr; 2698c2ecf20Sopenharmony_ci 2708c2ecf20Sopenharmony_ci fd = open("/dev/crypto/nx-gzip", O_RDWR); 2718c2ecf20Sopenharmony_ci if (fd < 0) { 2728c2ecf20Sopenharmony_ci fprintf(stderr, "open nx-gzip failed\n"); 2738c2ecf20Sopenharmony_ci return -1; 2748c2ecf20Sopenharmony_ci } 2758c2ecf20Sopenharmony_ci memset(&txattr, 0, sizeof(txattr)); 2768c2ecf20Sopenharmony_ci txattr.version = 1; 2778c2ecf20Sopenharmony_ci txattr.vas_id = -1 2788c2ecf20Sopenharmony_ci rc = ioctl(fd, VAS_TX_WIN_OPEN, 2798c2ecf20Sopenharmony_ci (unsigned long)&txattr); 2808c2ecf20Sopenharmony_ci if (rc < 0) { 2818c2ecf20Sopenharmony_ci fprintf(stderr, "ioctl() n %d, error %d\n", 2828c2ecf20Sopenharmony_ci rc, errno); 2838c2ecf20Sopenharmony_ci return rc; 2848c2ecf20Sopenharmony_ci } 2858c2ecf20Sopenharmony_ci addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE, 2868c2ecf20Sopenharmony_ci MAP_SHARED, fd, 0ULL); 2878c2ecf20Sopenharmony_ci if (addr == MAP_FAILED) { 2888c2ecf20Sopenharmony_ci fprintf(stderr, "mmap() failed, errno %d\n", 2898c2ecf20Sopenharmony_ci errno); 2908c2ecf20Sopenharmony_ci return -errno; 2918c2ecf20Sopenharmony_ci } 2928c2ecf20Sopenharmony_ci do { 2938c2ecf20Sopenharmony_ci //Format CRB request with compression or 2948c2ecf20Sopenharmony_ci //uncompression 2958c2ecf20Sopenharmony_ci // Refer tests for vas_copy/vas_paste 2968c2ecf20Sopenharmony_ci vas_copy((&crb, 0, 1); 2978c2ecf20Sopenharmony_ci vas_paste(addr, 0, 1); 2988c2ecf20Sopenharmony_ci // Poll on csb.flags with timeout 2998c2ecf20Sopenharmony_ci // csb address is listed in CRB 3008c2ecf20Sopenharmony_ci } while (true) 3018c2ecf20Sopenharmony_ci close(fd) or window can be closed upon process exit 3028c2ecf20Sopenharmony_ci } 3038c2ecf20Sopenharmony_ci 3048c2ecf20Sopenharmony_ci Refer https://github.com/abalib/power-gzip for tests or more 3058c2ecf20Sopenharmony_ci use cases. 306