18c2ecf20Sopenharmony_ci.. SPDX-License-Identifier: GPL-2.0
28c2ecf20Sopenharmony_ci.. _VAS-API:
38c2ecf20Sopenharmony_ci
48c2ecf20Sopenharmony_ci===================================================
58c2ecf20Sopenharmony_ciVirtual Accelerator Switchboard (VAS) userspace API
68c2ecf20Sopenharmony_ci===================================================
78c2ecf20Sopenharmony_ci
88c2ecf20Sopenharmony_ciIntroduction
98c2ecf20Sopenharmony_ci============
108c2ecf20Sopenharmony_ci
118c2ecf20Sopenharmony_ciPower9 processor introduced Virtual Accelerator Switchboard (VAS) which
128c2ecf20Sopenharmony_ciallows both userspace and kernel communicate to co-processor
138c2ecf20Sopenharmony_ci(hardware accelerator) referred to as the Nest Accelerator (NX). The NX
148c2ecf20Sopenharmony_ciunit comprises of one or more hardware engines or co-processor types
158c2ecf20Sopenharmony_cisuch as 842 compression, GZIP compression and encryption. On power9,
168c2ecf20Sopenharmony_ciuserspace applications will have access to only GZIP Compression engine
178c2ecf20Sopenharmony_ciwhich supports ZLIB and GZIP compression algorithms in the hardware.
188c2ecf20Sopenharmony_ci
198c2ecf20Sopenharmony_ciTo communicate with NX, kernel has to establish a channel or window and
208c2ecf20Sopenharmony_cithen requests can be submitted directly without kernel involvement.
218c2ecf20Sopenharmony_ciRequests to the GZIP engine must be formatted as a co-processor Request
228c2ecf20Sopenharmony_ciBlock (CRB) and these CRBs must be submitted to the NX using COPY/PASTE
238c2ecf20Sopenharmony_ciinstructions to paste the CRB to hardware address that is associated with
248c2ecf20Sopenharmony_cithe engine's request queue.
258c2ecf20Sopenharmony_ci
268c2ecf20Sopenharmony_ciThe GZIP engine provides two priority levels of requests: Normal and
278c2ecf20Sopenharmony_ciHigh. Only Normal requests are supported from userspace right now.
288c2ecf20Sopenharmony_ci
298c2ecf20Sopenharmony_ciThis document explains userspace API that is used to interact with
308c2ecf20Sopenharmony_cikernel to setup channel / window which can be used to send compression
318c2ecf20Sopenharmony_cirequests directly to NX accelerator.
328c2ecf20Sopenharmony_ci
338c2ecf20Sopenharmony_ci
348c2ecf20Sopenharmony_ciOverview
358c2ecf20Sopenharmony_ci========
368c2ecf20Sopenharmony_ci
378c2ecf20Sopenharmony_ciApplication access to the GZIP engine is provided through
388c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip device node implemented by the VAS/NX device driver.
398c2ecf20Sopenharmony_ciAn application must open the /dev/crypto/nx-gzip device to obtain a file
408c2ecf20Sopenharmony_cidescriptor (fd). Then should issue VAS_TX_WIN_OPEN ioctl with this fd to
418c2ecf20Sopenharmony_ciestablish connection to the engine. It means send window is opened on GZIP
428c2ecf20Sopenharmony_ciengine for this process. Once a connection is established, the application
438c2ecf20Sopenharmony_cishould use the mmap() system call to map the hardware address of engine's
448c2ecf20Sopenharmony_cirequest queue into the application's virtual address space.
458c2ecf20Sopenharmony_ci
468c2ecf20Sopenharmony_ciThe application can then submit one or more requests to the engine by
478c2ecf20Sopenharmony_ciusing copy/paste instructions and pasting the CRBs to the virtual address
488c2ecf20Sopenharmony_ci(aka paste_address) returned by mmap(). User space can close the
498c2ecf20Sopenharmony_ciestablished connection or send window by closing the file descriptior
508c2ecf20Sopenharmony_ci(close(fd)) or upon the process exit.
518c2ecf20Sopenharmony_ci
528c2ecf20Sopenharmony_ciNote that applications can send several requests with the same window or
538c2ecf20Sopenharmony_cican establish multiple windows, but one window for each file descriptor.
548c2ecf20Sopenharmony_ci
558c2ecf20Sopenharmony_ciFollowing sections provide additional details and references about the
568c2ecf20Sopenharmony_ciindividual steps.
578c2ecf20Sopenharmony_ci
588c2ecf20Sopenharmony_ciNX-GZIP Device Node
598c2ecf20Sopenharmony_ci===================
608c2ecf20Sopenharmony_ci
618c2ecf20Sopenharmony_ciThere is one /dev/crypto/nx-gzip node in the system and it provides
628c2ecf20Sopenharmony_ciaccess to all GZIP engines in the system. The only valid operations on
638c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip are:
648c2ecf20Sopenharmony_ci
658c2ecf20Sopenharmony_ci	* open() the device for read and write.
668c2ecf20Sopenharmony_ci	* issue VAS_TX_WIN_OPEN ioctl
678c2ecf20Sopenharmony_ci	* mmap() the engine's request queue into application's virtual
688c2ecf20Sopenharmony_ci	  address space (i.e. get a paste_address for the co-processor
698c2ecf20Sopenharmony_ci	  engine).
708c2ecf20Sopenharmony_ci	* close the device node.
718c2ecf20Sopenharmony_ci
728c2ecf20Sopenharmony_ciOther file operations on this device node are undefined.
738c2ecf20Sopenharmony_ci
748c2ecf20Sopenharmony_ciNote that the copy and paste operations go directly to the hardware and
758c2ecf20Sopenharmony_cido not go through this device. Refer COPY/PASTE document for more
768c2ecf20Sopenharmony_cidetails.
778c2ecf20Sopenharmony_ci
788c2ecf20Sopenharmony_ciAlthough a system may have several instances of the NX co-processor
798c2ecf20Sopenharmony_ciengines (typically, one per P9 chip) there is just one
808c2ecf20Sopenharmony_ci/dev/crypto/nx-gzip device node in the system. When the nx-gzip device
818c2ecf20Sopenharmony_cinode is opened, Kernel opens send window on a suitable instance of NX
828c2ecf20Sopenharmony_ciaccelerator. It finds CPU on which the user process is executing and
838c2ecf20Sopenharmony_cidetermine the NX instance for the corresponding chip on which this CPU
848c2ecf20Sopenharmony_cibelongs.
858c2ecf20Sopenharmony_ci
868c2ecf20Sopenharmony_ciApplications may chose a specific instance of the NX co-processor using
878c2ecf20Sopenharmony_cithe vas_id field in the VAS_TX_WIN_OPEN ioctl as detailed below.
888c2ecf20Sopenharmony_ci
898c2ecf20Sopenharmony_ciA userspace library libnxz is available here but still in development:
908c2ecf20Sopenharmony_ci
918c2ecf20Sopenharmony_ci	 https://github.com/abalib/power-gzip
928c2ecf20Sopenharmony_ci
938c2ecf20Sopenharmony_ciApplications that use inflate / deflate calls can link with libnxz
948c2ecf20Sopenharmony_ciinstead of libz and use NX GZIP compression without any modification.
958c2ecf20Sopenharmony_ci
968c2ecf20Sopenharmony_ciOpen /dev/crypto/nx-gzip
978c2ecf20Sopenharmony_ci========================
988c2ecf20Sopenharmony_ci
998c2ecf20Sopenharmony_ciThe nx-gzip device should be opened for read and write. No special
1008c2ecf20Sopenharmony_ciprivileges are needed to open the device. Each window corresponds to one
1018c2ecf20Sopenharmony_cifile descriptor. So if the userspace process needs multiple windows,
1028c2ecf20Sopenharmony_ciseveral open calls have to be issued.
1038c2ecf20Sopenharmony_ci
1048c2ecf20Sopenharmony_ciSee open(2) system call man pages for other details such as return values,
1058c2ecf20Sopenharmony_cierror codes and restrictions.
1068c2ecf20Sopenharmony_ci
1078c2ecf20Sopenharmony_ciVAS_TX_WIN_OPEN ioctl
1088c2ecf20Sopenharmony_ci=====================
1098c2ecf20Sopenharmony_ci
1108c2ecf20Sopenharmony_ciApplications should use the VAS_TX_WIN_OPEN ioctl as follows to establish
1118c2ecf20Sopenharmony_cia connection with NX co-processor engine:
1128c2ecf20Sopenharmony_ci
1138c2ecf20Sopenharmony_ci	::
1148c2ecf20Sopenharmony_ci
1158c2ecf20Sopenharmony_ci		struct vas_tx_win_open_attr {
1168c2ecf20Sopenharmony_ci			__u32   version;
1178c2ecf20Sopenharmony_ci			__s16   vas_id; /* specific instance of vas or -1
1188c2ecf20Sopenharmony_ci						for default */
1198c2ecf20Sopenharmony_ci			__u16   reserved1;
1208c2ecf20Sopenharmony_ci			__u64   flags;	/* For future use */
1218c2ecf20Sopenharmony_ci			__u64   reserved2[6];
1228c2ecf20Sopenharmony_ci		};
1238c2ecf20Sopenharmony_ci
1248c2ecf20Sopenharmony_ci	version:
1258c2ecf20Sopenharmony_ci		The version field must be currently set to 1.
1268c2ecf20Sopenharmony_ci	vas_id:
1278c2ecf20Sopenharmony_ci		If '-1' is passed, kernel will make a best-effort attempt
1288c2ecf20Sopenharmony_ci		to assign an optimal instance of NX for the process. To
1298c2ecf20Sopenharmony_ci		select the specific VAS instance, refer
1308c2ecf20Sopenharmony_ci		"Discovery of available VAS engines" section below.
1318c2ecf20Sopenharmony_ci
1328c2ecf20Sopenharmony_ci	flags, reserved1 and reserved2[6] fields are for future extension
1338c2ecf20Sopenharmony_ci	and must be set to 0.
1348c2ecf20Sopenharmony_ci
1358c2ecf20Sopenharmony_ci	The attributes attr for the VAS_TX_WIN_OPEN ioctl are defined as
1368c2ecf20Sopenharmony_ci	follows::
1378c2ecf20Sopenharmony_ci
1388c2ecf20Sopenharmony_ci		#define VAS_MAGIC 'v'
1398c2ecf20Sopenharmony_ci		#define VAS_TX_WIN_OPEN _IOW(VAS_MAGIC, 1,
1408c2ecf20Sopenharmony_ci						struct vas_tx_win_open_attr)
1418c2ecf20Sopenharmony_ci
1428c2ecf20Sopenharmony_ci		struct vas_tx_win_open_attr attr;
1438c2ecf20Sopenharmony_ci		rc = ioctl(fd, VAS_TX_WIN_OPEN, &attr);
1448c2ecf20Sopenharmony_ci
1458c2ecf20Sopenharmony_ci	The VAS_TX_WIN_OPEN ioctl returns 0 on success. On errors, it
1468c2ecf20Sopenharmony_ci	returns -1 and sets the errno variable to indicate the error.
1478c2ecf20Sopenharmony_ci
1488c2ecf20Sopenharmony_ci	Error conditions:
1498c2ecf20Sopenharmony_ci
1508c2ecf20Sopenharmony_ci		======	================================================
1518c2ecf20Sopenharmony_ci		EINVAL	fd does not refer to a valid VAS device.
1528c2ecf20Sopenharmony_ci		EINVAL	Invalid vas ID
1538c2ecf20Sopenharmony_ci		EINVAL	version is not set with proper value
1548c2ecf20Sopenharmony_ci		EEXIST	Window is already opened for the given fd
1558c2ecf20Sopenharmony_ci		ENOMEM	Memory is not available to allocate window
1568c2ecf20Sopenharmony_ci		ENOSPC	System has too many active windows (connections)
1578c2ecf20Sopenharmony_ci			opened
1588c2ecf20Sopenharmony_ci		EINVAL	reserved fields are not set to 0.
1598c2ecf20Sopenharmony_ci		======	================================================
1608c2ecf20Sopenharmony_ci
1618c2ecf20Sopenharmony_ci	See the ioctl(2) man page for more details, error codes and
1628c2ecf20Sopenharmony_ci	restrictions.
1638c2ecf20Sopenharmony_ci
1648c2ecf20Sopenharmony_cimmap() NX-GZIP device
1658c2ecf20Sopenharmony_ci=====================
1668c2ecf20Sopenharmony_ci
1678c2ecf20Sopenharmony_ciThe mmap() system call for a NX-GZIP device fd returns a paste_address
1688c2ecf20Sopenharmony_cithat the application can use to copy/paste its CRB to the hardware engines.
1698c2ecf20Sopenharmony_ci
1708c2ecf20Sopenharmony_ci	::
1718c2ecf20Sopenharmony_ci
1728c2ecf20Sopenharmony_ci		paste_addr = mmap(addr, size, prot, flags, fd, offset);
1738c2ecf20Sopenharmony_ci
1748c2ecf20Sopenharmony_ci	Only restrictions on mmap for a NX-GZIP device fd are:
1758c2ecf20Sopenharmony_ci
1768c2ecf20Sopenharmony_ci		* size should be PAGE_SIZE
1778c2ecf20Sopenharmony_ci		* offset parameter should be 0ULL
1788c2ecf20Sopenharmony_ci
1798c2ecf20Sopenharmony_ci	Refer to mmap(2) man page for additional details/restrictions.
1808c2ecf20Sopenharmony_ci	In addition to the error conditions listed on the mmap(2) man
1818c2ecf20Sopenharmony_ci	page, can also fail with one of the following error codes:
1828c2ecf20Sopenharmony_ci
1838c2ecf20Sopenharmony_ci		======	=============================================
1848c2ecf20Sopenharmony_ci		EINVAL	fd is not associated with an open window
1858c2ecf20Sopenharmony_ci			(i.e mmap() does not follow a successful call
1868c2ecf20Sopenharmony_ci			to the VAS_TX_WIN_OPEN ioctl).
1878c2ecf20Sopenharmony_ci		EINVAL	offset field is not 0ULL.
1888c2ecf20Sopenharmony_ci		======	=============================================
1898c2ecf20Sopenharmony_ci
1908c2ecf20Sopenharmony_ciDiscovery of available VAS engines
1918c2ecf20Sopenharmony_ci==================================
1928c2ecf20Sopenharmony_ci
1938c2ecf20Sopenharmony_ciEach available VAS instance in the system will have a device tree node
1948c2ecf20Sopenharmony_cilike /proc/device-tree/vas@* or /proc/device-tree/xscom@*/vas@*.
1958c2ecf20Sopenharmony_ciDetermine the chip or VAS instance and use the corresponding ibm,vas-id
1968c2ecf20Sopenharmony_ciproperty value in this node to select specific VAS instance.
1978c2ecf20Sopenharmony_ci
1988c2ecf20Sopenharmony_ciCopy/Paste operations
1998c2ecf20Sopenharmony_ci=====================
2008c2ecf20Sopenharmony_ci
2018c2ecf20Sopenharmony_ciApplications should use the copy and paste instructions to send CRB to NX.
2028c2ecf20Sopenharmony_ciRefer section 4.4 in PowerISA for Copy/Paste instructions:
2038c2ecf20Sopenharmony_cihttps://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
2048c2ecf20Sopenharmony_ci
2058c2ecf20Sopenharmony_ciCRB Specification and use NX
2068c2ecf20Sopenharmony_ci============================
2078c2ecf20Sopenharmony_ci
2088c2ecf20Sopenharmony_ciApplications should format requests to the co-processor using the
2098c2ecf20Sopenharmony_cico-processor Request Block (CRBs). Refer NX-GZIP user's manual for the format
2108c2ecf20Sopenharmony_ciof CRB and use NX from userspace such as sending requests and checking
2118c2ecf20Sopenharmony_cirequest status.
2128c2ecf20Sopenharmony_ci
2138c2ecf20Sopenharmony_ciNX Fault handling
2148c2ecf20Sopenharmony_ci=================
2158c2ecf20Sopenharmony_ci
2168c2ecf20Sopenharmony_ciApplications send requests to NX and wait for the status by polling on
2178c2ecf20Sopenharmony_cico-processor Status Block (CSB) flags. NX updates status in CSB after each
2188c2ecf20Sopenharmony_cirequest is processed. Refer NX-GZIP user's manual for the format of CSB and
2198c2ecf20Sopenharmony_cistatus flags.
2208c2ecf20Sopenharmony_ci
2218c2ecf20Sopenharmony_ciIn case if NX encounters translation error (called NX page fault) on CSB
2228c2ecf20Sopenharmony_ciaddress or any request buffer, raises an interrupt on the CPU to handle the
2238c2ecf20Sopenharmony_cifault. Page fault can happen if an application passes invalid addresses or
2248c2ecf20Sopenharmony_cirequest buffers are not in memory. The operating system handles the fault by
2258c2ecf20Sopenharmony_ciupdating CSB with the following data::
2268c2ecf20Sopenharmony_ci
2278c2ecf20Sopenharmony_ci	csb.flags = CSB_V;
2288c2ecf20Sopenharmony_ci	csb.cc = CSB_CC_FAULT_ADDRESS;
2298c2ecf20Sopenharmony_ci	csb.ce = CSB_CE_TERMINATION;
2308c2ecf20Sopenharmony_ci	csb.address = fault_address;
2318c2ecf20Sopenharmony_ci
2328c2ecf20Sopenharmony_ciWhen an application receives translation error, it can touch or access
2338c2ecf20Sopenharmony_cithe page that has a fault address so that this page will be in memory. Then
2348c2ecf20Sopenharmony_cithe application can resend this request to NX.
2358c2ecf20Sopenharmony_ci
2368c2ecf20Sopenharmony_ciIf the OS can not update CSB due to invalid CSB address, sends SEGV signal
2378c2ecf20Sopenharmony_cito the process who opened the send window on which the original request was
2388c2ecf20Sopenharmony_ciissued. This signal returns with the following siginfo struct::
2398c2ecf20Sopenharmony_ci
2408c2ecf20Sopenharmony_ci	siginfo.si_signo = SIGSEGV;
2418c2ecf20Sopenharmony_ci	siginfo.si_errno = EFAULT;
2428c2ecf20Sopenharmony_ci	siginfo.si_code = SEGV_MAPERR;
2438c2ecf20Sopenharmony_ci	siginfo.si_addr = CSB adress;
2448c2ecf20Sopenharmony_ci
2458c2ecf20Sopenharmony_ciIn the case of multi-thread applications, NX send windows can be shared
2468c2ecf20Sopenharmony_ciacross all threads. For example, a child thread can open a send window,
2478c2ecf20Sopenharmony_cibut other threads can send requests to NX using this window. These
2488c2ecf20Sopenharmony_cirequests will be successful even in the case of OS handling faults as long
2498c2ecf20Sopenharmony_cias CSB address is valid. If the NX request contains an invalid CSB address,
2508c2ecf20Sopenharmony_cithe signal will be sent to the child thread that opened the window. But if
2518c2ecf20Sopenharmony_cithe thread is exited without closing the window and the request is issued
2528c2ecf20Sopenharmony_ciusing this window. the signal will be issued to the thread group leader
2538c2ecf20Sopenharmony_ci(tgid). It is up to the application whether to ignore or handle these
2548c2ecf20Sopenharmony_cisignals.
2558c2ecf20Sopenharmony_ci
2568c2ecf20Sopenharmony_ciNX-GZIP User's Manual:
2578c2ecf20Sopenharmony_cihttps://github.com/libnxz/power-gzip/blob/master/power_nx_gzip_um.pdf
2588c2ecf20Sopenharmony_ci
2598c2ecf20Sopenharmony_ciSimple example
2608c2ecf20Sopenharmony_ci==============
2618c2ecf20Sopenharmony_ci
2628c2ecf20Sopenharmony_ci	::
2638c2ecf20Sopenharmony_ci
2648c2ecf20Sopenharmony_ci		int use_nx_gzip()
2658c2ecf20Sopenharmony_ci		{
2668c2ecf20Sopenharmony_ci			int rc, fd;
2678c2ecf20Sopenharmony_ci			void *addr;
2688c2ecf20Sopenharmony_ci			struct vas_setup_attr txattr;
2698c2ecf20Sopenharmony_ci
2708c2ecf20Sopenharmony_ci			fd = open("/dev/crypto/nx-gzip", O_RDWR);
2718c2ecf20Sopenharmony_ci			if (fd < 0) {
2728c2ecf20Sopenharmony_ci				fprintf(stderr, "open nx-gzip failed\n");
2738c2ecf20Sopenharmony_ci				return -1;
2748c2ecf20Sopenharmony_ci			}
2758c2ecf20Sopenharmony_ci			memset(&txattr, 0, sizeof(txattr));
2768c2ecf20Sopenharmony_ci			txattr.version = 1;
2778c2ecf20Sopenharmony_ci			txattr.vas_id = -1
2788c2ecf20Sopenharmony_ci			rc = ioctl(fd, VAS_TX_WIN_OPEN,
2798c2ecf20Sopenharmony_ci					(unsigned long)&txattr);
2808c2ecf20Sopenharmony_ci			if (rc < 0) {
2818c2ecf20Sopenharmony_ci				fprintf(stderr, "ioctl() n %d, error %d\n",
2828c2ecf20Sopenharmony_ci						rc, errno);
2838c2ecf20Sopenharmony_ci				return rc;
2848c2ecf20Sopenharmony_ci			}
2858c2ecf20Sopenharmony_ci			addr = mmap(NULL, 4096, PROT_READ|PROT_WRITE,
2868c2ecf20Sopenharmony_ci					MAP_SHARED, fd, 0ULL);
2878c2ecf20Sopenharmony_ci			if (addr == MAP_FAILED) {
2888c2ecf20Sopenharmony_ci				fprintf(stderr, "mmap() failed, errno %d\n",
2898c2ecf20Sopenharmony_ci						errno);
2908c2ecf20Sopenharmony_ci				return -errno;
2918c2ecf20Sopenharmony_ci			}
2928c2ecf20Sopenharmony_ci			do {
2938c2ecf20Sopenharmony_ci				//Format CRB request with compression or
2948c2ecf20Sopenharmony_ci				//uncompression
2958c2ecf20Sopenharmony_ci				// Refer tests for vas_copy/vas_paste
2968c2ecf20Sopenharmony_ci				vas_copy((&crb, 0, 1);
2978c2ecf20Sopenharmony_ci				vas_paste(addr, 0, 1);
2988c2ecf20Sopenharmony_ci				// Poll on csb.flags with timeout
2998c2ecf20Sopenharmony_ci				// csb address is listed in CRB
3008c2ecf20Sopenharmony_ci			} while (true)
3018c2ecf20Sopenharmony_ci			close(fd) or window can be closed upon process exit
3028c2ecf20Sopenharmony_ci		}
3038c2ecf20Sopenharmony_ci
3048c2ecf20Sopenharmony_ci	Refer https://github.com/abalib/power-gzip for tests or more
3058c2ecf20Sopenharmony_ci	use cases.
306