15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci NVX_gpu_multicast2 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_NVX_gpu_multicast2 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Joshua Schnarr, NVIDIA Corporation (jschnarr 'at' nvidia.com) 125bd8deadSopenharmony_ci Ingo Esser, NVIDIA Corporation (iesser 'at' nvidia.com) 135bd8deadSopenharmony_ci 145bd8deadSopenharmony_ciContributors 155bd8deadSopenharmony_ci 165bd8deadSopenharmony_ci Robert Menzel, NVIDIA 175bd8deadSopenharmony_ci Ralf Biermann, NVIDIA 185bd8deadSopenharmony_ci 195bd8deadSopenharmony_ciStatus 205bd8deadSopenharmony_ci 215bd8deadSopenharmony_ci Complete. 225bd8deadSopenharmony_ci 235bd8deadSopenharmony_ciVersion 245bd8deadSopenharmony_ci 255bd8deadSopenharmony_ci Last Modified Date: July 23, 2019 265bd8deadSopenharmony_ci Author Revision: 8 275bd8deadSopenharmony_ci 285bd8deadSopenharmony_ciNumber 295bd8deadSopenharmony_ci 305bd8deadSopenharmony_ci OpenGL Extension #543 315bd8deadSopenharmony_ci 325bd8deadSopenharmony_ciDependencies 335bd8deadSopenharmony_ci 345bd8deadSopenharmony_ci This extension is written against the OpenGL 4.6 specification 355bd8deadSopenharmony_ci (Compatibility Profile), dated October 24, 2016. 365bd8deadSopenharmony_ci 375bd8deadSopenharmony_ci This extension requires NV_gpu_multicast. 385bd8deadSopenharmony_ci 395bd8deadSopenharmony_ci This extension requires EXT_device_group. 405bd8deadSopenharmony_ci 415bd8deadSopenharmony_ci This extension requires NV_viewport_array. 425bd8deadSopenharmony_ci 435bd8deadSopenharmony_ci This extension requires NV_clip_space_w_scaling. 445bd8deadSopenharmony_ci 455bd8deadSopenharmony_ci This extension requires NVX_progress_fence. 465bd8deadSopenharmony_ci 475bd8deadSopenharmony_ci 485bd8deadSopenharmony_ciOverview 495bd8deadSopenharmony_ci 505bd8deadSopenharmony_ci This extension provides additional mechanisms that influence multicast rendering which is 515bd8deadSopenharmony_ci simultaneous rendering to multiple GPUs. 525bd8deadSopenharmony_ci 535bd8deadSopenharmony_ciNew Procedures and Functions 545bd8deadSopenharmony_ci 555bd8deadSopenharmony_ci uint AsyncCopyImageSubDataNVX( 565bd8deadSopenharmony_ci sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray, 575bd8deadSopenharmony_ci uint srcGpu, GLbitfield dstGpuMask, 585bd8deadSopenharmony_ci uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ, 595bd8deadSopenharmony_ci uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ, 605bd8deadSopenharmony_ci sizei srcWidth, sizei srcHeight, sizei srcDepth, 615bd8deadSopenharmony_ci sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); 625bd8deadSopenharmony_ci 635bd8deadSopenharmony_ci sync AsyncCopyBufferSubDataNVX( 645bd8deadSopenharmony_ci sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray, 655bd8deadSopenharmony_ci uint readGpu, GLbitfield writeGpuMask, 665bd8deadSopenharmony_ci uint readBuffer, uint writeBuffer, 675bd8deadSopenharmony_ci GLintptr readOffset, GLintptr writeOffset, sizeiptr size, 685bd8deadSopenharmony_ci sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); 695bd8deadSopenharmony_ci 705bd8deadSopenharmony_ci void UploadGpuMaskNVX(bitfield mask); 715bd8deadSopenharmony_ci 725bd8deadSopenharmony_ci void MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float *v); 735bd8deadSopenharmony_ci 745bd8deadSopenharmony_ci void MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v); 755bd8deadSopenharmony_ci 765bd8deadSopenharmony_ci void MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff); 775bd8deadSopenharmony_ci 785bd8deadSopenharmony_ci 795bd8deadSopenharmony_ciNew Tokens 805bd8deadSopenharmony_ci 815bd8deadSopenharmony_ci Accepted by the <pname> parameter of GetIntegerv and GetInteger64v: 825bd8deadSopenharmony_ci 835bd8deadSopenharmony_ci UPLOAD_GPU_MASK_NVX 0x954A 845bd8deadSopenharmony_ci 855bd8deadSopenharmony_ciAdditions to Chapter 20 (Multicast Rendering) added to the OpenGL 4.5 (Compatibility Profile) 865bd8deadSopenharmony_ciSpecification by NV_gpu_multicast 875bd8deadSopenharmony_ci 885bd8deadSopenharmony_ci Additions to Section 20.1 (Controlling Individual GPUs) 895bd8deadSopenharmony_ci 905bd8deadSopenharmony_ci Texture data uploads using the functions TexImage1D, TexImage2D, TexImage3D, 915bd8deadSopenharmony_ci TexSubImage1D, TexSubImage2D and TexSubImage3D are restricted to a specific set of GPUs with 925bd8deadSopenharmony_ci 935bd8deadSopenharmony_ci void UploadGpuMaskNVX(bitfield mask); 945bd8deadSopenharmony_ci 955bd8deadSopenharmony_ci This command also restricts buffer object data uploads using the functions BufferStorage, 965bd8deadSopenharmony_ci NamedBufferStorage, BufferSubData and NamedBufferSubData to the specified set of GPUs. 975bd8deadSopenharmony_ci 985bd8deadSopenharmony_ci Further this command also restricts buffer object clears using the functions ClearBufferData, 995bd8deadSopenharmony_ci ClearNamedBufferData, ClearBufferSubData and ClearNamedBufferSubData. 1005bd8deadSopenharmony_ci 1015bd8deadSopenharmony_ci The following errors apply to UploadGpuMaskNVX: 1025bd8deadSopenharmony_ci 1035bd8deadSopenharmony_ci INVALID_VALUE is generated 1045bd8deadSopenharmony_ci * if <mask> is zero, 1055bd8deadSopenharmony_ci * if <mask> is greater than or equal to 2^n, where n is equal to MULTICAST_GPUS_NV 1065bd8deadSopenharmony_ci 1075bd8deadSopenharmony_ci If the command does not generate an error, UPLOAD_GPU_MASK_NVX is set to <mask>. 1085bd8deadSopenharmony_ci 1095bd8deadSopenharmony_ci The default value of UPLOAD_GPU_MASK_NVX is (2^n)-1. 1105bd8deadSopenharmony_ci 1115bd8deadSopenharmony_ci If a function restricted by UploadGpuMaskNVX operates on textures or buffer objects 1125bd8deadSopenharmony_ci with GPU-shared storage type (as opposed to per-GPU storage), UPLOAD_GPU_MASK_NVX is ignored. 1135bd8deadSopenharmony_ci 1145bd8deadSopenharmony_ci Modify Section 20.2 (Multi-GPU Buffer Storage) 1155bd8deadSopenharmony_ci 1165bd8deadSopenharmony_ci Append the following paragraphs: 1175bd8deadSopenharmony_ci 1185bd8deadSopenharmony_ci To initiate a copy of buffer data without waiting for it to complete, use the following command: 1195bd8deadSopenharmony_ci 1205bd8deadSopenharmony_ci void AsyncCopyBufferSubDataNVX( 1215bd8deadSopenharmony_ci sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray, 1225bd8deadSopenharmony_ci uint readGpu, GLbitfield writeGpuMask, 1235bd8deadSopenharmony_ci uint readBuffer, uint writeBuffer, 1245bd8deadSopenharmony_ci GLintptr readOffset, GLintptr writeOffset, sizeiptr size, 1255bd8deadSopenharmony_ci sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); 1265bd8deadSopenharmony_ci 1275bd8deadSopenharmony_ci This command behaves equivalently to MulticastCopyBufferSubDataNV, except that it may be 1285bd8deadSopenharmony_ci performed concurrently with commands submitted in the future. 1295bd8deadSopenharmony_ci Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or 1305bd8deadSopenharmony_ci multiple copies. 1315bd8deadSopenharmony_ci An array of <waitSemaphoreCount> synchronization objects can be specified in the <waitSemaphoresArray> 1325bd8deadSopenharmony_ci parameter as a pointer to the array of semaphore objects. 1335bd8deadSopenharmony_ci The copy will wait for all fence semaphores in the <waitSemaphoreArray> array to be reach or exceed 1345bd8deadSopenharmony_ci their corresponding fence value in <fenceValueArray> before starting the transfer. 1355bd8deadSopenharmony_ci A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoresArray> is written 1365bd8deadSopenharmony_ci after the copy with the corresponding fence value in <signalValueArray>. 1375bd8deadSopenharmony_ci To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait 1385bd8deadSopenharmony_ci for the semaphores in <signalSemaphoreArray> to be signalled with the fence values in <signalValueArray>. 1395bd8deadSopenharmony_ci 1405bd8deadSopenharmony_ci Modify Section 20.3.1 (Copying Image Data Between GPUs) 1415bd8deadSopenharmony_ci 1425bd8deadSopenharmony_ci Insert the following paragraphs above the line starting "To copy pixel values": 1435bd8deadSopenharmony_ci 1445bd8deadSopenharmony_ci To initiate a copy of texel data without waiting for it to complete, use the following command: 1455bd8deadSopenharmony_ci 1465bd8deadSopenharmony_ci void AsyncCopyImageSubDataNVX( 1475bd8deadSopenharmony_ci sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray, 1485bd8deadSopenharmony_ci uint srcGpu, GLbitfield dstGpuMask, 1495bd8deadSopenharmony_ci uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ, 1505bd8deadSopenharmony_ci uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ, 1515bd8deadSopenharmony_ci sizei srcWidth, sizei srcHeight, sizei srcDepth, 1525bd8deadSopenharmony_ci sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray); 1535bd8deadSopenharmony_ci 1545bd8deadSopenharmony_ci This command behaves equivalently to MulticastCopyImageSubDataNV, except that it may be 1555bd8deadSopenharmony_ci performed concurrently with commands submitted in the future. 1565bd8deadSopenharmony_ci Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or 1575bd8deadSopenharmony_ci multiple copies. An array of <waitSemaphoreCount> synchronization objects can be specified in the 1585bd8deadSopenharmony_ci <waitSemaphoreArray> parameter as a pointer to the array of semaphore objects. 1595bd8deadSopenharmony_ci The copy will wait for all fence semaphores in the <waitSemaphoresArray> array to be reach or exceed 1605bd8deadSopenharmony_ci their corresponding fence value in <fenceValueArray> before starting the transfer. 1615bd8deadSopenharmony_ci A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoreArray> is written 1625bd8deadSopenharmony_ci after the copy with the corresponding fence value in <signalValueArray>. 1635bd8deadSopenharmony_ci To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait 1645bd8deadSopenharmony_ci for the semaphores in <signalSemaphoresArray> to be signalled with the fence values in <signalValueArray>. 1655bd8deadSopenharmony_ci 1665bd8deadSopenharmony_ciAdditions to Chapter 13 (Fixed-Function Vertex Post-Processing) added to the OpenGL 4.5 (Compatibility Profile) 1675bd8deadSopenharmony_ci 1685bd8deadSopenharmony_ci Modify Section 13.6 (Coordinate transformations) 1695bd8deadSopenharmony_ci 1705bd8deadSopenharmony_ci Viewport transformation parameters for multiple viewports are specified using 1715bd8deadSopenharmony_ci 1725bd8deadSopenharmony_ci MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float * v); 1735bd8deadSopenharmony_ci 1745bd8deadSopenharmony_ci where the array of viewport parameters can be controlled for each multicast GPU, respectively. 1755bd8deadSopenharmony_ci 1765bd8deadSopenharmony_ci A set of scissor rectangles that are each applied to the corresponding viewport is specified 1775bd8deadSopenharmony_ci using 1785bd8deadSopenharmony_ci 1795bd8deadSopenharmony_ci MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v); 1805bd8deadSopenharmony_ci 1815bd8deadSopenharmony_ci where the rectangle parameters can be controlled for each multicast GPU, respectively. 1825bd8deadSopenharmony_ci 1835bd8deadSopenharmony_ci 1845bd8deadSopenharmony_ci If VIEWPORT_POSITION_W_SCALE_NV is enabled, the w coordinates for each 1855bd8deadSopenharmony_ci primitive sent to a given viewport will be scaled as a function of 1865bd8deadSopenharmony_ci its x and y coordinates using the following equation: 1875bd8deadSopenharmony_ci 1885bd8deadSopenharmony_ci w' = xcoeff * x + ycoeff * y + w; 1895bd8deadSopenharmony_ci 1905bd8deadSopenharmony_ci The coefficients for "x" and "y" used in the above equation depend on the 1915bd8deadSopenharmony_ci viewport index and can be controlled for each multicast GPU, respectively, by the command 1925bd8deadSopenharmony_ci 1935bd8deadSopenharmony_ci MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff); 1945bd8deadSopenharmony_ci 1955bd8deadSopenharmony_ci An error INVALID_VALUE error is generated if <gpu> is greater than or equal to MULTICAST_GPUS_NV. 1965bd8deadSopenharmony_ci 1975bd8deadSopenharmony_ciAdditions to the OpenGL Shading Language Specification, Version 4.50 1985bd8deadSopenharmony_ci 1995bd8deadSopenharmony_ci Including the following line in a shader can be used to enumerate multicast GPUs 2005bd8deadSopenharmony_ci by using the shader built-in variable gl_DeviceIndex: 2015bd8deadSopenharmony_ci 2025bd8deadSopenharmony_ci #extension GL_EXT_device_group : enable 2035bd8deadSopenharmony_ci 2045bd8deadSopenharmony_ci Each multicast GPU contains a unique device index in the gl_DeviceIndex variable. 2055bd8deadSopenharmony_ci 2065bd8deadSopenharmony_ciErrors 2075bd8deadSopenharmony_ci 2085bd8deadSopenharmony_ci Relaxation of INVALID_ENUM errors 2095bd8deadSopenharmony_ci --------------------------------- 2105bd8deadSopenharmony_ci GetIntegerv and GetInteger64v now accept new tokens as 2115bd8deadSopenharmony_ci described in the "New Tokens" section. 2125bd8deadSopenharmony_ci 2135bd8deadSopenharmony_ciNew State 2145bd8deadSopenharmony_ci 2155bd8deadSopenharmony_ci Additions to Table 23.6 Buffer Object State 2165bd8deadSopenharmony_ci Initial 2175bd8deadSopenharmony_ci Get Value Type Get Command Value Description Sec. Attribute 2185bd8deadSopenharmony_ci -------------------------- ------ ----------- ----- ----------------------- ---- --------- 2195bd8deadSopenharmony_ci UPLOAD_GPU_MASK_NVX Z+ GetIntegerv * Mask of GPUs that 20.1 - 2205bd8deadSopenharmony_ci restricts buffer data 2215bd8deadSopenharmony_ci writes 2225bd8deadSopenharmony_ci * See section 20.1 2235bd8deadSopenharmony_ci 2245bd8deadSopenharmony_ci 2255bd8deadSopenharmony_ciNew Implementation Dependent State 2265bd8deadSopenharmony_ci 2275bd8deadSopenharmony_ci None. 2285bd8deadSopenharmony_ci 2295bd8deadSopenharmony_ciSample Code 2305bd8deadSopenharmony_ci 2315bd8deadSopenharmony_ci None. 2325bd8deadSopenharmony_ci 2335bd8deadSopenharmony_ciIssues 2345bd8deadSopenharmony_ci 2355bd8deadSopenharmony_ci None. 2365bd8deadSopenharmony_ci 2375bd8deadSopenharmony_ciRevision History 2385bd8deadSopenharmony_ci 2395bd8deadSopenharmony_ci Rev. Date Author Changes 2405bd8deadSopenharmony_ci ---- -------- -------- ----------------------------------------------- 2415bd8deadSopenharmony_ci 1 09/20/17 jschnarr initial draft 2425bd8deadSopenharmony_ci 2 02/23/18 rbiermann updated draft with new functions 2435bd8deadSopenharmony_ci 3 05/23/18 rbiermann updated draft with new ViewportArray and AsyncCopy functions 2445bd8deadSopenharmony_ci 4 06/08/18 rbiermann added NVX_progress_fence for synchronization objects 2455bd8deadSopenharmony_ci 5 08/15/18 rbiermann updated draft with gl_deviceIndex 2465bd8deadSopenharmony_ci 6 04/16/19 rbiermann updated draft with UploadGpuMaskNVX 2475bd8deadSopenharmony_ci 7 07/19/19 rbiermann updated draft with modifications of UploadGpuMaskNVX section 2485bd8deadSopenharmony_ci 8 07/23/19 rbiermann updated draft with support of Clear(Named)Buffer(Sub)Data by UploadGpuMaskNVX 2495bd8deadSopenharmony_ci 250