15bd8deadSopenharmony_ciName
25bd8deadSopenharmony_ci
35bd8deadSopenharmony_ci    NVX_gpu_multicast2
45bd8deadSopenharmony_ci
55bd8deadSopenharmony_ciName Strings
65bd8deadSopenharmony_ci
75bd8deadSopenharmony_ci    GL_NVX_gpu_multicast2
85bd8deadSopenharmony_ci
95bd8deadSopenharmony_ciContact
105bd8deadSopenharmony_ci
115bd8deadSopenharmony_ci    Joshua Schnarr, NVIDIA Corporation (jschnarr 'at' nvidia.com)
125bd8deadSopenharmony_ci    Ingo Esser, NVIDIA Corporation (iesser 'at' nvidia.com)
135bd8deadSopenharmony_ci
145bd8deadSopenharmony_ciContributors
155bd8deadSopenharmony_ci
165bd8deadSopenharmony_ci    Robert Menzel, NVIDIA
175bd8deadSopenharmony_ci    Ralf Biermann, NVIDIA
185bd8deadSopenharmony_ci
195bd8deadSopenharmony_ciStatus
205bd8deadSopenharmony_ci
215bd8deadSopenharmony_ci    Complete.
225bd8deadSopenharmony_ci
235bd8deadSopenharmony_ciVersion
245bd8deadSopenharmony_ci
255bd8deadSopenharmony_ci    Last Modified Date: July 23, 2019
265bd8deadSopenharmony_ci    Author Revision: 8
275bd8deadSopenharmony_ci
285bd8deadSopenharmony_ciNumber
295bd8deadSopenharmony_ci
305bd8deadSopenharmony_ci    OpenGL Extension #543
315bd8deadSopenharmony_ci
325bd8deadSopenharmony_ciDependencies
335bd8deadSopenharmony_ci
345bd8deadSopenharmony_ci    This extension is written against the OpenGL 4.6 specification
355bd8deadSopenharmony_ci    (Compatibility Profile), dated October 24, 2016.
365bd8deadSopenharmony_ci
375bd8deadSopenharmony_ci    This extension requires NV_gpu_multicast.
385bd8deadSopenharmony_ci    
395bd8deadSopenharmony_ci    This extension requires EXT_device_group.
405bd8deadSopenharmony_ci
415bd8deadSopenharmony_ci    This extension requires NV_viewport_array.
425bd8deadSopenharmony_ci    
435bd8deadSopenharmony_ci    This extension requires NV_clip_space_w_scaling.
445bd8deadSopenharmony_ci
455bd8deadSopenharmony_ci    This extension requires NVX_progress_fence.
465bd8deadSopenharmony_ci
475bd8deadSopenharmony_ci    
485bd8deadSopenharmony_ciOverview
495bd8deadSopenharmony_ci
505bd8deadSopenharmony_ci    This extension provides additional mechanisms that influence multicast rendering which is
515bd8deadSopenharmony_ci    simultaneous rendering to multiple GPUs.
525bd8deadSopenharmony_ci    
535bd8deadSopenharmony_ciNew Procedures and Functions
545bd8deadSopenharmony_ci
555bd8deadSopenharmony_ci    uint AsyncCopyImageSubDataNVX(
565bd8deadSopenharmony_ci        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
575bd8deadSopenharmony_ci        uint srcGpu, GLbitfield dstGpuMask,
585bd8deadSopenharmony_ci        uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
595bd8deadSopenharmony_ci        uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
605bd8deadSopenharmony_ci        sizei srcWidth, sizei srcHeight, sizei srcDepth,
615bd8deadSopenharmony_ci        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
625bd8deadSopenharmony_ci
635bd8deadSopenharmony_ci    sync AsyncCopyBufferSubDataNVX(
645bd8deadSopenharmony_ci        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
655bd8deadSopenharmony_ci        uint readGpu, GLbitfield writeGpuMask,
665bd8deadSopenharmony_ci        uint readBuffer, uint writeBuffer,
675bd8deadSopenharmony_ci        GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
685bd8deadSopenharmony_ci        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
695bd8deadSopenharmony_ci
705bd8deadSopenharmony_ci    void UploadGpuMaskNVX(bitfield mask);
715bd8deadSopenharmony_ci        
725bd8deadSopenharmony_ci    void MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float *v);
735bd8deadSopenharmony_ci
745bd8deadSopenharmony_ci    void MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
755bd8deadSopenharmony_ci    
765bd8deadSopenharmony_ci    void MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);    
775bd8deadSopenharmony_ci
785bd8deadSopenharmony_ci    
795bd8deadSopenharmony_ciNew Tokens
805bd8deadSopenharmony_ci
815bd8deadSopenharmony_ci    Accepted by the <pname> parameter of GetIntegerv and GetInteger64v:
825bd8deadSopenharmony_ci
835bd8deadSopenharmony_ci        UPLOAD_GPU_MASK_NVX                        0x954A
845bd8deadSopenharmony_ci
855bd8deadSopenharmony_ciAdditions to Chapter 20 (Multicast Rendering) added to the OpenGL 4.5 (Compatibility Profile)
865bd8deadSopenharmony_ciSpecification by NV_gpu_multicast
875bd8deadSopenharmony_ci
885bd8deadSopenharmony_ci    Additions to Section 20.1 (Controlling Individual GPUs)
895bd8deadSopenharmony_ci
905bd8deadSopenharmony_ci    Texture data uploads using the functions TexImage1D, TexImage2D, TexImage3D, 
915bd8deadSopenharmony_ci    TexSubImage1D, TexSubImage2D and TexSubImage3D are restricted to a specific set of GPUs with
925bd8deadSopenharmony_ci
935bd8deadSopenharmony_ci      void UploadGpuMaskNVX(bitfield mask);
945bd8deadSopenharmony_ci
955bd8deadSopenharmony_ci    This command also restricts buffer object data uploads using the functions BufferStorage, 
965bd8deadSopenharmony_ci    NamedBufferStorage, BufferSubData and NamedBufferSubData to the specified set of GPUs.
975bd8deadSopenharmony_ci
985bd8deadSopenharmony_ci    Further this command also restricts buffer object clears using the functions ClearBufferData,
995bd8deadSopenharmony_ci    ClearNamedBufferData, ClearBufferSubData and ClearNamedBufferSubData.
1005bd8deadSopenharmony_ci    
1015bd8deadSopenharmony_ci    The following errors apply to UploadGpuMaskNVX:
1025bd8deadSopenharmony_ci
1035bd8deadSopenharmony_ci    INVALID_VALUE is generated
1045bd8deadSopenharmony_ci    * if <mask> is zero,
1055bd8deadSopenharmony_ci    * if <mask> is greater than or equal to 2^n, where n is equal to MULTICAST_GPUS_NV
1065bd8deadSopenharmony_ci
1075bd8deadSopenharmony_ci    If the command does not generate an error, UPLOAD_GPU_MASK_NVX is set to <mask>.  
1085bd8deadSopenharmony_ci    
1095bd8deadSopenharmony_ci    The default value of UPLOAD_GPU_MASK_NVX is (2^n)-1.
1105bd8deadSopenharmony_ci
1115bd8deadSopenharmony_ci    If a function restricted by UploadGpuMaskNVX operates on textures or buffer objects
1125bd8deadSopenharmony_ci    with GPU-shared storage type (as opposed to per-GPU storage), UPLOAD_GPU_MASK_NVX is ignored.
1135bd8deadSopenharmony_ci
1145bd8deadSopenharmony_ci    Modify Section 20.2 (Multi-GPU Buffer Storage)
1155bd8deadSopenharmony_ci
1165bd8deadSopenharmony_ci    Append the following paragraphs:
1175bd8deadSopenharmony_ci
1185bd8deadSopenharmony_ci    To initiate a copy of buffer data without waiting for it to complete, use the following command:
1195bd8deadSopenharmony_ci
1205bd8deadSopenharmony_ci    void AsyncCopyBufferSubDataNVX(
1215bd8deadSopenharmony_ci        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
1225bd8deadSopenharmony_ci        uint readGpu, GLbitfield writeGpuMask,
1235bd8deadSopenharmony_ci        uint readBuffer, uint writeBuffer,
1245bd8deadSopenharmony_ci        GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
1255bd8deadSopenharmony_ci        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
1265bd8deadSopenharmony_ci
1275bd8deadSopenharmony_ci    This command behaves equivalently to MulticastCopyBufferSubDataNV, except that it may be
1285bd8deadSopenharmony_ci    performed concurrently with commands submitted in the future.
1295bd8deadSopenharmony_ci    Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or 
1305bd8deadSopenharmony_ci    multiple copies. 
1315bd8deadSopenharmony_ci    An array of <waitSemaphoreCount> synchronization objects can be specified in the <waitSemaphoresArray>
1325bd8deadSopenharmony_ci    parameter as a pointer to the array of semaphore objects.
1335bd8deadSopenharmony_ci    The copy will wait for all fence semaphores in the <waitSemaphoreArray> array to be reach or exceed
1345bd8deadSopenharmony_ci    their corresponding fence value in <fenceValueArray> before starting the transfer. 
1355bd8deadSopenharmony_ci    A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoresArray> is written
1365bd8deadSopenharmony_ci    after the copy with the corresponding fence value in <signalValueArray>.
1375bd8deadSopenharmony_ci    To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
1385bd8deadSopenharmony_ci    for the semaphores in <signalSemaphoreArray> to be signalled with the fence values in <signalValueArray>.
1395bd8deadSopenharmony_ci    
1405bd8deadSopenharmony_ci    Modify Section 20.3.1 (Copying Image Data Between GPUs)
1415bd8deadSopenharmony_ci
1425bd8deadSopenharmony_ci    Insert the following paragraphs above the line starting "To copy pixel values":
1435bd8deadSopenharmony_ci
1445bd8deadSopenharmony_ci    To initiate a copy of texel data without waiting for it to complete, use the following command:
1455bd8deadSopenharmony_ci
1465bd8deadSopenharmony_ci    void AsyncCopyImageSubDataNVX(
1475bd8deadSopenharmony_ci        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
1485bd8deadSopenharmony_ci        uint srcGpu, GLbitfield dstGpuMask,
1495bd8deadSopenharmony_ci        uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
1505bd8deadSopenharmony_ci        uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
1515bd8deadSopenharmony_ci        sizei srcWidth, sizei srcHeight, sizei srcDepth,
1525bd8deadSopenharmony_ci        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
1535bd8deadSopenharmony_ci
1545bd8deadSopenharmony_ci    This command behaves equivalently to MulticastCopyImageSubDataNV, except that it may be
1555bd8deadSopenharmony_ci    performed concurrently with commands submitted in the future. 
1565bd8deadSopenharmony_ci    Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or
1575bd8deadSopenharmony_ci    multiple copies. An array of <waitSemaphoreCount> synchronization objects can be specified in the 
1585bd8deadSopenharmony_ci    <waitSemaphoreArray> parameter as a pointer to the array of semaphore objects.
1595bd8deadSopenharmony_ci    The copy will wait for all fence semaphores in the <waitSemaphoresArray> array to be reach or exceed
1605bd8deadSopenharmony_ci    their corresponding fence value in <fenceValueArray> before starting the transfer. 
1615bd8deadSopenharmony_ci    A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoreArray> is written
1625bd8deadSopenharmony_ci    after the copy with the corresponding fence value in <signalValueArray>.
1635bd8deadSopenharmony_ci    To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
1645bd8deadSopenharmony_ci    for the semaphores in <signalSemaphoresArray> to be signalled with the fence values in <signalValueArray>.
1655bd8deadSopenharmony_ci
1665bd8deadSopenharmony_ciAdditions to Chapter 13 (Fixed-Function Vertex Post-Processing) added to the OpenGL 4.5 (Compatibility Profile)
1675bd8deadSopenharmony_ci
1685bd8deadSopenharmony_ci    Modify Section 13.6 (Coordinate transformations)
1695bd8deadSopenharmony_ci    
1705bd8deadSopenharmony_ci    Viewport transformation parameters for multiple viewports are specified using
1715bd8deadSopenharmony_ci
1725bd8deadSopenharmony_ci        MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float * v);
1735bd8deadSopenharmony_ci    
1745bd8deadSopenharmony_ci    where the array of viewport parameters can be controlled for each multicast GPU, respectively.
1755bd8deadSopenharmony_ci    
1765bd8deadSopenharmony_ci    A set of scissor rectangles that are each applied to the corresponding viewport is specified
1775bd8deadSopenharmony_ci    using
1785bd8deadSopenharmony_ci    
1795bd8deadSopenharmony_ci        MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
1805bd8deadSopenharmony_ci    
1815bd8deadSopenharmony_ci    where the rectangle parameters can be controlled for each multicast GPU, respectively.
1825bd8deadSopenharmony_ci    
1835bd8deadSopenharmony_ci    
1845bd8deadSopenharmony_ci    If VIEWPORT_POSITION_W_SCALE_NV is enabled, the w coordinates for each
1855bd8deadSopenharmony_ci    primitive sent to a given viewport will be scaled as a function of
1865bd8deadSopenharmony_ci    its x and y coordinates using the following equation:
1875bd8deadSopenharmony_ci
1885bd8deadSopenharmony_ci        w' = xcoeff * x + ycoeff * y + w;
1895bd8deadSopenharmony_ci
1905bd8deadSopenharmony_ci    The coefficients for "x" and "y" used in the above equation depend on the
1915bd8deadSopenharmony_ci    viewport index and can be controlled for each multicast GPU, respectively, by the command
1925bd8deadSopenharmony_ci
1935bd8deadSopenharmony_ci        MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);
1945bd8deadSopenharmony_ci
1955bd8deadSopenharmony_ci    An error INVALID_VALUE error is generated if <gpu> is greater than or equal to MULTICAST_GPUS_NV. 
1965bd8deadSopenharmony_ci    
1975bd8deadSopenharmony_ciAdditions to the OpenGL Shading Language Specification, Version 4.50
1985bd8deadSopenharmony_ci        
1995bd8deadSopenharmony_ci    Including the following line in a shader can be used to enumerate multicast GPUs
2005bd8deadSopenharmony_ci    by using the shader built-in variable gl_DeviceIndex:
2015bd8deadSopenharmony_ci
2025bd8deadSopenharmony_ci        #extension GL_EXT_device_group : enable
2035bd8deadSopenharmony_ci    
2045bd8deadSopenharmony_ci    Each multicast GPU contains a unique device index in the gl_DeviceIndex variable.
2055bd8deadSopenharmony_ci
2065bd8deadSopenharmony_ciErrors
2075bd8deadSopenharmony_ci
2085bd8deadSopenharmony_ci    Relaxation of INVALID_ENUM errors
2095bd8deadSopenharmony_ci    ---------------------------------
2105bd8deadSopenharmony_ci    GetIntegerv and GetInteger64v now accept new tokens as
2115bd8deadSopenharmony_ci    described in the "New Tokens" section.
2125bd8deadSopenharmony_ci    
2135bd8deadSopenharmony_ciNew State
2145bd8deadSopenharmony_ci
2155bd8deadSopenharmony_ci    Additions to Table 23.6 Buffer Object State
2165bd8deadSopenharmony_ci                                                   Initial
2175bd8deadSopenharmony_ci    Get Value                   Type  Get Command Value  Description               Sec.  Attribute
2185bd8deadSopenharmony_ci    -------------------------- ------ ----------- -----  -----------------------   ----  ---------
2195bd8deadSopenharmony_ci    UPLOAD_GPU_MASK_NVX          Z+   GetIntegerv   *    Mask of GPUs that         20.1     -
2205bd8deadSopenharmony_ci                                                         restricts buffer data
2215bd8deadSopenharmony_ci                                                         writes
2225bd8deadSopenharmony_ci    * See section 20.1
2235bd8deadSopenharmony_ci
2245bd8deadSopenharmony_ci
2255bd8deadSopenharmony_ciNew Implementation Dependent State
2265bd8deadSopenharmony_ci
2275bd8deadSopenharmony_ci    None.
2285bd8deadSopenharmony_ci
2295bd8deadSopenharmony_ciSample Code
2305bd8deadSopenharmony_ci
2315bd8deadSopenharmony_ci    None.
2325bd8deadSopenharmony_ci
2335bd8deadSopenharmony_ciIssues
2345bd8deadSopenharmony_ci
2355bd8deadSopenharmony_ci    None.
2365bd8deadSopenharmony_ci
2375bd8deadSopenharmony_ciRevision History
2385bd8deadSopenharmony_ci
2395bd8deadSopenharmony_ci    Rev.    Date    Author    Changes
2405bd8deadSopenharmony_ci    ----  --------  --------  -----------------------------------------------
2415bd8deadSopenharmony_ci     1    09/20/17  jschnarr  initial draft
2425bd8deadSopenharmony_ci     2    02/23/18  rbiermann updated draft with new functions
2435bd8deadSopenharmony_ci     3    05/23/18  rbiermann updated draft with new ViewportArray and AsyncCopy functions
2445bd8deadSopenharmony_ci     4    06/08/18  rbiermann added NVX_progress_fence for synchronization objects
2455bd8deadSopenharmony_ci     5    08/15/18  rbiermann updated draft with gl_deviceIndex
2465bd8deadSopenharmony_ci     6    04/16/19  rbiermann updated draft with UploadGpuMaskNVX
2475bd8deadSopenharmony_ci     7    07/19/19  rbiermann updated draft with modifications of UploadGpuMaskNVX section
2485bd8deadSopenharmony_ci     8    07/23/19  rbiermann updated draft with support of Clear(Named)Buffer(Sub)Data by UploadGpuMaskNVX
2495bd8deadSopenharmony_ci
250