15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci NV_shader_atomic_float64 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_NV_shader_atomic_float64 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Kedarnath Thangudu, NVIDIA Corporation (kthangudu 'at' nvidia.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Pat Brown, NVIDIA 165bd8deadSopenharmony_ci 175bd8deadSopenharmony_ciStatus 185bd8deadSopenharmony_ci 195bd8deadSopenharmony_ci Shipping in NVIDIA release 367.XX drivers and up. 205bd8deadSopenharmony_ci 215bd8deadSopenharmony_ciVersion 225bd8deadSopenharmony_ci 235bd8deadSopenharmony_ci Last Modified Date: October 15, 2014 245bd8deadSopenharmony_ci NVIDIA Revision: 1 255bd8deadSopenharmony_ci 265bd8deadSopenharmony_ciNumber 275bd8deadSopenharmony_ci 285bd8deadSopenharmony_ci OpenGL Extension #488 295bd8deadSopenharmony_ci 305bd8deadSopenharmony_ciDependencies 315bd8deadSopenharmony_ci 325bd8deadSopenharmony_ci This extension is written against the OpenGL 4.5 (Compatibility Profile) 335bd8deadSopenharmony_ci Specification. 345bd8deadSopenharmony_ci 355bd8deadSopenharmony_ci This extension is written against version 4.50 (revision 3) of the OpenGL 365bd8deadSopenharmony_ci Shading Language Specification. 375bd8deadSopenharmony_ci 385bd8deadSopenharmony_ci This extension requires ARB_gpu_shader_fp64 or NV_gpu_program_fp64. 395bd8deadSopenharmony_ci 405bd8deadSopenharmony_ci This extension interacts with NV_shader_buffer_store, NV_gpu_shader5, 415bd8deadSopenharmony_ci ARB_shader_storage_buffer_object, and ARB_compute_shader. 425bd8deadSopenharmony_ci 435bd8deadSopenharmony_ci This extension interacts with NV_gpu_program5. 445bd8deadSopenharmony_ci 455bd8deadSopenharmony_ciOverview 465bd8deadSopenharmony_ci 475bd8deadSopenharmony_ci This extension provides GLSL built-in functions and assembly opcodes 485bd8deadSopenharmony_ci allowing shaders to perform atomic read-modify-write operations to buffer 495bd8deadSopenharmony_ci or shared memory with double-precision floating-point components. The set 505bd8deadSopenharmony_ci of atomic operations provided by this extension is limited to adds and 515bd8deadSopenharmony_ci exchanges. Providing atomic add support allows shaders to atomically 525bd8deadSopenharmony_ci accumulate the sum of double-precision floating-point values into buffer 535bd8deadSopenharmony_ci memory across multiple (possibly concurrent) shader invocations. 545bd8deadSopenharmony_ci 555bd8deadSopenharmony_ci This extension provides GLSL support for atomics targeting double-precision 565bd8deadSopenharmony_ci floating-point pointers (if NV_gpu_shader5 is supported). 575bd8deadSopenharmony_ci Additionally, assembly opcodes for these operations are also provided if 585bd8deadSopenharmony_ci NV_gpu_program5 is supported. 595bd8deadSopenharmony_ci 605bd8deadSopenharmony_ciNew Procedures and Functions 615bd8deadSopenharmony_ci 625bd8deadSopenharmony_ci None. 635bd8deadSopenharmony_ci 645bd8deadSopenharmony_ciNew Tokens 655bd8deadSopenharmony_ci 665bd8deadSopenharmony_ci None. 675bd8deadSopenharmony_ci 685bd8deadSopenharmony_ciAdditions to the OpenGL 4.5 (Compatibility Profile) Specification 695bd8deadSopenharmony_ci 705bd8deadSopenharmony_ci None. 715bd8deadSopenharmony_ci 725bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications 735bd8deadSopenharmony_ci 745bd8deadSopenharmony_ci None. 755bd8deadSopenharmony_ci 765bd8deadSopenharmony_ciGLX Protocol 775bd8deadSopenharmony_ci 785bd8deadSopenharmony_ci None. 795bd8deadSopenharmony_ci 805bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.50 815bd8deadSopenharmony_ci(revision 3) 825bd8deadSopenharmony_ci 835bd8deadSopenharmony_ci Including the following line in a shader can be used to control the 845bd8deadSopenharmony_ci language features described in this extension: 855bd8deadSopenharmony_ci 865bd8deadSopenharmony_ci #extension GL_NV_shader_atomic_float64 : <behavior> 875bd8deadSopenharmony_ci 885bd8deadSopenharmony_ci where <behavior> is as specified in section 3.3. 895bd8deadSopenharmony_ci 905bd8deadSopenharmony_ci New preprocessor #defines are added to the OpenGL Shading Language: 915bd8deadSopenharmony_ci 925bd8deadSopenharmony_ci #define GL_NV_shader_atomic_float64 1 935bd8deadSopenharmony_ci 945bd8deadSopenharmony_ci Modify Section 8.11, Atomic Memory Functions (p. 172) 955bd8deadSopenharmony_ci 965bd8deadSopenharmony_ci (add to "atomicAdd" table cell, p. 173) 975bd8deadSopenharmony_ci 985bd8deadSopenharmony_ci double atomicAdd(coherent inout double mem, double data) 995bd8deadSopenharmony_ci 1005bd8deadSopenharmony_ci (add to "atomicExchange" table cell, p. 173) 1015bd8deadSopenharmony_ci 1025bd8deadSopenharmony_ci double atomicExchange(coherent inout double mem, double data) 1035bd8deadSopenharmony_ci 1045bd8deadSopenharmony_ci 1055bd8deadSopenharmony_ciDependencies on NV_shader_buffer_store, NV_gpu_shader5, 1065bd8deadSopenharmony_ciARB_shader_storage_buffer_object, and ARB_compute_shader 1075bd8deadSopenharmony_ci 1085bd8deadSopenharmony_ci If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following 1095bd8deadSopenharmony_ci functions should be added to the "Section 8.Y, Shader Memory Functions" 1105bd8deadSopenharmony_ci language in the NV_shader_buffer_store specification: 1115bd8deadSopenharmony_ci 1125bd8deadSopenharmony_ci double atomicAdd(double *address, double data); 1135bd8deadSopenharmony_ci double atomicExchange(double *address, double data); 1145bd8deadSopenharmony_ci 1155bd8deadSopenharmony_ci If ARB_shader_storage_buffer_object or ARB_compute_shader are supported, 1165bd8deadSopenharmony_ci make similar edits to the functions documented in the 1175bd8deadSopenharmony_ci ARB_shader_storage_buffer object extension. 1185bd8deadSopenharmony_ci 1195bd8deadSopenharmony_ci These functions are available if and only if GL_NV_shader_atomic_float64 is 1205bd8deadSopenharmony_ci enabled via the "#extension" directive. 1215bd8deadSopenharmony_ci 1225bd8deadSopenharmony_ciDependencies on NV_gpu_program5 1235bd8deadSopenharmony_ci 1245bd8deadSopenharmony_ci If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float64" is 1255bd8deadSopenharmony_ci specified in an assembly program, "F64" should be allowed as a storage 1265bd8deadSopenharmony_ci modifier to the ATOM instruction for the atomic operations "ADD" and 1275bd8deadSopenharmony_ci "EXCH". 1285bd8deadSopenharmony_ci 1295bd8deadSopenharmony_ci (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension, 1305bd8deadSopenharmony_ci as extended by NV_gpu_program5:) 1315bd8deadSopenharmony_ci 1325bd8deadSopenharmony_ci + Double-precision Floating-Point Atomic Operations (NV_shader_atomic_float64) 1335bd8deadSopenharmony_ci 1345bd8deadSopenharmony_ci If a program specifies the "NV_shader_atomic_float64" option, it may use 1355bd8deadSopenharmony_ci "F64" storage modifier with the "ATOM" opcode to perform atomic double- 1365bd8deadSopenharmony_ci precision floating-point add or exchange operations. 1375bd8deadSopenharmony_ci 1385bd8deadSopenharmony_ci (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:) 1395bd8deadSopenharmony_ci 1405bd8deadSopenharmony_ci atomic storage 1415bd8deadSopenharmony_ci modifier modifiers operation 1425bd8deadSopenharmony_ci -------- ----------------------- --------------------------------- 1435bd8deadSopenharmony_ci ADD U32, S32, U64, F32, F64 compute a sum 1445bd8deadSopenharmony_ci F16X2, F16X4 1455bd8deadSopenharmony_ci ... 1465bd8deadSopenharmony_ci EXCH U32, S32, U64, F32, F64 exchange memory with operand 1475bd8deadSopenharmony_ci F16X2, F16X4 1485bd8deadSopenharmony_ci 1495bd8deadSopenharmony_ci Note: 1505bd8deadSopenharmony_ci Storage modifier U64 is provided by NV_shader_atomic_int64 1515bd8deadSopenharmony_ci Storage modifier F32 is provided by NV_shader_atomic_float 1525bd8deadSopenharmony_ci Storage modifiers F16X2 and F16X4 are provided by NV_shader_atomic_fp16_vector 1535bd8deadSopenharmony_ci 1545bd8deadSopenharmony_ciErrors 1555bd8deadSopenharmony_ci 1565bd8deadSopenharmony_ci None. 1575bd8deadSopenharmony_ci 1585bd8deadSopenharmony_ciNew State 1595bd8deadSopenharmony_ci 1605bd8deadSopenharmony_ci None. 1615bd8deadSopenharmony_ci 1625bd8deadSopenharmony_ciNew Implementation Dependent State 1635bd8deadSopenharmony_ci 1645bd8deadSopenharmony_ci None. 1655bd8deadSopenharmony_ci 1665bd8deadSopenharmony_ciIssues 1675bd8deadSopenharmony_ci 1685bd8deadSopenharmony_ci (1) What double-precision floating-point targets are supported for 1695bd8deadSopenharmony_ci atomic operations? 1705bd8deadSopenharmony_ci 1715bd8deadSopenharmony_ci RESOLVED: This extension only supports atomic operations on double- 1725bd8deadSopenharmony_ci precision floating-point buffer memory. Atomic operation on double- 1735bd8deadSopenharmony_ci precision texture memory are not supported since OpenGL provides 1745bd8deadSopenharmony_ci no pixel/texture formats with double-precision components. 1755bd8deadSopenharmony_ci 1765bd8deadSopenharmony_ci (2) What atomic operations should we support for double-precision 1775bd8deadSopenharmony_ci floating-point targets? 1785bd8deadSopenharmony_ci 1795bd8deadSopenharmony_ci RESOLVED: Double-precision floating-point atomic addition is the main 1805bd8deadSopenharmony_ci functionality targeted by this extension. We provide exchanges because 1815bd8deadSopenharmony_ci the operation needs no special hardware support. 1825bd8deadSopenharmony_ci 1835bd8deadSopenharmony_ci We chose not to provide support for bitwise operations (AND/OR/XOR); 1845bd8deadSopenharmony_ci it's possible to support these by casting a pointer or aliasing an image 1855bd8deadSopenharmony_ci if required. Minimum, maximum, and compare-and-swap make sense, but the 1865bd8deadSopenharmony_ci underlying atomic hardware targeted by this extension does not support 1875bd8deadSopenharmony_ci floating-point comparisons. 1885bd8deadSopenharmony_ci 1895bd8deadSopenharmony_ciRevision History 1905bd8deadSopenharmony_ci 1915bd8deadSopenharmony_ci Revision 1 1925bd8deadSopenharmony_ci - Internal revisions. 1935bd8deadSopenharmony_ci 194