15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci NV_shader_atomic_float 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_NV_shader_atomic_float 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Cyril Crassin, NVIDIA 165bd8deadSopenharmony_ci Eric Werness, NVIDIA 175bd8deadSopenharmony_ci Jeff Bolz, NVIDIA 185bd8deadSopenharmony_ci 195bd8deadSopenharmony_ciStatus 205bd8deadSopenharmony_ci 215bd8deadSopenharmony_ci Shipping 225bd8deadSopenharmony_ci 235bd8deadSopenharmony_ciVersion 245bd8deadSopenharmony_ci 255bd8deadSopenharmony_ci Last Modified Date: August 13, 2012 265bd8deadSopenharmony_ci NVIDIA Revision: 2 275bd8deadSopenharmony_ci 285bd8deadSopenharmony_ciNumber 295bd8deadSopenharmony_ci 305bd8deadSopenharmony_ci 419 315bd8deadSopenharmony_ci 325bd8deadSopenharmony_ciDependencies 335bd8deadSopenharmony_ci 345bd8deadSopenharmony_ci This extension is written against the OpenGL 4.2 (Compatibility Profile) 355bd8deadSopenharmony_ci Specification. 365bd8deadSopenharmony_ci 375bd8deadSopenharmony_ci This extension is written against version 4.20 (revision 6) of the OpenGL 385bd8deadSopenharmony_ci Shading Language Specification. 395bd8deadSopenharmony_ci 405bd8deadSopenharmony_ci This extension interacts with EXT_shader_image_load_store, 415bd8deadSopenharmony_ci ARB_shader_image_load_store, and GLSL 4.20. 425bd8deadSopenharmony_ci 435bd8deadSopenharmony_ci This extension interacts with NV_shader_buffer_store and NV_gpu_shader5. 445bd8deadSopenharmony_ci 455bd8deadSopenharmony_ci This extension interacts with NV_gpu_program5. 465bd8deadSopenharmony_ci 475bd8deadSopenharmony_ci This extension interacts with EXT_shader_image_load_store and 485bd8deadSopenharmony_ci NV_gpu_program5. 495bd8deadSopenharmony_ci 505bd8deadSopenharmony_ci This extension interacts with GLSL 4.30, ARB_shader_storage_buffer_object, 515bd8deadSopenharmony_ci and ARB_compute_shader. 525bd8deadSopenharmony_ci 535bd8deadSopenharmony_ciOverview 545bd8deadSopenharmony_ci 555bd8deadSopenharmony_ci This extension provides GLSL built-in functions and assembly opcodes 565bd8deadSopenharmony_ci allowing shaders to perform atomic read-modify-write operations to buffer 575bd8deadSopenharmony_ci or texture memory with floating-point components. The set of atomic 585bd8deadSopenharmony_ci operations provided by this extension is limited to adds and exchanges. 595bd8deadSopenharmony_ci Providing atomic add support allows shaders to atomically accumulate the 605bd8deadSopenharmony_ci sum of floating-point values into buffer or texture memory across multiple 615bd8deadSopenharmony_ci (possibly concurrent) shader invocations. 625bd8deadSopenharmony_ci 635bd8deadSopenharmony_ci This extension provides GLSL support for atomics targeting image uniforms 645bd8deadSopenharmony_ci (if GLSL 4.20, ARB_shader_image_load_store, or EXT_shader_image_load_store 655bd8deadSopenharmony_ci is supported) or floating-point pointers (if NV_gpu_shader5 is supported). 665bd8deadSopenharmony_ci Additionally, assembly opcodes for these operations is also provided if 675bd8deadSopenharmony_ci NV_gpu_program5 is supported. 685bd8deadSopenharmony_ci 695bd8deadSopenharmony_ciNew Procedures and Functions 705bd8deadSopenharmony_ci 715bd8deadSopenharmony_ci None. 725bd8deadSopenharmony_ci 735bd8deadSopenharmony_ciNew Tokens 745bd8deadSopenharmony_ci 755bd8deadSopenharmony_ci None. 765bd8deadSopenharmony_ci 775bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification 785bd8deadSopenharmony_ci(OpenGL Operation) 795bd8deadSopenharmony_ci 805bd8deadSopenharmony_ci None. 815bd8deadSopenharmony_ci 825bd8deadSopenharmony_ciAdditions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification 835bd8deadSopenharmony_ci(Rasterization) 845bd8deadSopenharmony_ci 855bd8deadSopenharmony_ci None. 865bd8deadSopenharmony_ci 875bd8deadSopenharmony_ciAdditions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification 885bd8deadSopenharmony_ci(Per-Fragment Operations and the Frame Buffer) 895bd8deadSopenharmony_ci 905bd8deadSopenharmony_ci None. 915bd8deadSopenharmony_ci 925bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification 935bd8deadSopenharmony_ci(Special Functions) 945bd8deadSopenharmony_ci 955bd8deadSopenharmony_ci None. 965bd8deadSopenharmony_ci 975bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification 985bd8deadSopenharmony_ci(State and State Requests) 995bd8deadSopenharmony_ci 1005bd8deadSopenharmony_ci None. 1015bd8deadSopenharmony_ci 1025bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications 1035bd8deadSopenharmony_ci 1045bd8deadSopenharmony_ci None. 1055bd8deadSopenharmony_ci 1065bd8deadSopenharmony_ciGLX Protocol 1075bd8deadSopenharmony_ci 1085bd8deadSopenharmony_ci None. 1095bd8deadSopenharmony_ci 1105bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.20 1115bd8deadSopenharmony_ci(revision 6) 1125bd8deadSopenharmony_ci 1135bd8deadSopenharmony_ci Including the following line in a shader can be used to control the 1145bd8deadSopenharmony_ci language features described in this extension: 1155bd8deadSopenharmony_ci 1165bd8deadSopenharmony_ci #extension GL_NV_shader_atomic_float : <behavior> 1175bd8deadSopenharmony_ci 1185bd8deadSopenharmony_ci where <behavior> is as specified in section 3.3. 1195bd8deadSopenharmony_ci 1205bd8deadSopenharmony_ci New preprocessor #defines are added to the OpenGL Shading Language: 1215bd8deadSopenharmony_ci 1225bd8deadSopenharmony_ci #define GL_NV_shader_atomic_float 1 1235bd8deadSopenharmony_ci 1245bd8deadSopenharmony_ci Modify Section 8.11, Image Functions (p. 149) 1255bd8deadSopenharmony_ci 1265bd8deadSopenharmony_ci (add to "imageAtomicAdd" table cell, p. 151) 1275bd8deadSopenharmony_ci 1285bd8deadSopenharmony_ci float imageAtomicAdd(IMAGE_PARAMS, float data) 1295bd8deadSopenharmony_ci 1305bd8deadSopenharmony_ci (add to "imageAtomicExchange" table cell, p. 152) 1315bd8deadSopenharmony_ci 1325bd8deadSopenharmony_ci float imageAtomicExchange(IMAGE_PARAMS, float data) 1335bd8deadSopenharmony_ci 1345bd8deadSopenharmony_ciDependencies on EXT_shader_image_load_store, ARB_shader_image_load_store, and 1355bd8deadSopenharmony_ciGLSL 4.20 1365bd8deadSopenharmony_ci 1375bd8deadSopenharmony_ci If none of EXT_shader_image_load_store, ARB_shader_image_load_store, or 1385bd8deadSopenharmony_ci GLSL 4.20 are supported, the new floating-point variants of the built-in 1395bd8deadSopenharmony_ci functions imageAtomicAdd and imageAtomicExchange should be removed. 1405bd8deadSopenharmony_ci 1415bd8deadSopenharmony_ciDependencies on NV_shader_buffer_store and NV_gpu_shader5 1425bd8deadSopenharmony_ci 1435bd8deadSopenharmony_ci If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following 1445bd8deadSopenharmony_ci functions should be added to the "Section 8.Y, Shader Memory Functions" 1455bd8deadSopenharmony_ci language in the NV_shader_buffer_store specification: 1465bd8deadSopenharmony_ci 1475bd8deadSopenharmony_ci float atomicAdd(float *address, float data); 1485bd8deadSopenharmony_ci float atomicExchange(float *address, float data); 1495bd8deadSopenharmony_ci 1505bd8deadSopenharmony_ciDependencies on NV_gpu_program5 1515bd8deadSopenharmony_ci 1525bd8deadSopenharmony_ci If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float" is 1535bd8deadSopenharmony_ci specified in an assembly program, "F32" should be allowed as a storage 1545bd8deadSopenharmony_ci modifier to the ATOM instruction for the atomic operations "ADD" and 1555bd8deadSopenharmony_ci "EXCH". 1565bd8deadSopenharmony_ci 1575bd8deadSopenharmony_ci (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension, 1585bd8deadSopenharmony_ci as extended by NV_gpu_program5:) 1595bd8deadSopenharmony_ci 1605bd8deadSopenharmony_ci + Floating-Point Atomic Operations (NV_shader_atomic_float) 1615bd8deadSopenharmony_ci 1625bd8deadSopenharmony_ci If a program specifies the "NV_shader_atomic_float" option, it may use 1635bd8deadSopenharmony_ci "F32" storage modifier with the "ATOM" and "ATOMIM" opcodes to perform 1645bd8deadSopenharmony_ci atomic floating-point add or exchange operations. 1655bd8deadSopenharmony_ci 1665bd8deadSopenharmony_ci (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:) 1675bd8deadSopenharmony_ci 1685bd8deadSopenharmony_ci atomic storage 1695bd8deadSopenharmony_ci modifier modifiers operation 1705bd8deadSopenharmony_ci -------- ------------------ -------------------------------------- 1715bd8deadSopenharmony_ci ADD U32, S32, U64, F32 compute a sum 1725bd8deadSopenharmony_ci ... 1735bd8deadSopenharmony_ci EXCH U32, S32, U64, F32 exchange memory with operand 1745bd8deadSopenharmony_ci 1755bd8deadSopenharmony_ciDependencies on EXT_shader_image_load_store and NV_gpu_program5 1765bd8deadSopenharmony_ci 1775bd8deadSopenharmony_ci If EXT_shader_image_load_store and NV_gpu_program5 are supported and 1785bd8deadSopenharmony_ci "OPTION NV_shader_atomic_float" is specified in an assembly program, "F32" 1795bd8deadSopenharmony_ci should be allowed as a storage modifier to the ATOMIM instruction for the 1805bd8deadSopenharmony_ci atomic operations "ADD" and "EXCH". 1815bd8deadSopenharmony_ci 1825bd8deadSopenharmony_ci (Add to the table in "Section 2.X.8.Z, ATOMIM" in the "Dependencies on 1835bd8deadSopenharmony_ci NV_gpu_program5" portion of the EXT_shader_image_load specification) 1845bd8deadSopenharmony_ci 1855bd8deadSopenharmony_ci atomic storage 1865bd8deadSopenharmony_ci modifier modifiers operation 1875bd8deadSopenharmony_ci -------- ------------- -------------------------------------- 1885bd8deadSopenharmony_ci ADD U32, S32, F32 compute a sum 1895bd8deadSopenharmony_ci ... 1905bd8deadSopenharmony_ci EXCH U32, S32, F32 exchange memory with operand 1915bd8deadSopenharmony_ci 1925bd8deadSopenharmony_ciDependencies on GLSL 4.30, ARB_shader_storage_buffer_object, and 1935bd8deadSopenharmony_ciARB_compute_shader 1945bd8deadSopenharmony_ci 1955bd8deadSopenharmony_ci If GLSL 4.30 is supported, add the following atomic memory functions to 1965bd8deadSopenharmony_ci section 8.11 (Atomic Memory Functions) of the GLSL 4.30 specification: 1975bd8deadSopenharmony_ci 1985bd8deadSopenharmony_ci float atomicAdd(inout float mem, float data); 1995bd8deadSopenharmony_ci float atomicExchange(inout float mem, float data); 2005bd8deadSopenharmony_ci 2015bd8deadSopenharmony_ci If ARB_shader_storage_buffer_object or ARB_compute_shader are supported, 2025bd8deadSopenharmony_ci make similar edits to the functions documented in the 2035bd8deadSopenharmony_ci ARB_shader_storage_buffer object extension. 2045bd8deadSopenharmony_ci 2055bd8deadSopenharmony_ci These functions are available if and only if GL_NV_shader_atomic_float is 2065bd8deadSopenharmony_ci enabled via the "#extension" directive. 2075bd8deadSopenharmony_ci 2085bd8deadSopenharmony_ciErrors 2095bd8deadSopenharmony_ci 2105bd8deadSopenharmony_ci None. 2115bd8deadSopenharmony_ci 2125bd8deadSopenharmony_ciNew State 2135bd8deadSopenharmony_ci 2145bd8deadSopenharmony_ci None. 2155bd8deadSopenharmony_ci 2165bd8deadSopenharmony_ciNew Implementation Dependent State 2175bd8deadSopenharmony_ci 2185bd8deadSopenharmony_ci None. 2195bd8deadSopenharmony_ci 2205bd8deadSopenharmony_ciIssues 2215bd8deadSopenharmony_ci 2225bd8deadSopenharmony_ci (1) What atomic operations should we support for floating-point targets? 2235bd8deadSopenharmony_ci 2245bd8deadSopenharmony_ci RESOLVED: Floating-point atomic addition is the main functionality 2255bd8deadSopenharmony_ci targeted by this extension. We provide exchanges because the operation 2265bd8deadSopenharmony_ci needs no special hardware support. 2275bd8deadSopenharmony_ci 2285bd8deadSopenharmony_ci We chose not to provide support for bitwise operations (AND/OR/XOR); 2295bd8deadSopenharmony_ci it's possible to support these by casting a pointer or aliasing an image 2305bd8deadSopenharmony_ci if required. Minimum, maximum, and compare-and-swap make sense, but the 2315bd8deadSopenharmony_ci underlying atomic hardware targeted by this extension does not support 2325bd8deadSopenharmony_ci floating-point comparisons. 2335bd8deadSopenharmony_ci 2345bd8deadSopenharmony_ciRevision History 2355bd8deadSopenharmony_ci 2365bd8deadSopenharmony_ci Rev. Date Author Changes 2375bd8deadSopenharmony_ci ---- -------- -------- ----------------------------------------- 2385bd8deadSopenharmony_ci 2 08/13/12 pbrown Add interaction with OpenGL 4.3 (and related ARB 2395bd8deadSopenharmony_ci extensions) supporting floating-point atomics 2405bd8deadSopenharmony_ci to shared and shader storage buffer memory. 2415bd8deadSopenharmony_ci 2425bd8deadSopenharmony_ci 1 pbrown Internal revisions. 243