15bd8deadSopenharmony_ciName
25bd8deadSopenharmony_ci
35bd8deadSopenharmony_ci    NV_shader_atomic_float
45bd8deadSopenharmony_ci
55bd8deadSopenharmony_ciName Strings
65bd8deadSopenharmony_ci
75bd8deadSopenharmony_ci    GL_NV_shader_atomic_float
85bd8deadSopenharmony_ci
95bd8deadSopenharmony_ciContact
105bd8deadSopenharmony_ci
115bd8deadSopenharmony_ci    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
125bd8deadSopenharmony_ci
135bd8deadSopenharmony_ciContributors
145bd8deadSopenharmony_ci
155bd8deadSopenharmony_ci    Cyril Crassin, NVIDIA
165bd8deadSopenharmony_ci    Eric Werness, NVIDIA
175bd8deadSopenharmony_ci    Jeff Bolz, NVIDIA
185bd8deadSopenharmony_ci
195bd8deadSopenharmony_ciStatus
205bd8deadSopenharmony_ci
215bd8deadSopenharmony_ci    Shipping
225bd8deadSopenharmony_ci
235bd8deadSopenharmony_ciVersion
245bd8deadSopenharmony_ci
255bd8deadSopenharmony_ci    Last Modified Date:         August 13, 2012
265bd8deadSopenharmony_ci    NVIDIA Revision:            2
275bd8deadSopenharmony_ci
285bd8deadSopenharmony_ciNumber
295bd8deadSopenharmony_ci
305bd8deadSopenharmony_ci    419
315bd8deadSopenharmony_ci
325bd8deadSopenharmony_ciDependencies
335bd8deadSopenharmony_ci
345bd8deadSopenharmony_ci    This extension is written against the OpenGL 4.2 (Compatibility Profile)
355bd8deadSopenharmony_ci    Specification.
365bd8deadSopenharmony_ci
375bd8deadSopenharmony_ci    This extension is written against version 4.20 (revision 6) of the OpenGL
385bd8deadSopenharmony_ci    Shading Language Specification.
395bd8deadSopenharmony_ci
405bd8deadSopenharmony_ci    This extension interacts with EXT_shader_image_load_store,
415bd8deadSopenharmony_ci    ARB_shader_image_load_store, and GLSL 4.20.
425bd8deadSopenharmony_ci
435bd8deadSopenharmony_ci    This extension interacts with NV_shader_buffer_store and NV_gpu_shader5.
445bd8deadSopenharmony_ci
455bd8deadSopenharmony_ci    This extension interacts with NV_gpu_program5.
465bd8deadSopenharmony_ci
475bd8deadSopenharmony_ci    This extension interacts with EXT_shader_image_load_store and
485bd8deadSopenharmony_ci    NV_gpu_program5.
495bd8deadSopenharmony_ci
505bd8deadSopenharmony_ci    This extension interacts with GLSL 4.30, ARB_shader_storage_buffer_object,
515bd8deadSopenharmony_ci    and ARB_compute_shader.
525bd8deadSopenharmony_ci
535bd8deadSopenharmony_ciOverview
545bd8deadSopenharmony_ci
555bd8deadSopenharmony_ci    This extension provides GLSL built-in functions and assembly opcodes
565bd8deadSopenharmony_ci    allowing shaders to perform atomic read-modify-write operations to buffer
575bd8deadSopenharmony_ci    or texture memory with floating-point components.  The set of atomic
585bd8deadSopenharmony_ci    operations provided by this extension is limited to adds and exchanges.
595bd8deadSopenharmony_ci    Providing atomic add support allows shaders to atomically accumulate the
605bd8deadSopenharmony_ci    sum of floating-point values into buffer or texture memory across multiple
615bd8deadSopenharmony_ci    (possibly concurrent) shader invocations.
625bd8deadSopenharmony_ci
635bd8deadSopenharmony_ci    This extension provides GLSL support for atomics targeting image uniforms
645bd8deadSopenharmony_ci    (if GLSL 4.20, ARB_shader_image_load_store, or EXT_shader_image_load_store
655bd8deadSopenharmony_ci    is supported) or floating-point pointers (if NV_gpu_shader5 is supported).
665bd8deadSopenharmony_ci    Additionally, assembly opcodes for these operations is also provided if
675bd8deadSopenharmony_ci    NV_gpu_program5 is supported.
685bd8deadSopenharmony_ci
695bd8deadSopenharmony_ciNew Procedures and Functions
705bd8deadSopenharmony_ci
715bd8deadSopenharmony_ci    None.
725bd8deadSopenharmony_ci
735bd8deadSopenharmony_ciNew Tokens
745bd8deadSopenharmony_ci
755bd8deadSopenharmony_ci    None.
765bd8deadSopenharmony_ci
775bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification
785bd8deadSopenharmony_ci(OpenGL Operation)
795bd8deadSopenharmony_ci
805bd8deadSopenharmony_ci    None.
815bd8deadSopenharmony_ci
825bd8deadSopenharmony_ciAdditions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification
835bd8deadSopenharmony_ci(Rasterization)
845bd8deadSopenharmony_ci
855bd8deadSopenharmony_ci    None.
865bd8deadSopenharmony_ci
875bd8deadSopenharmony_ciAdditions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification
885bd8deadSopenharmony_ci(Per-Fragment Operations and the Frame Buffer)
895bd8deadSopenharmony_ci
905bd8deadSopenharmony_ci    None.
915bd8deadSopenharmony_ci
925bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification
935bd8deadSopenharmony_ci(Special Functions)
945bd8deadSopenharmony_ci
955bd8deadSopenharmony_ci    None.
965bd8deadSopenharmony_ci
975bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification
985bd8deadSopenharmony_ci(State and State Requests)
995bd8deadSopenharmony_ci
1005bd8deadSopenharmony_ci    None.
1015bd8deadSopenharmony_ci
1025bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications
1035bd8deadSopenharmony_ci
1045bd8deadSopenharmony_ci    None.
1055bd8deadSopenharmony_ci
1065bd8deadSopenharmony_ciGLX Protocol
1075bd8deadSopenharmony_ci
1085bd8deadSopenharmony_ci    None.
1095bd8deadSopenharmony_ci
1105bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.20
1115bd8deadSopenharmony_ci(revision 6)
1125bd8deadSopenharmony_ci
1135bd8deadSopenharmony_ci    Including the following line in a shader can be used to control the
1145bd8deadSopenharmony_ci    language features described in this extension:
1155bd8deadSopenharmony_ci
1165bd8deadSopenharmony_ci      #extension GL_NV_shader_atomic_float : <behavior>
1175bd8deadSopenharmony_ci
1185bd8deadSopenharmony_ci    where <behavior> is as specified in section 3.3.
1195bd8deadSopenharmony_ci
1205bd8deadSopenharmony_ci    New preprocessor #defines are added to the OpenGL Shading Language:
1215bd8deadSopenharmony_ci
1225bd8deadSopenharmony_ci      #define GL_NV_shader_atomic_float         1
1235bd8deadSopenharmony_ci
1245bd8deadSopenharmony_ci    Modify Section 8.11, Image Functions (p. 149)
1255bd8deadSopenharmony_ci
1265bd8deadSopenharmony_ci    (add to "imageAtomicAdd" table cell, p. 151)
1275bd8deadSopenharmony_ci
1285bd8deadSopenharmony_ci      float imageAtomicAdd(IMAGE_PARAMS, float data)
1295bd8deadSopenharmony_ci
1305bd8deadSopenharmony_ci    (add to "imageAtomicExchange" table cell, p. 152)
1315bd8deadSopenharmony_ci
1325bd8deadSopenharmony_ci      float imageAtomicExchange(IMAGE_PARAMS, float data)
1335bd8deadSopenharmony_ci
1345bd8deadSopenharmony_ciDependencies on EXT_shader_image_load_store, ARB_shader_image_load_store, and
1355bd8deadSopenharmony_ciGLSL 4.20
1365bd8deadSopenharmony_ci
1375bd8deadSopenharmony_ci    If none of EXT_shader_image_load_store, ARB_shader_image_load_store, or
1385bd8deadSopenharmony_ci    GLSL 4.20 are supported, the new floating-point variants of the built-in
1395bd8deadSopenharmony_ci    functions imageAtomicAdd and imageAtomicExchange should be removed.
1405bd8deadSopenharmony_ci
1415bd8deadSopenharmony_ciDependencies on NV_shader_buffer_store and NV_gpu_shader5
1425bd8deadSopenharmony_ci
1435bd8deadSopenharmony_ci    If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following
1445bd8deadSopenharmony_ci    functions should be added to the "Section 8.Y, Shader Memory Functions"
1455bd8deadSopenharmony_ci    language in the NV_shader_buffer_store specification:
1465bd8deadSopenharmony_ci
1475bd8deadSopenharmony_ci      float atomicAdd(float *address, float data);
1485bd8deadSopenharmony_ci      float atomicExchange(float *address, float data);
1495bd8deadSopenharmony_ci
1505bd8deadSopenharmony_ciDependencies on NV_gpu_program5
1515bd8deadSopenharmony_ci
1525bd8deadSopenharmony_ci    If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float" is
1535bd8deadSopenharmony_ci    specified in an assembly program, "F32" should be allowed as a storage
1545bd8deadSopenharmony_ci    modifier to the ATOM instruction for the atomic operations "ADD" and
1555bd8deadSopenharmony_ci    "EXCH".
1565bd8deadSopenharmony_ci
1575bd8deadSopenharmony_ci    (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension,
1585bd8deadSopenharmony_ci    as extended by NV_gpu_program5:)
1595bd8deadSopenharmony_ci
1605bd8deadSopenharmony_ci      + Floating-Point Atomic Operations (NV_shader_atomic_float)
1615bd8deadSopenharmony_ci
1625bd8deadSopenharmony_ci      If a program specifies the "NV_shader_atomic_float" option, it may use
1635bd8deadSopenharmony_ci      "F32" storage modifier with the "ATOM" and "ATOMIM" opcodes to perform
1645bd8deadSopenharmony_ci      atomic floating-point add or exchange operations.
1655bd8deadSopenharmony_ci    
1665bd8deadSopenharmony_ci    (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:)
1675bd8deadSopenharmony_ci
1685bd8deadSopenharmony_ci      atomic     storage
1695bd8deadSopenharmony_ci      modifier   modifiers            operation
1705bd8deadSopenharmony_ci      --------   ------------------   --------------------------------------
1715bd8deadSopenharmony_ci       ADD       U32, S32, U64, F32   compute a sum
1725bd8deadSopenharmony_ci       ...
1735bd8deadSopenharmony_ci       EXCH      U32, S32, U64, F32   exchange memory with operand
1745bd8deadSopenharmony_ci
1755bd8deadSopenharmony_ciDependencies on EXT_shader_image_load_store and NV_gpu_program5
1765bd8deadSopenharmony_ci
1775bd8deadSopenharmony_ci    If EXT_shader_image_load_store and NV_gpu_program5 are supported and
1785bd8deadSopenharmony_ci    "OPTION NV_shader_atomic_float" is specified in an assembly program, "F32"
1795bd8deadSopenharmony_ci    should be allowed as a storage modifier to the ATOMIM instruction for the
1805bd8deadSopenharmony_ci    atomic operations "ADD" and "EXCH".
1815bd8deadSopenharmony_ci
1825bd8deadSopenharmony_ci    (Add to the table in "Section 2.X.8.Z, ATOMIM" in the "Dependencies on
1835bd8deadSopenharmony_ci    NV_gpu_program5" portion of the EXT_shader_image_load specification)
1845bd8deadSopenharmony_ci
1855bd8deadSopenharmony_ci      atomic     storage
1865bd8deadSopenharmony_ci      modifier   modifiers       operation
1875bd8deadSopenharmony_ci      --------   -------------   --------------------------------------
1885bd8deadSopenharmony_ci       ADD       U32, S32, F32   compute a sum
1895bd8deadSopenharmony_ci       ...
1905bd8deadSopenharmony_ci       EXCH      U32, S32, F32   exchange memory with operand
1915bd8deadSopenharmony_ci
1925bd8deadSopenharmony_ciDependencies on GLSL 4.30, ARB_shader_storage_buffer_object, and
1935bd8deadSopenharmony_ciARB_compute_shader
1945bd8deadSopenharmony_ci
1955bd8deadSopenharmony_ci    If GLSL 4.30 is supported, add the following atomic memory functions to
1965bd8deadSopenharmony_ci    section 8.11 (Atomic Memory Functions) of the GLSL 4.30 specification:
1975bd8deadSopenharmony_ci
1985bd8deadSopenharmony_ci      float atomicAdd(inout float mem, float data);
1995bd8deadSopenharmony_ci      float atomicExchange(inout float mem, float data);
2005bd8deadSopenharmony_ci
2015bd8deadSopenharmony_ci    If ARB_shader_storage_buffer_object or ARB_compute_shader are supported,
2025bd8deadSopenharmony_ci    make similar edits to the functions documented in the
2035bd8deadSopenharmony_ci    ARB_shader_storage_buffer object extension.
2045bd8deadSopenharmony_ci
2055bd8deadSopenharmony_ci    These functions are available if and only if GL_NV_shader_atomic_float is
2065bd8deadSopenharmony_ci    enabled via the "#extension" directive.
2075bd8deadSopenharmony_ci
2085bd8deadSopenharmony_ciErrors
2095bd8deadSopenharmony_ci
2105bd8deadSopenharmony_ci    None.
2115bd8deadSopenharmony_ci
2125bd8deadSopenharmony_ciNew State
2135bd8deadSopenharmony_ci
2145bd8deadSopenharmony_ci    None.
2155bd8deadSopenharmony_ci
2165bd8deadSopenharmony_ciNew Implementation Dependent State
2175bd8deadSopenharmony_ci
2185bd8deadSopenharmony_ci    None.
2195bd8deadSopenharmony_ci
2205bd8deadSopenharmony_ciIssues
2215bd8deadSopenharmony_ci
2225bd8deadSopenharmony_ci    (1) What atomic operations should we support for floating-point targets?
2235bd8deadSopenharmony_ci
2245bd8deadSopenharmony_ci      RESOLVED:  Floating-point atomic addition is the main functionality
2255bd8deadSopenharmony_ci      targeted by this extension.  We provide exchanges because the operation
2265bd8deadSopenharmony_ci      needs no special hardware support.
2275bd8deadSopenharmony_ci
2285bd8deadSopenharmony_ci      We chose not to provide support for bitwise operations (AND/OR/XOR);
2295bd8deadSopenharmony_ci      it's possible to support these by casting a pointer or aliasing an image
2305bd8deadSopenharmony_ci      if required.  Minimum, maximum, and compare-and-swap make sense, but the
2315bd8deadSopenharmony_ci      underlying atomic hardware targeted by this extension does not support
2325bd8deadSopenharmony_ci      floating-point comparisons.
2335bd8deadSopenharmony_ci
2345bd8deadSopenharmony_ciRevision History
2355bd8deadSopenharmony_ci
2365bd8deadSopenharmony_ci    Rev.    Date    Author    Changes
2375bd8deadSopenharmony_ci    ----  --------  --------  -----------------------------------------
2385bd8deadSopenharmony_ci     2    08/13/12  pbrown    Add interaction with OpenGL 4.3 (and related ARB
2395bd8deadSopenharmony_ci                              extensions) supporting floating-point atomics
2405bd8deadSopenharmony_ci                              to shared and shader storage buffer memory.
2415bd8deadSopenharmony_ci
2425bd8deadSopenharmony_ci     1              pbrown    Internal revisions.
243