15bd8deadSopenharmony_ciName
25bd8deadSopenharmony_ci
35bd8deadSopenharmony_ci    NV_shader_atomic_float64
45bd8deadSopenharmony_ci
55bd8deadSopenharmony_ciName Strings
65bd8deadSopenharmony_ci
75bd8deadSopenharmony_ci    GL_NV_shader_atomic_float64
85bd8deadSopenharmony_ci
95bd8deadSopenharmony_ciContact
105bd8deadSopenharmony_ci
115bd8deadSopenharmony_ci    Kedarnath Thangudu, NVIDIA Corporation (kthangudu 'at' nvidia.com)
125bd8deadSopenharmony_ci
135bd8deadSopenharmony_ciContributors
145bd8deadSopenharmony_ci
155bd8deadSopenharmony_ci    Pat Brown, NVIDIA
165bd8deadSopenharmony_ci
175bd8deadSopenharmony_ciStatus
185bd8deadSopenharmony_ci
195bd8deadSopenharmony_ci    Shipping in NVIDIA release 367.XX drivers and up.
205bd8deadSopenharmony_ci
215bd8deadSopenharmony_ciVersion
225bd8deadSopenharmony_ci
235bd8deadSopenharmony_ci    Last Modified Date:         October 15, 2014
245bd8deadSopenharmony_ci    NVIDIA Revision:            1
255bd8deadSopenharmony_ci
265bd8deadSopenharmony_ciNumber
275bd8deadSopenharmony_ci
285bd8deadSopenharmony_ci    OpenGL Extension #488
295bd8deadSopenharmony_ci
305bd8deadSopenharmony_ciDependencies
315bd8deadSopenharmony_ci
325bd8deadSopenharmony_ci    This extension is written against the OpenGL 4.5 (Compatibility Profile)
335bd8deadSopenharmony_ci    Specification.
345bd8deadSopenharmony_ci
355bd8deadSopenharmony_ci    This extension is written against version 4.50 (revision 3) of the OpenGL
365bd8deadSopenharmony_ci    Shading Language Specification.
375bd8deadSopenharmony_ci
385bd8deadSopenharmony_ci    This extension requires ARB_gpu_shader_fp64 or NV_gpu_program_fp64.
395bd8deadSopenharmony_ci
405bd8deadSopenharmony_ci    This extension interacts with NV_shader_buffer_store, NV_gpu_shader5,
415bd8deadSopenharmony_ci    ARB_shader_storage_buffer_object, and ARB_compute_shader.
425bd8deadSopenharmony_ci
435bd8deadSopenharmony_ci    This extension interacts with NV_gpu_program5.
445bd8deadSopenharmony_ci
455bd8deadSopenharmony_ciOverview
465bd8deadSopenharmony_ci
475bd8deadSopenharmony_ci    This extension provides GLSL built-in functions and assembly opcodes
485bd8deadSopenharmony_ci    allowing shaders to perform atomic read-modify-write operations to buffer
495bd8deadSopenharmony_ci    or shared memory with double-precision floating-point components.  The set
505bd8deadSopenharmony_ci    of atomic operations provided by this extension is limited to adds and
515bd8deadSopenharmony_ci    exchanges. Providing atomic add support allows shaders to atomically
525bd8deadSopenharmony_ci    accumulate the sum of double-precision floating-point values into buffer
535bd8deadSopenharmony_ci    memory across multiple (possibly concurrent) shader invocations.
545bd8deadSopenharmony_ci
555bd8deadSopenharmony_ci    This extension provides GLSL support for atomics targeting double-precision
565bd8deadSopenharmony_ci    floating-point pointers (if NV_gpu_shader5 is supported).
575bd8deadSopenharmony_ci    Additionally, assembly opcodes for these operations are also provided if
585bd8deadSopenharmony_ci    NV_gpu_program5 is supported.
595bd8deadSopenharmony_ci
605bd8deadSopenharmony_ciNew Procedures and Functions
615bd8deadSopenharmony_ci
625bd8deadSopenharmony_ci    None.
635bd8deadSopenharmony_ci
645bd8deadSopenharmony_ciNew Tokens
655bd8deadSopenharmony_ci
665bd8deadSopenharmony_ci    None.
675bd8deadSopenharmony_ci
685bd8deadSopenharmony_ciAdditions to the OpenGL 4.5 (Compatibility Profile) Specification
695bd8deadSopenharmony_ci
705bd8deadSopenharmony_ci    None.
715bd8deadSopenharmony_ci
725bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications
735bd8deadSopenharmony_ci
745bd8deadSopenharmony_ci    None.
755bd8deadSopenharmony_ci
765bd8deadSopenharmony_ciGLX Protocol
775bd8deadSopenharmony_ci
785bd8deadSopenharmony_ci    None.
795bd8deadSopenharmony_ci
805bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.50
815bd8deadSopenharmony_ci(revision 3)
825bd8deadSopenharmony_ci
835bd8deadSopenharmony_ci    Including the following line in a shader can be used to control the
845bd8deadSopenharmony_ci    language features described in this extension:
855bd8deadSopenharmony_ci
865bd8deadSopenharmony_ci      #extension GL_NV_shader_atomic_float64 : <behavior>
875bd8deadSopenharmony_ci
885bd8deadSopenharmony_ci    where <behavior> is as specified in section 3.3.
895bd8deadSopenharmony_ci
905bd8deadSopenharmony_ci    New preprocessor #defines are added to the OpenGL Shading Language:
915bd8deadSopenharmony_ci
925bd8deadSopenharmony_ci      #define GL_NV_shader_atomic_float64         1
935bd8deadSopenharmony_ci
945bd8deadSopenharmony_ci    Modify Section 8.11, Atomic Memory Functions (p. 172)
955bd8deadSopenharmony_ci
965bd8deadSopenharmony_ci    (add to "atomicAdd" table cell, p. 173)
975bd8deadSopenharmony_ci
985bd8deadSopenharmony_ci      double atomicAdd(coherent inout double mem, double data)
995bd8deadSopenharmony_ci
1005bd8deadSopenharmony_ci    (add to "atomicExchange" table cell, p. 173)
1015bd8deadSopenharmony_ci
1025bd8deadSopenharmony_ci      double atomicExchange(coherent inout double mem, double data)
1035bd8deadSopenharmony_ci
1045bd8deadSopenharmony_ci
1055bd8deadSopenharmony_ciDependencies on NV_shader_buffer_store, NV_gpu_shader5,
1065bd8deadSopenharmony_ciARB_shader_storage_buffer_object, and ARB_compute_shader
1075bd8deadSopenharmony_ci
1085bd8deadSopenharmony_ci    If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following
1095bd8deadSopenharmony_ci    functions should be added to the "Section 8.Y, Shader Memory Functions"
1105bd8deadSopenharmony_ci    language in the NV_shader_buffer_store specification:
1115bd8deadSopenharmony_ci
1125bd8deadSopenharmony_ci      double atomicAdd(double *address, double data);
1135bd8deadSopenharmony_ci      double atomicExchange(double *address, double data);
1145bd8deadSopenharmony_ci
1155bd8deadSopenharmony_ci    If ARB_shader_storage_buffer_object or ARB_compute_shader are supported,
1165bd8deadSopenharmony_ci    make similar edits to the functions documented in the
1175bd8deadSopenharmony_ci    ARB_shader_storage_buffer object extension.
1185bd8deadSopenharmony_ci
1195bd8deadSopenharmony_ci    These functions are available if and only if GL_NV_shader_atomic_float64 is
1205bd8deadSopenharmony_ci    enabled via the "#extension" directive.
1215bd8deadSopenharmony_ci
1225bd8deadSopenharmony_ciDependencies on NV_gpu_program5
1235bd8deadSopenharmony_ci
1245bd8deadSopenharmony_ci    If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float64" is
1255bd8deadSopenharmony_ci    specified in an assembly program, "F64" should be allowed as a storage
1265bd8deadSopenharmony_ci    modifier to the ATOM instruction for the atomic operations "ADD" and
1275bd8deadSopenharmony_ci    "EXCH".
1285bd8deadSopenharmony_ci
1295bd8deadSopenharmony_ci    (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension,
1305bd8deadSopenharmony_ci    as extended by NV_gpu_program5:)
1315bd8deadSopenharmony_ci
1325bd8deadSopenharmony_ci      + Double-precision Floating-Point Atomic Operations (NV_shader_atomic_float64)
1335bd8deadSopenharmony_ci
1345bd8deadSopenharmony_ci      If a program specifies the "NV_shader_atomic_float64" option, it may use
1355bd8deadSopenharmony_ci      "F64" storage modifier with the "ATOM" opcode to perform atomic double-
1365bd8deadSopenharmony_ci      precision floating-point add or exchange operations.
1375bd8deadSopenharmony_ci
1385bd8deadSopenharmony_ci    (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:)
1395bd8deadSopenharmony_ci
1405bd8deadSopenharmony_ci      atomic     storage
1415bd8deadSopenharmony_ci      modifier   modifiers                 operation
1425bd8deadSopenharmony_ci      --------   -----------------------   ---------------------------------
1435bd8deadSopenharmony_ci       ADD       U32, S32, U64, F32, F64   compute a sum
1445bd8deadSopenharmony_ci                 F16X2, F16X4
1455bd8deadSopenharmony_ci       ...
1465bd8deadSopenharmony_ci       EXCH      U32, S32, U64, F32, F64   exchange memory with operand
1475bd8deadSopenharmony_ci                 F16X2, F16X4
1485bd8deadSopenharmony_ci
1495bd8deadSopenharmony_ci    Note:
1505bd8deadSopenharmony_ci      Storage modifier U64 is provided by NV_shader_atomic_int64
1515bd8deadSopenharmony_ci      Storage modifier F32 is provided by NV_shader_atomic_float
1525bd8deadSopenharmony_ci      Storage modifiers F16X2 and F16X4 are provided by NV_shader_atomic_fp16_vector
1535bd8deadSopenharmony_ci
1545bd8deadSopenharmony_ciErrors
1555bd8deadSopenharmony_ci
1565bd8deadSopenharmony_ci    None.
1575bd8deadSopenharmony_ci
1585bd8deadSopenharmony_ciNew State
1595bd8deadSopenharmony_ci
1605bd8deadSopenharmony_ci    None.
1615bd8deadSopenharmony_ci
1625bd8deadSopenharmony_ciNew Implementation Dependent State
1635bd8deadSopenharmony_ci
1645bd8deadSopenharmony_ci    None.
1655bd8deadSopenharmony_ci
1665bd8deadSopenharmony_ciIssues
1675bd8deadSopenharmony_ci
1685bd8deadSopenharmony_ci    (1) What double-precision floating-point targets are supported for
1695bd8deadSopenharmony_ci    atomic operations?
1705bd8deadSopenharmony_ci
1715bd8deadSopenharmony_ci      RESOLVED: This extension only supports atomic operations on double-
1725bd8deadSopenharmony_ci      precision floating-point buffer memory. Atomic operation on double-
1735bd8deadSopenharmony_ci      precision texture memory are not supported since OpenGL provides
1745bd8deadSopenharmony_ci      no pixel/texture formats with double-precision components.
1755bd8deadSopenharmony_ci
1765bd8deadSopenharmony_ci    (2) What atomic operations should we support for double-precision
1775bd8deadSopenharmony_ci    floating-point targets?
1785bd8deadSopenharmony_ci
1795bd8deadSopenharmony_ci      RESOLVED:  Double-precision floating-point atomic addition is the main
1805bd8deadSopenharmony_ci      functionality targeted by this extension.  We provide exchanges because
1815bd8deadSopenharmony_ci      the operation needs no special hardware support.
1825bd8deadSopenharmony_ci
1835bd8deadSopenharmony_ci      We chose not to provide support for bitwise operations (AND/OR/XOR);
1845bd8deadSopenharmony_ci      it's possible to support these by casting a pointer or aliasing an image
1855bd8deadSopenharmony_ci      if required.  Minimum, maximum, and compare-and-swap make sense, but the
1865bd8deadSopenharmony_ci      underlying atomic hardware targeted by this extension does not support
1875bd8deadSopenharmony_ci      floating-point comparisons.
1885bd8deadSopenharmony_ci
1895bd8deadSopenharmony_ciRevision History
1905bd8deadSopenharmony_ci
1915bd8deadSopenharmony_ci    Revision 1
1925bd8deadSopenharmony_ci      - Internal revisions.
1935bd8deadSopenharmony_ci
194