extensions/NV/NV_shader_atomic_float64.txt

5bd8deadSopenharmony_ciName
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    NV_shader_atomic_float64
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciName Strings
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    GL_NV_shader_atomic_float64
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciContact
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Kedarnath Thangudu, NVIDIA Corporation (kthangudu 'at' nvidia.com)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciContributors
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Pat Brown, NVIDIA
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciStatus
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Shipping in NVIDIA release 367.XX drivers and up.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciVersion
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Last Modified Date:         October 15, 2014
5bd8deadSopenharmony_ci    NVIDIA Revision:            1
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciNumber
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    OpenGL Extension #488
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciDependencies
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension is written against the OpenGL 4.5 (Compatibility Profile)
5bd8deadSopenharmony_ci    Specification.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension is written against version 4.50 (revision 3) of the OpenGL
5bd8deadSopenharmony_ci    Shading Language Specification.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension requires ARB_gpu_shader_fp64 or NV_gpu_program_fp64.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension interacts with NV_shader_buffer_store, NV_gpu_shader5,
5bd8deadSopenharmony_ci    ARB_shader_storage_buffer_object, and ARB_compute_shader.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension interacts with NV_gpu_program5.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciOverview
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension provides GLSL built-in functions and assembly opcodes
5bd8deadSopenharmony_ci    allowing shaders to perform atomic read-modify-write operations to buffer
5bd8deadSopenharmony_ci    or shared memory with double-precision floating-point components.  The set
5bd8deadSopenharmony_ci    of atomic operations provided by this extension is limited to adds and
5bd8deadSopenharmony_ci    exchanges. Providing atomic add support allows shaders to atomically
5bd8deadSopenharmony_ci    accumulate the sum of double-precision floating-point values into buffer
5bd8deadSopenharmony_ci    memory across multiple (possibly concurrent) shader invocations.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    This extension provides GLSL support for atomics targeting double-precision
5bd8deadSopenharmony_ci    floating-point pointers (if NV_gpu_shader5 is supported).
5bd8deadSopenharmony_ci    Additionally, assembly opcodes for these operations are also provided if
5bd8deadSopenharmony_ci    NV_gpu_program5 is supported.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciNew Procedures and Functions
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciNew Tokens
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciAdditions to the OpenGL 4.5 (Compatibility Profile) Specification
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciGLX Protocol
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.50
5bd8deadSopenharmony_ci(revision 3)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Including the following line in a shader can be used to control the
5bd8deadSopenharmony_ci    language features described in this extension:
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      #extension GL_NV_shader_atomic_float64 : <behavior>
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    where <behavior> is as specified in section 3.3.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    New preprocessor #defines are added to the OpenGL Shading Language:
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      #define GL_NV_shader_atomic_float64         1
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Modify Section 8.11, Atomic Memory Functions (p. 172)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (add to "atomicAdd" table cell, p. 173)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      double atomicAdd(coherent inout double mem, double data)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (add to "atomicExchange" table cell, p. 173)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      double atomicExchange(coherent inout double mem, double data)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciDependencies on NV_shader_buffer_store, NV_gpu_shader5,
5bd8deadSopenharmony_ciARB_shader_storage_buffer_object, and ARB_compute_shader
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following
5bd8deadSopenharmony_ci    functions should be added to the "Section 8.Y, Shader Memory Functions"
5bd8deadSopenharmony_ci    language in the NV_shader_buffer_store specification:
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      double atomicAdd(double *address, double data);
5bd8deadSopenharmony_ci      double atomicExchange(double *address, double data);
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    If ARB_shader_storage_buffer_object or ARB_compute_shader are supported,
5bd8deadSopenharmony_ci    make similar edits to the functions documented in the
5bd8deadSopenharmony_ci    ARB_shader_storage_buffer object extension.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    These functions are available if and only if GL_NV_shader_atomic_float64 is
5bd8deadSopenharmony_ci    enabled via the "#extension" directive.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciDependencies on NV_gpu_program5
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float64" is
5bd8deadSopenharmony_ci    specified in an assembly program, "F64" should be allowed as a storage
5bd8deadSopenharmony_ci    modifier to the ATOM instruction for the atomic operations "ADD" and
5bd8deadSopenharmony_ci    "EXCH".
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension,
5bd8deadSopenharmony_ci    as extended by NV_gpu_program5:)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      + Double-precision Floating-Point Atomic Operations (NV_shader_atomic_float64)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      If a program specifies the "NV_shader_atomic_float64" option, it may use
5bd8deadSopenharmony_ci      "F64" storage modifier with the "ATOM" opcode to perform atomic double-
5bd8deadSopenharmony_ci      precision floating-point add or exchange operations.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:)
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      atomic     storage
5bd8deadSopenharmony_ci      modifier   modifiers                 operation
5bd8deadSopenharmony_ci      --------   -----------------------   ---------------------------------
5bd8deadSopenharmony_ci       ADD       U32, S32, U64, F32, F64   compute a sum
5bd8deadSopenharmony_ci                 F16X2, F16X4
5bd8deadSopenharmony_ci       ...
5bd8deadSopenharmony_ci       EXCH      U32, S32, U64, F32, F64   exchange memory with operand
5bd8deadSopenharmony_ci                 F16X2, F16X4
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Note:
5bd8deadSopenharmony_ci      Storage modifier U64 is provided by NV_shader_atomic_int64
5bd8deadSopenharmony_ci      Storage modifier F32 is provided by NV_shader_atomic_float
5bd8deadSopenharmony_ci      Storage modifiers F16X2 and F16X4 are provided by NV_shader_atomic_fp16_vector
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciErrors
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciNew State
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciNew Implementation Dependent State
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    None.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciIssues
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (1) What double-precision floating-point targets are supported for
5bd8deadSopenharmony_ci    atomic operations?
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      RESOLVED: This extension only supports atomic operations on double-
5bd8deadSopenharmony_ci      precision floating-point buffer memory. Atomic operation on double-
5bd8deadSopenharmony_ci      precision texture memory are not supported since OpenGL provides
5bd8deadSopenharmony_ci      no pixel/texture formats with double-precision components.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    (2) What atomic operations should we support for double-precision
5bd8deadSopenharmony_ci    floating-point targets?
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      RESOLVED:  Double-precision floating-point atomic addition is the main
5bd8deadSopenharmony_ci      functionality targeted by this extension.  We provide exchanges because
5bd8deadSopenharmony_ci      the operation needs no special hardware support.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci      We chose not to provide support for bitwise operations (AND/OR/XOR);
5bd8deadSopenharmony_ci      it's possible to support these by casting a pointer or aliasing an image
5bd8deadSopenharmony_ci      if required.  Minimum, maximum, and compare-and-swap make sense, but the
5bd8deadSopenharmony_ci      underlying atomic hardware targeted by this extension does not support
5bd8deadSopenharmony_ci      floating-point comparisons.
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ciRevision History
5bd8deadSopenharmony_ci
5bd8deadSopenharmony_ci    Revision 1
5bd8deadSopenharmony_ci      - Internal revisions.
5bd8deadSopenharmony_ci