15bd8deadSopenharmony_ciName
25bd8deadSopenharmony_ci
35bd8deadSopenharmony_ci    NV_compute_program5
45bd8deadSopenharmony_ci
55bd8deadSopenharmony_ciName Strings
65bd8deadSopenharmony_ci
75bd8deadSopenharmony_ci    GL_NV_compute_program5
85bd8deadSopenharmony_ci
95bd8deadSopenharmony_ciContact
105bd8deadSopenharmony_ci
115bd8deadSopenharmony_ci    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
125bd8deadSopenharmony_ci
135bd8deadSopenharmony_ciStatus
145bd8deadSopenharmony_ci
155bd8deadSopenharmony_ci    Complete
165bd8deadSopenharmony_ci
175bd8deadSopenharmony_ciVersion
185bd8deadSopenharmony_ci
195bd8deadSopenharmony_ci    Last Modified Date:         10/23/2012
205bd8deadSopenharmony_ci    NVIDIA Revision:            2
215bd8deadSopenharmony_ci
225bd8deadSopenharmony_ciNumber
235bd8deadSopenharmony_ci
245bd8deadSopenharmony_ci    421
255bd8deadSopenharmony_ci
265bd8deadSopenharmony_ciDependencies
275bd8deadSopenharmony_ci
285bd8deadSopenharmony_ci    OpenGL 4.0 (Core or Compatibiity Profile) is required.
295bd8deadSopenharmony_ci
305bd8deadSopenharmony_ci    This extension is written against the OpenGL 4.2 Specification
315bd8deadSopenharmony_ci    (Compatibility Profile).
325bd8deadSopenharmony_ci
335bd8deadSopenharmony_ci    NV_gpu_program4 and NV_gpu_program5 are required.
345bd8deadSopenharmony_ci
355bd8deadSopenharmony_ci    ARB_compute_shader is required.
365bd8deadSopenharmony_ci
375bd8deadSopenharmony_ci    This specification interacts with NV_shader_atomic_float.
385bd8deadSopenharmony_ci
395bd8deadSopenharmony_ci    This specification interacts with EXT_shader_image_load_store.
405bd8deadSopenharmony_ci
415bd8deadSopenharmony_ciOverview
425bd8deadSopenharmony_ci
435bd8deadSopenharmony_ci    This extension builds on the ARB_compute_shader extension to provide new
445bd8deadSopenharmony_ci    assembly compute program capability for OpenGL.  ARB_compute_shader adds
455bd8deadSopenharmony_ci    the basic functionality, including the ability to dispatch compute work.
465bd8deadSopenharmony_ci    This extension provides the ability to write a compute program in
475bd8deadSopenharmony_ci    assembly, using the same basic syntax and capability set found in the
485bd8deadSopenharmony_ci    NV_gpu_program4 and NV_gpu_program5 extensions.
495bd8deadSopenharmony_ci
505bd8deadSopenharmony_ciNew Procedures and Functions
515bd8deadSopenharmony_ci
525bd8deadSopenharmony_ci    None.
535bd8deadSopenharmony_ci
545bd8deadSopenharmony_ciNew Tokens
555bd8deadSopenharmony_ci
565bd8deadSopenharmony_ci    Accepted by the <cap> parameter of Disable, Enable, and IsEnabled, 
575bd8deadSopenharmony_ci    by the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv, 
585bd8deadSopenharmony_ci    and GetDoublev, and by the <target> parameter of ProgramStringARB,
595bd8deadSopenharmony_ci    BindProgramARB, ProgramEnvParameter4[df][v]ARB,
605bd8deadSopenharmony_ci    ProgramLocalParameter4[df][v]ARB, GetProgramEnvParameter[df]vARB, 
615bd8deadSopenharmony_ci    GetProgramLocalParameter[df]vARB, GetProgramivARB and
625bd8deadSopenharmony_ci    GetProgramStringARB:
635bd8deadSopenharmony_ci
645bd8deadSopenharmony_ci        COMPUTE_PROGRAM_NV                              0x90FB
655bd8deadSopenharmony_ci
665bd8deadSopenharmony_ci    Accepted by the <target> parameter of ProgramBufferParametersfvNV,
675bd8deadSopenharmony_ci    ProgramBufferParametersIivNV, and ProgramBufferParametersIuivNV,
685bd8deadSopenharmony_ci    BindBufferRangeNV, BindBufferOffsetNV, BindBufferBaseNV, and BindBuffer
695bd8deadSopenharmony_ci    and the <value> parameter of GetIntegerIndexedvEXT:
705bd8deadSopenharmony_ci
715bd8deadSopenharmony_ci        COMPUTE_PROGRAM_PARAMETER_BUFFER_NV             0x90FC
725bd8deadSopenharmony_ci
735bd8deadSopenharmony_ci    (Note:  Various enumerants from ARB_compute_shader will also be used by
745bd8deadSopenharmony_ci     this extension.)
755bd8deadSopenharmony_ci
765bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification
775bd8deadSopenharmony_ci(OpenGL Operation)
785bd8deadSopenharmony_ci
795bd8deadSopenharmony_ci    Modify Section 2.X, GPU Programs, of NV_gpu_program4 (as modified by
805bd8deadSopenharmony_ci    NV_gpu_program5)
815bd8deadSopenharmony_ci
825bd8deadSopenharmony_ci    (insert after second paragraph)
835bd8deadSopenharmony_ci
845bd8deadSopenharmony_ci    Compute Programs
855bd8deadSopenharmony_ci
865bd8deadSopenharmony_ci    Compute programs are used to perform general purpose computations using a
875bd8deadSopenharmony_ci    three-dimensional array of program invocations (threads).  The compute
885bd8deadSopenharmony_ci    shader invocations are arranged into work groups specified by the
895bd8deadSopenharmony_ci    mandatory GROUP_SIZE declaration, each of which comprises a fixed-size,
905bd8deadSopenharmony_ci    three-dimensional array of program invocations.  One or more work groups
915bd8deadSopenharmony_ci    are scheduled for execution using the DispatchCompute or
925bd8deadSopenharmony_ci    DispatchComputeIndirect commands.
935bd8deadSopenharmony_ci
945bd8deadSopenharmony_ci    Each work group scheduled for execution will launch a separate program
955bd8deadSopenharmony_ci    invocation for each work group member.  While the program invocations in a
965bd8deadSopenharmony_ci    work group are launched together, they run independently after launch.
975bd8deadSopenharmony_ci    The BAR (barrier) instruction is available to synchronize program
985bd8deadSopenharmony_ci    invocations; an invocation stops at each BAR instruction until all
995bd8deadSopenharmony_ci    invocations in the work group have executed the BAR instruction.  Each
1005bd8deadSopenharmony_ci    work group has an optional shared memory allocation (specified by the
1015bd8deadSopenharmony_ci    SHARED_MEMORY declaration) that can be read or written by any invocations
1025bd8deadSopenharmony_ci    of the work group.
1035bd8deadSopenharmony_ci
1045bd8deadSopenharmony_ci    Unlike other program types, compute program invocations have no inputs or
1055bd8deadSopenharmony_ci    outputs interfacing with the rest of the pipeline.  Compute programs may
1065bd8deadSopenharmony_ci    obtain inputs using mechanisms such as global loads, image loads, atomic
1075bd8deadSopenharmony_ci    counter reads, shader storage buffer reads, and program parameters.
1085bd8deadSopenharmony_ci    Built-in inputs are also provided to allow a compute shader invocation to
1095bd8deadSopenharmony_ci    determine its position in the work group, the position of its work group
1105bd8deadSopenharmony_ci    in the full dispatch, as well as the work group and full dispatch sizes.
1115bd8deadSopenharmony_ci    Compute program results are expected to be written to globally accessible
1125bd8deadSopenharmony_ci    memory using mechanisms such as global stores, image stores, atomic
1135bd8deadSopenharmony_ci    counters, and shader storage buffers.
1145bd8deadSopenharmony_ci
1155bd8deadSopenharmony_ci
1165bd8deadSopenharmony_ci    Modify Section 2.X.2, Program Grammar
1175bd8deadSopenharmony_ci
1185bd8deadSopenharmony_ci    (replace third paragraph)
1195bd8deadSopenharmony_ci
1205bd8deadSopenharmony_ci    Compute programs are required to begin with the header string "!!NVcp5.0".
1215bd8deadSopenharmony_ci    This header string identifies the subsequent program body as being a
1225bd8deadSopenharmony_ci    compute program and indicates that it should be parsed according to the
1235bd8deadSopenharmony_ci    base NV_gpu_program5 grammar plus the additions below.  Program string
1245bd8deadSopenharmony_ci    parsing begins with the character immediately following the header string.
1255bd8deadSopenharmony_ci
1265bd8deadSopenharmony_ci    (add the following grammar rules to the NV_gpu_program5 base grammar for
1275bd8deadSopenharmony_ci     compute programs)
1285bd8deadSopenharmony_ci
1295bd8deadSopenharmony_ci    <declSequence>          ::= <declaration> <declSequence>
1305bd8deadSopenharmony_ci
1315bd8deadSopenharmony_ci    <instruction>           ::= <SpecialInstruction>
1325bd8deadSopenharmony_ci
1335bd8deadSopenharmony_ci    <opModifier>            ::= "CTA"
1345bd8deadSopenharmony_ci
1355bd8deadSopenharmony_ci    <namingStatement>       ::= <SHARED_statement>
1365bd8deadSopenharmony_ci
1375bd8deadSopenharmony_ci    <SHARED_statement>      ::= "SHARED" <establishName> <sharedSingleInit>
1385bd8deadSopenharmony_ci                              | "SHARED" <establishName> <optArraySize> 
1395bd8deadSopenharmony_ci                                <sharedMultipleInit>
1405bd8deadSopenharmony_ci
1415bd8deadSopenharmony_ci    <sharedSingleInit>      ::= "=" <sharedUseDS>
1425bd8deadSopenharmony_ci
1435bd8deadSopenharmony_ci    <sharedMultipleInit>    ::= "=" "{" <sharedItemList> "}"
1445bd8deadSopenharmony_ci
1455bd8deadSopenharmony_ci    <sharedItemList>        ::= <sharedUseDM>
1465bd8deadSopenharmony_ci                              | <sharedUseDM> "," <sharedItemList>
1475bd8deadSopenharmony_ci
1485bd8deadSopenharmony_ci    <sharedUseV>            ::= <sharedVarName> <optArrayMem>
1495bd8deadSopenharmony_ci
1505bd8deadSopenharmony_ci    <sharedUseDS>           ::= <sharedBaseBinding> <arrayMemAbs>
1515bd8deadSopenharmony_ci
1525bd8deadSopenharmony_ci    <sharedUseDM>           ::= <sharedUseDS>
1535bd8deadSopenharmony_ci                              | <sharedBaseBinding> <arrayRange>
1545bd8deadSopenharmony_ci
1555bd8deadSopenharmony_ci    <sharedBaseBinding>     ::= "program" "." "sharedmem"
1565bd8deadSopenharmony_ci
1575bd8deadSopenharmony_ci    <SpecialInstruction>    ::= "BAR"
1585bd8deadSopenharmony_ci                              | "ATOMS" <opModifiers> <instResult> "," 
1595bd8deadSopenharmony_ci                                <instOperandV> "," <sharedUseV>
1605bd8deadSopenharmony_ci                              | "LDS" <opModifiers> <instResult> "," 
1615bd8deadSopenharmony_ci                                <sharedUseV>
1625bd8deadSopenharmony_ci                              | "STS" <opModifiers> <instOperandV> "," 
1635bd8deadSopenharmony_ci                                <sharedUseV>
1645bd8deadSopenharmony_ci
1655bd8deadSopenharmony_ci    <declaration>           ::= "GROUP_SIZE" <int>
1665bd8deadSopenharmony_ci                              | "GROUP_SIZE" <int> <int>
1675bd8deadSopenharmony_ci                              | "GROUP_SIZE" <int> <int> <int>
1685bd8deadSopenharmony_ci                              | "SHARED_MEMORY" <int>
1695bd8deadSopenharmony_ci
1705bd8deadSopenharmony_ci    <attribBasic>           ::= "invocation" "." "localid"
1715bd8deadSopenharmony_ci                              | "invocation" "." "globalid"
1725bd8deadSopenharmony_ci                              | "invocation" "." "groupid"
1735bd8deadSopenharmony_ci                              | "invocation" "." "groupcount"
1745bd8deadSopenharmony_ci                              | "invocation" "." "groupsize"
1755bd8deadSopenharmony_ci                              | "invocation" "." "localindex"
1765bd8deadSopenharmony_ci
1775bd8deadSopenharmony_ci
1785bd8deadSopenharmony_ci    (add the following subsection to Section 2.X.3.2, Program Attribute
1795bd8deadSopenharmony_ci     Variables)
1805bd8deadSopenharmony_ci
1815bd8deadSopenharmony_ci    Compute program attribute variables describe the attributes of the current
1825bd8deadSopenharmony_ci    program invocation.  Each DispatchCompute command produces a set of
1835bd8deadSopenharmony_ci    program invocations arranged as a one-, two-, or three-dimensional array.
1845bd8deadSopenharmony_ci    Figure X.1 illustrates a two-dimensional dispatch with a local work group
1855bd8deadSopenharmony_ci    size of 8x4, and a total dispatch of 5x4 local workgroups.  Each
1865bd8deadSopenharmony_ci    individual program invocation has a global one-, two-, or
1875bd8deadSopenharmony_ci    three-dimensional global coordinate, which can be further decomposed into
1885bd8deadSopenharmony_ci    a work group offset (in fixed-size work groups) and a local offset
1895bd8deadSopenharmony_ci    relative to the origin of an invocation's work group.
1905bd8deadSopenharmony_ci
1915bd8deadSopenharmony_ci                +-------+-------+-------+-------+-------+
1925bd8deadSopenharmony_ci                |       |       | work  |       |       |
1935bd8deadSopenharmony_ci                |       |       | group |       |       |
1945bd8deadSopenharmony_ci                |       |       | (2,3) |       |       |
1955bd8deadSopenharmony_ci         (0,12) +-------+-------+-------+-------+-------+
1965bd8deadSopenharmony_ci                |       |       |       |       |       |
1975bd8deadSopenharmony_ci                |       |       |       |       |       |
1985bd8deadSopenharmony_ci                |       | *     |       |       |       |
1995bd8deadSopenharmony_ci          (0,8) +-------+-------+-------+-------+-------+
2005bd8deadSopenharmony_ci                |       |       |       |       | work  |
2015bd8deadSopenharmony_ci                |       |       |       |       | group |
2025bd8deadSopenharmony_ci                |       |       |       |       | (4,1) |
2035bd8deadSopenharmony_ci          (0,4) +-------+-------+-------+-------+-------+
2045bd8deadSopenharmony_ci                | work  |       |       |       |       |
2055bd8deadSopenharmony_ci                | group |       |       |       |       |
2065bd8deadSopenharmony_ci                | (0,0) |       |       |       |       |
2075bd8deadSopenharmony_ci                +-------+-------+-------+-------+-------+
2085bd8deadSopenharmony_ci              (0,0)   (8,0)   (16,0)  (24,0)  (32,0)   
2095bd8deadSopenharmony_ci
2105bd8deadSopenharmony_ci      Figure X.1, Compute Dispatch.  The single invocation at the location
2115bd8deadSopenharmony_ci      labeled "*" has a location (invocation.globalid) of (10,9).  The offset
2125bd8deadSopenharmony_ci      relative to its local work group (invocation.localid) is (2,1).  Its
2135bd8deadSopenharmony_ci      local work group has an offset (invocation.groupid) of (1,2), in units
2145bd8deadSopenharmony_ci      of work groups.
2155bd8deadSopenharmony_ci
2165bd8deadSopenharmony_ci    The set of available compute program attribute bindings is enumerated in
2175bd8deadSopenharmony_ci    Table X.1.  All bindings are considered four-component unsigned integer
2185bd8deadSopenharmony_ci    vectors with the value of the fourth component undefined.
2195bd8deadSopenharmony_ci
2205bd8deadSopenharmony_ci      Attribute Binding          Components  Underlying State
2215bd8deadSopenharmony_ci      -------------------------  ----------  ------------------------------
2225bd8deadSopenharmony_ci      invocation.localid         (x,y,z,-)   offset relative to base of
2235bd8deadSopenharmony_ci                                             work group
2245bd8deadSopenharmony_ci
2255bd8deadSopenharmony_ci      invocation.globalid        (x,y,z,-)   offset relative to the base
2265bd8deadSopenharmony_ci                                             of the dispatched work
2275bd8deadSopenharmony_ci
2285bd8deadSopenharmony_ci      invocation.groupid         (x,y,z,-)   offset (in groups) of local work
2295bd8deadSopenharmony_ci                                             group
2305bd8deadSopenharmony_ci
2315bd8deadSopenharmony_ci      invocation.groupcount      (x,y,z,-)   total local work group count
2325bd8deadSopenharmony_ci
2335bd8deadSopenharmony_ci      invocation.groupsize       (x,y,z,-)   number of invocations in each
2345bd8deadSopenharmony_ci                                             dimension of the local work group
2355bd8deadSopenharmony_ci
2365bd8deadSopenharmony_ci      invocation.localindex      (x,-,-,-)   one-dimensional (flattened) index
2375bd8deadSopenharmony_ci                                             in local workgroup
2385bd8deadSopenharmony_ci
2395bd8deadSopenharmony_ci      Table X.1, Compute Program Attribute Bindings.
2405bd8deadSopenharmony_ci
2415bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.localid", the "x", "y",
2425bd8deadSopenharmony_ci    and "z" components of the invocation attribute variable are filled with
2435bd8deadSopenharmony_ci    the "x", "y", "z" components, respectively, of the offset of the
2445bd8deadSopenharmony_ci    invocation relative to the base of its local workgroup.  The "w" component
2455bd8deadSopenharmony_ci    of the attribute is undefined.
2465bd8deadSopenharmony_ci
2475bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.globalid", the "x",
2485bd8deadSopenharmony_ci    "y", and "z" components of the invocation attribute variable are filled
2495bd8deadSopenharmony_ci    with the "x", "y", "z" components, respectively, of the offset of the
2505bd8deadSopenharmony_ci    invocation relative to the full compute dispatch.  The "w" component of
2515bd8deadSopenharmony_ci    the attribute is undefined.
2525bd8deadSopenharmony_ci
2535bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.groupid", the "x", "y",
2545bd8deadSopenharmony_ci    and "z" components of the invocation attribute variable are filled with
2555bd8deadSopenharmony_ci    the "x", "y", "z" components, respectively, of the offset of the local
2565bd8deadSopenharmony_ci    work group (in groups) relative to the full compute dispatch.  The "w"
2575bd8deadSopenharmony_ci    component of the attribute is undefined.
2585bd8deadSopenharmony_ci
2595bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.groupcount", the "x",
2605bd8deadSopenharmony_ci    "y", and "z" components of the invocation attribute variable are filled
2615bd8deadSopenharmony_ci    the "x", "y", and "z" dimensions, respectively, in local work groups of
2625bd8deadSopenharmony_ci    the full compute dispatch.  The "w" component of the attribute is
2635bd8deadSopenharmony_ci    undefined.
2645bd8deadSopenharmony_ci
2655bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.groupsize", the "x",
2665bd8deadSopenharmony_ci    "y", and "z" components of the invocation attribute variable are filled
2675bd8deadSopenharmony_ci    the "x", "y", and "z" dimensions, respectively, of the local work group,
2685bd8deadSopenharmony_ci    as specified by the GROUP_SIZE declaration.  The "w" component of the
2695bd8deadSopenharmony_ci    attribute is undefined.
2705bd8deadSopenharmony_ci
2715bd8deadSopenharmony_ci    If a compute attribute binding matches "invocation.localindex", the "x",
2725bd8deadSopenharmony_ci    components of the invocation attribute variable is filled with a flattened
2735bd8deadSopenharmony_ci    one-dimensional index of the invocation, which is derived as:
2745bd8deadSopenharmony_ci
2755bd8deadSopenharmony_ci      invocation.localid.z * invocation.groupsize.x * invocation.groupsize.y +
2765bd8deadSopenharmony_ci      invocation.localid.y * invocation.groupsize.x +
2775bd8deadSopenharmony_ci      invocation.localid.x
2785bd8deadSopenharmony_ci
2795bd8deadSopenharmony_ci    The "y", "z", and "w" components of the attribute are undefined.
2805bd8deadSopenharmony_ci
2815bd8deadSopenharmony_ci    For one-dimensional dispatches, the "y" components of
2825bd8deadSopenharmony_ci    "invocation.localid", "invocation.globalid", and "invocation.groupid" will
2835bd8deadSopenharmony_ci    be zero.  For one- and two- dimensional dispatches, the "z" components of
2845bd8deadSopenharmony_ci    "invocation.localid", "invocation.globalid", and "invocation.groupid" will
2855bd8deadSopenharmony_ci    be zero.  The same components of "invocation.groupcount" and
2865bd8deadSopenharmony_ci    "invocation.groupsize" will be one in these cases.
2875bd8deadSopenharmony_ci
2885bd8deadSopenharmony_ci
2895bd8deadSopenharmony_ci    (add the following subsection to section 2.X.3.5, Program Results.)
2905bd8deadSopenharmony_ci
2915bd8deadSopenharmony_ci    Compute programs have no result variables; all shader results must be
2925bd8deadSopenharmony_ci    written to memory.
2935bd8deadSopenharmony_ci
2945bd8deadSopenharmony_ci
2955bd8deadSopenharmony_ci    Add New Section 2.X.3.Y, Compute Program Shared Memory, after Section
2965bd8deadSopenharmony_ci    2.X.3.6, Program Parameter Buffers
2975bd8deadSopenharmony_ci
2985bd8deadSopenharmony_ci    Compute program shared memory variables are arrays of basic machine units
2995bd8deadSopenharmony_ci    from which data can be read or written using the LDS and STS instructions.
3005bd8deadSopenharmony_ci    Compute program shared memory also supports atomic memory operations using
3015bd8deadSopenharmony_ci    the ATOMS instruction.  The GL allocates a single block of shared memory
3025bd8deadSopenharmony_ci    for each local work group, whose size in basic machine units is specified
3035bd8deadSopenharmony_ci    by the "SHARED_MEMORY" statement.  The contents of compute program shared
3045bd8deadSopenharmony_ci    memory are undefined when program execution for the local work group
3055bd8deadSopenharmony_ci    begins and can be changed only by using the ATOMS or STS instructions.
3065bd8deadSopenharmony_ci    Compute program shared memory variables are shared between all invocations
3075bd8deadSopenharmony_ci    of a local work group.  Writes performed by one invocation will be visible
3085bd8deadSopenharmony_ci    for any reads of the same memory from any other invocation executed after
3095bd8deadSopenharmony_ci    the write.  Note that the order of reads and writes between different
3105bd8deadSopenharmony_ci    invocations in a local work group is largely undefined, although the BAR
3115bd8deadSopenharmony_ci    instruction can be used to introduce synchronization points for all
3125bd8deadSopenharmony_ci    invocations in a local work group.
3135bd8deadSopenharmony_ci
3145bd8deadSopenharmony_ci    Shared memory variables may only be used as operands in the ATOMS, LDS,
3155bd8deadSopenharmony_ci    and STS instructions; they may not be used by used as results or operands
3165bd8deadSopenharmony_ci    in general instructions.  Shared memory variables must be declared
3175bd8deadSopenharmony_ci    explicitly via the <SHARED_statement> grammar rule.  Shared memory
3185bd8deadSopenharmony_ci    bindings can not be used directly in executable instructions.
3195bd8deadSopenharmony_ci
3205bd8deadSopenharmony_ci    Shader storage buffer variables may be declared as arrays, but all
3215bd8deadSopenharmony_ci    bindings assigned to the array must use the same binding point(s) and must
3225bd8deadSopenharmony_ci    increase consecutively.
3235bd8deadSopenharmony_ci
3245bd8deadSopenharmony_ci      Binding                        Components  Underlying State
3255bd8deadSopenharmony_ci      -----------------------------  ----------  -----------------------------
3265bd8deadSopenharmony_ci      program.sharedmem[a]           (x,x,x,x)   compute shared memory,
3275bd8deadSopenharmony_ci                                                   element a
3285bd8deadSopenharmony_ci      program.sharedmem[a..b]        (x,x,x,x)   compute shared memory,
3295bd8deadSopenharmony_ci                                                   elements a through b
3305bd8deadSopenharmony_ci      program.sharedmem              (x,x,x,x)   compute shared memory,
3315bd8deadSopenharmony_ci                                                   all elements
3325bd8deadSopenharmony_ci
3335bd8deadSopenharmony_ci      Table X.3: Shared Memory Bindings.  <a> and <b> indicate individual
3345bd8deadSopenharmony_ci      elements of shared memory.
3355bd8deadSopenharmony_ci
3365bd8deadSopenharmony_ci    If a shared memory binding matches "program.sharedmem[a]", the shared
3375bd8deadSopenharmony_ci    memory variable is associated with basic machine element <a> of compute
3385bd8deadSopenharmony_ci    shared memory.
3395bd8deadSopenharmony_ci
3405bd8deadSopenharmony_ci    For shared memory declarations, "program.sharedmem[a..b]" is equivalent to
3415bd8deadSopenharmony_ci    specifying elements <a> through <b> of compute shared memory in order.
3425bd8deadSopenharmony_ci
3435bd8deadSopenharmony_ci    For shared memory declarations, "program.sharedmem" is equivalent to
3445bd8deadSopenharmony_ci    specifying elements zero through <N>-1 of compute shared memory in order,
3455bd8deadSopenharmony_ci    where <N> is the total shared memory size declared by the "SHARED_MEMORY"
3465bd8deadSopenharmony_ci    statement.
3475bd8deadSopenharmony_ci
3485bd8deadSopenharmony_ci
3495bd8deadSopenharmony_ci    Modify Section 2.X.4, Program Execution Environment
3505bd8deadSopenharmony_ci
3515bd8deadSopenharmony_ci    (add to the opcode table)
3525bd8deadSopenharmony_ci
3535bd8deadSopenharmony_ci                  Modifiers 
3545bd8deadSopenharmony_ci      Instruction F I C S H D  Out Inputs    Description
3555bd8deadSopenharmony_ci      ----------- - - - - - -  --- --------  --------------------------------
3565bd8deadSopenharmony_ci      ATOMS       - - X - - -  s   v,su      atomic transaction to shared mem
3575bd8deadSopenharmony_ci      BAR         - - - - - -  -   -         work group execution barrier
3585bd8deadSopenharmony_ci      LDS         - - X X - F  v   su        load from shared memory
3595bd8deadSopenharmony_ci      STS         - - - - - -  -   v,su      store to shared memory
3605bd8deadSopenharmony_ci
3615bd8deadSopenharmony_ci
3625bd8deadSopenharmony_ci    Modify Section 2.X.4.1, Program Instruction Modifiers
3635bd8deadSopenharmony_ci
3645bd8deadSopenharmony_ci      Modifier  Description
3655bd8deadSopenharmony_ci      --------  -----------------------------------------------
3665bd8deadSopenharmony_ci      CTA       Memory barrier orders only memory transactions
3675bd8deadSopenharmony_ci                relative to invocations within local work group
3685bd8deadSopenharmony_ci
3695bd8deadSopenharmony_ci    (add to descriptions of opcode modifiers)
3705bd8deadSopenharmony_ci
3715bd8deadSopenharmony_ci    For the MEMBAR (memory barrier) instruction, the "CTA" modifier specifies
3725bd8deadSopenharmony_ci    that memory transactions before and after the barrier are strongly ordered
3735bd8deadSopenharmony_ci    as observed by any other shader invocation in the local work group.
3745bd8deadSopenharmony_ci    
3755bd8deadSopenharmony_ci
3765bd8deadSopenharmony_ci    Modify Section 2.X.4.5, Program Memory Access, from NV_gpu_program5
3775bd8deadSopenharmony_ci
3785bd8deadSopenharmony_ci    (add to the end of the first paragraph) ... Additionally programs may load
3795bd8deadSopenharmony_ci    from or store to shared memory via the ATOMS (atomic shared memory
3805bd8deadSopenharmony_ci    operation), LDS (load from shared memory), and STS (store to shared
3815bd8deadSopenharmony_ci    memory) instructions.
3825bd8deadSopenharmony_ci
3835bd8deadSopenharmony_ci    (modify miscellaneous other language referring to "buffer object memory"
3845bd8deadSopenharmony_ci    to instead refer to "buffer object and shared memory")
3855bd8deadSopenharmony_ci
3865bd8deadSopenharmony_ci    (add hypothetical built-in functions SharedMemoryLoad() and
3875bd8deadSopenharmony_ci    SharedMemoryStore() that behave similarly to BufferMemoryLoad() and
3885bd8deadSopenharmony_ci    BufferMemoryStore(), except that they access local work group shared
3895bd8deadSopenharmony_ci    memory instead of buffer object memory)
3905bd8deadSopenharmony_ci
3915bd8deadSopenharmony_ci
3925bd8deadSopenharmony_ci    Add the following subsection to section 2.X.7, Program Declarations
3935bd8deadSopenharmony_ci
3945bd8deadSopenharmony_ci    Section 2.X.7.Y, Compute Program Declarations
3955bd8deadSopenharmony_ci
3965bd8deadSopenharmony_ci    Compute programs support two types of declaration statement, as described
3975bd8deadSopenharmony_ci    below.
3985bd8deadSopenharmony_ci
3995bd8deadSopenharmony_ci    - Shader Thread Group Size (GROUP_SIZE)
4005bd8deadSopenharmony_ci
4015bd8deadSopenharmony_ci    The GROUP_SIZE statement declares the number of shader threads in a one-,
4025bd8deadSopenharmony_ci    two-, or three-dimensional local work group.  The statement must have one
4035bd8deadSopenharmony_ci    to three unsigned integer arguments.  Each argument must be less than or
4045bd8deadSopenharmony_ci    equal to the value of the implementation-dependent limit
4055bd8deadSopenharmony_ci    MAX_COMPUTE_LOCAL_WORK_SIZE for its corresponding dimension (X, Y, or Z).
4065bd8deadSopenharmony_ci    A program will fail to load unless it contains exactly one GROUP_SIZE
4075bd8deadSopenharmony_ci    declaration.
4085bd8deadSopenharmony_ci
4095bd8deadSopenharmony_ci
4105bd8deadSopenharmony_ci    - Shared Memory Storage Size (SHARED_MEMORY)
4115bd8deadSopenharmony_ci
4125bd8deadSopenharmony_ci    The SHARED_MEMORY statement declares the size of the shared memory, in
4135bd8deadSopenharmony_ci    basic machine units, available to the threads of each local work group.
4145bd8deadSopenharmony_ci    The SHARED_MEMORY statement is optional, but a program will fail to load
4155bd8deadSopenharmony_ci    if it includes multiple SHARED_MEMORY declarations, if it uses the the
4165bd8deadSopenharmony_ci    ATOMS, LDS, or STS instructions in a program without a SHARED_MEMORY
4175bd8deadSopenharmony_ci    declaration, if uses these instructions with an offset that would access
4185bd8deadSopenharmony_ci    memory beyond the declared shared memory size, or if the declared shared
4195bd8deadSopenharmony_ci    memory size is greater than the implementation-dependent limit
4205bd8deadSopenharmony_ci    MAX_COMPUTE_SHARED_VARIABLE_SIZE.
4215bd8deadSopenharmony_ci
4225bd8deadSopenharmony_ci
4235bd8deadSopenharmony_ci    (add the following subsection to section 2.X.8, Program Instruction Set.)
4245bd8deadSopenharmony_ci
4255bd8deadSopenharmony_ci    Section 2.X.8.Z, ATOMS:  Atomic Memory Operation (Shared Memory)
4265bd8deadSopenharmony_ci
4275bd8deadSopenharmony_ci    The ATOMS instruction performs an atomic memory operation by reading from
4285bd8deadSopenharmony_ci    shared memory specified by the second unsigned integer scalar operand,
4295bd8deadSopenharmony_ci    computing a new value based on the value read from memory and the first
4305bd8deadSopenharmony_ci    (vector) operand, and then writing the result back to the same memory
4315bd8deadSopenharmony_ci    address.  The memory transaction is atomic, guaranteeing that no other
4325bd8deadSopenharmony_ci    write to the memory accessed will occur between the time it is read and
4335bd8deadSopenharmony_ci    written by the ATOMS instruction.  The result of the ATOMS instruction is
4345bd8deadSopenharmony_ci    the scalar value read from memory.  The second operand used for the ATOMS
4355bd8deadSopenharmony_ci    instruction must correspond to a shared memory variable declared using the
4365bd8deadSopenharmony_ci    "SHARED" statement; a program will fail to load if any other type of
4375bd8deadSopenharmony_ci    operand is used for the second operand of an ATOMS instruction.
4385bd8deadSopenharmony_ci
4395bd8deadSopenharmony_ci    The ATOMS instruction has two required instruction modifiers.  The atomic
4405bd8deadSopenharmony_ci    modifier specifies the type of operation to be performed.  The storage
4415bd8deadSopenharmony_ci    modifier specifies the size and data type of the operand read from memory
4425bd8deadSopenharmony_ci    and the base data type of the operation used to compute the value to be
4435bd8deadSopenharmony_ci    written to memory.
4445bd8deadSopenharmony_ci
4455bd8deadSopenharmony_ci      atomic     storage
4465bd8deadSopenharmony_ci      modifier   modifiers            operation
4475bd8deadSopenharmony_ci      --------   ------------------   --------------------------------------
4485bd8deadSopenharmony_ci       ADD       U32, S32, U64, F32   compute a sum
4495bd8deadSopenharmony_ci       MIN       U32, S32             compute minimum
4505bd8deadSopenharmony_ci       MAX       U32, S32             compute maximum
4515bd8deadSopenharmony_ci       IWRAP     U32                  increment memory, wrapping at operand
4525bd8deadSopenharmony_ci       DWRAP     U32                  decrement memory, wrapping at operand
4535bd8deadSopenharmony_ci       AND       U32, S32             compute bit-wise AND
4545bd8deadSopenharmony_ci       OR        U32, S32             compute bit-wise OR
4555bd8deadSopenharmony_ci       XOR       U32, S32             compute bit-wise XOR
4565bd8deadSopenharmony_ci       EXCH      U32, S32, U64, F32   exchange memory with operand
4575bd8deadSopenharmony_ci       CSWAP     U32, S32, U64        compare-and-swap
4585bd8deadSopenharmony_ci
4595bd8deadSopenharmony_ci     Table X.Y, Supported atomic and storage modifiers for the ATOM
4605bd8deadSopenharmony_ci     instruction.
4615bd8deadSopenharmony_ci
4625bd8deadSopenharmony_ci    Not all storage modifiers are supported by ATOMS, and the set of modifiers
4635bd8deadSopenharmony_ci    allowed for any given instruction depends on the atomic modifier
4645bd8deadSopenharmony_ci    specified.  Table X.Y enumerates the set of atomic modifiers supported by
4655bd8deadSopenharmony_ci    the ATOMS instruction, and the storage modifiers allowed for each.
4665bd8deadSopenharmony_ci
4675bd8deadSopenharmony_ci      tmp0 = VectorLoad(op0);
4685bd8deadSopenharmony_ci      result = SharedMemoryLoad(op1, storageModifier);
4695bd8deadSopenharmony_ci      switch (atomicModifier) {
4705bd8deadSopenharmony_ci      case ADD:
4715bd8deadSopenharmony_ci        writeval = tmp0.x + result;
4725bd8deadSopenharmony_ci        break;
4735bd8deadSopenharmony_ci      case MIN:
4745bd8deadSopenharmony_ci        writeval = min(tmp0.x, result);
4755bd8deadSopenharmony_ci        break;
4765bd8deadSopenharmony_ci      case MAX:
4775bd8deadSopenharmony_ci        writeval = max(tmp0.x, result);
4785bd8deadSopenharmony_ci        break;
4795bd8deadSopenharmony_ci      case IWRAP:
4805bd8deadSopenharmony_ci        writeval = (result >= tmp0.x) ? 0 : result+1; 
4815bd8deadSopenharmony_ci        break;
4825bd8deadSopenharmony_ci      case DWRAP:
4835bd8deadSopenharmony_ci        writeval = (result == 0 || result > tmp0.x) ? tmp0.x : result-1;
4845bd8deadSopenharmony_ci        break;
4855bd8deadSopenharmony_ci      case AND:
4865bd8deadSopenharmony_ci        writeval = tmp0.x & result;
4875bd8deadSopenharmony_ci        break;
4885bd8deadSopenharmony_ci      case OR:
4895bd8deadSopenharmony_ci        writeval = tmp0.x | result;
4905bd8deadSopenharmony_ci        break;
4915bd8deadSopenharmony_ci      case XOR:
4925bd8deadSopenharmony_ci        writeval = tmp0.x ^ result;
4935bd8deadSopenharmony_ci        break;
4945bd8deadSopenharmony_ci      case EXCH:
4955bd8deadSopenharmony_ci        break;
4965bd8deadSopenharmony_ci      case CSWAP:
4975bd8deadSopenharmony_ci        if (result == tmp0.x) {
4985bd8deadSopenharmony_ci          writeval = tmp0.y;
4995bd8deadSopenharmony_ci        } else {
5005bd8deadSopenharmony_ci          return result;  // no memory store
5015bd8deadSopenharmony_ci        }
5025bd8deadSopenharmony_ci        break;
5035bd8deadSopenharmony_ci      }
5045bd8deadSopenharmony_ci      SharedMemoryStore(op1, writeval, storageModifier);
5055bd8deadSopenharmony_ci
5065bd8deadSopenharmony_ci    ATOMS performs a scalar atomic operation.  The <y>, <z>, and <w>
5075bd8deadSopenharmony_ci    components of the result vector are undefined.
5085bd8deadSopenharmony_ci      
5095bd8deadSopenharmony_ci    ATOMS supports no base data type modifiers, but requires exactly one
5105bd8deadSopenharmony_ci    storage modifier.  The base data types of the result vector, and the first
5115bd8deadSopenharmony_ci    (vector) operand are derived from the storage modifier.  The second
5125bd8deadSopenharmony_ci    operand is always interpreted as a scalar unsigned integer.
5135bd8deadSopenharmony_ci
5145bd8deadSopenharmony_ci
5155bd8deadSopenharmony_ci    Section 2.X.8.Z, BAR:  Execution Barrier
5165bd8deadSopenharmony_ci
5175bd8deadSopenharmony_ci    The BAR instruction synchronizes the execution of compute shader
5185bd8deadSopenharmony_ci    invocations within a local work group.  When a compute shader invocation
5195bd8deadSopenharmony_ci    executes the BAR instruction, it pauses until the same BAR instruction has
5205bd8deadSopenharmony_ci    been executed by all invocations in the current local work group.  Once
5215bd8deadSopenharmony_ci    all invocations have executed the BAR instruction, processing continues
5225bd8deadSopenharmony_ci    with the instruction following the BAR instruction.
5235bd8deadSopenharmony_ci
5245bd8deadSopenharmony_ci    There is no compile-time restriction on the locations in a program where
5255bd8deadSopenharmony_ci    BAR is allowed.  However, BAR instructions are not allowed in divergent
5265bd8deadSopenharmony_ci    flow control; if any compute shader invocation in the work group executes
5275bd8deadSopenharmony_ci    the BAR instruction, all compute shaders invocations must execute the
5285bd8deadSopenharmony_ci    instruction.  Results of executing a BAR instruction are undefined and can
5295bd8deadSopenharmony_ci    result in application hangs and/or program termination if the instruction
5305bd8deadSopenharmony_ci    is issued:
5315bd8deadSopenharmony_ci
5325bd8deadSopenharmony_ci      * inside any IF/ELSE/ENDIF block where the results of the condition
5335bd8deadSopenharmony_ci        evaluated by the IF instruction are not identical across the work
5345bd8deadSopenharmony_ci        group;
5355bd8deadSopenharmony_ci
5365bd8deadSopenharmony_ci      * inside any iteration of REP/ENDREP block where at least one invocation
5375bd8deadSopenharmony_ci        in the work group has skipped to the next iteration using the CONT
5385bd8deadSopenharmony_ci        instruction, exited the loop using a BRK or RET instruction, or exited
5395bd8deadSopenharmony_ci        the loop due to having completed the requested number of loop
5405bd8deadSopenharmony_ci        iterations; or
5415bd8deadSopenharmony_ci
5425bd8deadSopenharmony_ci      * inside any subroutine (including main) where at least one invocation
5435bd8deadSopenharmony_ci        in the work group has exited the subroutine using the RET instruction.
5445bd8deadSopenharmony_ci
5455bd8deadSopenharmony_ci    BAR has no operands and generates no result.
5465bd8deadSopenharmony_ci
5475bd8deadSopenharmony_ci
5485bd8deadSopenharmony_ci    Section 2.X.8.Z, LDS:  Load from Shared Memory
5495bd8deadSopenharmony_ci
5505bd8deadSopenharmony_ci    The LDS instruction generates a result vector by fetching data from the
5515bd8deadSopenharmony_ci    shared memory for the current local work group identified by the first
5525bd8deadSopenharmony_ci    operand, as described in Section 2.X.4.5.  The single operand for the LDS
5535bd8deadSopenharmony_ci    instruction must correspond to a shader shared memory variable declared
5545bd8deadSopenharmony_ci    using the "SHARED" statement; a program will fail to load if any other
5555bd8deadSopenharmony_ci    type of operand is used in an LDS instruction.
5565bd8deadSopenharmony_ci
5575bd8deadSopenharmony_ci      result = SharedMemoryLoad(op0, storageModifier);
5585bd8deadSopenharmony_ci
5595bd8deadSopenharmony_ci    LDS supports no base data type modifiers, but requires exactly one storage
5605bd8deadSopenharmony_ci    modifier.  The base data type of the result vector is derived from the
5615bd8deadSopenharmony_ci    storage modifier.
5625bd8deadSopenharmony_ci
5635bd8deadSopenharmony_ci
5645bd8deadSopenharmony_ci    Replace Section 2.X.8.Z, MEMBAR:  Memory Barrier, as added by
5655bd8deadSopenharmony_ci    EXT_shader_image_load_store
5665bd8deadSopenharmony_ci
5675bd8deadSopenharmony_ci    The MEMBAR instruction synchronizes memory transactions to ensure that
5685bd8deadSopenharmony_ci    memory transactions resulting from any instruction executed by the thread
5695bd8deadSopenharmony_ci    prior to the MEMBAR instruction complete prior to any memory transactions
5705bd8deadSopenharmony_ci    issued after the instruction, as observed by other shader invocations.
5715bd8deadSopenharmony_ci
5725bd8deadSopenharmony_ci    The MEMBAR instruction has one optional instruction modifier.  If the CTA
5735bd8deadSopenharmony_ci    instruction modifier is specified, memory transactions before and after
5745bd8deadSopenharmony_ci    the barrier will be strongly ordered as observed by other shader
5755bd8deadSopenharmony_ci    invocations in the same local work group.  However, it does not order
5765bd8deadSopenharmony_ci    transactions as viewed by any other shader.  With the CTA modifier,
5775bd8deadSopenharmony_ci    shaders not in the local work group may observe the results of memory
5785bd8deadSopenharmony_ci    transactions issued after the MEMBAR instruction before those issued
5795bd8deadSopenharmony_ci    before the MEMBAR instruction.  If the CTA instruction modifier is not
5805bd8deadSopenharmony_ci    specified, all shader invocations will see the results of any memory
5815bd8deadSopenharmony_ci    transaction issued before the MEMBAR instruction before those issued after
5825bd8deadSopenharmony_ci    the MEMBAR instruction.
5835bd8deadSopenharmony_ci
5845bd8deadSopenharmony_ci    MEMBAR has no operands and generates no result.
5855bd8deadSopenharmony_ci
5865bd8deadSopenharmony_ci
5875bd8deadSopenharmony_ci    Section 2.X.8.Z, STS:  Store to Shared Memory
5885bd8deadSopenharmony_ci
5895bd8deadSopenharmony_ci    The STS instruction writes the contents of the first vector operand to
5905bd8deadSopenharmony_ci    shared memory for the current local work group identified by the second
5915bd8deadSopenharmony_ci    operand, as described in Section 2.X.4.5.  This instruction generates no
5925bd8deadSopenharmony_ci    result.  The second operand for the STS instruction must correspond to a
5935bd8deadSopenharmony_ci    shared memory variable declared using the "SHARED" statement; a program
5945bd8deadSopenharmony_ci    will fail to load if any other type of operand is used in an STS
5955bd8deadSopenharmony_ci    instruction.
5965bd8deadSopenharmony_ci
5975bd8deadSopenharmony_ci      tmp0 = VectorLoad(op0);
5985bd8deadSopenharmony_ci      SharedMemoryStore(op1, tmp0, storageModifier);
5995bd8deadSopenharmony_ci
6005bd8deadSopenharmony_ci    STS supports no base data type modifiers, but requires exactly one storage
6015bd8deadSopenharmony_ci    modifier.  The base data type of the vector components of the first
6025bd8deadSopenharmony_ci    operand is derived from the storage modifier.
6035bd8deadSopenharmony_ci
6045bd8deadSopenharmony_ci
6055bd8deadSopenharmony_ciAdditions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification
6065bd8deadSopenharmony_ci(Rasterization)
6075bd8deadSopenharmony_ci
6085bd8deadSopenharmony_ci    None.
6095bd8deadSopenharmony_ci
6105bd8deadSopenharmony_ciAdditions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification
6115bd8deadSopenharmony_ci(Per-Fragment Operations and the Frame Buffer)
6125bd8deadSopenharmony_ci
6135bd8deadSopenharmony_ci    None.
6145bd8deadSopenharmony_ci
6155bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification
6165bd8deadSopenharmony_ci(Special Functions)
6175bd8deadSopenharmony_ci
6185bd8deadSopenharmony_ci    None.
6195bd8deadSopenharmony_ci
6205bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification
6215bd8deadSopenharmony_ci(State and State Requests)
6225bd8deadSopenharmony_ci
6235bd8deadSopenharmony_ci    None.
6245bd8deadSopenharmony_ci
6255bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications
6265bd8deadSopenharmony_ci
6275bd8deadSopenharmony_ci    None.
6285bd8deadSopenharmony_ci
6295bd8deadSopenharmony_ciGLX Protocol
6305bd8deadSopenharmony_ci
6315bd8deadSopenharmony_ci    None.
6325bd8deadSopenharmony_ci
6335bd8deadSopenharmony_ciDependencies on NV_shader_atomic_float
6345bd8deadSopenharmony_ci
6355bd8deadSopenharmony_ci    If NV_shader_atomic_float is not supported, the ADD and EXCH atomic
6365bd8deadSopenharmony_ci    operations in the ATOMS instruction do not support the "F32" storage
6375bd8deadSopenharmony_ci    modifier.
6385bd8deadSopenharmony_ci
6395bd8deadSopenharmony_ciDependencies on EXT_shader_image_load_store
6405bd8deadSopenharmony_ci
6415bd8deadSopenharmony_ci    If EXT_shader_image_load_store is not supported, language describing the
6425bd8deadSopenharmony_ci    "CTA" instruction modifier and modifying the MEMBAR instruction (as added
6435bd8deadSopenharmony_ci    by EXT_shader_image_load_store) should be removed.
6445bd8deadSopenharmony_ci
6455bd8deadSopenharmony_ciErrors
6465bd8deadSopenharmony_ci
6475bd8deadSopenharmony_ci    None.
6485bd8deadSopenharmony_ci
6495bd8deadSopenharmony_ciNew State
6505bd8deadSopenharmony_ci
6515bd8deadSopenharmony_ci    (Modify ARB_vertex_program, Table X.6 -- Program State)
6525bd8deadSopenharmony_ci
6535bd8deadSopenharmony_ci                                                      Initial
6545bd8deadSopenharmony_ci    Get Value                    Type    Get Command  Value   Description               Sec.    Attribute
6555bd8deadSopenharmony_ci    ---------                    ------- -----------  ------- ------------------------  ------  ---------
6565bd8deadSopenharmony_ci    COMPUTE_PROGRAM_PARAMETER_   Z+      GetIntegerv  0       Active compute program    2.14.1  -
6575bd8deadSopenharmony_ci      BUFFER_NV                                               buffer object binding
6585bd8deadSopenharmony_ci    COMPUTE_PROGRAM_PARAMETER_   nxZ+    GetInteger-  0       Buffer objects bound for  2.14.1  -
6595bd8deadSopenharmony_ci      BUFFER_NV                          IndexedvEXT          compute program use
6605bd8deadSopenharmony_ci
6615bd8deadSopenharmony_ci    Also shares buffer bindings and other state with the ARB_compute_shader
6625bd8deadSopenharmony_ci    extension.
6635bd8deadSopenharmony_ci
6645bd8deadSopenharmony_ciNew Implementation Dependent State
6655bd8deadSopenharmony_ci
6665bd8deadSopenharmony_ci    None, but shares implementation-dependent state with the
6675bd8deadSopenharmony_ci    ARB_compute_shader extension.
6685bd8deadSopenharmony_ci
6695bd8deadSopenharmony_ciIssues
6705bd8deadSopenharmony_ci
6715bd8deadSopenharmony_ci    None.
6725bd8deadSopenharmony_ci
6735bd8deadSopenharmony_ciRevision History
6745bd8deadSopenharmony_ci
6755bd8deadSopenharmony_ci    Rev.    Date    Author    Changes
6765bd8deadSopenharmony_ci    ----  --------  --------  --------------------------------------------
6775bd8deadSopenharmony_ci     2    10/23/12  pbrown    Remove the restriction forbidding the use of BAR
6785bd8deadSopenharmony_ci                              inside potentially divergent flow control.
6795bd8deadSopenharmony_ci                              Instead, we will allow BAR to be executed
6805bd8deadSopenharmony_ci                              anywhere, but specify undefined results
6815bd8deadSopenharmony_ci                              (including hangs or program termination) if the
6825bd8deadSopenharmony_ci                              flow control is divergent (bug 9367).
6835bd8deadSopenharmony_ci
6845bd8deadSopenharmony_ci     1              pbrown    Internal spec development.
685