15bd8deadSopenharmony_ciName
25bd8deadSopenharmony_ci
35bd8deadSopenharmony_ci    NV_shader_buffer_load
45bd8deadSopenharmony_ci
55bd8deadSopenharmony_ciName Strings
65bd8deadSopenharmony_ci
75bd8deadSopenharmony_ci    GL_NV_shader_buffer_load
85bd8deadSopenharmony_ci
95bd8deadSopenharmony_ciContact
105bd8deadSopenharmony_ci
115bd8deadSopenharmony_ci    Jeff Bolz, NVIDIA Corporation (jbolz 'at' nvidia.com)
125bd8deadSopenharmony_ci
135bd8deadSopenharmony_ciContributors
145bd8deadSopenharmony_ci
155bd8deadSopenharmony_ci    Pat Brown, NVIDIA
165bd8deadSopenharmony_ci    Chris Dodd, NVIDIA
175bd8deadSopenharmony_ci    Mark Kilgard, NVIDIA
185bd8deadSopenharmony_ci    Eric Werness, NVIDIA
195bd8deadSopenharmony_ci
205bd8deadSopenharmony_ciStatus
215bd8deadSopenharmony_ci
225bd8deadSopenharmony_ci    Complete
235bd8deadSopenharmony_ci
245bd8deadSopenharmony_ciVersion
255bd8deadSopenharmony_ci
265bd8deadSopenharmony_ci    Last Modified Date:         August 8, 2010
275bd8deadSopenharmony_ci    Author Revision:            8
285bd8deadSopenharmony_ci
295bd8deadSopenharmony_ciNumber
305bd8deadSopenharmony_ci
315bd8deadSopenharmony_ci    379
325bd8deadSopenharmony_ci
335bd8deadSopenharmony_ciDependencies
345bd8deadSopenharmony_ci
355bd8deadSopenharmony_ci    Written against the OpenGL 3.0 Specification.
365bd8deadSopenharmony_ci
375bd8deadSopenharmony_ci    Written against the GLSL 1.30 Specification (Revision 09).
385bd8deadSopenharmony_ci
395bd8deadSopenharmony_ci    This extension interacts with NV_gpu_program4. 
405bd8deadSopenharmony_ci
415bd8deadSopenharmony_ci
425bd8deadSopenharmony_ciOverview
435bd8deadSopenharmony_ci
445bd8deadSopenharmony_ci    At a very coarse level, GL has evolved in a way that allows 
455bd8deadSopenharmony_ci    applications to replace many of the original state machine variables 
465bd8deadSopenharmony_ci    with blocks of user-defined data. For example, the current vertex 
475bd8deadSopenharmony_ci    state has been augmented by vertex buffer objects, fixed-function 
485bd8deadSopenharmony_ci    shading state and parameters have been replaced by shaders/programs 
495bd8deadSopenharmony_ci    and constant buffers, etc.. Applications switch between coarse sets 
505bd8deadSopenharmony_ci    of state by binding objects to the context or to other container 
515bd8deadSopenharmony_ci    objects (e.g. vertex array objects) instead of manipulating state 
525bd8deadSopenharmony_ci    variables of the context. In terms of the number of GL commands 
535bd8deadSopenharmony_ci    required to draw an object, modern applications are orders of 
545bd8deadSopenharmony_ci    magnitude more efficient than legacy applications, but this explosion 
555bd8deadSopenharmony_ci    of objects bound to other objects has led to a new bottleneck - 
565bd8deadSopenharmony_ci    pointer chasing and CPU L2 cache misses in the driver, and general 
575bd8deadSopenharmony_ci    L2 cache pollution.
585bd8deadSopenharmony_ci
595bd8deadSopenharmony_ci    This extension provides a mechanism to read from a flat, 64-bit GPU 
605bd8deadSopenharmony_ci    address space from programs/shaders, to query GPU addresses of buffer
615bd8deadSopenharmony_ci    objects at the API level, and to bind buffer objects to the context in
625bd8deadSopenharmony_ci    such a way that they can be accessed via their GPU addresses in any 
635bd8deadSopenharmony_ci    shader stage. 
645bd8deadSopenharmony_ci    
655bd8deadSopenharmony_ci    The intent is that applications can avoid re-binding buffer objects 
665bd8deadSopenharmony_ci    or updating constants between each Draw call and instead simply use 
675bd8deadSopenharmony_ci    a VertexAttrib (or TexCoord, or InstanceID, or...) to "point" to the 
685bd8deadSopenharmony_ci    new object's state. In this way, one of the cheapest "state" updates 
695bd8deadSopenharmony_ci    (from the CPU's point of view) can be used to effect a significant 
705bd8deadSopenharmony_ci    state change in the shader similarly to how a pointer change may on 
715bd8deadSopenharmony_ci    the CPU. At the same time, this relieves the limits on how many 
725bd8deadSopenharmony_ci    buffer objects can be accessed at once by shaders, and allows these 
735bd8deadSopenharmony_ci    buffer object accesses to be exposed as C-style pointer dereferences
745bd8deadSopenharmony_ci    in the shading language.
755bd8deadSopenharmony_ci
765bd8deadSopenharmony_ci    As a very simple example, imagine packing a group of similar objects' 
775bd8deadSopenharmony_ci    constants into a single buffer object and pointing your program
785bd8deadSopenharmony_ci    at object <i> by setting "glVertexAttribI1iEXT(attrLoc, i);"
795bd8deadSopenharmony_ci    and using a shader as such:
805bd8deadSopenharmony_ci
815bd8deadSopenharmony_ci        struct MyObjectType {
825bd8deadSopenharmony_ci            mat4x4 modelView;
835bd8deadSopenharmony_ci            vec4 materialPropertyX;
845bd8deadSopenharmony_ci            // etc.
855bd8deadSopenharmony_ci        };
865bd8deadSopenharmony_ci        uniform MyObjectType *allObjects;
875bd8deadSopenharmony_ci        in int objectID; // bound to attrLoc
885bd8deadSopenharmony_ci        
895bd8deadSopenharmony_ci        ...
905bd8deadSopenharmony_ci
915bd8deadSopenharmony_ci        mat4x4 thisObjectsMatrix = allObjects[objectID].modelView;
925bd8deadSopenharmony_ci        // do transform, shading, etc.
935bd8deadSopenharmony_ci
945bd8deadSopenharmony_ci    This is beneficial in much the same way that texture arrays allow 
955bd8deadSopenharmony_ci    choosing between similar, but independent, texture maps with a single
965bd8deadSopenharmony_ci    coordinate identifying which slice of the texture to use. It also
975bd8deadSopenharmony_ci    resembles instancing, where a lightweight change (incrementing the 
985bd8deadSopenharmony_ci    instance ID) can be used to generate a different and interesting 
995bd8deadSopenharmony_ci    result, but with additional flexibility over instancing because the 
1005bd8deadSopenharmony_ci    values are app-controlled and not a single incrementing counter.
1015bd8deadSopenharmony_ci    
1025bd8deadSopenharmony_ci    Dependent pointer fetches are allowed, so more complex scene graph 
1035bd8deadSopenharmony_ci    structures can be built into buffer objects providing significant new 
1045bd8deadSopenharmony_ci    flexibility in the use of shaders. Another simple example, showing 
1055bd8deadSopenharmony_ci    something you can't do with existing functionality, is to do dependent
1065bd8deadSopenharmony_ci    fetches into many buffer objects:
1075bd8deadSopenharmony_ci
1085bd8deadSopenharmony_ci        GenBuffers(N, dataBuffers);
1095bd8deadSopenharmony_ci        GenBuffers(1, &pointerBuffer);
1105bd8deadSopenharmony_ci
1115bd8deadSopenharmony_ci        GLuint64EXT gpuAddrs[N];
1125bd8deadSopenharmony_ci        for (i = 0; i < N; ++i) {
1135bd8deadSopenharmony_ci            BindBuffer(target, dataBuffers[i]);
1145bd8deadSopenharmony_ci            BufferData(target, size[i], myData[i], STATIC_DRAW);
1155bd8deadSopenharmony_ci            
1165bd8deadSopenharmony_ci            // get the address of this buffer and make it resident.
1175bd8deadSopenharmony_ci            GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS, 
1185bd8deadSopenharmony_ci                                      gpuaddrs[i]); 
1195bd8deadSopenharmony_ci            MakeBufferResidentNV(target, READ_ONLY);
1205bd8deadSopenharmony_ci        }
1215bd8deadSopenharmony_ci
1225bd8deadSopenharmony_ci        GLuint64EXT pointerBufferAddr;
1235bd8deadSopenharmony_ci        BindBuffer(target, pointerBuffer);
1245bd8deadSopenharmony_ci        BufferData(target, sizeof(GLuint64EXT)*N, gpuAddrs, STATIC_DRAW);
1255bd8deadSopenharmony_ci        GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS, 
1265bd8deadSopenharmony_ci                                  &pointerBufferAddr); 
1275bd8deadSopenharmony_ci        MakeBufferResidentNV(target, READ_ONLY);
1285bd8deadSopenharmony_ci
1295bd8deadSopenharmony_ci        // now in the shader, we can use a double indirection
1305bd8deadSopenharmony_ci        vec4 **ptrToBuffers = pointerBufferAddr;
1315bd8deadSopenharmony_ci        vec4 *ptrToBufferI = ptrToBuffers[i];
1325bd8deadSopenharmony_ci
1335bd8deadSopenharmony_ci    This allows simultaneous access to more buffers than 
1345bd8deadSopenharmony_ci    EXT_bindable_uniform (MAX_VERTEX_BINDABLE_UNIFORMS, etc.) and each
1355bd8deadSopenharmony_ci    can be larger than MAX_BINDABLE_UNIFORM_SIZE.
1365bd8deadSopenharmony_ci
1375bd8deadSopenharmony_ciNew Procedures and Functions
1385bd8deadSopenharmony_ci
1395bd8deadSopenharmony_ci    void MakeBufferResidentNV(enum target, enum access);
1405bd8deadSopenharmony_ci    void MakeBufferNonResidentNV(enum target);
1415bd8deadSopenharmony_ci    boolean IsBufferResidentNV(enum target);
1425bd8deadSopenharmony_ci    void MakeNamedBufferResidentNV(uint buffer, enum access);
1435bd8deadSopenharmony_ci    void MakeNamedBufferNonResidentNV(uint buffer);
1445bd8deadSopenharmony_ci    boolean IsNamedBufferResidentNV(uint buffer);
1455bd8deadSopenharmony_ci
1465bd8deadSopenharmony_ci    void GetBufferParameterui64vNV(enum target, enum pname, 
1475bd8deadSopenharmony_ci                                   uint64EXT *params);
1485bd8deadSopenharmony_ci    void GetNamedBufferParameterui64vNV(uint buffer, enum pname, 
1495bd8deadSopenharmony_ci                                        uint64EXT *params);
1505bd8deadSopenharmony_ci
1515bd8deadSopenharmony_ci    void GetIntegerui64vNV(enum value, uint64EXT *result);
1525bd8deadSopenharmony_ci
1535bd8deadSopenharmony_ci    void Uniformui64NV(int location, uint64EXT value);
1545bd8deadSopenharmony_ci    void Uniformui64vNV(int location, sizei count,
1555bd8deadSopenharmony_ci                               const uint64EXT *value);
1565bd8deadSopenharmony_ci    void GetUniformui64vNV(uint program, int location, uint64EXT *params);
1575bd8deadSopenharmony_ci    void ProgramUniformui64NV(uint program, int location, uint64EXT value);
1585bd8deadSopenharmony_ci    void ProgramUniformui64vNV(uint program, int location, sizei count, 
1595bd8deadSopenharmony_ci                               const uint64EXT *value);
1605bd8deadSopenharmony_ci
1615bd8deadSopenharmony_ciNew Tokens
1625bd8deadSopenharmony_ci
1635bd8deadSopenharmony_ci    Accepted by the <pname> parameter of GetBufferParameterui64vNV,
1645bd8deadSopenharmony_ci    GetNamedBufferParameterui64vNV:
1655bd8deadSopenharmony_ci
1665bd8deadSopenharmony_ci        BUFFER_GPU_ADDRESS_NV                          0x8F1D
1675bd8deadSopenharmony_ci
1685bd8deadSopenharmony_ci    Returned by the <type> parameter of GetActiveUniform:
1695bd8deadSopenharmony_ci    
1705bd8deadSopenharmony_ci        GPU_ADDRESS_NV                                 0x8F34
1715bd8deadSopenharmony_ci
1725bd8deadSopenharmony_ci    Accepted by the <value> parameter of GetIntegerui64vNV: 
1735bd8deadSopenharmony_ci
1745bd8deadSopenharmony_ci        MAX_SHADER_BUFFER_ADDRESS_NV                   0x8F35
1755bd8deadSopenharmony_ci
1765bd8deadSopenharmony_ci
1775bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 3.0 Specification (OpenGL Operation)
1785bd8deadSopenharmony_ci
1795bd8deadSopenharmony_ci    Append to Section 2.9 (p. 45)
1805bd8deadSopenharmony_ci
1815bd8deadSopenharmony_ci    The data store of a buffer object may be made accessible to the GL
1825bd8deadSopenharmony_ci    via shader buffer loads by calling:
1835bd8deadSopenharmony_ci
1845bd8deadSopenharmony_ci        void MakeBufferResidentNV(enum target, enum access);
1855bd8deadSopenharmony_ci
1865bd8deadSopenharmony_ci    <access> may only be READ_ONLY, but is provided for future 
1875bd8deadSopenharmony_ci    extensibility to indicate to the driver that the GPU may write to the
1885bd8deadSopenharmony_ci    memory. <target> may be any of the buffer targets accepted by 
1895bd8deadSopenharmony_ci    BindBuffer.  The error INVALID_OPERATION will be generated if no
1905bd8deadSopenharmony_ci    buffer is bound to <target>, if the buffer bound to <target> is
1915bd8deadSopenharmony_ci    already resident in the current GL context, or if the buffer bound to
1925bd8deadSopenharmony_ci    <target> has no data store.
1935bd8deadSopenharmony_ci
1945bd8deadSopenharmony_ci    While the buffer object is resident, it is legal to use GPU addresses 
1955bd8deadSopenharmony_ci    in the range [BUFFER_GPU_ADDRESS, BUFFER_GPU_ADDRESS + BUFFER_SIZE) 
1965bd8deadSopenharmony_ci    in any shader stage.
1975bd8deadSopenharmony_ci
1985bd8deadSopenharmony_ci    The data store of a buffer object may be made inaccessible to the GL
1995bd8deadSopenharmony_ci    via shader buffer loads by calling:
2005bd8deadSopenharmony_ci    
2015bd8deadSopenharmony_ci        void MakeBufferNonResidentNV(enum target);
2025bd8deadSopenharmony_ci
2035bd8deadSopenharmony_ci    A buffer is also made non-resident implicitly as a result of being
2045bd8deadSopenharmony_ci    respecified via BufferData or being deleted. <target> may be any of 
2055bd8deadSopenharmony_ci    the buffer targets accepted by BindBuffer.  The error 
2065bd8deadSopenharmony_ci    INVALID_OPERATION will be generated if no buffer is bound to <target>
2075bd8deadSopenharmony_ci    or if the buffer bound to <target> is not resident in the current
2085bd8deadSopenharmony_ci    GL context.
2095bd8deadSopenharmony_ci
2105bd8deadSopenharmony_ci    The function:
2115bd8deadSopenharmony_ci
2125bd8deadSopenharmony_ci        void GetBufferParameterui64vNV(enum target, enum pname, 
2135bd8deadSopenharmony_ci                                       uint64EXT *params);
2145bd8deadSopenharmony_ci
2155bd8deadSopenharmony_ci    may be used to query the GPU address of a buffer object's data store. 
2165bd8deadSopenharmony_ci    This address remains valid until the buffer object is deleted, or 
2175bd8deadSopenharmony_ci    when the data store is respecified via BufferData. The address "zero"
2185bd8deadSopenharmony_ci    is reserved for convenience, so no buffer object will ever have an 
2195bd8deadSopenharmony_ci    address of zero.  The error INVALID_OPERATION will be generated if no
2205bd8deadSopenharmony_ci    buffer is bound to <target>, or if the buffer bound to <target> has no
2215bd8deadSopenharmony_ci    data store.
2225bd8deadSopenharmony_ci
2235bd8deadSopenharmony_ci    The functions:
2245bd8deadSopenharmony_ci
2255bd8deadSopenharmony_ci        void MakeNamedBufferResidentNV(uint buffer, enum access);
2265bd8deadSopenharmony_ci        void MakeNamedBufferNonResidentNV(uint buffer);
2275bd8deadSopenharmony_ci        void GetNamedBufferParameterui64vNV(uint buffer, enum pname, 
2285bd8deadSopenharmony_ci                                            uint64EXT *params);
2295bd8deadSopenharmony_ci   
2305bd8deadSopenharmony_ci    operate identically to the non-"Named" functions except, rather than 
2315bd8deadSopenharmony_ci    using currently bound buffers, it uses the buffer object identified 
2325bd8deadSopenharmony_ci    by <buffer>.  If the buffer object named by the buffer parameter has
2335bd8deadSopenharmony_ci    not been previously bound or has been deleted since the last binding,
2345bd8deadSopenharmony_ci    the GL first creates a new state vector, initialized with a zero-sized
2355bd8deadSopenharmony_ci    memory buffer and comprising the state values listed in table 2.6.
2365bd8deadSopenharmony_ci    There is no buffer corresponding to the name zero, these commands
2375bd8deadSopenharmony_ci    generate the INVALID_OPERATION error if the buffer parameter is zero.
2385bd8deadSopenharmony_ci
2395bd8deadSopenharmony_ci    Add to Section 2.20.3 (p. 98)
2405bd8deadSopenharmony_ci
2415bd8deadSopenharmony_ci        void Uniformui64NV(int location, uint64EXT value);
2425bd8deadSopenharmony_ci        void Uniformui64vNV(int location, sizei count, uint64EXT *value);
2435bd8deadSopenharmony_ci
2445bd8deadSopenharmony_ci    The Uniformui64{v}NV commands will load <count> uint64EXT values into 
2455bd8deadSopenharmony_ci    a uniform location defined as a GPU_ADDRESS_NV or an array of 
2465bd8deadSopenharmony_ci    GPU_ADDRESS_NVs.
2475bd8deadSopenharmony_ci
2485bd8deadSopenharmony_ci    The functions:
2495bd8deadSopenharmony_ci
2505bd8deadSopenharmony_ci        void ProgramUniformui64NV(uint program, int location, 
2515bd8deadSopenharmony_ci                                  uint64EXT value);
2525bd8deadSopenharmony_ci        void ProgramUniformui64vNV(uint program, int location, sizei count, 
2535bd8deadSopenharmony_ci                                   uint64EXT *value);
2545bd8deadSopenharmony_ci   
2555bd8deadSopenharmony_ci    operate identically to the non-"Program" functions except, rather 
2565bd8deadSopenharmony_ci    than updating the currently in use program object, these "Program" 
2575bd8deadSopenharmony_ci    commands update the program object named by the initial program 
2585bd8deadSopenharmony_ci    parameter.
2595bd8deadSopenharmony_ci
2605bd8deadSopenharmony_ci
2615bd8deadSopenharmony_ci    Insert a new subsection after Section 2.20.4, Shader Execution (Vertex
2625bd8deadSopenharmony_ci    Shaders), p. 103.
2635bd8deadSopenharmony_ci
2645bd8deadSopenharmony_ci    Section 2.20.X, Shader Memory Access
2655bd8deadSopenharmony_ci
2665bd8deadSopenharmony_ci    Shaders may load from buffer object memory by dereferencing pointer
2675bd8deadSopenharmony_ci    variables.  Pointer variables are 64-bit unsigned integer values referring
2685bd8deadSopenharmony_ci    to the GPU addresses of data stored in buffer objects made resident by
2695bd8deadSopenharmony_ci    MakeBufferResidentNV.  The GPU addresses of such buffer objects may be
2705bd8deadSopenharmony_ci    queried using GetBufferParameterui64vNV with a <pname> of
2715bd8deadSopenharmony_ci    BUFFER_GPU_ADDRESS_NV.
2725bd8deadSopenharmony_ci
2735bd8deadSopenharmony_ci    When a shader dereferences a pointer variable, data are read from buffer
2745bd8deadSopenharmony_ci    object memory according to the following rules:
2755bd8deadSopenharmony_ci
2765bd8deadSopenharmony_ci    - Data of type "bool" are stored in memory as one uint-typed value at the
2775bd8deadSopenharmony_ci      specified GPU address.  All non-zero values correspond to true, and zero
2785bd8deadSopenharmony_ci      corresponds to false.
2795bd8deadSopenharmony_ci
2805bd8deadSopenharmony_ci    - Data of type "int" are stored in memory as one int-typed value at the
2815bd8deadSopenharmony_ci      specified GPU address.
2825bd8deadSopenharmony_ci
2835bd8deadSopenharmony_ci    - Data of type "uint" are stored in memory as one uint-typed value at the
2845bd8deadSopenharmony_ci      specified GPU address.
2855bd8deadSopenharmony_ci 
2865bd8deadSopenharmony_ci    - Data of type "float" are stored in memory as one float-typed value at
2875bd8deadSopenharmony_ci      the specified GPU address.
2885bd8deadSopenharmony_ci
2895bd8deadSopenharmony_ci    - Vectors with <N> elements with any of the above basic element types are
2905bd8deadSopenharmony_ci      stored in memory as <N> values in consecutive memory locations beginning
2915bd8deadSopenharmony_ci      at the specified GPU address, with components stored in order with the
2925bd8deadSopenharmony_ci      first (X) component at the lowest offset.  The data type used for
2935bd8deadSopenharmony_ci      individual components is derived according to the rules for scalar
2945bd8deadSopenharmony_ci      members above.
2955bd8deadSopenharmony_ci
2965bd8deadSopenharmony_ci    - Data with any pointer type are stored in memory as a single 64-bit
2975bd8deadSopenharmony_ci      unsigned integer value at the specified GPU address.
2985bd8deadSopenharmony_ci
2995bd8deadSopenharmony_ci    - Column-major matrices with <C> columns and <R> rows (using the type
3005bd8deadSopenharmony_ci      "mat<C>x<R>", or simply "mat<C>" if <C>==<R>) are treated as an array of
3015bd8deadSopenharmony_ci      <C> floating-point column vectors, each consisting of <R> components.
3025bd8deadSopenharmony_ci      The column vectors will be stored in order, with column zero at the
3035bd8deadSopenharmony_ci      lowest offset.  The difference in offsets between consecutive columns of
3045bd8deadSopenharmony_ci      the matrix will be referred to as the column stride, and is constant
3055bd8deadSopenharmony_ci      across the matrix.
3065bd8deadSopenharmony_ci
3075bd8deadSopenharmony_ci    - Row-major matrices with <C> columns and <R> rows (using the type
3085bd8deadSopenharmony_ci      "mat<C>x<R>", or simply "mat<C>" if <C>==<R>) are treated as an array of
3095bd8deadSopenharmony_ci      <R> floating-point row vectors, each consisting of <C> components. The
3105bd8deadSopenharmony_ci      row vectors will be stored in order, with row zero at the lowest offset.
3115bd8deadSopenharmony_ci      The difference in offsets between consecutive rows of the matrix will be
3125bd8deadSopenharmony_ci      referred to as the row stride, and is constant across the matrix.
3135bd8deadSopenharmony_ci 
3145bd8deadSopenharmony_ci    - Arrays of scalars, vectors, pointers, and matrices are stored in memory
3155bd8deadSopenharmony_ci      by element order, with array member zero at the lowest offset.  The
3165bd8deadSopenharmony_ci      difference in offsets between each pair of elements in the array in
3175bd8deadSopenharmony_ci      basic machine units is referred to as the array stride, and is constant
3185bd8deadSopenharmony_ci      across the entire array.
3195bd8deadSopenharmony_ci
3205bd8deadSopenharmony_ci    For matrix and array variables, the matrix and/or array strides
3215bd8deadSopenharmony_ci    corresponding to the variable may be derived according to the structure
3225bd8deadSopenharmony_ci    layout rules specified immediately below.
3235bd8deadSopenharmony_ci
3245bd8deadSopenharmony_ci    When dereferencing a pointer to a structure, its individual members will
3255bd8deadSopenharmony_ci    be laid out in memory in monotonically increasing order based on their
3265bd8deadSopenharmony_ci    location in the structure declaration.  Each structure member has a base
3275bd8deadSopenharmony_ci    offset and a base alignment, from which an aligned offset is computed by
3285bd8deadSopenharmony_ci    rounding the base offset up to the next multiple of the base alignment.
3295bd8deadSopenharmony_ci    The base offset of the first member of a structure is taken from the
3305bd8deadSopenharmony_ci    aligned offset of the structure itself.  The base offset of all other
3315bd8deadSopenharmony_ci    structure members is derived by taking the offset of the last basic
3325bd8deadSopenharmony_ci    machine unit consumed by the previous member and adding one.  Each
3335bd8deadSopenharmony_ci    structure member is stored in memory at its aligned offset.
3345bd8deadSopenharmony_ci
3355bd8deadSopenharmony_ci      (1) If the member is a scalar consuming <N> basic machine units, the
3365bd8deadSopenharmony_ci          base alignment is <N>.
3375bd8deadSopenharmony_ci
3385bd8deadSopenharmony_ci      (2) If the member is a two- or four-component vector with components
3395bd8deadSopenharmony_ci          consuming <N> basic machine units, the base alignment is 2<N> or
3405bd8deadSopenharmony_ci          4<N>, respectively.
3415bd8deadSopenharmony_ci
3425bd8deadSopenharmony_ci      (3) If the member is a three-component vector with components consuming
3435bd8deadSopenharmony_ci          <N> basic machine units, the base alignment is 4<N>.
3445bd8deadSopenharmony_ci
3455bd8deadSopenharmony_ci      (4) If the member is an array of scalars or vectors, the base alignment
3465bd8deadSopenharmony_ci          and array stride are set to match the base alignment of a single
3475bd8deadSopenharmony_ci          array element, according to rules (1), (2), and (3). The array may
3485bd8deadSopenharmony_ci          have padding at the end; the base offset of the member following the
3495bd8deadSopenharmony_ci          array is rounded up to the next multiple of the base alignment.
3505bd8deadSopenharmony_ci
3515bd8deadSopenharmony_ci      (5) If the member is a column-major matrix with <C> columns and <R>
3525bd8deadSopenharmony_ci          rows, the matrix is stored identically to an array of <C> column
3535bd8deadSopenharmony_ci          vectors with <R> components each, according to rule (4).
3545bd8deadSopenharmony_ci
3555bd8deadSopenharmony_ci      (6) If the member is an array of <S> column-major matrices with <C>
3565bd8deadSopenharmony_ci          columns and <R> rows, the matrix is stored identically to a row of
3575bd8deadSopenharmony_ci          <S>*<C> column vectors with <R> components each, according to rule
3585bd8deadSopenharmony_ci          (4).
3595bd8deadSopenharmony_ci
3605bd8deadSopenharmony_ci      (7) If the member is a row-major matrix with <C> columns and <R> rows,
3615bd8deadSopenharmony_ci          the matrix is stored identically to an array of <R> row vectors
3625bd8deadSopenharmony_ci          with <C> components each, according to rule (4).
3635bd8deadSopenharmony_ci
3645bd8deadSopenharmony_ci      (8) If the member is an array of <S> row-major matrices with <C> columns
3655bd8deadSopenharmony_ci          and <R> rows, the matrix is stored identically to a row of <S>*<R>
3665bd8deadSopenharmony_ci          row vectors with <C> components each, according to rule (4).
3675bd8deadSopenharmony_ci
3685bd8deadSopenharmony_ci      (9) If the member is a structure, the base alignment of the structure is
3695bd8deadSopenharmony_ci          <N>, where <N> is the largest base alignment value of any of its
3705bd8deadSopenharmony_ci          members.  The individual members of this sub-structure are then
3715bd8deadSopenharmony_ci          assigned offsets by applying this set of rules recursively, where
3725bd8deadSopenharmony_ci          the base offset of the first member of the sub-structure is equal to
3735bd8deadSopenharmony_ci          the aligned offset of the structure. The structure may have padding
3745bd8deadSopenharmony_ci          at the end; the base offset of the member following the
3755bd8deadSopenharmony_ci          sub-structure is rounded up to the next multiple of the base
3765bd8deadSopenharmony_ci          alignment of the structure.
3775bd8deadSopenharmony_ci
3785bd8deadSopenharmony_ci      (10) If the member is an array of <S> structures, the <S> elements of
3795bd8deadSopenharmony_ci           the array are laid out in order, according to rule (9).
3805bd8deadSopenharmony_ci
3815bd8deadSopenharmony_ci    If a shader reads from a GPU address that does not correspond to a buffer
3825bd8deadSopenharmony_ci    object made resident by MakeBufferResidentNV, the results of the operation
3835bd8deadSopenharmony_ci    are undefined and may result in application termination.
3845bd8deadSopenharmony_ci
3855bd8deadSopenharmony_ci    Any variable, array element, or structure member accessed using a pointer
3865bd8deadSopenharmony_ci    has a required base alignment, which may be derived according the
3875bd8deadSopenharmony_ci    structure layout rules above.  If a variable, array member, or structure
3885bd8deadSopenharmony_ci    member is accessed using a pointer that is not a multiple of its base
3895bd8deadSopenharmony_ci    alignment, the results of the access will be undefined.  To store multiple
3905bd8deadSopenharmony_ci    variables in a single buffer object, an application must ensure that each
3915bd8deadSopenharmony_ci    variable is properly aligned.  Storing a single scalar, vector, matrix,
3925bd8deadSopenharmony_ci    array, or structure variable using a pointer set to the base GPU address
3935bd8deadSopenharmony_ci    of a resident buffer object requires no special alignment.  The base GPU
3945bd8deadSopenharmony_ci    address of a buffer object is guaranteed to be sufficiently aligned to
3955bd8deadSopenharmony_ci    satisfy the base alignment requirement of any variable, and the layout
3965bd8deadSopenharmony_ci    rules above ensure that individual matrix rows/columns, array elements,
3975bd8deadSopenharmony_ci    and structure members are properly aligned as long as the base pointer
3985bd8deadSopenharmony_ci    meets alignment requirements.
3995bd8deadSopenharmony_ci
4005bd8deadSopenharmony_ci
4015bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 3.0 Specification (Special Functions)
4025bd8deadSopenharmony_ci
4035bd8deadSopenharmony_ci    Add to Section 5.4, p. 310 (Display Lists)
4045bd8deadSopenharmony_ci
4055bd8deadSopenharmony_ci    Edit the list of commands that are executed immediately when compiling
4065bd8deadSopenharmony_ci    a display list to include MakeBufferResidentNV, 
4075bd8deadSopenharmony_ci    MakeBufferNonResidentNV, MakeNamedBufferResidentNV, 
4085bd8deadSopenharmony_ci    MakeNamedBufferNonResidentNV, GetBufferParameterui64vNV, 
4095bd8deadSopenharmony_ci    GetNamedBufferParameterui64vNV, IsBufferResidentNV, and
4105bd8deadSopenharmony_ci    IsNamedBufferResidentNV.
4115bd8deadSopenharmony_ci
4125bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 3.0 Specification (Querying GL State)
4135bd8deadSopenharmony_ci
4145bd8deadSopenharmony_ci    Add to Section 6.1.11, p. 314 (Pointer, String, and 64-bit Queries)
4155bd8deadSopenharmony_ci
4165bd8deadSopenharmony_ci    The command:
4175bd8deadSopenharmony_ci        
4185bd8deadSopenharmony_ci        void GetIntegerui64vNV(enum value, uint64EXT *result);
4195bd8deadSopenharmony_ci
4205bd8deadSopenharmony_ci    obtains 64-bit unsigned integer state variables. Legal values of 
4215bd8deadSopenharmony_ci    <value> are only those that specify GetIntegerui64vNV in the state
4225bd8deadSopenharmony_ci    tables in Chapter 6.
4235bd8deadSopenharmony_ci
4245bd8deadSopenharmony_ci    Add to Section 6.1.13, p. 332 (Buffer Object Queries)
4255bd8deadSopenharmony_ci
4265bd8deadSopenharmony_ci    The commands:
4275bd8deadSopenharmony_ci
4285bd8deadSopenharmony_ci        boolean IsBufferResidentNV(enum target);
4295bd8deadSopenharmony_ci        boolean IsNamedBufferResidentNV(uint buffer);
4305bd8deadSopenharmony_ci
4315bd8deadSopenharmony_ci    return TRUE if the specified buffer is resident in the current context.
4325bd8deadSopenharmony_ci    The error INVALID_OPERATION will be generated by IsBufferResidentNV if no
4335bd8deadSopenharmony_ci    buffer is bound to <target>.  If the buffer object named by the buffer
4345bd8deadSopenharmony_ci    parameter of IsNamedBufferResidentNV has not been previously bound or has
4355bd8deadSopenharmony_ci    been deleted since the last binding, the GL first creates a new state
4365bd8deadSopenharmony_ci    vector, initialized with a zero-sized memory buffer and comprising the
4375bd8deadSopenharmony_ci    state values listed in table 2.6.  There is no buffer corresponding to the
4385bd8deadSopenharmony_ci    name zero, IsNamedBufferResidentNV generates the INVALID_OPERATION error if
4395bd8deadSopenharmony_ci    the buffer parameter is zero.
4405bd8deadSopenharmony_ci
4415bd8deadSopenharmony_ci    Add to Section 6.1.15, p. 337 (Shader and Program Queries)
4425bd8deadSopenharmony_ci
4435bd8deadSopenharmony_ci        void GetUniformui64vNV(uint program, int location, uint64EXT *params);
4445bd8deadSopenharmony_ci
4455bd8deadSopenharmony_ciAdditions to Appendix D of the OpenGL 3.0 Specification (Shared Objects and Multiple Contexts)
4465bd8deadSopenharmony_ci
4475bd8deadSopenharmony_ci    Add a new section D.X (Object Use by GPU Address)
4485bd8deadSopenharmony_ci
4495bd8deadSopenharmony_ci    A buffer object's GPU addresses is valid in all contexts in the share
4505bd8deadSopenharmony_ci    group that the buffer belongs to. A buffer should be made resident in
4515bd8deadSopenharmony_ci    each context that will use it via GPU address, to allow the GL 
4525bd8deadSopenharmony_ci    knowledge that it is used in each command stream.
4535bd8deadSopenharmony_ci
4545bd8deadSopenharmony_ciAdditions to the NV_gpu_program4 specification:
4555bd8deadSopenharmony_ci
4565bd8deadSopenharmony_ci    Change Section 2.X.2, Program Grammar
4575bd8deadSopenharmony_ci
4585bd8deadSopenharmony_ci    If a program specifies the NV_shader_buffer_load program option, 
4595bd8deadSopenharmony_ci    the following modifications apply to the program grammar:
4605bd8deadSopenharmony_ci
4615bd8deadSopenharmony_ci    Append to <opModifier> list: | "F32" | "F32X2" | "F32X4" | "S8" | "S16" | 
4625bd8deadSopenharmony_ci    "S32" | "S32X2" | "S32X4" | "U8" | "U16" | "U32" | "U32X2" | "U32X4".
4635bd8deadSopenharmony_ci
4645bd8deadSopenharmony_ci    Append to <SCALARop> list: | "LOAD".
4655bd8deadSopenharmony_ci
4665bd8deadSopenharmony_ci    Modify Section 2.X.4, Program Execution Environment
4675bd8deadSopenharmony_ci
4685bd8deadSopenharmony_ci    (Add to the set of opcodes in Table X.13)
4695bd8deadSopenharmony_ci
4705bd8deadSopenharmony_ci                  Modifiers 
4715bd8deadSopenharmony_ci      Instruction F I C S H D  Out Inputs    Description
4725bd8deadSopenharmony_ci      ----------- - - - - - -  --- --------  --------------------------------
4735bd8deadSopenharmony_ci      LOAD        X X X X - F  v   su        Global load
4745bd8deadSopenharmony_ci
4755bd8deadSopenharmony_ci
4765bd8deadSopenharmony_ci    (Add to Table X.14, Instruction Modifiers, and to the corresponding
4775bd8deadSopenharmony_ci    description following the table)
4785bd8deadSopenharmony_ci
4795bd8deadSopenharmony_ci      Modifier  Description
4805bd8deadSopenharmony_ci      --------  -----------------------------------------------
4815bd8deadSopenharmony_ci      F32       Access one 32-bit floating-point value
4825bd8deadSopenharmony_ci      F32X2     Access two 32-bit floating-point values
4835bd8deadSopenharmony_ci      F32X4     Access four 32-bit floating-point values
4845bd8deadSopenharmony_ci      S8        Access one 8-bit signed integer value
4855bd8deadSopenharmony_ci      S16       Access one 16-bit signed integer value
4865bd8deadSopenharmony_ci      S32       Access one 32-bit signed integer value
4875bd8deadSopenharmony_ci      S32X2     Access two 32-bit signed integer values
4885bd8deadSopenharmony_ci      S32X4     Access four 32-bit signed integer values
4895bd8deadSopenharmony_ci      U8        Access one 8-bit unsigned integer value
4905bd8deadSopenharmony_ci      U16       Access one 16-bit unsigned integer value
4915bd8deadSopenharmony_ci      U32       Access one 32-bit unsigned integer value
4925bd8deadSopenharmony_ci      U32X2     Access two 32-bit unsigned integer values
4935bd8deadSopenharmony_ci      U32X4     Access four 32-bit unsigned integer values
4945bd8deadSopenharmony_ci
4955bd8deadSopenharmony_ci    For memory load operations, the "F32", "F32X2", "F32X4", "S8", "S16",
4965bd8deadSopenharmony_ci    "S32", "S32X2", "S32X4", "U8", "U16", "U32", "U32X2", and "U32X4" storage
4975bd8deadSopenharmony_ci    modifiers control how data are loaded from memory.  Storage modifiers are
4985bd8deadSopenharmony_ci    supported by LOAD instruction and are covered in more detail in the
4995bd8deadSopenharmony_ci    descriptions of that instruction.  LOAD must specify exactly one of these
5005bd8deadSopenharmony_ci    modifiers, and may not specify any of the base data type modifiers (F,U,S)
5015bd8deadSopenharmony_ci    described above.  The base data type of the result vector of a LOAD
5025bd8deadSopenharmony_ci    instruction is trivially derived from the storage modifier.
5035bd8deadSopenharmony_ci
5045bd8deadSopenharmony_ci
5055bd8deadSopenharmony_ci    Add New Section 2.X.4.5, Program Memory Access
5065bd8deadSopenharmony_ci
5075bd8deadSopenharmony_ci    Programs may load from buffer object memory via the LOAD (global load)
5085bd8deadSopenharmony_ci    instruction.
5095bd8deadSopenharmony_ci
5105bd8deadSopenharmony_ci    Load instructions read 8, 16, 32, 64, or 128 bits of data from a source
5115bd8deadSopenharmony_ci    address to produce a four-component vector, according to the storage
5125bd8deadSopenharmony_ci    modifier specified with the instruction.  The storage modifier has three
5135bd8deadSopenharmony_ci    parts:
5145bd8deadSopenharmony_ci
5155bd8deadSopenharmony_ci      - a base data type, "F", "S", or "U", specifying that the instruction
5165bd8deadSopenharmony_ci        fetches floating-point, signed integer, or unsigned integer values,
5175bd8deadSopenharmony_ci        respectively;
5185bd8deadSopenharmony_ci
5195bd8deadSopenharmony_ci      - a component size, specifying that the components fetched by the
5205bd8deadSopenharmony_ci        instruction have 8, 16, or 32 bits; and
5215bd8deadSopenharmony_ci
5225bd8deadSopenharmony_ci      - an optional component count, where "X2" and "X4" indicate that two or
5235bd8deadSopenharmony_ci        four components be fetched, and no count indicates a single component
5245bd8deadSopenharmony_ci        fetch.
5255bd8deadSopenharmony_ci
5265bd8deadSopenharmony_ci    When the storage modifier specifies that fewer than four components should
5275bd8deadSopenharmony_ci    be fetched, remaining components are filled with zeroes.  When performing
5285bd8deadSopenharmony_ci    a global load (LOAD), the GPU address is specified as an instruction
5295bd8deadSopenharmony_ci    operand.  Given a GPU address <address> and a storage modifier <modifier>,
5305bd8deadSopenharmony_ci    the memory load can be described by the following code:
5315bd8deadSopenharmony_ci
5325bd8deadSopenharmony_ci      result_t_vec BufferMemoryLoad(char *address, OpModifier modifier)
5335bd8deadSopenharmony_ci      {
5345bd8deadSopenharmony_ci        result_t_vec result = { 0, 0, 0, 0 };
5355bd8deadSopenharmony_ci        switch (modifier) {
5365bd8deadSopenharmony_ci        case F32:
5375bd8deadSopenharmony_ci            result.x = ((float32_t *)address)[0];
5385bd8deadSopenharmony_ci            break;
5395bd8deadSopenharmony_ci        case F32X2:
5405bd8deadSopenharmony_ci            result.x = ((float32_t *)address)[0];
5415bd8deadSopenharmony_ci            result.y = ((float32_t *)address)[1];
5425bd8deadSopenharmony_ci            break;
5435bd8deadSopenharmony_ci        case F32X4:
5445bd8deadSopenharmony_ci            result.x = ((float32_t *)address)[0];
5455bd8deadSopenharmony_ci            result.y = ((float32_t *)address)[1];
5465bd8deadSopenharmony_ci            result.z = ((float32_t *)address)[2];
5475bd8deadSopenharmony_ci            result.w = ((float32_t *)address)[3];
5485bd8deadSopenharmony_ci            break;
5495bd8deadSopenharmony_ci        case S8:
5505bd8deadSopenharmony_ci            result.x = ((int8_t *)address)[0];
5515bd8deadSopenharmony_ci            break;
5525bd8deadSopenharmony_ci        case S16:
5535bd8deadSopenharmony_ci            result.x = ((int16_t *)address)[0];
5545bd8deadSopenharmony_ci            break;
5555bd8deadSopenharmony_ci        case S32:
5565bd8deadSopenharmony_ci            result.x = ((int32_t *)address)[0];
5575bd8deadSopenharmony_ci            break;
5585bd8deadSopenharmony_ci        case S32X2:
5595bd8deadSopenharmony_ci            result.x = ((int32_t *)address)[0];
5605bd8deadSopenharmony_ci            result.y = ((int32_t *)address)[1];
5615bd8deadSopenharmony_ci            break;
5625bd8deadSopenharmony_ci        case S32X4:
5635bd8deadSopenharmony_ci            result.x = ((int32_t *)address)[0];
5645bd8deadSopenharmony_ci            result.y = ((int32_t *)address)[1];
5655bd8deadSopenharmony_ci            result.z = ((int32_t *)address)[2];
5665bd8deadSopenharmony_ci            result.w = ((int32_t *)address)[3];
5675bd8deadSopenharmony_ci            break;
5685bd8deadSopenharmony_ci        case U8:
5695bd8deadSopenharmony_ci            result.x = ((uint8_t *)address)[0];
5705bd8deadSopenharmony_ci            break;
5715bd8deadSopenharmony_ci        case U16:
5725bd8deadSopenharmony_ci            result.x = ((uint16_t *)address)[0];
5735bd8deadSopenharmony_ci            break;
5745bd8deadSopenharmony_ci        case U32:
5755bd8deadSopenharmony_ci            result.x = ((uint32_t *)address)[0];
5765bd8deadSopenharmony_ci            break;
5775bd8deadSopenharmony_ci        case U32X2:
5785bd8deadSopenharmony_ci            result.x = ((uint32_t *)address)[0];
5795bd8deadSopenharmony_ci            result.y = ((uint32_t *)address)[1];
5805bd8deadSopenharmony_ci            break;
5815bd8deadSopenharmony_ci        case U32X4:
5825bd8deadSopenharmony_ci            result.x = ((uint32_t *)address)[0];
5835bd8deadSopenharmony_ci            result.y = ((uint32_t *)address)[1];
5845bd8deadSopenharmony_ci            result.z = ((uint32_t *)address)[2];
5855bd8deadSopenharmony_ci            result.w = ((uint32_t *)address)[3];
5865bd8deadSopenharmony_ci            break;
5875bd8deadSopenharmony_ci        }
5885bd8deadSopenharmony_ci        return result;
5895bd8deadSopenharmony_ci      }
5905bd8deadSopenharmony_ci
5915bd8deadSopenharmony_ci    If a global load accesses a memory address that does not correspond to a
5925bd8deadSopenharmony_ci    buffer object made resident by MakeBufferResidentNV, the results of the
5935bd8deadSopenharmony_ci    operation are undefined and may result in application termination.
5945bd8deadSopenharmony_ci
5955bd8deadSopenharmony_ci    The address used for the buffer memory loads must be aligned to the fetch
5965bd8deadSopenharmony_ci    size corresponding to the storage opcode modifier.  For S8 and U8, the
5975bd8deadSopenharmony_ci    offset has no alignment requirements.  For S16 and U16, the offset must be
5985bd8deadSopenharmony_ci    a multiple of two basic machine units.  For F32, S32, and U32, the offset
5995bd8deadSopenharmony_ci    must be a multiple of four.  For F32X2, S32X2, and U32X2, the offset must
6005bd8deadSopenharmony_ci    be a multiple of eight.  For F32X4, S32X4, and U32X4, the offset must be a
6015bd8deadSopenharmony_ci    multiple of sixteen.  If an offset is not correctly aligned, the values
6025bd8deadSopenharmony_ci    returned by a buffer memory load will be undefined.
6035bd8deadSopenharmony_ci
6045bd8deadSopenharmony_ci
6055bd8deadSopenharmony_ci    Modify Section 2.X.6, Program Options
6065bd8deadSopenharmony_ci
6075bd8deadSopenharmony_ci    + Shader Buffer Load Support (NV_shader_buffer_load)
6085bd8deadSopenharmony_ci
6095bd8deadSopenharmony_ci    If a program specifies the "NV_shader_buffer_load" option, it may use the
6105bd8deadSopenharmony_ci    LOAD instruction to load data from a resident buffer object given a GPU
6115bd8deadSopenharmony_ci    address.
6125bd8deadSopenharmony_ci
6135bd8deadSopenharmony_ci
6145bd8deadSopenharmony_ci    Section 2.X.8.Z, LOAD:  Global Load
6155bd8deadSopenharmony_ci
6165bd8deadSopenharmony_ci    The LOAD instruction generates a result vector by reading an address from
6175bd8deadSopenharmony_ci    the single unsigned integer scalar operand and fetching data from buffer
6185bd8deadSopenharmony_ci    object memory, as described in Section 2.X.4.5.
6195bd8deadSopenharmony_ci
6205bd8deadSopenharmony_ci      address = ScalarLoad(op0);
6215bd8deadSopenharmony_ci      result = BufferMemoryLoad(address, storageModifier);
6225bd8deadSopenharmony_ci
6235bd8deadSopenharmony_ci    LOAD supports no base data type modifiers, but requires exactly one
6245bd8deadSopenharmony_ci    storage modifier.  The base data type of the result vector is derived from
6255bd8deadSopenharmony_ci    the storage modifier.  The single scalar operand is always interpreted as
6265bd8deadSopenharmony_ci    an unsigned integer.
6275bd8deadSopenharmony_ci
6285bd8deadSopenharmony_ci    The range of GPU addresses supported by the LOAD instruction may be
6295bd8deadSopenharmony_ci    subject to an implementation-dependent limit.  If any component fetched by
6305bd8deadSopenharmony_ci    the LOAD instruction corresponds to memory with an address larger than the
6315bd8deadSopenharmony_ci    value of MAX_SHADER_BUFFER_ADDRESS_NV, the value fetched for that
6325bd8deadSopenharmony_ci    component will be undefined.
6335bd8deadSopenharmony_ci
6345bd8deadSopenharmony_ci
6355bd8deadSopenharmony_ciModifications to The OpenGL Shading Language Specification, Version 1.30.09
6365bd8deadSopenharmony_ci
6375bd8deadSopenharmony_ci    Modify Section 3.6, Keywords, p. 14
6385bd8deadSopenharmony_ci
6395bd8deadSopenharmony_ci    (add the following to the list of reserved keywords)
6405bd8deadSopenharmony_ci
6415bd8deadSopenharmony_ci    intptr_t 
6425bd8deadSopenharmony_ci    uintptr_t
6435bd8deadSopenharmony_ci
6445bd8deadSopenharmony_ci
6455bd8deadSopenharmony_ci    Modify Section 4.1, Basic Types, p. 18
6465bd8deadSopenharmony_ci
6475bd8deadSopenharmony_ci    (add to the basic "Transparent Types" table, p. 18)
6485bd8deadSopenharmony_ci
6495bd8deadSopenharmony_ci      Types       Meaning
6505bd8deadSopenharmony_ci      --------    ----------------------------------------------------------
6515bd8deadSopenharmony_ci      intptr_t    a signed integer with the same precision as a pointer
6525bd8deadSopenharmony_ci      uintptr_t   an unsigned integer with the same precision as a pointer
6535bd8deadSopenharmony_ci
6545bd8deadSopenharmony_ci    (replace the last paragraph of the section with the following)
6555bd8deadSopenharmony_ci
6565bd8deadSopenharmony_ci    Pointers to any of the transparent types, user-defined structs, or other
6575bd8deadSopenharmony_ci    pointer types are supported.
6585bd8deadSopenharmony_ci
6595bd8deadSopenharmony_ci
6605bd8deadSopenharmony_ci    Modify Section 4.1.3, Integers, p. 18
6615bd8deadSopenharmony_ci
6625bd8deadSopenharmony_ci    (add to the end of the first paragraph) Signed and unsigned integer
6635bd8deadSopenharmony_ci    variables are fully supported.  ... intptr_t and uintptr_t variables have
6645bd8deadSopenharmony_ci    the same number of bits of precision as the native size of a pointer in
6655bd8deadSopenharmony_ci    the underlying implementation.
6665bd8deadSopenharmony_ci
6675bd8deadSopenharmony_ci
6685bd8deadSopenharmony_ci    (Insert new section immediately before Section 4.1.10, Implicit
6695bd8deadSopenharmony_ci    Conversions, p. 27)
6705bd8deadSopenharmony_ci
6715bd8deadSopenharmony_ci    Section 4.1.X, Pointers
6725bd8deadSopenharmony_ci
6735bd8deadSopenharmony_ci    Pointers are 64-bit unsigned integer values that represent the address of
6745bd8deadSopenharmony_ci    some "global" memory (i.e. not local to this invocation of a shader).
6755bd8deadSopenharmony_ci    Pointers to any of the transparent types, user-defined structures, or
6765bd8deadSopenharmony_ci    pointer types are supported.  Pointers are dereferenced with the operators
6775bd8deadSopenharmony_ci    (*), (->), and ([]) and a variety of operators performing addition and
6785bd8deadSopenharmony_ci    subtraction are supported.  There is no mechanism to assign a pointer to
6795bd8deadSopenharmony_ci    the address of a local variable or array, nor is there a mechanism to
6805bd8deadSopenharmony_ci    allocate or free memory from within a shader.  There are no function
6815bd8deadSopenharmony_ci    pointers.
6825bd8deadSopenharmony_ci
6835bd8deadSopenharmony_ci    The underlying memory read using pointer variables may also be accessed
6845bd8deadSopenharmony_ci    using the OpenGL API commands.  To communicate between shaders and other
6855bd8deadSopenharmony_ci    OpenGL API commands, variables read through pointers are arranged in
6865bd8deadSopenharmony_ci    memory in the manner described in Section 2.20.X of the OpenGL
6875bd8deadSopenharmony_ci    Specification.
6885bd8deadSopenharmony_ci
6895bd8deadSopenharmony_ci
6905bd8deadSopenharmony_ci    Modify Section 4.1.10, Implicit Conversions, p. 27
6915bd8deadSopenharmony_ci
6925bd8deadSopenharmony_ci    (add before the final paragraph of the section, p. 27) 
6935bd8deadSopenharmony_ci
6945bd8deadSopenharmony_ci    Pointers to any type may be implicitly converted to pointers to void.
6955bd8deadSopenharmony_ci    Pointers to any type (including void), are never implicitly converted to
6965bd8deadSopenharmony_ci    pointers to any other non-void type.
6975bd8deadSopenharmony_ci
6985bd8deadSopenharmony_ci
6995bd8deadSopenharmony_ci    Modify Section 5.1, Operators, p. 39
7005bd8deadSopenharmony_ci
7015bd8deadSopenharmony_ci    (add new entries to the precedence table; for a full spec, renumber the
7025bd8deadSopenharmony_ci    new precedence row "3.5" to "4", and renumber all subsequent rows)
7035bd8deadSopenharmony_ci
7045bd8deadSopenharmony_ci    Precedence  Operator Class               Operators    Associativity
7055bd8deadSopenharmony_ci    ----------  --------------------------   ---------    -------------
7065bd8deadSopenharmony_ci      2         field access from pointer       ->        left to right
7075bd8deadSopenharmony_ci      3         pointer dereference             *         right to left
7085bd8deadSopenharmony_ci      3.5       typecast                        ()        right to left    
7095bd8deadSopenharmony_ci
7105bd8deadSopenharmony_ci    (modify the last paragraph, p.39, to delete language saying that
7115bd8deadSopenharmony_ci     dereferences and typecast operators are not supported)  
7125bd8deadSopenharmony_ci
7135bd8deadSopenharmony_ci    There is no address-of operator.
7145bd8deadSopenharmony_ci
7155bd8deadSopenharmony_ci
7165bd8deadSopenharmony_ci    (Insert new section immediately after Section 5.7, Structure and Array
7175bd8deadSopenharmony_ci     Operations, p. 46)
7185bd8deadSopenharmony_ci
7195bd8deadSopenharmony_ci    Section 5.X, Pointer Operations
7205bd8deadSopenharmony_ci
7215bd8deadSopenharmony_ci    The following operators are allowed to operate on pointer types:
7225bd8deadSopenharmony_ci
7235bd8deadSopenharmony_ci        pointer dereference                     *
7245bd8deadSopenharmony_ci        additive                                + -
7255bd8deadSopenharmony_ci        array subscript                         []
7265bd8deadSopenharmony_ci        arithmetic assignments                  += -=
7275bd8deadSopenharmony_ci        postfix increment and decrement         ++ --
7285bd8deadSopenharmony_ci        prefix increment and decrement          ++ --
7295bd8deadSopenharmony_ci        equality                                == !=
7305bd8deadSopenharmony_ci        assignment                              =
7315bd8deadSopenharmony_ci        field or method selector                ->
7325bd8deadSopenharmony_ci
7335bd8deadSopenharmony_ci    The pointer dereference operator is a unary operator that converts a
7345bd8deadSopenharmony_ci    pointer expression into an l-value designating data of the type pointed to
7355bd8deadSopenharmony_ci    by the pointer expression.  The result of a pointer dereference may not be
7365bd8deadSopenharmony_ci    used as the left-hand side of an assignment.
7375bd8deadSopenharmony_ci
7385bd8deadSopenharmony_ci    The pointer binary addition (+) and subtraction (-) operators produce a
7395bd8deadSopenharmony_ci    pointer result from one pointer operand and one scalar signed or unsigned
7405bd8deadSopenharmony_ci    integer operand.  For subtraction, the pointer must be the first operand;
7415bd8deadSopenharmony_ci    for addition, the pointer may be either operand.  The type of the result
7425bd8deadSopenharmony_ci    is the same type as the pointer operand.  A new pointer is computed by
7435bd8deadSopenharmony_ci    adding or subtracting <I>*<S> basic machine units to the value of the
7445bd8deadSopenharmony_ci    pointer operand, where <I> is the integer operand and <S> is the stride
7455bd8deadSopenharmony_ci    that would be derived by applying the rules specified in Section 2.20.X of
7465bd8deadSopenharmony_ci    the OpenGL Specification to an array with elements of the type pointed to
7475bd8deadSopenharmony_ci    by the pointer.
7485bd8deadSopenharmony_ci
7495bd8deadSopenharmony_ci    The binary subtraction (-) operator may also operate on a pair of pointers
7505bd8deadSopenharmony_ci    of identical type.  In this operation, the second operand is subtracted
7515bd8deadSopenharmony_ci    from the first, yielding a signed integer result of type <intptr_t>.  The
7525bd8deadSopenharmony_ci    result is in units of the type being pointed to.  The result is the
7535bd8deadSopenharmony_ci    integer value that would yield the first pointer operand if added to the
7545bd8deadSopenharmony_ci    second pointer operand in the manner described above.  If no such integer
7555bd8deadSopenharmony_ci    value exists, the result of the operation is undefined.  Pointer
7565bd8deadSopenharmony_ci    subtraction is not supported for pointers to the type <void>.
7575bd8deadSopenharmony_ci
7585bd8deadSopenharmony_ci    The array subscript operator ([]) adds a signed or unsigned integer
7595bd8deadSopenharmony_ci    expression specified inside the brackets to a pointer expression specified
7605bd8deadSopenharmony_ci    to the left of the brackets, and then dereferences the pointer produced by
7615bd8deadSopenharmony_ci    the addition.  The array subscript operation "P[i]" is functionally
7625bd8deadSopenharmony_ci    equivalent to "(*(P+i))".
7635bd8deadSopenharmony_ci
7645bd8deadSopenharmony_ci    The add into (+=) and subtract from (-=) are binary operations, where the
7655bd8deadSopenharmony_ci    first operand must be one that could be assigned to (an l-value) and the
7665bd8deadSopenharmony_ci    second operand must be a signed or unsigned integer scalar.  These
7675bd8deadSopenharmony_ci    operations add the integer operand into or subtract the integer operand
7685bd8deadSopenharmony_ci    from the pointer operand, as defined for pointer addition and subtraction.
7695bd8deadSopenharmony_ci
7705bd8deadSopenharmony_ci    The arithmetic unary operators post- and pre-increment and decrement (--
7715bd8deadSopenharmony_ci    and ++) operate on pointers.  For post- and pre-increment and decrement,
7725bd8deadSopenharmony_ci    the expression must be one that could be assigned to (an l-value).  Pre-
7735bd8deadSopenharmony_ci    and post-increment and decrement add or subtract 1 to the contents of the
7745bd8deadSopenharmony_ci    expression they operate on, as defined for pointer addition and
7755bd8deadSopenharmony_ci    subtraction.  The value of the pre-increment or pre-decrement expression
7765bd8deadSopenharmony_ci    is the resulting value of that modification.  The value of the
7775bd8deadSopenharmony_ci    post-increment or post-decrement expression is the value of the expression
7785bd8deadSopenharmony_ci    before modification.
7795bd8deadSopenharmony_ci
7805bd8deadSopenharmony_ci    The equality operators equal (==) and not equal (!=) operate on pointer
7815bd8deadSopenharmony_ci    types and produce a scalar Boolean result.  The two operands must either
7825bd8deadSopenharmony_ci    be pointers to the same type, or one of the two operands must point to
7835bd8deadSopenharmony_ci    void.  Two pointers are considered equal if and only if they point to the
7845bd8deadSopenharmony_ci    same global memory address.
7855bd8deadSopenharmony_ci
7865bd8deadSopenharmony_ci    The field or method selection operator (->) operates on a pointer to a
7875bd8deadSopenharmony_ci    structure of any type and is used to select a field of the structure
7885bd8deadSopenharmony_ci    pointed to by the pointer.  This selector also operates on a pointer to
7895bd8deadSopenharmony_ci    vector of any type, where the right hand side of the operator must be a
7905bd8deadSopenharmony_ci    valid string using the vector component selection suffix described in
7915bd8deadSopenharmony_ci    Section 5.5.  In both cases, the field or method selection operation
7925bd8deadSopenharmony_ci    "p->s" is functionally equivalent to "((*p).s)".
7935bd8deadSopenharmony_ci
7945bd8deadSopenharmony_ci    Pointer addition and subtraction, including the add into, subtract from,
7955bd8deadSopenharmony_ci    and pre- and post-increment and decrement operators, are not supported on
7965bd8deadSopenharmony_ci    pointers to a void type.
7975bd8deadSopenharmony_ci
7985bd8deadSopenharmony_ci    The assignment operator may be used to update the value of a pointer
7995bd8deadSopenharmony_ci    variable, as described in Section 5.8.
8005bd8deadSopenharmony_ci
8015bd8deadSopenharmony_ci
8025bd8deadSopenharmony_ci    (Insert after Section 5.10, Vector and Matrix Operations, p. 50)
8035bd8deadSopenharmony_ci
8045bd8deadSopenharmony_ci    Section 5.11, Typecast Operations
8055bd8deadSopenharmony_ci
8065bd8deadSopenharmony_ci    The typecast operator may be used to convert an expression from one type
8075bd8deadSopenharmony_ci    to another, operating in a manner similar to scalar, vector, and matrix
8085bd8deadSopenharmony_ci    constructors.  The typecast operator specifies a new data type in
8095bd8deadSopenharmony_ci    parentheses, followed by an expression, as in the following examples:
8105bd8deadSopenharmony_ci
8115bd8deadSopenharmony_ci      float a = (float) 2U;
8125bd8deadSopenharmony_ci      vec3 b = (vec3) 1.0;
8135bd8deadSopenharmony_ci      vec4 c = (vec4) b;
8145bd8deadSopenharmony_ci      mat2 d = (mat2) 1.0;
8155bd8deadSopenharmony_ci      mat4 e = (mat4) d;
8165bd8deadSopenharmony_ci
8175bd8deadSopenharmony_ci    For scalar, vector, and matrix data types, the set of typecasts supported
8185bd8deadSopenharmony_ci    is equivalent to the set of single-operand constructors supported, and a
8195bd8deadSopenharmony_ci    typecast operates identically to an equivalent constructor.  A scalar
8205bd8deadSopenharmony_ci    expression may be typecast to any scalar, vector, or matrix data type.  A
8215bd8deadSopenharmony_ci    vector expression may be typecast any vector type, except vectors with a
8225bd8deadSopenharmony_ci    larger number of components.  Additionally, four-component vector
8235bd8deadSopenharmony_ci    expressions may also be cast to a mat2 type.  A matrix expression may be
8245bd8deadSopenharmony_ci    typecast to any other matrix data type.
8255bd8deadSopenharmony_ci
8265bd8deadSopenharmony_ci    Expressions with structure type may only be typecast to a structure of
8275bd8deadSopenharmony_ci    identical type, which has no effect.  Typecast operators are not supported
8285bd8deadSopenharmony_ci    for array types.
8295bd8deadSopenharmony_ci
8305bd8deadSopenharmony_ci    Note that the typecast operator takes only a single expression.  Unlike
8315bd8deadSopenharmony_ci    constructors, they can not be used to generate a vector, structure, or
8325bd8deadSopenharmony_ci    matrix from multiple inputs.  For example,
8335bd8deadSopenharmony_ci
8345bd8deadSopenharmony_ci      vec3 f = (vec3) (1.0, 2.0, 3.0);
8355bd8deadSopenharmony_ci
8365bd8deadSopenharmony_ci    generates a three-component vector <f>.  But all three components
8375bd8deadSopenharmony_ci    are set to 3.0, which is the scalar value of the expression "(1.0, 2.0,
8385bd8deadSopenharmony_ci    3.0)".  The commas in that expression are sequence operators, not list
8395bd8deadSopenharmony_ci    delimiters.
8405bd8deadSopenharmony_ci
8415bd8deadSopenharmony_ci    Additionally, typecast operators may also be used to cast values to a
8425bd8deadSopenharmony_ci    pointer type.  In this case, the expression being typecast must be either
8435bd8deadSopenharmony_ci    a pointer (to any type) or a scalar of type intptr_t or uintptr_t.
8445bd8deadSopenharmony_ci
8455bd8deadSopenharmony_ci      vec4      *v4ptr
8465bd8deadSopenharmony_ci      intptr_t  iptr;
8475bd8deadSopenharmony_ci      vec3      *v3ptr = (vec3 *) v4ptr;
8485bd8deadSopenharmony_ci      ivec2     *iv2ptr = (ivec2 *) iptr;
8495bd8deadSopenharmony_ci
8505bd8deadSopenharmony_ci    Note that function call-style constructors are not supported for pointers.
8515bd8deadSopenharmony_ci
8525bd8deadSopenharmony_ci
8535bd8deadSopenharmony_ci    Add to the end of Section 8.3, Common Functions, p. 72
8545bd8deadSopenharmony_ci
8555bd8deadSopenharmony_ci    (add support for pointer packing functions)
8565bd8deadSopenharmony_ci
8575bd8deadSopenharmony_ci    Syntax:
8585bd8deadSopenharmony_ci
8595bd8deadSopenharmony_ci      void *packPtr(uvec2 a);
8605bd8deadSopenharmony_ci      uvec2 unpackPtr(void *a);
8615bd8deadSopenharmony_ci
8625bd8deadSopenharmony_ci    The function packPtr() returns a pointer to void by constructing a 64-bit
8635bd8deadSopenharmony_ci    void pointer from the two 32-bit components of an unsigned integer vector.
8645bd8deadSopenharmony_ci    The first vector component specifies the 32 least significant bits of the
8655bd8deadSopenharmony_ci    pointer; the second component specifies the 32 most significant bits.
8665bd8deadSopenharmony_ci
8675bd8deadSopenharmony_ci    The function unpackPtr() returns a two-component unsigned integer vector
8685bd8deadSopenharmony_ci    built from a 64-bit void pointer.  The first component of the vector
8695bd8deadSopenharmony_ci    consists of the 32 least significant bits of the pointer value; the second
8705bd8deadSopenharmony_ci    component consists of the 32 most significant bits.
8715bd8deadSopenharmony_ci
8725bd8deadSopenharmony_ci
8735bd8deadSopenharmony_ci    Modify Chapter 9, Shading Language Grammar, p.92
8745bd8deadSopenharmony_ci
8755bd8deadSopenharmony_ci    (change comment in the grammar disallowing pointer dereferences)
8765bd8deadSopenharmony_ci
8775bd8deadSopenharmony_ci    Change the sentence:
8785bd8deadSopenharmony_ci
8795bd8deadSopenharmony_ci      // Grammar Note: No '*' or '&' unary ops. Pointers are not supported.
8805bd8deadSopenharmony_ci
8815bd8deadSopenharmony_ci    to
8825bd8deadSopenharmony_ci
8835bd8deadSopenharmony_ci      // Grammar Note: No '&' unary.
8845bd8deadSopenharmony_ci
8855bd8deadSopenharmony_ci
8865bd8deadSopenharmony_ciAdditions to the AGL/EGL/GLX/WGL Specifications
8875bd8deadSopenharmony_ci
8885bd8deadSopenharmony_ci    None
8895bd8deadSopenharmony_ci
8905bd8deadSopenharmony_ciErrors
8915bd8deadSopenharmony_ci
8925bd8deadSopenharmony_ci    INVALID_ENUM is generated by MakeBufferResidentNV if <access> is not
8935bd8deadSopenharmony_ci    READ_ONLY.
8945bd8deadSopenharmony_ci    
8955bd8deadSopenharmony_ci    INVALID_ENUM is generated by GetBufferParameterui64vNV if <pname> is
8965bd8deadSopenharmony_ci    not BUFFER_GPU_ADDRESS_NV.
8975bd8deadSopenharmony_ci
8985bd8deadSopenharmony_ci    INVALID_OPERATION is generated by MakeBufferResidentNV,
8995bd8deadSopenharmony_ci    MakeBufferNonResidentNV, IsBufferResidentNV, and GetBufferParameterui64vNV
9005bd8deadSopenharmony_ci    if no buffer is bound to <target>.
9015bd8deadSopenharmony_ci
9025bd8deadSopenharmony_ci    INVALID_OPERATION is generated by MakeBufferResidentNV if the buffer bound
9035bd8deadSopenharmony_ci    to <target> is already resident in the current GL context.
9045bd8deadSopenharmony_ci
9055bd8deadSopenharmony_ci    INVALID_OPERATION is generated by MakeBufferNonResidentNV if the buffer
9065bd8deadSopenharmony_ci    bound to <target> is not resident in the current GL context.
9075bd8deadSopenharmony_ci
9085bd8deadSopenharmony_ci    INVALID_OPERATION is generated by MakeNamedBufferResidentNV if <buffer> is
9095bd8deadSopenharmony_ci    already resident in the current GL context.
9105bd8deadSopenharmony_ci
9115bd8deadSopenharmony_ci    INVALID_OPERATION is generated by MakeNamedBufferNonResidentNV if <buffer>
9125bd8deadSopenharmony_ci    is not resident in the current GL context.
9135bd8deadSopenharmony_ci
9145bd8deadSopenharmony_ci    INVALID_OPERATION is generated by GetBufferParameterui64vNV or
9155bd8deadSopenharmony_ci    MakeBufferResidentNV if the buffer bound to <target> has no data store.
9165bd8deadSopenharmony_ci
9175bd8deadSopenharmony_ci    INVALID_OPERATION is generated by GetNamedBufferParameterui64vNV or
9185bd8deadSopenharmony_ci    MakeNamedBufferResidentNV if <buffer> has no data store.
9195bd8deadSopenharmony_ci
9205bd8deadSopenharmony_ciExamples
9215bd8deadSopenharmony_ci
9225bd8deadSopenharmony_ci    (1) Layout of a complex structure using the rules from the new Section
9235bd8deadSopenharmony_ci        2.20.X added to the OpenGL spec:
9245bd8deadSopenharmony_ci
9255bd8deadSopenharmony_ci    struct  Example {
9265bd8deadSopenharmony_ci                    // bytes used            rules
9275bd8deadSopenharmony_ci      float a;      //  0-3                  
9285bd8deadSopenharmony_ci      vec2 b;       //  8-15                 1   // bumped to a multiple of 8
9295bd8deadSopenharmony_ci      vec3 c;       //  16-27                1
9305bd8deadSopenharmony_ci      struct {
9315bd8deadSopenharmony_ci        int d;      //  32-35                2   // bumped to a multiple of 8 (bvec2)
9325bd8deadSopenharmony_ci        bvec2 e;    //  40-47                1
9335bd8deadSopenharmony_ci      } f;
9345bd8deadSopenharmony_ci      float g;      //  48-51                
9355bd8deadSopenharmony_ci      float h[2];   //  52-55 (h[0])         5   // multiple of 4 (float) with no additional padding
9365bd8deadSopenharmony_ci                    //  56-59 (h[1])         6   // tightly packed
9375bd8deadSopenharmony_ci      mat2x3 i;     //  64-75 (i[0])         
9385bd8deadSopenharmony_ci                    //  80-91 (i[1])         6   // bumped to a multiple of 16 (vec3)
9395bd8deadSopenharmony_ci      struct {
9405bd8deadSopenharmony_ci        uvec3 j;    //   96-107 (m[0].j)     
9415bd8deadSopenharmony_ci        vec2 k;     //  112-119 (m[0].k)     1   // bumped to a multiple of 8 (vec2)
9425bd8deadSopenharmony_ci        float l[2]; //  120-123 (m[0].l[0])  1,5 // simply float aligned
9435bd8deadSopenharmony_ci                    //  124-127 (m[0].l[1])  6   // tightly packed
9445bd8deadSopenharmony_ci                    //  128-139 (m[1].j)
9455bd8deadSopenharmony_ci                    //  144-151 (m[1].k)
9465bd8deadSopenharmony_ci                    //  152-155 (m[1].l[0])
9475bd8deadSopenharmony_ci                    //  156-159 (m[1].l[1])
9485bd8deadSopenharmony_ci      } m[2];
9495bd8deadSopenharmony_ci    };
9505bd8deadSopenharmony_ci    // sizeof(Example) == 160
9515bd8deadSopenharmony_ci
9525bd8deadSopenharmony_ci    (2) Replacing bindable_uniform with an array of pointers:
9535bd8deadSopenharmony_ci
9545bd8deadSopenharmony_ci        #version 120
9555bd8deadSopenharmony_ci        #extension GL_NV_shader_buffer_load : require
9565bd8deadSopenharmony_ci        #extension GL_EXT_bindable_uniform : require
9575bd8deadSopenharmony_ci
9585bd8deadSopenharmony_ci        in vec4 **ptr;
9595bd8deadSopenharmony_ci        in uvec2 whichbuf;
9605bd8deadSopenharmony_ci
9615bd8deadSopenharmony_ci        void main() {
9625bd8deadSopenharmony_ci            gl_FrontColor = ptr[whichbuf.x][whichbuf.y];
9635bd8deadSopenharmony_ci            gl_Position = ftransform();
9645bd8deadSopenharmony_ci        }
9655bd8deadSopenharmony_ci
9665bd8deadSopenharmony_ci        in the GL code, assuming the bufferobject setup in the Overview:
9675bd8deadSopenharmony_ci
9685bd8deadSopenharmony_ci        glBindAttribLocation(program, 8, "ptr");    
9695bd8deadSopenharmony_ci        glBindAttribLocation(program, 9, "whichbuf");    
9705bd8deadSopenharmony_ci        glLinkProgram(program);
9715bd8deadSopenharmony_ci        glBegin(...);
9725bd8deadSopenharmony_ci        glVertexAttribI2iEXT(8, (unsigned int)pointerBufferAddr, 
9735bd8deadSopenharmony_ci                                (unsigned int)(pointerBufferAddr>>32));
9745bd8deadSopenharmony_ci        for (i = ...) {
9755bd8deadSopenharmony_ci            for (j = ...) {
9765bd8deadSopenharmony_ci                glVertexAttribI2iEXT(9, i, j);
9775bd8deadSopenharmony_ci                glVertex3f(...);
9785bd8deadSopenharmony_ci            }
9795bd8deadSopenharmony_ci        }
9805bd8deadSopenharmony_ci        glEnd();
9815bd8deadSopenharmony_ci
9825bd8deadSopenharmony_ci
9835bd8deadSopenharmony_ciNew State
9845bd8deadSopenharmony_ci
9855bd8deadSopenharmony_ci    Update Table 6.11, p. 349 (Buffer Object State)
9865bd8deadSopenharmony_ci
9875bd8deadSopenharmony_ci    Get Value                   Type    Get Command                  Initial Value   Sec     Attribute
9885bd8deadSopenharmony_ci    ---------                   ----    -----------                  -------------   ---     ---------
9895bd8deadSopenharmony_ci    BUFFER_GPU_ADDRESS_NV       Z64+    GetBufferParameterui64vNV    0               2.9     none
9905bd8deadSopenharmony_ci
9915bd8deadSopenharmony_ci    Update Table 6.46, p. 384 (Implementation Dependent Values)
9925bd8deadSopenharmony_ci
9935bd8deadSopenharmony_ci    Get Value                   Type    Get Command                  Minimum Value   Sec     Attribute
9945bd8deadSopenharmony_ci    ---------                   ----    -----------                  -------------   ---     ---------
9955bd8deadSopenharmony_ci    MAX_SHADER_BUFFER_ADDRESS_NV Z64+   GetIntegerui64vNV            0xFFFFFFFF      2.X.2   none
9965bd8deadSopenharmony_ci
9975bd8deadSopenharmony_ciDependencies on NV_gpu_program4:
9985bd8deadSopenharmony_ci
9995bd8deadSopenharmony_ci    This extension is generally written against the NV_gpu_program4 
10005bd8deadSopenharmony_ci    wording, program grammar, etc., but doesn't have specific 
10015bd8deadSopenharmony_ci    dependencies on its functionality. 
10025bd8deadSopenharmony_ci
10035bd8deadSopenharmony_ci    
10045bd8deadSopenharmony_ciIssues
10055bd8deadSopenharmony_ci
10065bd8deadSopenharmony_ci    1) Only buffer objects?
10075bd8deadSopenharmony_ci
10085bd8deadSopenharmony_ci    RESOLVED: YES, for now. Buffer objects are unformatted memory and 
10095bd8deadSopenharmony_ci    easily mapped to a "pointer"-style shading language. 
10105bd8deadSopenharmony_ci
10115bd8deadSopenharmony_ci    2) Should we allow writes?
10125bd8deadSopenharmony_ci
10135bd8deadSopenharmony_ci    RESOLVED: NO, deferred to a later extension. Writes involve 
10145bd8deadSopenharmony_ci    specifying many kinds of synchronization primitives. Writes are also
10155bd8deadSopenharmony_ci    a "side effect" which makes program execution "observable" in cases
10165bd8deadSopenharmony_ci    where it may not have otherwise been (e.g. early-Z can kill fragments
10175bd8deadSopenharmony_ci    before shading, or a post-transform cache may prevent vertex program
10185bd8deadSopenharmony_ci    execution).
10195bd8deadSopenharmony_ci    
10205bd8deadSopenharmony_ci    3) What happens if an invalid pointer is fetched?
10215bd8deadSopenharmony_ci
10225bd8deadSopenharmony_ci    UNRESOLVED: Unpredictable results, including program termination?
10235bd8deadSopenharmony_ci    Make the driver trap the error and report it (still unpredictable
10245bd8deadSopenharmony_ci    results, but no program termination)? My preference would be to 
10255bd8deadSopenharmony_ci    at least report the faulting address (roughly), whether it was 
10265bd8deadSopenharmony_ci    a read or a write, and which shader stage faulted. I'd like to not 
10275bd8deadSopenharmony_ci    terminate the program, but the app has to assume all their data 
10285bd8deadSopenharmony_ci    stored in the GL is lost.
10295bd8deadSopenharmony_ci
10305bd8deadSopenharmony_ci    4) What should this extension be named?
10315bd8deadSopenharmony_ci
10325bd8deadSopenharmony_ci    RESOLVED: NV_shader_buffer_load. Rather than trying to choose an
10335bd8deadSopenharmony_ci    overly-general name and naming future extensions "GL_XXX2", let's 
10345bd8deadSopenharmony_ci    name this according to the specific functionality it provides.
10355bd8deadSopenharmony_ci
10365bd8deadSopenharmony_ci    5) What are the performance characteristics of buffer loads?
10375bd8deadSopenharmony_ci
10385bd8deadSopenharmony_ci    RESOLVED: Likely somewhere between uniforms and texture fetches, 
10395bd8deadSopenharmony_ci    but totally implementation-dependent. Uniforms still serve a purpose
10405bd8deadSopenharmony_ci    for "program locals". Buffer loads may have different caching 
10415bd8deadSopenharmony_ci    behavior than either uniforms or texture fetches, but the expectation
10425bd8deadSopenharmony_ci    is that they will be cached reads of memory and all the common sense
10435bd8deadSopenharmony_ci    guidelines to try to maintain locality of reference apply.
10445bd8deadSopenharmony_ci
10455bd8deadSopenharmony_ci    6) What does MakeBufferResidentNV do? Why not just have a 
10465bd8deadSopenharmony_ci    MapBufferGPUNV?
10475bd8deadSopenharmony_ci
10485bd8deadSopenharmony_ci    RESOLVED: Reserving virtual address space only requires knowing the 
10495bd8deadSopenharmony_ci    size of the data store, so an explicit MapBufferGPU call isn't 
10505bd8deadSopenharmony_ci    necessary. If all GPUs supported demand paging, a GPU address might
10515bd8deadSopenharmony_ci    be sufficient, but without that assumption MakeBufferResidentNV serves
10525bd8deadSopenharmony_ci    as a hint to the driver that it needs to page lock memory, download 
10535bd8deadSopenharmony_ci    the buffer contents into GPU-accessible memory, or other similar 
10545bd8deadSopenharmony_ci    preparation. MapBufferGPU would also imply that a different address
10555bd8deadSopenharmony_ci    may be returned each time it is mapped, which could be cumbersome
10565bd8deadSopenharmony_ci    for the application to handle.
10575bd8deadSopenharmony_ci
10585bd8deadSopenharmony_ci    7) Is it an error to render while any resident buffer is mapped?
10595bd8deadSopenharmony_ci    
10605bd8deadSopenharmony_ci    RESOLVED: No. As the number of attachment points in the context grows,
10615bd8deadSopenharmony_ci    even the existing error check is falling out of favor.
10625bd8deadSopenharmony_ci
10635bd8deadSopenharmony_ci    8) Does MapBuffer stall on pending use of a resident buffer?
10645bd8deadSopenharmony_ci
10655bd8deadSopenharmony_ci    RESOLVED: No. The existing language is:
10665bd8deadSopenharmony_ci    
10675bd8deadSopenharmony_ci        "If the GL is able to map the buffer object's data store into the 
10685bd8deadSopenharmony_ci         client's address space, MapBuffer returns the pointer value to 
10695bd8deadSopenharmony_ci         the data store once all pending operations on that buffer have
10705bd8deadSopenharmony_ci         completed."
10715bd8deadSopenharmony_ci
10725bd8deadSopenharmony_ci    However, since the implementation has no information about how the 
10735bd8deadSopenharmony_ci    buffer is used, "all pending operations" amounts to a Finish. In 
10745bd8deadSopenharmony_ci    terms of sharing across contexts/threads, ARB_vertex_buffer_object 
10755bd8deadSopenharmony_ci    says:
10765bd8deadSopenharmony_ci
10775bd8deadSopenharmony_ci        "How is synchronization enforced when buffer objects are shared by
10785bd8deadSopenharmony_ci         multiple OpenGL contexts?
10795bd8deadSopenharmony_ci
10805bd8deadSopenharmony_ci         RESOLVED: It is generally the clients' responsibility to
10815bd8deadSopenharmony_ci         synchronize modifications made to shared buffer objects."
10825bd8deadSopenharmony_ci
10835bd8deadSopenharmony_ci    So we shouldn't dictate any additional shared object synchronization.
10845bd8deadSopenharmony_ci    So the best we could do is a Finish, but it's not clear that this 
10855bd8deadSopenharmony_ci    accomplishes anything for the application since they can just as 
10865bd8deadSopenharmony_ci    easily call Finish. Or if they don't want synchronization, they can 
10875bd8deadSopenharmony_ci    use MAP_UNSYNCHRONIZED_BIT. It seems the resolution to this is 
10885bd8deadSopenharmony_ci    inconsequential as GL already provides the tools to achieve either 
10895bd8deadSopenharmony_ci    behavior. Hence, don't bother stalling.
10905bd8deadSopenharmony_ci
10915bd8deadSopenharmony_ci    However, if a buffer was previously resident and has since been made 
10925bd8deadSopenharmony_ci    non-resident, the implementation should enforce the stalling 
10935bd8deadSopenharmony_ci    behavior for those pending operations from before it was made non-
10945bd8deadSopenharmony_ci    resident.
10955bd8deadSopenharmony_ci
10965bd8deadSopenharmony_ci    9) Given issue (8), what are some effective ways to load data into 
10975bd8deadSopenharmony_ci    a buffer that is resident?
10985bd8deadSopenharmony_ci
10995bd8deadSopenharmony_ci    RESOLVED: There are several possibilities:
11005bd8deadSopenharmony_ci
11015bd8deadSopenharmony_ci    - BufferSubData.
11025bd8deadSopenharmony_ci    
11035bd8deadSopenharmony_ci    - The application may track using Fences which parts of the buffer 
11045bd8deadSopenharmony_ci      are actually in use and update them with CPU writes using 
11055bd8deadSopenharmony_ci      MAP_UNSYNCHRONIZED_BIT. This is potentially error-prone, as 
11065bd8deadSopenharmony_ci      described in ARB_copy_buffer.
11075bd8deadSopenharmony_ci
11085bd8deadSopenharmony_ci    - CopyBufferSubData. ARB_copy_buffer describes a simple usage example
11095bd8deadSopenharmony_ci      for a single-threaded application. Since this extension is targeted
11105bd8deadSopenharmony_ci      at reducing the CPU bottleneck in the rendering thread, offloading
11115bd8deadSopenharmony_ci      some of the work to other threads may be useful.
11125bd8deadSopenharmony_ci
11135bd8deadSopenharmony_ci      Example with a single Loading thread and Rendering thread:
11145bd8deadSopenharmony_ci
11155bd8deadSopenharmony_ci          Loading thread:
11165bd8deadSopenharmony_ci              while (1) {
11175bd8deadSopenharmony_ci                  WaitForEvent(something to do);
11185bd8deadSopenharmony_ci
11195bd8deadSopenharmony_ci                  NamedBufferData(tempBuffer, updateSize, NULL, STREAM_DRAW);
11205bd8deadSopenharmony_ci                  ptr = MapNamedBuffer(tempBuffer, WRITE_ONLY);
11215bd8deadSopenharmony_ci                  // fill ptr
11225bd8deadSopenharmony_ci                  UnmapNamedBuffer(tempBuffer);
11235bd8deadSopenharmony_ci                  // the buffer could have been filled via BufferData, if 
11245bd8deadSopenharmony_ci                  // that's more natural.
11255bd8deadSopenharmony_ci                  
11265bd8deadSopenharmony_ci                  // send tempBuffer name to Rendering thread
11275bd8deadSopenharmony_ci              }
11285bd8deadSopenharmony_ci          Rendering thread:
11295bd8deadSopenharmony_ci              foreach (obj in scene) {
11305bd8deadSopenharmony_ci                  if (obj has changed) {
11315bd8deadSopenharmony_ci                      // get tempBuffer name from Loading thread
11325bd8deadSopenharmony_ci                      
11335bd8deadSopenharmony_ci                      NamedCopyBufferSubData(tempBuffer, objBuf, objOffset, updateSize);
11345bd8deadSopenharmony_ci                  }
11355bd8deadSopenharmony_ci                  Draw(obj);
11365bd8deadSopenharmony_ci              }
11375bd8deadSopenharmony_ci
11385bd8deadSopenharmony_ci      If we further desire to offload the data transfer to another 
11395bd8deadSopenharmony_ci      thread, and the implementation supports concurrent data transfers 
11405bd8deadSopenharmony_ci      in one context/thread while rendering in another context/thread, 
11415bd8deadSopenharmony_ci      this may also be accomplished thusly:
11425bd8deadSopenharmony_ci
11435bd8deadSopenharmony_ci          Loading thread:
11445bd8deadSopenharmony_ci              while (1) {
11455bd8deadSopenharmony_ci                  WaitForEvent(something to do);
11465bd8deadSopenharmony_ci
11475bd8deadSopenharmony_ci                  NamedBufferData(sysBuffer, updateSize, NULL, STREAM_DRAW);
11485bd8deadSopenharmony_ci                  ptr = MapNamedBuffer(sysBuffer, WRITE_ONLY);
11495bd8deadSopenharmony_ci                  // fill ptr
11505bd8deadSopenharmony_ci                  UnmapNamedBuffer(sysBuffer);
11515bd8deadSopenharmony_ci                  
11525bd8deadSopenharmony_ci                  NamedBufferData(vidBuffer, updateSize, NULL, STREAM_COPY);
11535bd8deadSopenharmony_ci                  // This is a sysmem->vidmem blit.
11545bd8deadSopenharmony_ci                  NamedCopyBufferSubData(sysBuffer, vidBuffer, 0, updateSize);
11555bd8deadSopenharmony_ci                  SetFence(fenceId, ALL_COMPLETED);
11565bd8deadSopenharmony_ci
11575bd8deadSopenharmony_ci                  // send vidBuffer name and fenceId to Rendering thread
11585bd8deadSopenharmony_ci
11595bd8deadSopenharmony_ci                  // This could have been a BufferSubData directly into 
11605bd8deadSopenharmony_ci                  // vidBuffer, if that's more natural.
11615bd8deadSopenharmony_ci              }
11625bd8deadSopenharmony_ci          Rendering thread:
11635bd8deadSopenharmony_ci              foreach (obj in scene) {
11645bd8deadSopenharmony_ci                  if (obj has changed) {
11655bd8deadSopenharmony_ci                      // get vidBuffer name and fenceId from Loading thread
11665bd8deadSopenharmony_ci                      
11675bd8deadSopenharmony_ci                      // note: there aren't any sharable fences currently,
11685bd8deadSopenharmony_ci                      // actually need to ask the loading thread when it
11695bd8deadSopenharmony_ci                      // has finished.
11705bd8deadSopenharmony_ci                      FinishFence(fenceId);
11715bd8deadSopenharmony_ci                      
11725bd8deadSopenharmony_ci                      // This is hopefully a fast vidmem->vidmem blit.
11735bd8deadSopenharmony_ci                      NamedCopyBufferSubData(vidBuffer, objBuffer, objOffset, updateSize);
11745bd8deadSopenharmony_ci                  }
11755bd8deadSopenharmony_ci                  Draw(obj);
11765bd8deadSopenharmony_ci              }
11775bd8deadSopenharmony_ci
11785bd8deadSopenharmony_ci      In both of these examples, the point at which the data is written to 
11795bd8deadSopenharmony_ci      the resident buffer's data store is clearly specified in order
11805bd8deadSopenharmony_ci      with rendering commands. This resolves a whole class of 
11815bd8deadSopenharmony_ci      synchronization bugs (Write After Read hazard) that 
11825bd8deadSopenharmony_ci      MAP_UNSYNCHRONIZED_BIT is prone to.
11835bd8deadSopenharmony_ci
11845bd8deadSopenharmony_ci    10) What happens if BufferData is called on a buffer that is resident? 
11855bd8deadSopenharmony_ci    
11865bd8deadSopenharmony_ci    RESOLVED: BufferData is specified to "delete the existing data store", 
11875bd8deadSopenharmony_ci    so the GPU address of that data should become invalid. The buffer is
11885bd8deadSopenharmony_ci    therefore made non-resident in the current context.
11895bd8deadSopenharmony_ci
11905bd8deadSopenharmony_ci    11) Should residency be a property of the buffer object, or should
11915bd8deadSopenharmony_ci    a buffer be "made resident to a context"?
11925bd8deadSopenharmony_ci
11935bd8deadSopenharmony_ci    RESOLVED: Made resident to a context. If a shared buffer is used in 
11945bd8deadSopenharmony_ci    two threads/contexts, it may be difficult for the application to know 
11955bd8deadSopenharmony_ci    when the residency state actually changes on the shared object 
11965bd8deadSopenharmony_ci    particularly if there is a large latency between commands being 
11975bd8deadSopenharmony_ci    submitted on the client and processed on the server. Allowing the 
11985bd8deadSopenharmony_ci    buffer to be made resident to each context individually allows the 
11995bd8deadSopenharmony_ci    state to be reliably toggled in-order in each command stream. This 
12005bd8deadSopenharmony_ci    also allows MakeBufferNonResident to serve as indication to the GL
12015bd8deadSopenharmony_ci    that the buffer is no longer in use in each command stream.
12025bd8deadSopenharmony_ci
12035bd8deadSopenharmony_ci    This leads to an unfortunate orphaning issue. For example, if the 
12045bd8deadSopenharmony_ci    buffer is resident in context A and then deleted in context B, how 
12055bd8deadSopenharmony_ci    can the app make it non-resident in context A? Given the name-based 
12065bd8deadSopenharmony_ci    object model, it is impossible. It would be complex from an 
12075bd8deadSopenharmony_ci    implementation point of view for DeleteBuffers (or BufferData) to 
12085bd8deadSopenharmony_ci    either make it non-resident or throw an error if it is resident in 
12095bd8deadSopenharmony_ci    some other context. 
12105bd8deadSopenharmony_ci    
12115bd8deadSopenharmony_ci    An ideal solution would be a (separate) extension that allows the 
12125bd8deadSopenharmony_ci    application to increment the refcount on the object and to decrement
12135bd8deadSopenharmony_ci    the refcount without necessarily deleting the object's name. Until 
12145bd8deadSopenharmony_ci    such an extension exists, the unsatisfying proposed resolution is that
12155bd8deadSopenharmony_ci    a buffer can be "stuck" resident until the context is deleted. Note 
12165bd8deadSopenharmony_ci    that DeleteBuffers should make the buffer non-resident in the context 
12175bd8deadSopenharmony_ci    that does the delete, so this problem only applies to rare multi-
12185bd8deadSopenharmony_ci    context corner cases.
12195bd8deadSopenharmony_ci
12205bd8deadSopenharmony_ci    12) Is there any value in requiring an "immutable structure" bit of 
12215bd8deadSopenharmony_ci    state to be set in order to query the address? 
12225bd8deadSopenharmony_ci    
12235bd8deadSopenharmony_ci    RESOLVED: NO. Given that the BufferData behavior is fairly 
12245bd8deadSopenharmony_ci    straightforward to specify and implement, it's not clear that this 
12255bd8deadSopenharmony_ci    would be useful.
12265bd8deadSopenharmony_ci
12275bd8deadSopenharmony_ci    13) What should the program syntax look like?
12285bd8deadSopenharmony_ci
12295bd8deadSopenharmony_ci    RESOLVED: Support 1-, 2-, 4-vec fetches of float/int/uint types, as 
12305bd8deadSopenharmony_ci    well as 8- and 16-bit int/uint fetches via a new LOAD instruction 
12315bd8deadSopenharmony_ci    with a slew of suffixes. Handling 8/16bit sizes will be useful for 
12325bd8deadSopenharmony_ci    high-level languages compiling to the assembly. Addresses are required
12335bd8deadSopenharmony_ci    to be a multiple of the size of the data, as some implementations may 
12345bd8deadSopenharmony_ci    require this.
12355bd8deadSopenharmony_ci
12365bd8deadSopenharmony_ci    Other options include a more x86-style pointer dereference 
12375bd8deadSopenharmony_ci    ("MOV R0, DWORD PTR[R1];") or a complement to program.local
12385bd8deadSopenharmony_ci    ("MOV R0, program.global[R1];") but neither of these provide the
12395bd8deadSopenharmony_ci    simple granularity of the explicit type suffixes, and a new 
12405bd8deadSopenharmony_ci    instruction is convenient in terms of implementation and not muddling 
12415bd8deadSopenharmony_ci    the clean definition of MOV.
12425bd8deadSopenharmony_ci
12435bd8deadSopenharmony_ci    14) How does the GL know to invalidate caches when data has changed?
12445bd8deadSopenharmony_ci
12455bd8deadSopenharmony_ci    RESOLVED: Any entry points that can write to buffer objects should 
12465bd8deadSopenharmony_ci    trigger the necessary invalidation. A new entry point may only be 
12475bd8deadSopenharmony_ci    necessary once there is a way to write to a buffer by GPU address.
12485bd8deadSopenharmony_ci
12495bd8deadSopenharmony_ci    15) Does this extension require 64bit register/operation support in 
12505bd8deadSopenharmony_ci        programs and shaders?
12515bd8deadSopenharmony_ci
12525bd8deadSopenharmony_ci    RESOLVED: NO. At the API level, GPU addresses are always 64bit values
12535bd8deadSopenharmony_ci    and when they are stored in uniforms, attribs, parameters, etc. they
12545bd8deadSopenharmony_ci    should always be stored at full precision. However, if programs and 
12555bd8deadSopenharmony_ci    shaders don't support 64bit registers/operations via another 
12565bd8deadSopenharmony_ci    programmability extension, then they will need to use only 32 bits.
12575bd8deadSopenharmony_ci    On such implementations, the usable address space is therefore limited
12585bd8deadSopenharmony_ci    to 4GB. Such a limit should be reflected in the value of 
12595bd8deadSopenharmony_ci    MAX_SHADER_BUFFER_ADDRESS_NV.
12605bd8deadSopenharmony_ci
12615bd8deadSopenharmony_ci    It is expected that GLSL shaders will be compiled in such a way as to 
12625bd8deadSopenharmony_ci    generate 64bit pointers on implementations that support it and 32bit
12635bd8deadSopenharmony_ci    pointers on implementations that don't. So GLSL shaders written against
12645bd8deadSopenharmony_ci    a 32bit implementation can be expected to be forward-compatible when 
12655bd8deadSopenharmony_ci    run against a 64bit implementation. (u)intptr_t types are provided to 
12665bd8deadSopenharmony_ci    ease this compatibility.
12675bd8deadSopenharmony_ci    
12685bd8deadSopenharmony_ci    Built-in functions are provided to convert pointers to and from a pair
12695bd8deadSopenharmony_ci    of integers. These can be used to pass pointers as two components of a
12705bd8deadSopenharmony_ci    generic attrib, to construct a pointer from an RGUI32 texture fetch, 
12715bd8deadSopenharmony_ci    or to write a pointer to a fragment shader output.
12725bd8deadSopenharmony_ci
12735bd8deadSopenharmony_ci    16) What assumption can applications make about the alignment of 
12745bd8deadSopenharmony_ci    addresses returned by GetBufferParameterui64vNV?
12755bd8deadSopenharmony_ci
12765bd8deadSopenharmony_ci    RESOLVED: All buffers will begin at an address that is a multiple of 
12775bd8deadSopenharmony_ci    16 bytes.
12785bd8deadSopenharmony_ci
12795bd8deadSopenharmony_ci    17) How can the application guarantee that the layout of a structure
12805bd8deadSopenharmony_ci        on the CPU matches the layout used by the GLSL compiler?
12815bd8deadSopenharmony_ci
12825bd8deadSopenharmony_ci    RESOLVED: Provide a standard set of packing rules designed around 
12835bd8deadSopenharmony_ci    naturally aligning simple types. This spec will define pointer fetches
12845bd8deadSopenharmony_ci    in GLSL to use these rules, but does not explicitly guarantee that 
12855bd8deadSopenharmony_ci    other extensions (like EXT_bindable_uniform) will use the same packing
12865bd8deadSopenharmony_ci    rules for their bufferobject fetches. These packing rules are 
12875bd8deadSopenharmony_ci    different from the ARB_uniform_buffer_object rules - in particular, 
12885bd8deadSopenharmony_ci    these rules do not require vec4 padding of the array stride.
12895bd8deadSopenharmony_ci
12905bd8deadSopenharmony_ci    18) Is the address space per-context, per-share-group, or global?
12915bd8deadSopenharmony_ci
12925bd8deadSopenharmony_ci    RESOLVED: It is per-share-group. Using addresses from one share group
12935bd8deadSopenharmony_ci    in another share group will cause undefined results.
12945bd8deadSopenharmony_ci
12955bd8deadSopenharmony_ci    19) Is there risk of using invalid pointers for "killed" fragments, 
12965bd8deadSopenharmony_ci    fragments that don't take a certain branch of an "if" block, or 
12975bd8deadSopenharmony_ci    fragments whose shader is conceptually never executed due to pixel 
12985bd8deadSopenharmony_ci    ownership, stipple, etc.?
12995bd8deadSopenharmony_ci
13005bd8deadSopenharmony_ci    RESOLVED: NO. OpenGL implementations sometimes run fragment programs 
13015bd8deadSopenharmony_ci    on "helper" pixels that have no coverage, or continue to run fragment
13025bd8deadSopenharmony_ci    programs on killed pixels in order to be able to compute sane partial
13035bd8deadSopenharmony_ci    derivatives for fragment program instructions (DDX, DDY) or automatic
13045bd8deadSopenharmony_ci    level-of-detail calculations for texturing.  In this approach,
13055bd8deadSopenharmony_ci    derivatives are approximated by computing the difference in a quantity
13065bd8deadSopenharmony_ci    computed for a given fragment at (x,y) and a fragment at a neighboring
13075bd8deadSopenharmony_ci    pixel.  When a fragment program is executed on a "helper" pixel or 
13085bd8deadSopenharmony_ci    killed pixel, global loads may not be executed in order to prevent 
13095bd8deadSopenharmony_ci    spurious faults. Helper pixels aren't explicitly mentioned in the spec 
13105bd8deadSopenharmony_ci    body; instead, partial derivatives are obtained by magic.
13115bd8deadSopenharmony_ci
13125bd8deadSopenharmony_ci    If a fragment program contains a KIL instruction, compilers may not
13135bd8deadSopenharmony_ci    reorder code such that a LOAD instruction is executed before a KIL
13145bd8deadSopenharmony_ci    instruction that logically precedes it in flow control.  Once a 
13155bd8deadSopenharmony_ci    fragment is killed, subsequent loads should never be executed if they
13165bd8deadSopenharmony_ci    could cause any observable side effects.
13175bd8deadSopenharmony_ci
13185bd8deadSopenharmony_ci    As a result, if a shader uses instructions that explicitly or 
13195bd8deadSopenharmony_ci    implicitly do LOD calculations dependent on the result of a global 
13205bd8deadSopenharmony_ci    load, those instructions will have undefined results.
13215bd8deadSopenharmony_ci
13225bd8deadSopenharmony_ci    20) How are structures and arrays stored in buffer object memory?
13235bd8deadSopenharmony_ci
13245bd8deadSopenharmony_ci    RESOLVED:  Individual structure members and array elements are stored
13255bd8deadSopenharmony_ci    "packed" in memory, subject to an alignment requirement.  Structure
13265bd8deadSopenharmony_ci    members are stored according to the order of declaration.  Array elements
13275bd8deadSopenharmony_ci    are stored consecutively by element number.  Unreferenced structure
13285bd8deadSopenharmony_ci    members or array elements are never eliminated.  
13295bd8deadSopenharmony_ci
13305bd8deadSopenharmony_ci    The alignment requirement of individual structure members or array
13315bd8deadSopenharmony_ci    elements is usually equal to the size of the item.  For the purposes of
13325bd8deadSopenharmony_ci    this requirement, vector types are treated atomically (i.e., a "vec4" with
13335bd8deadSopenharmony_ci    32-bit floats will be 16-byte aligned).  One exception is that the
13345bd8deadSopenharmony_ci    required alignment of three-component vectors is the same as the required
13355bd8deadSopenharmony_ci    alignment of a four-component vector of the same base type.
13365bd8deadSopenharmony_ci
13375bd8deadSopenharmony_ci    21) How do the memory layout rules relate to the similar layout rules
13385bd8deadSopenharmony_ci    specified for the uniform buffer object (UBO) feature incorporated in
13395bd8deadSopenharmony_ci    OpenGL 3.1?
13405bd8deadSopenharmony_ci
13415bd8deadSopenharmony_ci    RESOLVED:  This extension was completed prior to OpenGL 3.1, but the
13425bd8deadSopenharmony_ci    layout rules for this extension and for UBO were developed roughly
13435bd8deadSopenharmony_ci    concurrently.  The layout rules here are nearly identical to those for the
13445bd8deadSopenharmony_ci    "std140" layout for uniform blocks.  The main difference here is that
13455bd8deadSopenharmony_ci    "std140" requires arrays of small types (e.g., "float") to be padded out
13465bd8deadSopenharmony_ci    to vec4 alignment (16B), while this extension does not.
13475bd8deadSopenharmony_ci
13485bd8deadSopenharmony_ci    Note that this extension does NOT allow shaders to use the layout()
13495bd8deadSopenharmony_ci    qualifier added by GLSL 1.40 to achieve fine-grained control of structure
13505bd8deadSopenharmony_ci    or array layout using pointers.  A subsequent extension could provide this
13515bd8deadSopenharmony_ci    capability.
13525bd8deadSopenharmony_ci
13535bd8deadSopenharmony_ci    22) Should we provide a mechanism for tighter packing of an array of
13545bd8deadSopenharmony_ci    three-component vectors?
13555bd8deadSopenharmony_ci
13565bd8deadSopenharmony_ci    RESOLVED:  This could be desirable, but it won't be provided in this
13575bd8deadSopenharmony_ci    extension.  A subsequent extension could support alternate layouts by
13585bd8deadSopenharmony_ci    allowing shaders to use of the GLSL 1.40 layout() modifier to qualify
13595bd8deadSopenharmony_ci    pointer types.  
13605bd8deadSopenharmony_ci
13615bd8deadSopenharmony_ci    If tight packing of vec3's is strongly required, a three component array
13625bd8deadSopenharmony_ci    element could be constructed using three single component loads or by
13635bd8deadSopenharmony_ci    selecting/swizzling components of one or more larger loads.  The former
13645bd8deadSopenharmony_ci    technique could be done using GLSL by replacing:
13655bd8deadSopenharmony_ci
13665bd8deadSopenharmony_ci      vec3 *pointer;
13675bd8deadSopenharmony_ci      vec3 elementN;
13685bd8deadSopenharmony_ci      int n;
13695bd8deadSopenharmony_ci      elementN = pointer[n];
13705bd8deadSopenharmony_ci
13715bd8deadSopenharmony_ci    with
13725bd8deadSopenharmony_ci
13735bd8deadSopenharmony_ci      float *pointer;
13745bd8deadSopenharmony_ci      vec3 elementN;
13755bd8deadSopenharmony_ci      int n;
13765bd8deadSopenharmony_ci      elementN = vec3(pointer[n*3], pointer[n*3+1], pointer[n*3+2]);
13775bd8deadSopenharmony_ci
13785bd8deadSopenharmony_ci
13795bd8deadSopenharmony_ciRevision History
13805bd8deadSopenharmony_ci
13815bd8deadSopenharmony_ci    Rev.    Date    Author    Changes
13825bd8deadSopenharmony_ci    ----  --------  --------  -----------------------------------------
13835bd8deadSopenharmony_ci     8    08/06/10  istewart  Modify behavior of named buffer functions
13845bd8deadSopenharmony_ci                              to match those of EXT_direct_state_access.
13855bd8deadSopenharmony_ci                              Add INVALID_OPERATION error to 
13865bd8deadSopenharmony_ci                              MakeBufferResidentNV and GetBufferParameterui64vNV
13875bd8deadSopenharmony_ci                              if the buffer object has no data store.
13885bd8deadSopenharmony_ci
13895bd8deadSopenharmony_ci     7    06/22/10  pbrown    Document INVALID_OPERATION errors on 
13905bd8deadSopenharmony_ci                              residency managment and query APIs when an
13915bd8deadSopenharmony_ci                              non-existent buffer object is referenced, 
13925bd8deadSopenharmony_ci                              when trying to make an already resident buffer
13935bd8deadSopenharmony_ci                              resident, or when trying to make an already
13945bd8deadSopenharmony_ci                              non-resident buffer non-resident.
13955bd8deadSopenharmony_ci
13965bd8deadSopenharmony_ci     6    09/21/09  groth     Fix non-conformant DSA function names.
13975bd8deadSopenharmony_ci
13985bd8deadSopenharmony_ci     5    09/10/09  Jon Leech Add 'const' to type of Uniformui64vNV and
13995bd8deadSopenharmony_ci                              ProgramUniformui64vNV 'count' argument.
14005bd8deadSopenharmony_ci
14015bd8deadSopenharmony_ci     4    09/09/09  mjk       Fix typos
14025bd8deadSopenharmony_ci
14035bd8deadSopenharmony_ci     3    08/21/09  pbrown    Add explicit spec language describing the
14045bd8deadSopenharmony_ci                              typecast operator implemented here.  The
14055bd8deadSopenharmony_ci                              previous spec language said it was allowed
14065bd8deadSopenharmony_ci                              but didn't say what it did.
14075bd8deadSopenharmony_ci
14085bd8deadSopenharmony_ci     2    08/05/09  pbrown    Update section describing memory layout of
14095bd8deadSopenharmony_ci                              variables pointed to; moved to the core
14105bd8deadSopenharmony_ci                              specification as with OpenGL 3.1's uniform
14115bd8deadSopenharmony_ci                              buffer layout.  Added a few issues on memory
14125bd8deadSopenharmony_ci                              layout.  Explicitly documented the set of
14135bd8deadSopenharmony_ci                              operations and implicit conversions allowed 
14145bd8deadSopenharmony_ci                              on pointers.
14155bd8deadSopenharmony_ci
14165bd8deadSopenharmony_ci     1              jbolz     Internal revisions.
1417