15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci NV_shader_buffer_load 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_NV_shader_buffer_load 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Jeff Bolz, NVIDIA Corporation (jbolz 'at' nvidia.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Pat Brown, NVIDIA 165bd8deadSopenharmony_ci Chris Dodd, NVIDIA 175bd8deadSopenharmony_ci Mark Kilgard, NVIDIA 185bd8deadSopenharmony_ci Eric Werness, NVIDIA 195bd8deadSopenharmony_ci 205bd8deadSopenharmony_ciStatus 215bd8deadSopenharmony_ci 225bd8deadSopenharmony_ci Complete 235bd8deadSopenharmony_ci 245bd8deadSopenharmony_ciVersion 255bd8deadSopenharmony_ci 265bd8deadSopenharmony_ci Last Modified Date: August 8, 2010 275bd8deadSopenharmony_ci Author Revision: 8 285bd8deadSopenharmony_ci 295bd8deadSopenharmony_ciNumber 305bd8deadSopenharmony_ci 315bd8deadSopenharmony_ci 379 325bd8deadSopenharmony_ci 335bd8deadSopenharmony_ciDependencies 345bd8deadSopenharmony_ci 355bd8deadSopenharmony_ci Written against the OpenGL 3.0 Specification. 365bd8deadSopenharmony_ci 375bd8deadSopenharmony_ci Written against the GLSL 1.30 Specification (Revision 09). 385bd8deadSopenharmony_ci 395bd8deadSopenharmony_ci This extension interacts with NV_gpu_program4. 405bd8deadSopenharmony_ci 415bd8deadSopenharmony_ci 425bd8deadSopenharmony_ciOverview 435bd8deadSopenharmony_ci 445bd8deadSopenharmony_ci At a very coarse level, GL has evolved in a way that allows 455bd8deadSopenharmony_ci applications to replace many of the original state machine variables 465bd8deadSopenharmony_ci with blocks of user-defined data. For example, the current vertex 475bd8deadSopenharmony_ci state has been augmented by vertex buffer objects, fixed-function 485bd8deadSopenharmony_ci shading state and parameters have been replaced by shaders/programs 495bd8deadSopenharmony_ci and constant buffers, etc.. Applications switch between coarse sets 505bd8deadSopenharmony_ci of state by binding objects to the context or to other container 515bd8deadSopenharmony_ci objects (e.g. vertex array objects) instead of manipulating state 525bd8deadSopenharmony_ci variables of the context. In terms of the number of GL commands 535bd8deadSopenharmony_ci required to draw an object, modern applications are orders of 545bd8deadSopenharmony_ci magnitude more efficient than legacy applications, but this explosion 555bd8deadSopenharmony_ci of objects bound to other objects has led to a new bottleneck - 565bd8deadSopenharmony_ci pointer chasing and CPU L2 cache misses in the driver, and general 575bd8deadSopenharmony_ci L2 cache pollution. 585bd8deadSopenharmony_ci 595bd8deadSopenharmony_ci This extension provides a mechanism to read from a flat, 64-bit GPU 605bd8deadSopenharmony_ci address space from programs/shaders, to query GPU addresses of buffer 615bd8deadSopenharmony_ci objects at the API level, and to bind buffer objects to the context in 625bd8deadSopenharmony_ci such a way that they can be accessed via their GPU addresses in any 635bd8deadSopenharmony_ci shader stage. 645bd8deadSopenharmony_ci 655bd8deadSopenharmony_ci The intent is that applications can avoid re-binding buffer objects 665bd8deadSopenharmony_ci or updating constants between each Draw call and instead simply use 675bd8deadSopenharmony_ci a VertexAttrib (or TexCoord, or InstanceID, or...) to "point" to the 685bd8deadSopenharmony_ci new object's state. In this way, one of the cheapest "state" updates 695bd8deadSopenharmony_ci (from the CPU's point of view) can be used to effect a significant 705bd8deadSopenharmony_ci state change in the shader similarly to how a pointer change may on 715bd8deadSopenharmony_ci the CPU. At the same time, this relieves the limits on how many 725bd8deadSopenharmony_ci buffer objects can be accessed at once by shaders, and allows these 735bd8deadSopenharmony_ci buffer object accesses to be exposed as C-style pointer dereferences 745bd8deadSopenharmony_ci in the shading language. 755bd8deadSopenharmony_ci 765bd8deadSopenharmony_ci As a very simple example, imagine packing a group of similar objects' 775bd8deadSopenharmony_ci constants into a single buffer object and pointing your program 785bd8deadSopenharmony_ci at object <i> by setting "glVertexAttribI1iEXT(attrLoc, i);" 795bd8deadSopenharmony_ci and using a shader as such: 805bd8deadSopenharmony_ci 815bd8deadSopenharmony_ci struct MyObjectType { 825bd8deadSopenharmony_ci mat4x4 modelView; 835bd8deadSopenharmony_ci vec4 materialPropertyX; 845bd8deadSopenharmony_ci // etc. 855bd8deadSopenharmony_ci }; 865bd8deadSopenharmony_ci uniform MyObjectType *allObjects; 875bd8deadSopenharmony_ci in int objectID; // bound to attrLoc 885bd8deadSopenharmony_ci 895bd8deadSopenharmony_ci ... 905bd8deadSopenharmony_ci 915bd8deadSopenharmony_ci mat4x4 thisObjectsMatrix = allObjects[objectID].modelView; 925bd8deadSopenharmony_ci // do transform, shading, etc. 935bd8deadSopenharmony_ci 945bd8deadSopenharmony_ci This is beneficial in much the same way that texture arrays allow 955bd8deadSopenharmony_ci choosing between similar, but independent, texture maps with a single 965bd8deadSopenharmony_ci coordinate identifying which slice of the texture to use. It also 975bd8deadSopenharmony_ci resembles instancing, where a lightweight change (incrementing the 985bd8deadSopenharmony_ci instance ID) can be used to generate a different and interesting 995bd8deadSopenharmony_ci result, but with additional flexibility over instancing because the 1005bd8deadSopenharmony_ci values are app-controlled and not a single incrementing counter. 1015bd8deadSopenharmony_ci 1025bd8deadSopenharmony_ci Dependent pointer fetches are allowed, so more complex scene graph 1035bd8deadSopenharmony_ci structures can be built into buffer objects providing significant new 1045bd8deadSopenharmony_ci flexibility in the use of shaders. Another simple example, showing 1055bd8deadSopenharmony_ci something you can't do with existing functionality, is to do dependent 1065bd8deadSopenharmony_ci fetches into many buffer objects: 1075bd8deadSopenharmony_ci 1085bd8deadSopenharmony_ci GenBuffers(N, dataBuffers); 1095bd8deadSopenharmony_ci GenBuffers(1, &pointerBuffer); 1105bd8deadSopenharmony_ci 1115bd8deadSopenharmony_ci GLuint64EXT gpuAddrs[N]; 1125bd8deadSopenharmony_ci for (i = 0; i < N; ++i) { 1135bd8deadSopenharmony_ci BindBuffer(target, dataBuffers[i]); 1145bd8deadSopenharmony_ci BufferData(target, size[i], myData[i], STATIC_DRAW); 1155bd8deadSopenharmony_ci 1165bd8deadSopenharmony_ci // get the address of this buffer and make it resident. 1175bd8deadSopenharmony_ci GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS, 1185bd8deadSopenharmony_ci gpuaddrs[i]); 1195bd8deadSopenharmony_ci MakeBufferResidentNV(target, READ_ONLY); 1205bd8deadSopenharmony_ci } 1215bd8deadSopenharmony_ci 1225bd8deadSopenharmony_ci GLuint64EXT pointerBufferAddr; 1235bd8deadSopenharmony_ci BindBuffer(target, pointerBuffer); 1245bd8deadSopenharmony_ci BufferData(target, sizeof(GLuint64EXT)*N, gpuAddrs, STATIC_DRAW); 1255bd8deadSopenharmony_ci GetBufferParameterui64vNV(target, BUFFER_GPU_ADDRESS, 1265bd8deadSopenharmony_ci &pointerBufferAddr); 1275bd8deadSopenharmony_ci MakeBufferResidentNV(target, READ_ONLY); 1285bd8deadSopenharmony_ci 1295bd8deadSopenharmony_ci // now in the shader, we can use a double indirection 1305bd8deadSopenharmony_ci vec4 **ptrToBuffers = pointerBufferAddr; 1315bd8deadSopenharmony_ci vec4 *ptrToBufferI = ptrToBuffers[i]; 1325bd8deadSopenharmony_ci 1335bd8deadSopenharmony_ci This allows simultaneous access to more buffers than 1345bd8deadSopenharmony_ci EXT_bindable_uniform (MAX_VERTEX_BINDABLE_UNIFORMS, etc.) and each 1355bd8deadSopenharmony_ci can be larger than MAX_BINDABLE_UNIFORM_SIZE. 1365bd8deadSopenharmony_ci 1375bd8deadSopenharmony_ciNew Procedures and Functions 1385bd8deadSopenharmony_ci 1395bd8deadSopenharmony_ci void MakeBufferResidentNV(enum target, enum access); 1405bd8deadSopenharmony_ci void MakeBufferNonResidentNV(enum target); 1415bd8deadSopenharmony_ci boolean IsBufferResidentNV(enum target); 1425bd8deadSopenharmony_ci void MakeNamedBufferResidentNV(uint buffer, enum access); 1435bd8deadSopenharmony_ci void MakeNamedBufferNonResidentNV(uint buffer); 1445bd8deadSopenharmony_ci boolean IsNamedBufferResidentNV(uint buffer); 1455bd8deadSopenharmony_ci 1465bd8deadSopenharmony_ci void GetBufferParameterui64vNV(enum target, enum pname, 1475bd8deadSopenharmony_ci uint64EXT *params); 1485bd8deadSopenharmony_ci void GetNamedBufferParameterui64vNV(uint buffer, enum pname, 1495bd8deadSopenharmony_ci uint64EXT *params); 1505bd8deadSopenharmony_ci 1515bd8deadSopenharmony_ci void GetIntegerui64vNV(enum value, uint64EXT *result); 1525bd8deadSopenharmony_ci 1535bd8deadSopenharmony_ci void Uniformui64NV(int location, uint64EXT value); 1545bd8deadSopenharmony_ci void Uniformui64vNV(int location, sizei count, 1555bd8deadSopenharmony_ci const uint64EXT *value); 1565bd8deadSopenharmony_ci void GetUniformui64vNV(uint program, int location, uint64EXT *params); 1575bd8deadSopenharmony_ci void ProgramUniformui64NV(uint program, int location, uint64EXT value); 1585bd8deadSopenharmony_ci void ProgramUniformui64vNV(uint program, int location, sizei count, 1595bd8deadSopenharmony_ci const uint64EXT *value); 1605bd8deadSopenharmony_ci 1615bd8deadSopenharmony_ciNew Tokens 1625bd8deadSopenharmony_ci 1635bd8deadSopenharmony_ci Accepted by the <pname> parameter of GetBufferParameterui64vNV, 1645bd8deadSopenharmony_ci GetNamedBufferParameterui64vNV: 1655bd8deadSopenharmony_ci 1665bd8deadSopenharmony_ci BUFFER_GPU_ADDRESS_NV 0x8F1D 1675bd8deadSopenharmony_ci 1685bd8deadSopenharmony_ci Returned by the <type> parameter of GetActiveUniform: 1695bd8deadSopenharmony_ci 1705bd8deadSopenharmony_ci GPU_ADDRESS_NV 0x8F34 1715bd8deadSopenharmony_ci 1725bd8deadSopenharmony_ci Accepted by the <value> parameter of GetIntegerui64vNV: 1735bd8deadSopenharmony_ci 1745bd8deadSopenharmony_ci MAX_SHADER_BUFFER_ADDRESS_NV 0x8F35 1755bd8deadSopenharmony_ci 1765bd8deadSopenharmony_ci 1775bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 3.0 Specification (OpenGL Operation) 1785bd8deadSopenharmony_ci 1795bd8deadSopenharmony_ci Append to Section 2.9 (p. 45) 1805bd8deadSopenharmony_ci 1815bd8deadSopenharmony_ci The data store of a buffer object may be made accessible to the GL 1825bd8deadSopenharmony_ci via shader buffer loads by calling: 1835bd8deadSopenharmony_ci 1845bd8deadSopenharmony_ci void MakeBufferResidentNV(enum target, enum access); 1855bd8deadSopenharmony_ci 1865bd8deadSopenharmony_ci <access> may only be READ_ONLY, but is provided for future 1875bd8deadSopenharmony_ci extensibility to indicate to the driver that the GPU may write to the 1885bd8deadSopenharmony_ci memory. <target> may be any of the buffer targets accepted by 1895bd8deadSopenharmony_ci BindBuffer. The error INVALID_OPERATION will be generated if no 1905bd8deadSopenharmony_ci buffer is bound to <target>, if the buffer bound to <target> is 1915bd8deadSopenharmony_ci already resident in the current GL context, or if the buffer bound to 1925bd8deadSopenharmony_ci <target> has no data store. 1935bd8deadSopenharmony_ci 1945bd8deadSopenharmony_ci While the buffer object is resident, it is legal to use GPU addresses 1955bd8deadSopenharmony_ci in the range [BUFFER_GPU_ADDRESS, BUFFER_GPU_ADDRESS + BUFFER_SIZE) 1965bd8deadSopenharmony_ci in any shader stage. 1975bd8deadSopenharmony_ci 1985bd8deadSopenharmony_ci The data store of a buffer object may be made inaccessible to the GL 1995bd8deadSopenharmony_ci via shader buffer loads by calling: 2005bd8deadSopenharmony_ci 2015bd8deadSopenharmony_ci void MakeBufferNonResidentNV(enum target); 2025bd8deadSopenharmony_ci 2035bd8deadSopenharmony_ci A buffer is also made non-resident implicitly as a result of being 2045bd8deadSopenharmony_ci respecified via BufferData or being deleted. <target> may be any of 2055bd8deadSopenharmony_ci the buffer targets accepted by BindBuffer. The error 2065bd8deadSopenharmony_ci INVALID_OPERATION will be generated if no buffer is bound to <target> 2075bd8deadSopenharmony_ci or if the buffer bound to <target> is not resident in the current 2085bd8deadSopenharmony_ci GL context. 2095bd8deadSopenharmony_ci 2105bd8deadSopenharmony_ci The function: 2115bd8deadSopenharmony_ci 2125bd8deadSopenharmony_ci void GetBufferParameterui64vNV(enum target, enum pname, 2135bd8deadSopenharmony_ci uint64EXT *params); 2145bd8deadSopenharmony_ci 2155bd8deadSopenharmony_ci may be used to query the GPU address of a buffer object's data store. 2165bd8deadSopenharmony_ci This address remains valid until the buffer object is deleted, or 2175bd8deadSopenharmony_ci when the data store is respecified via BufferData. The address "zero" 2185bd8deadSopenharmony_ci is reserved for convenience, so no buffer object will ever have an 2195bd8deadSopenharmony_ci address of zero. The error INVALID_OPERATION will be generated if no 2205bd8deadSopenharmony_ci buffer is bound to <target>, or if the buffer bound to <target> has no 2215bd8deadSopenharmony_ci data store. 2225bd8deadSopenharmony_ci 2235bd8deadSopenharmony_ci The functions: 2245bd8deadSopenharmony_ci 2255bd8deadSopenharmony_ci void MakeNamedBufferResidentNV(uint buffer, enum access); 2265bd8deadSopenharmony_ci void MakeNamedBufferNonResidentNV(uint buffer); 2275bd8deadSopenharmony_ci void GetNamedBufferParameterui64vNV(uint buffer, enum pname, 2285bd8deadSopenharmony_ci uint64EXT *params); 2295bd8deadSopenharmony_ci 2305bd8deadSopenharmony_ci operate identically to the non-"Named" functions except, rather than 2315bd8deadSopenharmony_ci using currently bound buffers, it uses the buffer object identified 2325bd8deadSopenharmony_ci by <buffer>. If the buffer object named by the buffer parameter has 2335bd8deadSopenharmony_ci not been previously bound or has been deleted since the last binding, 2345bd8deadSopenharmony_ci the GL first creates a new state vector, initialized with a zero-sized 2355bd8deadSopenharmony_ci memory buffer and comprising the state values listed in table 2.6. 2365bd8deadSopenharmony_ci There is no buffer corresponding to the name zero, these commands 2375bd8deadSopenharmony_ci generate the INVALID_OPERATION error if the buffer parameter is zero. 2385bd8deadSopenharmony_ci 2395bd8deadSopenharmony_ci Add to Section 2.20.3 (p. 98) 2405bd8deadSopenharmony_ci 2415bd8deadSopenharmony_ci void Uniformui64NV(int location, uint64EXT value); 2425bd8deadSopenharmony_ci void Uniformui64vNV(int location, sizei count, uint64EXT *value); 2435bd8deadSopenharmony_ci 2445bd8deadSopenharmony_ci The Uniformui64{v}NV commands will load <count> uint64EXT values into 2455bd8deadSopenharmony_ci a uniform location defined as a GPU_ADDRESS_NV or an array of 2465bd8deadSopenharmony_ci GPU_ADDRESS_NVs. 2475bd8deadSopenharmony_ci 2485bd8deadSopenharmony_ci The functions: 2495bd8deadSopenharmony_ci 2505bd8deadSopenharmony_ci void ProgramUniformui64NV(uint program, int location, 2515bd8deadSopenharmony_ci uint64EXT value); 2525bd8deadSopenharmony_ci void ProgramUniformui64vNV(uint program, int location, sizei count, 2535bd8deadSopenharmony_ci uint64EXT *value); 2545bd8deadSopenharmony_ci 2555bd8deadSopenharmony_ci operate identically to the non-"Program" functions except, rather 2565bd8deadSopenharmony_ci than updating the currently in use program object, these "Program" 2575bd8deadSopenharmony_ci commands update the program object named by the initial program 2585bd8deadSopenharmony_ci parameter. 2595bd8deadSopenharmony_ci 2605bd8deadSopenharmony_ci 2615bd8deadSopenharmony_ci Insert a new subsection after Section 2.20.4, Shader Execution (Vertex 2625bd8deadSopenharmony_ci Shaders), p. 103. 2635bd8deadSopenharmony_ci 2645bd8deadSopenharmony_ci Section 2.20.X, Shader Memory Access 2655bd8deadSopenharmony_ci 2665bd8deadSopenharmony_ci Shaders may load from buffer object memory by dereferencing pointer 2675bd8deadSopenharmony_ci variables. Pointer variables are 64-bit unsigned integer values referring 2685bd8deadSopenharmony_ci to the GPU addresses of data stored in buffer objects made resident by 2695bd8deadSopenharmony_ci MakeBufferResidentNV. The GPU addresses of such buffer objects may be 2705bd8deadSopenharmony_ci queried using GetBufferParameterui64vNV with a <pname> of 2715bd8deadSopenharmony_ci BUFFER_GPU_ADDRESS_NV. 2725bd8deadSopenharmony_ci 2735bd8deadSopenharmony_ci When a shader dereferences a pointer variable, data are read from buffer 2745bd8deadSopenharmony_ci object memory according to the following rules: 2755bd8deadSopenharmony_ci 2765bd8deadSopenharmony_ci - Data of type "bool" are stored in memory as one uint-typed value at the 2775bd8deadSopenharmony_ci specified GPU address. All non-zero values correspond to true, and zero 2785bd8deadSopenharmony_ci corresponds to false. 2795bd8deadSopenharmony_ci 2805bd8deadSopenharmony_ci - Data of type "int" are stored in memory as one int-typed value at the 2815bd8deadSopenharmony_ci specified GPU address. 2825bd8deadSopenharmony_ci 2835bd8deadSopenharmony_ci - Data of type "uint" are stored in memory as one uint-typed value at the 2845bd8deadSopenharmony_ci specified GPU address. 2855bd8deadSopenharmony_ci 2865bd8deadSopenharmony_ci - Data of type "float" are stored in memory as one float-typed value at 2875bd8deadSopenharmony_ci the specified GPU address. 2885bd8deadSopenharmony_ci 2895bd8deadSopenharmony_ci - Vectors with <N> elements with any of the above basic element types are 2905bd8deadSopenharmony_ci stored in memory as <N> values in consecutive memory locations beginning 2915bd8deadSopenharmony_ci at the specified GPU address, with components stored in order with the 2925bd8deadSopenharmony_ci first (X) component at the lowest offset. The data type used for 2935bd8deadSopenharmony_ci individual components is derived according to the rules for scalar 2945bd8deadSopenharmony_ci members above. 2955bd8deadSopenharmony_ci 2965bd8deadSopenharmony_ci - Data with any pointer type are stored in memory as a single 64-bit 2975bd8deadSopenharmony_ci unsigned integer value at the specified GPU address. 2985bd8deadSopenharmony_ci 2995bd8deadSopenharmony_ci - Column-major matrices with <C> columns and <R> rows (using the type 3005bd8deadSopenharmony_ci "mat<C>x<R>", or simply "mat<C>" if <C>==<R>) are treated as an array of 3015bd8deadSopenharmony_ci <C> floating-point column vectors, each consisting of <R> components. 3025bd8deadSopenharmony_ci The column vectors will be stored in order, with column zero at the 3035bd8deadSopenharmony_ci lowest offset. The difference in offsets between consecutive columns of 3045bd8deadSopenharmony_ci the matrix will be referred to as the column stride, and is constant 3055bd8deadSopenharmony_ci across the matrix. 3065bd8deadSopenharmony_ci 3075bd8deadSopenharmony_ci - Row-major matrices with <C> columns and <R> rows (using the type 3085bd8deadSopenharmony_ci "mat<C>x<R>", or simply "mat<C>" if <C>==<R>) are treated as an array of 3095bd8deadSopenharmony_ci <R> floating-point row vectors, each consisting of <C> components. The 3105bd8deadSopenharmony_ci row vectors will be stored in order, with row zero at the lowest offset. 3115bd8deadSopenharmony_ci The difference in offsets between consecutive rows of the matrix will be 3125bd8deadSopenharmony_ci referred to as the row stride, and is constant across the matrix. 3135bd8deadSopenharmony_ci 3145bd8deadSopenharmony_ci - Arrays of scalars, vectors, pointers, and matrices are stored in memory 3155bd8deadSopenharmony_ci by element order, with array member zero at the lowest offset. The 3165bd8deadSopenharmony_ci difference in offsets between each pair of elements in the array in 3175bd8deadSopenharmony_ci basic machine units is referred to as the array stride, and is constant 3185bd8deadSopenharmony_ci across the entire array. 3195bd8deadSopenharmony_ci 3205bd8deadSopenharmony_ci For matrix and array variables, the matrix and/or array strides 3215bd8deadSopenharmony_ci corresponding to the variable may be derived according to the structure 3225bd8deadSopenharmony_ci layout rules specified immediately below. 3235bd8deadSopenharmony_ci 3245bd8deadSopenharmony_ci When dereferencing a pointer to a structure, its individual members will 3255bd8deadSopenharmony_ci be laid out in memory in monotonically increasing order based on their 3265bd8deadSopenharmony_ci location in the structure declaration. Each structure member has a base 3275bd8deadSopenharmony_ci offset and a base alignment, from which an aligned offset is computed by 3285bd8deadSopenharmony_ci rounding the base offset up to the next multiple of the base alignment. 3295bd8deadSopenharmony_ci The base offset of the first member of a structure is taken from the 3305bd8deadSopenharmony_ci aligned offset of the structure itself. The base offset of all other 3315bd8deadSopenharmony_ci structure members is derived by taking the offset of the last basic 3325bd8deadSopenharmony_ci machine unit consumed by the previous member and adding one. Each 3335bd8deadSopenharmony_ci structure member is stored in memory at its aligned offset. 3345bd8deadSopenharmony_ci 3355bd8deadSopenharmony_ci (1) If the member is a scalar consuming <N> basic machine units, the 3365bd8deadSopenharmony_ci base alignment is <N>. 3375bd8deadSopenharmony_ci 3385bd8deadSopenharmony_ci (2) If the member is a two- or four-component vector with components 3395bd8deadSopenharmony_ci consuming <N> basic machine units, the base alignment is 2<N> or 3405bd8deadSopenharmony_ci 4<N>, respectively. 3415bd8deadSopenharmony_ci 3425bd8deadSopenharmony_ci (3) If the member is a three-component vector with components consuming 3435bd8deadSopenharmony_ci <N> basic machine units, the base alignment is 4<N>. 3445bd8deadSopenharmony_ci 3455bd8deadSopenharmony_ci (4) If the member is an array of scalars or vectors, the base alignment 3465bd8deadSopenharmony_ci and array stride are set to match the base alignment of a single 3475bd8deadSopenharmony_ci array element, according to rules (1), (2), and (3). The array may 3485bd8deadSopenharmony_ci have padding at the end; the base offset of the member following the 3495bd8deadSopenharmony_ci array is rounded up to the next multiple of the base alignment. 3505bd8deadSopenharmony_ci 3515bd8deadSopenharmony_ci (5) If the member is a column-major matrix with <C> columns and <R> 3525bd8deadSopenharmony_ci rows, the matrix is stored identically to an array of <C> column 3535bd8deadSopenharmony_ci vectors with <R> components each, according to rule (4). 3545bd8deadSopenharmony_ci 3555bd8deadSopenharmony_ci (6) If the member is an array of <S> column-major matrices with <C> 3565bd8deadSopenharmony_ci columns and <R> rows, the matrix is stored identically to a row of 3575bd8deadSopenharmony_ci <S>*<C> column vectors with <R> components each, according to rule 3585bd8deadSopenharmony_ci (4). 3595bd8deadSopenharmony_ci 3605bd8deadSopenharmony_ci (7) If the member is a row-major matrix with <C> columns and <R> rows, 3615bd8deadSopenharmony_ci the matrix is stored identically to an array of <R> row vectors 3625bd8deadSopenharmony_ci with <C> components each, according to rule (4). 3635bd8deadSopenharmony_ci 3645bd8deadSopenharmony_ci (8) If the member is an array of <S> row-major matrices with <C> columns 3655bd8deadSopenharmony_ci and <R> rows, the matrix is stored identically to a row of <S>*<R> 3665bd8deadSopenharmony_ci row vectors with <C> components each, according to rule (4). 3675bd8deadSopenharmony_ci 3685bd8deadSopenharmony_ci (9) If the member is a structure, the base alignment of the structure is 3695bd8deadSopenharmony_ci <N>, where <N> is the largest base alignment value of any of its 3705bd8deadSopenharmony_ci members. The individual members of this sub-structure are then 3715bd8deadSopenharmony_ci assigned offsets by applying this set of rules recursively, where 3725bd8deadSopenharmony_ci the base offset of the first member of the sub-structure is equal to 3735bd8deadSopenharmony_ci the aligned offset of the structure. The structure may have padding 3745bd8deadSopenharmony_ci at the end; the base offset of the member following the 3755bd8deadSopenharmony_ci sub-structure is rounded up to the next multiple of the base 3765bd8deadSopenharmony_ci alignment of the structure. 3775bd8deadSopenharmony_ci 3785bd8deadSopenharmony_ci (10) If the member is an array of <S> structures, the <S> elements of 3795bd8deadSopenharmony_ci the array are laid out in order, according to rule (9). 3805bd8deadSopenharmony_ci 3815bd8deadSopenharmony_ci If a shader reads from a GPU address that does not correspond to a buffer 3825bd8deadSopenharmony_ci object made resident by MakeBufferResidentNV, the results of the operation 3835bd8deadSopenharmony_ci are undefined and may result in application termination. 3845bd8deadSopenharmony_ci 3855bd8deadSopenharmony_ci Any variable, array element, or structure member accessed using a pointer 3865bd8deadSopenharmony_ci has a required base alignment, which may be derived according the 3875bd8deadSopenharmony_ci structure layout rules above. If a variable, array member, or structure 3885bd8deadSopenharmony_ci member is accessed using a pointer that is not a multiple of its base 3895bd8deadSopenharmony_ci alignment, the results of the access will be undefined. To store multiple 3905bd8deadSopenharmony_ci variables in a single buffer object, an application must ensure that each 3915bd8deadSopenharmony_ci variable is properly aligned. Storing a single scalar, vector, matrix, 3925bd8deadSopenharmony_ci array, or structure variable using a pointer set to the base GPU address 3935bd8deadSopenharmony_ci of a resident buffer object requires no special alignment. The base GPU 3945bd8deadSopenharmony_ci address of a buffer object is guaranteed to be sufficiently aligned to 3955bd8deadSopenharmony_ci satisfy the base alignment requirement of any variable, and the layout 3965bd8deadSopenharmony_ci rules above ensure that individual matrix rows/columns, array elements, 3975bd8deadSopenharmony_ci and structure members are properly aligned as long as the base pointer 3985bd8deadSopenharmony_ci meets alignment requirements. 3995bd8deadSopenharmony_ci 4005bd8deadSopenharmony_ci 4015bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 3.0 Specification (Special Functions) 4025bd8deadSopenharmony_ci 4035bd8deadSopenharmony_ci Add to Section 5.4, p. 310 (Display Lists) 4045bd8deadSopenharmony_ci 4055bd8deadSopenharmony_ci Edit the list of commands that are executed immediately when compiling 4065bd8deadSopenharmony_ci a display list to include MakeBufferResidentNV, 4075bd8deadSopenharmony_ci MakeBufferNonResidentNV, MakeNamedBufferResidentNV, 4085bd8deadSopenharmony_ci MakeNamedBufferNonResidentNV, GetBufferParameterui64vNV, 4095bd8deadSopenharmony_ci GetNamedBufferParameterui64vNV, IsBufferResidentNV, and 4105bd8deadSopenharmony_ci IsNamedBufferResidentNV. 4115bd8deadSopenharmony_ci 4125bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 3.0 Specification (Querying GL State) 4135bd8deadSopenharmony_ci 4145bd8deadSopenharmony_ci Add to Section 6.1.11, p. 314 (Pointer, String, and 64-bit Queries) 4155bd8deadSopenharmony_ci 4165bd8deadSopenharmony_ci The command: 4175bd8deadSopenharmony_ci 4185bd8deadSopenharmony_ci void GetIntegerui64vNV(enum value, uint64EXT *result); 4195bd8deadSopenharmony_ci 4205bd8deadSopenharmony_ci obtains 64-bit unsigned integer state variables. Legal values of 4215bd8deadSopenharmony_ci <value> are only those that specify GetIntegerui64vNV in the state 4225bd8deadSopenharmony_ci tables in Chapter 6. 4235bd8deadSopenharmony_ci 4245bd8deadSopenharmony_ci Add to Section 6.1.13, p. 332 (Buffer Object Queries) 4255bd8deadSopenharmony_ci 4265bd8deadSopenharmony_ci The commands: 4275bd8deadSopenharmony_ci 4285bd8deadSopenharmony_ci boolean IsBufferResidentNV(enum target); 4295bd8deadSopenharmony_ci boolean IsNamedBufferResidentNV(uint buffer); 4305bd8deadSopenharmony_ci 4315bd8deadSopenharmony_ci return TRUE if the specified buffer is resident in the current context. 4325bd8deadSopenharmony_ci The error INVALID_OPERATION will be generated by IsBufferResidentNV if no 4335bd8deadSopenharmony_ci buffer is bound to <target>. If the buffer object named by the buffer 4345bd8deadSopenharmony_ci parameter of IsNamedBufferResidentNV has not been previously bound or has 4355bd8deadSopenharmony_ci been deleted since the last binding, the GL first creates a new state 4365bd8deadSopenharmony_ci vector, initialized with a zero-sized memory buffer and comprising the 4375bd8deadSopenharmony_ci state values listed in table 2.6. There is no buffer corresponding to the 4385bd8deadSopenharmony_ci name zero, IsNamedBufferResidentNV generates the INVALID_OPERATION error if 4395bd8deadSopenharmony_ci the buffer parameter is zero. 4405bd8deadSopenharmony_ci 4415bd8deadSopenharmony_ci Add to Section 6.1.15, p. 337 (Shader and Program Queries) 4425bd8deadSopenharmony_ci 4435bd8deadSopenharmony_ci void GetUniformui64vNV(uint program, int location, uint64EXT *params); 4445bd8deadSopenharmony_ci 4455bd8deadSopenharmony_ciAdditions to Appendix D of the OpenGL 3.0 Specification (Shared Objects and Multiple Contexts) 4465bd8deadSopenharmony_ci 4475bd8deadSopenharmony_ci Add a new section D.X (Object Use by GPU Address) 4485bd8deadSopenharmony_ci 4495bd8deadSopenharmony_ci A buffer object's GPU addresses is valid in all contexts in the share 4505bd8deadSopenharmony_ci group that the buffer belongs to. A buffer should be made resident in 4515bd8deadSopenharmony_ci each context that will use it via GPU address, to allow the GL 4525bd8deadSopenharmony_ci knowledge that it is used in each command stream. 4535bd8deadSopenharmony_ci 4545bd8deadSopenharmony_ciAdditions to the NV_gpu_program4 specification: 4555bd8deadSopenharmony_ci 4565bd8deadSopenharmony_ci Change Section 2.X.2, Program Grammar 4575bd8deadSopenharmony_ci 4585bd8deadSopenharmony_ci If a program specifies the NV_shader_buffer_load program option, 4595bd8deadSopenharmony_ci the following modifications apply to the program grammar: 4605bd8deadSopenharmony_ci 4615bd8deadSopenharmony_ci Append to <opModifier> list: | "F32" | "F32X2" | "F32X4" | "S8" | "S16" | 4625bd8deadSopenharmony_ci "S32" | "S32X2" | "S32X4" | "U8" | "U16" | "U32" | "U32X2" | "U32X4". 4635bd8deadSopenharmony_ci 4645bd8deadSopenharmony_ci Append to <SCALARop> list: | "LOAD". 4655bd8deadSopenharmony_ci 4665bd8deadSopenharmony_ci Modify Section 2.X.4, Program Execution Environment 4675bd8deadSopenharmony_ci 4685bd8deadSopenharmony_ci (Add to the set of opcodes in Table X.13) 4695bd8deadSopenharmony_ci 4705bd8deadSopenharmony_ci Modifiers 4715bd8deadSopenharmony_ci Instruction F I C S H D Out Inputs Description 4725bd8deadSopenharmony_ci ----------- - - - - - - --- -------- -------------------------------- 4735bd8deadSopenharmony_ci LOAD X X X X - F v su Global load 4745bd8deadSopenharmony_ci 4755bd8deadSopenharmony_ci 4765bd8deadSopenharmony_ci (Add to Table X.14, Instruction Modifiers, and to the corresponding 4775bd8deadSopenharmony_ci description following the table) 4785bd8deadSopenharmony_ci 4795bd8deadSopenharmony_ci Modifier Description 4805bd8deadSopenharmony_ci -------- ----------------------------------------------- 4815bd8deadSopenharmony_ci F32 Access one 32-bit floating-point value 4825bd8deadSopenharmony_ci F32X2 Access two 32-bit floating-point values 4835bd8deadSopenharmony_ci F32X4 Access four 32-bit floating-point values 4845bd8deadSopenharmony_ci S8 Access one 8-bit signed integer value 4855bd8deadSopenharmony_ci S16 Access one 16-bit signed integer value 4865bd8deadSopenharmony_ci S32 Access one 32-bit signed integer value 4875bd8deadSopenharmony_ci S32X2 Access two 32-bit signed integer values 4885bd8deadSopenharmony_ci S32X4 Access four 32-bit signed integer values 4895bd8deadSopenharmony_ci U8 Access one 8-bit unsigned integer value 4905bd8deadSopenharmony_ci U16 Access one 16-bit unsigned integer value 4915bd8deadSopenharmony_ci U32 Access one 32-bit unsigned integer value 4925bd8deadSopenharmony_ci U32X2 Access two 32-bit unsigned integer values 4935bd8deadSopenharmony_ci U32X4 Access four 32-bit unsigned integer values 4945bd8deadSopenharmony_ci 4955bd8deadSopenharmony_ci For memory load operations, the "F32", "F32X2", "F32X4", "S8", "S16", 4965bd8deadSopenharmony_ci "S32", "S32X2", "S32X4", "U8", "U16", "U32", "U32X2", and "U32X4" storage 4975bd8deadSopenharmony_ci modifiers control how data are loaded from memory. Storage modifiers are 4985bd8deadSopenharmony_ci supported by LOAD instruction and are covered in more detail in the 4995bd8deadSopenharmony_ci descriptions of that instruction. LOAD must specify exactly one of these 5005bd8deadSopenharmony_ci modifiers, and may not specify any of the base data type modifiers (F,U,S) 5015bd8deadSopenharmony_ci described above. The base data type of the result vector of a LOAD 5025bd8deadSopenharmony_ci instruction is trivially derived from the storage modifier. 5035bd8deadSopenharmony_ci 5045bd8deadSopenharmony_ci 5055bd8deadSopenharmony_ci Add New Section 2.X.4.5, Program Memory Access 5065bd8deadSopenharmony_ci 5075bd8deadSopenharmony_ci Programs may load from buffer object memory via the LOAD (global load) 5085bd8deadSopenharmony_ci instruction. 5095bd8deadSopenharmony_ci 5105bd8deadSopenharmony_ci Load instructions read 8, 16, 32, 64, or 128 bits of data from a source 5115bd8deadSopenharmony_ci address to produce a four-component vector, according to the storage 5125bd8deadSopenharmony_ci modifier specified with the instruction. The storage modifier has three 5135bd8deadSopenharmony_ci parts: 5145bd8deadSopenharmony_ci 5155bd8deadSopenharmony_ci - a base data type, "F", "S", or "U", specifying that the instruction 5165bd8deadSopenharmony_ci fetches floating-point, signed integer, or unsigned integer values, 5175bd8deadSopenharmony_ci respectively; 5185bd8deadSopenharmony_ci 5195bd8deadSopenharmony_ci - a component size, specifying that the components fetched by the 5205bd8deadSopenharmony_ci instruction have 8, 16, or 32 bits; and 5215bd8deadSopenharmony_ci 5225bd8deadSopenharmony_ci - an optional component count, where "X2" and "X4" indicate that two or 5235bd8deadSopenharmony_ci four components be fetched, and no count indicates a single component 5245bd8deadSopenharmony_ci fetch. 5255bd8deadSopenharmony_ci 5265bd8deadSopenharmony_ci When the storage modifier specifies that fewer than four components should 5275bd8deadSopenharmony_ci be fetched, remaining components are filled with zeroes. When performing 5285bd8deadSopenharmony_ci a global load (LOAD), the GPU address is specified as an instruction 5295bd8deadSopenharmony_ci operand. Given a GPU address <address> and a storage modifier <modifier>, 5305bd8deadSopenharmony_ci the memory load can be described by the following code: 5315bd8deadSopenharmony_ci 5325bd8deadSopenharmony_ci result_t_vec BufferMemoryLoad(char *address, OpModifier modifier) 5335bd8deadSopenharmony_ci { 5345bd8deadSopenharmony_ci result_t_vec result = { 0, 0, 0, 0 }; 5355bd8deadSopenharmony_ci switch (modifier) { 5365bd8deadSopenharmony_ci case F32: 5375bd8deadSopenharmony_ci result.x = ((float32_t *)address)[0]; 5385bd8deadSopenharmony_ci break; 5395bd8deadSopenharmony_ci case F32X2: 5405bd8deadSopenharmony_ci result.x = ((float32_t *)address)[0]; 5415bd8deadSopenharmony_ci result.y = ((float32_t *)address)[1]; 5425bd8deadSopenharmony_ci break; 5435bd8deadSopenharmony_ci case F32X4: 5445bd8deadSopenharmony_ci result.x = ((float32_t *)address)[0]; 5455bd8deadSopenharmony_ci result.y = ((float32_t *)address)[1]; 5465bd8deadSopenharmony_ci result.z = ((float32_t *)address)[2]; 5475bd8deadSopenharmony_ci result.w = ((float32_t *)address)[3]; 5485bd8deadSopenharmony_ci break; 5495bd8deadSopenharmony_ci case S8: 5505bd8deadSopenharmony_ci result.x = ((int8_t *)address)[0]; 5515bd8deadSopenharmony_ci break; 5525bd8deadSopenharmony_ci case S16: 5535bd8deadSopenharmony_ci result.x = ((int16_t *)address)[0]; 5545bd8deadSopenharmony_ci break; 5555bd8deadSopenharmony_ci case S32: 5565bd8deadSopenharmony_ci result.x = ((int32_t *)address)[0]; 5575bd8deadSopenharmony_ci break; 5585bd8deadSopenharmony_ci case S32X2: 5595bd8deadSopenharmony_ci result.x = ((int32_t *)address)[0]; 5605bd8deadSopenharmony_ci result.y = ((int32_t *)address)[1]; 5615bd8deadSopenharmony_ci break; 5625bd8deadSopenharmony_ci case S32X4: 5635bd8deadSopenharmony_ci result.x = ((int32_t *)address)[0]; 5645bd8deadSopenharmony_ci result.y = ((int32_t *)address)[1]; 5655bd8deadSopenharmony_ci result.z = ((int32_t *)address)[2]; 5665bd8deadSopenharmony_ci result.w = ((int32_t *)address)[3]; 5675bd8deadSopenharmony_ci break; 5685bd8deadSopenharmony_ci case U8: 5695bd8deadSopenharmony_ci result.x = ((uint8_t *)address)[0]; 5705bd8deadSopenharmony_ci break; 5715bd8deadSopenharmony_ci case U16: 5725bd8deadSopenharmony_ci result.x = ((uint16_t *)address)[0]; 5735bd8deadSopenharmony_ci break; 5745bd8deadSopenharmony_ci case U32: 5755bd8deadSopenharmony_ci result.x = ((uint32_t *)address)[0]; 5765bd8deadSopenharmony_ci break; 5775bd8deadSopenharmony_ci case U32X2: 5785bd8deadSopenharmony_ci result.x = ((uint32_t *)address)[0]; 5795bd8deadSopenharmony_ci result.y = ((uint32_t *)address)[1]; 5805bd8deadSopenharmony_ci break; 5815bd8deadSopenharmony_ci case U32X4: 5825bd8deadSopenharmony_ci result.x = ((uint32_t *)address)[0]; 5835bd8deadSopenharmony_ci result.y = ((uint32_t *)address)[1]; 5845bd8deadSopenharmony_ci result.z = ((uint32_t *)address)[2]; 5855bd8deadSopenharmony_ci result.w = ((uint32_t *)address)[3]; 5865bd8deadSopenharmony_ci break; 5875bd8deadSopenharmony_ci } 5885bd8deadSopenharmony_ci return result; 5895bd8deadSopenharmony_ci } 5905bd8deadSopenharmony_ci 5915bd8deadSopenharmony_ci If a global load accesses a memory address that does not correspond to a 5925bd8deadSopenharmony_ci buffer object made resident by MakeBufferResidentNV, the results of the 5935bd8deadSopenharmony_ci operation are undefined and may result in application termination. 5945bd8deadSopenharmony_ci 5955bd8deadSopenharmony_ci The address used for the buffer memory loads must be aligned to the fetch 5965bd8deadSopenharmony_ci size corresponding to the storage opcode modifier. For S8 and U8, the 5975bd8deadSopenharmony_ci offset has no alignment requirements. For S16 and U16, the offset must be 5985bd8deadSopenharmony_ci a multiple of two basic machine units. For F32, S32, and U32, the offset 5995bd8deadSopenharmony_ci must be a multiple of four. For F32X2, S32X2, and U32X2, the offset must 6005bd8deadSopenharmony_ci be a multiple of eight. For F32X4, S32X4, and U32X4, the offset must be a 6015bd8deadSopenharmony_ci multiple of sixteen. If an offset is not correctly aligned, the values 6025bd8deadSopenharmony_ci returned by a buffer memory load will be undefined. 6035bd8deadSopenharmony_ci 6045bd8deadSopenharmony_ci 6055bd8deadSopenharmony_ci Modify Section 2.X.6, Program Options 6065bd8deadSopenharmony_ci 6075bd8deadSopenharmony_ci + Shader Buffer Load Support (NV_shader_buffer_load) 6085bd8deadSopenharmony_ci 6095bd8deadSopenharmony_ci If a program specifies the "NV_shader_buffer_load" option, it may use the 6105bd8deadSopenharmony_ci LOAD instruction to load data from a resident buffer object given a GPU 6115bd8deadSopenharmony_ci address. 6125bd8deadSopenharmony_ci 6135bd8deadSopenharmony_ci 6145bd8deadSopenharmony_ci Section 2.X.8.Z, LOAD: Global Load 6155bd8deadSopenharmony_ci 6165bd8deadSopenharmony_ci The LOAD instruction generates a result vector by reading an address from 6175bd8deadSopenharmony_ci the single unsigned integer scalar operand and fetching data from buffer 6185bd8deadSopenharmony_ci object memory, as described in Section 2.X.4.5. 6195bd8deadSopenharmony_ci 6205bd8deadSopenharmony_ci address = ScalarLoad(op0); 6215bd8deadSopenharmony_ci result = BufferMemoryLoad(address, storageModifier); 6225bd8deadSopenharmony_ci 6235bd8deadSopenharmony_ci LOAD supports no base data type modifiers, but requires exactly one 6245bd8deadSopenharmony_ci storage modifier. The base data type of the result vector is derived from 6255bd8deadSopenharmony_ci the storage modifier. The single scalar operand is always interpreted as 6265bd8deadSopenharmony_ci an unsigned integer. 6275bd8deadSopenharmony_ci 6285bd8deadSopenharmony_ci The range of GPU addresses supported by the LOAD instruction may be 6295bd8deadSopenharmony_ci subject to an implementation-dependent limit. If any component fetched by 6305bd8deadSopenharmony_ci the LOAD instruction corresponds to memory with an address larger than the 6315bd8deadSopenharmony_ci value of MAX_SHADER_BUFFER_ADDRESS_NV, the value fetched for that 6325bd8deadSopenharmony_ci component will be undefined. 6335bd8deadSopenharmony_ci 6345bd8deadSopenharmony_ci 6355bd8deadSopenharmony_ciModifications to The OpenGL Shading Language Specification, Version 1.30.09 6365bd8deadSopenharmony_ci 6375bd8deadSopenharmony_ci Modify Section 3.6, Keywords, p. 14 6385bd8deadSopenharmony_ci 6395bd8deadSopenharmony_ci (add the following to the list of reserved keywords) 6405bd8deadSopenharmony_ci 6415bd8deadSopenharmony_ci intptr_t 6425bd8deadSopenharmony_ci uintptr_t 6435bd8deadSopenharmony_ci 6445bd8deadSopenharmony_ci 6455bd8deadSopenharmony_ci Modify Section 4.1, Basic Types, p. 18 6465bd8deadSopenharmony_ci 6475bd8deadSopenharmony_ci (add to the basic "Transparent Types" table, p. 18) 6485bd8deadSopenharmony_ci 6495bd8deadSopenharmony_ci Types Meaning 6505bd8deadSopenharmony_ci -------- ---------------------------------------------------------- 6515bd8deadSopenharmony_ci intptr_t a signed integer with the same precision as a pointer 6525bd8deadSopenharmony_ci uintptr_t an unsigned integer with the same precision as a pointer 6535bd8deadSopenharmony_ci 6545bd8deadSopenharmony_ci (replace the last paragraph of the section with the following) 6555bd8deadSopenharmony_ci 6565bd8deadSopenharmony_ci Pointers to any of the transparent types, user-defined structs, or other 6575bd8deadSopenharmony_ci pointer types are supported. 6585bd8deadSopenharmony_ci 6595bd8deadSopenharmony_ci 6605bd8deadSopenharmony_ci Modify Section 4.1.3, Integers, p. 18 6615bd8deadSopenharmony_ci 6625bd8deadSopenharmony_ci (add to the end of the first paragraph) Signed and unsigned integer 6635bd8deadSopenharmony_ci variables are fully supported. ... intptr_t and uintptr_t variables have 6645bd8deadSopenharmony_ci the same number of bits of precision as the native size of a pointer in 6655bd8deadSopenharmony_ci the underlying implementation. 6665bd8deadSopenharmony_ci 6675bd8deadSopenharmony_ci 6685bd8deadSopenharmony_ci (Insert new section immediately before Section 4.1.10, Implicit 6695bd8deadSopenharmony_ci Conversions, p. 27) 6705bd8deadSopenharmony_ci 6715bd8deadSopenharmony_ci Section 4.1.X, Pointers 6725bd8deadSopenharmony_ci 6735bd8deadSopenharmony_ci Pointers are 64-bit unsigned integer values that represent the address of 6745bd8deadSopenharmony_ci some "global" memory (i.e. not local to this invocation of a shader). 6755bd8deadSopenharmony_ci Pointers to any of the transparent types, user-defined structures, or 6765bd8deadSopenharmony_ci pointer types are supported. Pointers are dereferenced with the operators 6775bd8deadSopenharmony_ci (*), (->), and ([]) and a variety of operators performing addition and 6785bd8deadSopenharmony_ci subtraction are supported. There is no mechanism to assign a pointer to 6795bd8deadSopenharmony_ci the address of a local variable or array, nor is there a mechanism to 6805bd8deadSopenharmony_ci allocate or free memory from within a shader. There are no function 6815bd8deadSopenharmony_ci pointers. 6825bd8deadSopenharmony_ci 6835bd8deadSopenharmony_ci The underlying memory read using pointer variables may also be accessed 6845bd8deadSopenharmony_ci using the OpenGL API commands. To communicate between shaders and other 6855bd8deadSopenharmony_ci OpenGL API commands, variables read through pointers are arranged in 6865bd8deadSopenharmony_ci memory in the manner described in Section 2.20.X of the OpenGL 6875bd8deadSopenharmony_ci Specification. 6885bd8deadSopenharmony_ci 6895bd8deadSopenharmony_ci 6905bd8deadSopenharmony_ci Modify Section 4.1.10, Implicit Conversions, p. 27 6915bd8deadSopenharmony_ci 6925bd8deadSopenharmony_ci (add before the final paragraph of the section, p. 27) 6935bd8deadSopenharmony_ci 6945bd8deadSopenharmony_ci Pointers to any type may be implicitly converted to pointers to void. 6955bd8deadSopenharmony_ci Pointers to any type (including void), are never implicitly converted to 6965bd8deadSopenharmony_ci pointers to any other non-void type. 6975bd8deadSopenharmony_ci 6985bd8deadSopenharmony_ci 6995bd8deadSopenharmony_ci Modify Section 5.1, Operators, p. 39 7005bd8deadSopenharmony_ci 7015bd8deadSopenharmony_ci (add new entries to the precedence table; for a full spec, renumber the 7025bd8deadSopenharmony_ci new precedence row "3.5" to "4", and renumber all subsequent rows) 7035bd8deadSopenharmony_ci 7045bd8deadSopenharmony_ci Precedence Operator Class Operators Associativity 7055bd8deadSopenharmony_ci ---------- -------------------------- --------- ------------- 7065bd8deadSopenharmony_ci 2 field access from pointer -> left to right 7075bd8deadSopenharmony_ci 3 pointer dereference * right to left 7085bd8deadSopenharmony_ci 3.5 typecast () right to left 7095bd8deadSopenharmony_ci 7105bd8deadSopenharmony_ci (modify the last paragraph, p.39, to delete language saying that 7115bd8deadSopenharmony_ci dereferences and typecast operators are not supported) 7125bd8deadSopenharmony_ci 7135bd8deadSopenharmony_ci There is no address-of operator. 7145bd8deadSopenharmony_ci 7155bd8deadSopenharmony_ci 7165bd8deadSopenharmony_ci (Insert new section immediately after Section 5.7, Structure and Array 7175bd8deadSopenharmony_ci Operations, p. 46) 7185bd8deadSopenharmony_ci 7195bd8deadSopenharmony_ci Section 5.X, Pointer Operations 7205bd8deadSopenharmony_ci 7215bd8deadSopenharmony_ci The following operators are allowed to operate on pointer types: 7225bd8deadSopenharmony_ci 7235bd8deadSopenharmony_ci pointer dereference * 7245bd8deadSopenharmony_ci additive + - 7255bd8deadSopenharmony_ci array subscript [] 7265bd8deadSopenharmony_ci arithmetic assignments += -= 7275bd8deadSopenharmony_ci postfix increment and decrement ++ -- 7285bd8deadSopenharmony_ci prefix increment and decrement ++ -- 7295bd8deadSopenharmony_ci equality == != 7305bd8deadSopenharmony_ci assignment = 7315bd8deadSopenharmony_ci field or method selector -> 7325bd8deadSopenharmony_ci 7335bd8deadSopenharmony_ci The pointer dereference operator is a unary operator that converts a 7345bd8deadSopenharmony_ci pointer expression into an l-value designating data of the type pointed to 7355bd8deadSopenharmony_ci by the pointer expression. The result of a pointer dereference may not be 7365bd8deadSopenharmony_ci used as the left-hand side of an assignment. 7375bd8deadSopenharmony_ci 7385bd8deadSopenharmony_ci The pointer binary addition (+) and subtraction (-) operators produce a 7395bd8deadSopenharmony_ci pointer result from one pointer operand and one scalar signed or unsigned 7405bd8deadSopenharmony_ci integer operand. For subtraction, the pointer must be the first operand; 7415bd8deadSopenharmony_ci for addition, the pointer may be either operand. The type of the result 7425bd8deadSopenharmony_ci is the same type as the pointer operand. A new pointer is computed by 7435bd8deadSopenharmony_ci adding or subtracting <I>*<S> basic machine units to the value of the 7445bd8deadSopenharmony_ci pointer operand, where <I> is the integer operand and <S> is the stride 7455bd8deadSopenharmony_ci that would be derived by applying the rules specified in Section 2.20.X of 7465bd8deadSopenharmony_ci the OpenGL Specification to an array with elements of the type pointed to 7475bd8deadSopenharmony_ci by the pointer. 7485bd8deadSopenharmony_ci 7495bd8deadSopenharmony_ci The binary subtraction (-) operator may also operate on a pair of pointers 7505bd8deadSopenharmony_ci of identical type. In this operation, the second operand is subtracted 7515bd8deadSopenharmony_ci from the first, yielding a signed integer result of type <intptr_t>. The 7525bd8deadSopenharmony_ci result is in units of the type being pointed to. The result is the 7535bd8deadSopenharmony_ci integer value that would yield the first pointer operand if added to the 7545bd8deadSopenharmony_ci second pointer operand in the manner described above. If no such integer 7555bd8deadSopenharmony_ci value exists, the result of the operation is undefined. Pointer 7565bd8deadSopenharmony_ci subtraction is not supported for pointers to the type <void>. 7575bd8deadSopenharmony_ci 7585bd8deadSopenharmony_ci The array subscript operator ([]) adds a signed or unsigned integer 7595bd8deadSopenharmony_ci expression specified inside the brackets to a pointer expression specified 7605bd8deadSopenharmony_ci to the left of the brackets, and then dereferences the pointer produced by 7615bd8deadSopenharmony_ci the addition. The array subscript operation "P[i]" is functionally 7625bd8deadSopenharmony_ci equivalent to "(*(P+i))". 7635bd8deadSopenharmony_ci 7645bd8deadSopenharmony_ci The add into (+=) and subtract from (-=) are binary operations, where the 7655bd8deadSopenharmony_ci first operand must be one that could be assigned to (an l-value) and the 7665bd8deadSopenharmony_ci second operand must be a signed or unsigned integer scalar. These 7675bd8deadSopenharmony_ci operations add the integer operand into or subtract the integer operand 7685bd8deadSopenharmony_ci from the pointer operand, as defined for pointer addition and subtraction. 7695bd8deadSopenharmony_ci 7705bd8deadSopenharmony_ci The arithmetic unary operators post- and pre-increment and decrement (-- 7715bd8deadSopenharmony_ci and ++) operate on pointers. For post- and pre-increment and decrement, 7725bd8deadSopenharmony_ci the expression must be one that could be assigned to (an l-value). Pre- 7735bd8deadSopenharmony_ci and post-increment and decrement add or subtract 1 to the contents of the 7745bd8deadSopenharmony_ci expression they operate on, as defined for pointer addition and 7755bd8deadSopenharmony_ci subtraction. The value of the pre-increment or pre-decrement expression 7765bd8deadSopenharmony_ci is the resulting value of that modification. The value of the 7775bd8deadSopenharmony_ci post-increment or post-decrement expression is the value of the expression 7785bd8deadSopenharmony_ci before modification. 7795bd8deadSopenharmony_ci 7805bd8deadSopenharmony_ci The equality operators equal (==) and not equal (!=) operate on pointer 7815bd8deadSopenharmony_ci types and produce a scalar Boolean result. The two operands must either 7825bd8deadSopenharmony_ci be pointers to the same type, or one of the two operands must point to 7835bd8deadSopenharmony_ci void. Two pointers are considered equal if and only if they point to the 7845bd8deadSopenharmony_ci same global memory address. 7855bd8deadSopenharmony_ci 7865bd8deadSopenharmony_ci The field or method selection operator (->) operates on a pointer to a 7875bd8deadSopenharmony_ci structure of any type and is used to select a field of the structure 7885bd8deadSopenharmony_ci pointed to by the pointer. This selector also operates on a pointer to 7895bd8deadSopenharmony_ci vector of any type, where the right hand side of the operator must be a 7905bd8deadSopenharmony_ci valid string using the vector component selection suffix described in 7915bd8deadSopenharmony_ci Section 5.5. In both cases, the field or method selection operation 7925bd8deadSopenharmony_ci "p->s" is functionally equivalent to "((*p).s)". 7935bd8deadSopenharmony_ci 7945bd8deadSopenharmony_ci Pointer addition and subtraction, including the add into, subtract from, 7955bd8deadSopenharmony_ci and pre- and post-increment and decrement operators, are not supported on 7965bd8deadSopenharmony_ci pointers to a void type. 7975bd8deadSopenharmony_ci 7985bd8deadSopenharmony_ci The assignment operator may be used to update the value of a pointer 7995bd8deadSopenharmony_ci variable, as described in Section 5.8. 8005bd8deadSopenharmony_ci 8015bd8deadSopenharmony_ci 8025bd8deadSopenharmony_ci (Insert after Section 5.10, Vector and Matrix Operations, p. 50) 8035bd8deadSopenharmony_ci 8045bd8deadSopenharmony_ci Section 5.11, Typecast Operations 8055bd8deadSopenharmony_ci 8065bd8deadSopenharmony_ci The typecast operator may be used to convert an expression from one type 8075bd8deadSopenharmony_ci to another, operating in a manner similar to scalar, vector, and matrix 8085bd8deadSopenharmony_ci constructors. The typecast operator specifies a new data type in 8095bd8deadSopenharmony_ci parentheses, followed by an expression, as in the following examples: 8105bd8deadSopenharmony_ci 8115bd8deadSopenharmony_ci float a = (float) 2U; 8125bd8deadSopenharmony_ci vec3 b = (vec3) 1.0; 8135bd8deadSopenharmony_ci vec4 c = (vec4) b; 8145bd8deadSopenharmony_ci mat2 d = (mat2) 1.0; 8155bd8deadSopenharmony_ci mat4 e = (mat4) d; 8165bd8deadSopenharmony_ci 8175bd8deadSopenharmony_ci For scalar, vector, and matrix data types, the set of typecasts supported 8185bd8deadSopenharmony_ci is equivalent to the set of single-operand constructors supported, and a 8195bd8deadSopenharmony_ci typecast operates identically to an equivalent constructor. A scalar 8205bd8deadSopenharmony_ci expression may be typecast to any scalar, vector, or matrix data type. A 8215bd8deadSopenharmony_ci vector expression may be typecast any vector type, except vectors with a 8225bd8deadSopenharmony_ci larger number of components. Additionally, four-component vector 8235bd8deadSopenharmony_ci expressions may also be cast to a mat2 type. A matrix expression may be 8245bd8deadSopenharmony_ci typecast to any other matrix data type. 8255bd8deadSopenharmony_ci 8265bd8deadSopenharmony_ci Expressions with structure type may only be typecast to a structure of 8275bd8deadSopenharmony_ci identical type, which has no effect. Typecast operators are not supported 8285bd8deadSopenharmony_ci for array types. 8295bd8deadSopenharmony_ci 8305bd8deadSopenharmony_ci Note that the typecast operator takes only a single expression. Unlike 8315bd8deadSopenharmony_ci constructors, they can not be used to generate a vector, structure, or 8325bd8deadSopenharmony_ci matrix from multiple inputs. For example, 8335bd8deadSopenharmony_ci 8345bd8deadSopenharmony_ci vec3 f = (vec3) (1.0, 2.0, 3.0); 8355bd8deadSopenharmony_ci 8365bd8deadSopenharmony_ci generates a three-component vector <f>. But all three components 8375bd8deadSopenharmony_ci are set to 3.0, which is the scalar value of the expression "(1.0, 2.0, 8385bd8deadSopenharmony_ci 3.0)". The commas in that expression are sequence operators, not list 8395bd8deadSopenharmony_ci delimiters. 8405bd8deadSopenharmony_ci 8415bd8deadSopenharmony_ci Additionally, typecast operators may also be used to cast values to a 8425bd8deadSopenharmony_ci pointer type. In this case, the expression being typecast must be either 8435bd8deadSopenharmony_ci a pointer (to any type) or a scalar of type intptr_t or uintptr_t. 8445bd8deadSopenharmony_ci 8455bd8deadSopenharmony_ci vec4 *v4ptr 8465bd8deadSopenharmony_ci intptr_t iptr; 8475bd8deadSopenharmony_ci vec3 *v3ptr = (vec3 *) v4ptr; 8485bd8deadSopenharmony_ci ivec2 *iv2ptr = (ivec2 *) iptr; 8495bd8deadSopenharmony_ci 8505bd8deadSopenharmony_ci Note that function call-style constructors are not supported for pointers. 8515bd8deadSopenharmony_ci 8525bd8deadSopenharmony_ci 8535bd8deadSopenharmony_ci Add to the end of Section 8.3, Common Functions, p. 72 8545bd8deadSopenharmony_ci 8555bd8deadSopenharmony_ci (add support for pointer packing functions) 8565bd8deadSopenharmony_ci 8575bd8deadSopenharmony_ci Syntax: 8585bd8deadSopenharmony_ci 8595bd8deadSopenharmony_ci void *packPtr(uvec2 a); 8605bd8deadSopenharmony_ci uvec2 unpackPtr(void *a); 8615bd8deadSopenharmony_ci 8625bd8deadSopenharmony_ci The function packPtr() returns a pointer to void by constructing a 64-bit 8635bd8deadSopenharmony_ci void pointer from the two 32-bit components of an unsigned integer vector. 8645bd8deadSopenharmony_ci The first vector component specifies the 32 least significant bits of the 8655bd8deadSopenharmony_ci pointer; the second component specifies the 32 most significant bits. 8665bd8deadSopenharmony_ci 8675bd8deadSopenharmony_ci The function unpackPtr() returns a two-component unsigned integer vector 8685bd8deadSopenharmony_ci built from a 64-bit void pointer. The first component of the vector 8695bd8deadSopenharmony_ci consists of the 32 least significant bits of the pointer value; the second 8705bd8deadSopenharmony_ci component consists of the 32 most significant bits. 8715bd8deadSopenharmony_ci 8725bd8deadSopenharmony_ci 8735bd8deadSopenharmony_ci Modify Chapter 9, Shading Language Grammar, p.92 8745bd8deadSopenharmony_ci 8755bd8deadSopenharmony_ci (change comment in the grammar disallowing pointer dereferences) 8765bd8deadSopenharmony_ci 8775bd8deadSopenharmony_ci Change the sentence: 8785bd8deadSopenharmony_ci 8795bd8deadSopenharmony_ci // Grammar Note: No '*' or '&' unary ops. Pointers are not supported. 8805bd8deadSopenharmony_ci 8815bd8deadSopenharmony_ci to 8825bd8deadSopenharmony_ci 8835bd8deadSopenharmony_ci // Grammar Note: No '&' unary. 8845bd8deadSopenharmony_ci 8855bd8deadSopenharmony_ci 8865bd8deadSopenharmony_ciAdditions to the AGL/EGL/GLX/WGL Specifications 8875bd8deadSopenharmony_ci 8885bd8deadSopenharmony_ci None 8895bd8deadSopenharmony_ci 8905bd8deadSopenharmony_ciErrors 8915bd8deadSopenharmony_ci 8925bd8deadSopenharmony_ci INVALID_ENUM is generated by MakeBufferResidentNV if <access> is not 8935bd8deadSopenharmony_ci READ_ONLY. 8945bd8deadSopenharmony_ci 8955bd8deadSopenharmony_ci INVALID_ENUM is generated by GetBufferParameterui64vNV if <pname> is 8965bd8deadSopenharmony_ci not BUFFER_GPU_ADDRESS_NV. 8975bd8deadSopenharmony_ci 8985bd8deadSopenharmony_ci INVALID_OPERATION is generated by MakeBufferResidentNV, 8995bd8deadSopenharmony_ci MakeBufferNonResidentNV, IsBufferResidentNV, and GetBufferParameterui64vNV 9005bd8deadSopenharmony_ci if no buffer is bound to <target>. 9015bd8deadSopenharmony_ci 9025bd8deadSopenharmony_ci INVALID_OPERATION is generated by MakeBufferResidentNV if the buffer bound 9035bd8deadSopenharmony_ci to <target> is already resident in the current GL context. 9045bd8deadSopenharmony_ci 9055bd8deadSopenharmony_ci INVALID_OPERATION is generated by MakeBufferNonResidentNV if the buffer 9065bd8deadSopenharmony_ci bound to <target> is not resident in the current GL context. 9075bd8deadSopenharmony_ci 9085bd8deadSopenharmony_ci INVALID_OPERATION is generated by MakeNamedBufferResidentNV if <buffer> is 9095bd8deadSopenharmony_ci already resident in the current GL context. 9105bd8deadSopenharmony_ci 9115bd8deadSopenharmony_ci INVALID_OPERATION is generated by MakeNamedBufferNonResidentNV if <buffer> 9125bd8deadSopenharmony_ci is not resident in the current GL context. 9135bd8deadSopenharmony_ci 9145bd8deadSopenharmony_ci INVALID_OPERATION is generated by GetBufferParameterui64vNV or 9155bd8deadSopenharmony_ci MakeBufferResidentNV if the buffer bound to <target> has no data store. 9165bd8deadSopenharmony_ci 9175bd8deadSopenharmony_ci INVALID_OPERATION is generated by GetNamedBufferParameterui64vNV or 9185bd8deadSopenharmony_ci MakeNamedBufferResidentNV if <buffer> has no data store. 9195bd8deadSopenharmony_ci 9205bd8deadSopenharmony_ciExamples 9215bd8deadSopenharmony_ci 9225bd8deadSopenharmony_ci (1) Layout of a complex structure using the rules from the new Section 9235bd8deadSopenharmony_ci 2.20.X added to the OpenGL spec: 9245bd8deadSopenharmony_ci 9255bd8deadSopenharmony_ci struct Example { 9265bd8deadSopenharmony_ci // bytes used rules 9275bd8deadSopenharmony_ci float a; // 0-3 9285bd8deadSopenharmony_ci vec2 b; // 8-15 1 // bumped to a multiple of 8 9295bd8deadSopenharmony_ci vec3 c; // 16-27 1 9305bd8deadSopenharmony_ci struct { 9315bd8deadSopenharmony_ci int d; // 32-35 2 // bumped to a multiple of 8 (bvec2) 9325bd8deadSopenharmony_ci bvec2 e; // 40-47 1 9335bd8deadSopenharmony_ci } f; 9345bd8deadSopenharmony_ci float g; // 48-51 9355bd8deadSopenharmony_ci float h[2]; // 52-55 (h[0]) 5 // multiple of 4 (float) with no additional padding 9365bd8deadSopenharmony_ci // 56-59 (h[1]) 6 // tightly packed 9375bd8deadSopenharmony_ci mat2x3 i; // 64-75 (i[0]) 9385bd8deadSopenharmony_ci // 80-91 (i[1]) 6 // bumped to a multiple of 16 (vec3) 9395bd8deadSopenharmony_ci struct { 9405bd8deadSopenharmony_ci uvec3 j; // 96-107 (m[0].j) 9415bd8deadSopenharmony_ci vec2 k; // 112-119 (m[0].k) 1 // bumped to a multiple of 8 (vec2) 9425bd8deadSopenharmony_ci float l[2]; // 120-123 (m[0].l[0]) 1,5 // simply float aligned 9435bd8deadSopenharmony_ci // 124-127 (m[0].l[1]) 6 // tightly packed 9445bd8deadSopenharmony_ci // 128-139 (m[1].j) 9455bd8deadSopenharmony_ci // 144-151 (m[1].k) 9465bd8deadSopenharmony_ci // 152-155 (m[1].l[0]) 9475bd8deadSopenharmony_ci // 156-159 (m[1].l[1]) 9485bd8deadSopenharmony_ci } m[2]; 9495bd8deadSopenharmony_ci }; 9505bd8deadSopenharmony_ci // sizeof(Example) == 160 9515bd8deadSopenharmony_ci 9525bd8deadSopenharmony_ci (2) Replacing bindable_uniform with an array of pointers: 9535bd8deadSopenharmony_ci 9545bd8deadSopenharmony_ci #version 120 9555bd8deadSopenharmony_ci #extension GL_NV_shader_buffer_load : require 9565bd8deadSopenharmony_ci #extension GL_EXT_bindable_uniform : require 9575bd8deadSopenharmony_ci 9585bd8deadSopenharmony_ci in vec4 **ptr; 9595bd8deadSopenharmony_ci in uvec2 whichbuf; 9605bd8deadSopenharmony_ci 9615bd8deadSopenharmony_ci void main() { 9625bd8deadSopenharmony_ci gl_FrontColor = ptr[whichbuf.x][whichbuf.y]; 9635bd8deadSopenharmony_ci gl_Position = ftransform(); 9645bd8deadSopenharmony_ci } 9655bd8deadSopenharmony_ci 9665bd8deadSopenharmony_ci in the GL code, assuming the bufferobject setup in the Overview: 9675bd8deadSopenharmony_ci 9685bd8deadSopenharmony_ci glBindAttribLocation(program, 8, "ptr"); 9695bd8deadSopenharmony_ci glBindAttribLocation(program, 9, "whichbuf"); 9705bd8deadSopenharmony_ci glLinkProgram(program); 9715bd8deadSopenharmony_ci glBegin(...); 9725bd8deadSopenharmony_ci glVertexAttribI2iEXT(8, (unsigned int)pointerBufferAddr, 9735bd8deadSopenharmony_ci (unsigned int)(pointerBufferAddr>>32)); 9745bd8deadSopenharmony_ci for (i = ...) { 9755bd8deadSopenharmony_ci for (j = ...) { 9765bd8deadSopenharmony_ci glVertexAttribI2iEXT(9, i, j); 9775bd8deadSopenharmony_ci glVertex3f(...); 9785bd8deadSopenharmony_ci } 9795bd8deadSopenharmony_ci } 9805bd8deadSopenharmony_ci glEnd(); 9815bd8deadSopenharmony_ci 9825bd8deadSopenharmony_ci 9835bd8deadSopenharmony_ciNew State 9845bd8deadSopenharmony_ci 9855bd8deadSopenharmony_ci Update Table 6.11, p. 349 (Buffer Object State) 9865bd8deadSopenharmony_ci 9875bd8deadSopenharmony_ci Get Value Type Get Command Initial Value Sec Attribute 9885bd8deadSopenharmony_ci --------- ---- ----------- ------------- --- --------- 9895bd8deadSopenharmony_ci BUFFER_GPU_ADDRESS_NV Z64+ GetBufferParameterui64vNV 0 2.9 none 9905bd8deadSopenharmony_ci 9915bd8deadSopenharmony_ci Update Table 6.46, p. 384 (Implementation Dependent Values) 9925bd8deadSopenharmony_ci 9935bd8deadSopenharmony_ci Get Value Type Get Command Minimum Value Sec Attribute 9945bd8deadSopenharmony_ci --------- ---- ----------- ------------- --- --------- 9955bd8deadSopenharmony_ci MAX_SHADER_BUFFER_ADDRESS_NV Z64+ GetIntegerui64vNV 0xFFFFFFFF 2.X.2 none 9965bd8deadSopenharmony_ci 9975bd8deadSopenharmony_ciDependencies on NV_gpu_program4: 9985bd8deadSopenharmony_ci 9995bd8deadSopenharmony_ci This extension is generally written against the NV_gpu_program4 10005bd8deadSopenharmony_ci wording, program grammar, etc., but doesn't have specific 10015bd8deadSopenharmony_ci dependencies on its functionality. 10025bd8deadSopenharmony_ci 10035bd8deadSopenharmony_ci 10045bd8deadSopenharmony_ciIssues 10055bd8deadSopenharmony_ci 10065bd8deadSopenharmony_ci 1) Only buffer objects? 10075bd8deadSopenharmony_ci 10085bd8deadSopenharmony_ci RESOLVED: YES, for now. Buffer objects are unformatted memory and 10095bd8deadSopenharmony_ci easily mapped to a "pointer"-style shading language. 10105bd8deadSopenharmony_ci 10115bd8deadSopenharmony_ci 2) Should we allow writes? 10125bd8deadSopenharmony_ci 10135bd8deadSopenharmony_ci RESOLVED: NO, deferred to a later extension. Writes involve 10145bd8deadSopenharmony_ci specifying many kinds of synchronization primitives. Writes are also 10155bd8deadSopenharmony_ci a "side effect" which makes program execution "observable" in cases 10165bd8deadSopenharmony_ci where it may not have otherwise been (e.g. early-Z can kill fragments 10175bd8deadSopenharmony_ci before shading, or a post-transform cache may prevent vertex program 10185bd8deadSopenharmony_ci execution). 10195bd8deadSopenharmony_ci 10205bd8deadSopenharmony_ci 3) What happens if an invalid pointer is fetched? 10215bd8deadSopenharmony_ci 10225bd8deadSopenharmony_ci UNRESOLVED: Unpredictable results, including program termination? 10235bd8deadSopenharmony_ci Make the driver trap the error and report it (still unpredictable 10245bd8deadSopenharmony_ci results, but no program termination)? My preference would be to 10255bd8deadSopenharmony_ci at least report the faulting address (roughly), whether it was 10265bd8deadSopenharmony_ci a read or a write, and which shader stage faulted. I'd like to not 10275bd8deadSopenharmony_ci terminate the program, but the app has to assume all their data 10285bd8deadSopenharmony_ci stored in the GL is lost. 10295bd8deadSopenharmony_ci 10305bd8deadSopenharmony_ci 4) What should this extension be named? 10315bd8deadSopenharmony_ci 10325bd8deadSopenharmony_ci RESOLVED: NV_shader_buffer_load. Rather than trying to choose an 10335bd8deadSopenharmony_ci overly-general name and naming future extensions "GL_XXX2", let's 10345bd8deadSopenharmony_ci name this according to the specific functionality it provides. 10355bd8deadSopenharmony_ci 10365bd8deadSopenharmony_ci 5) What are the performance characteristics of buffer loads? 10375bd8deadSopenharmony_ci 10385bd8deadSopenharmony_ci RESOLVED: Likely somewhere between uniforms and texture fetches, 10395bd8deadSopenharmony_ci but totally implementation-dependent. Uniforms still serve a purpose 10405bd8deadSopenharmony_ci for "program locals". Buffer loads may have different caching 10415bd8deadSopenharmony_ci behavior than either uniforms or texture fetches, but the expectation 10425bd8deadSopenharmony_ci is that they will be cached reads of memory and all the common sense 10435bd8deadSopenharmony_ci guidelines to try to maintain locality of reference apply. 10445bd8deadSopenharmony_ci 10455bd8deadSopenharmony_ci 6) What does MakeBufferResidentNV do? Why not just have a 10465bd8deadSopenharmony_ci MapBufferGPUNV? 10475bd8deadSopenharmony_ci 10485bd8deadSopenharmony_ci RESOLVED: Reserving virtual address space only requires knowing the 10495bd8deadSopenharmony_ci size of the data store, so an explicit MapBufferGPU call isn't 10505bd8deadSopenharmony_ci necessary. If all GPUs supported demand paging, a GPU address might 10515bd8deadSopenharmony_ci be sufficient, but without that assumption MakeBufferResidentNV serves 10525bd8deadSopenharmony_ci as a hint to the driver that it needs to page lock memory, download 10535bd8deadSopenharmony_ci the buffer contents into GPU-accessible memory, or other similar 10545bd8deadSopenharmony_ci preparation. MapBufferGPU would also imply that a different address 10555bd8deadSopenharmony_ci may be returned each time it is mapped, which could be cumbersome 10565bd8deadSopenharmony_ci for the application to handle. 10575bd8deadSopenharmony_ci 10585bd8deadSopenharmony_ci 7) Is it an error to render while any resident buffer is mapped? 10595bd8deadSopenharmony_ci 10605bd8deadSopenharmony_ci RESOLVED: No. As the number of attachment points in the context grows, 10615bd8deadSopenharmony_ci even the existing error check is falling out of favor. 10625bd8deadSopenharmony_ci 10635bd8deadSopenharmony_ci 8) Does MapBuffer stall on pending use of a resident buffer? 10645bd8deadSopenharmony_ci 10655bd8deadSopenharmony_ci RESOLVED: No. The existing language is: 10665bd8deadSopenharmony_ci 10675bd8deadSopenharmony_ci "If the GL is able to map the buffer object's data store into the 10685bd8deadSopenharmony_ci client's address space, MapBuffer returns the pointer value to 10695bd8deadSopenharmony_ci the data store once all pending operations on that buffer have 10705bd8deadSopenharmony_ci completed." 10715bd8deadSopenharmony_ci 10725bd8deadSopenharmony_ci However, since the implementation has no information about how the 10735bd8deadSopenharmony_ci buffer is used, "all pending operations" amounts to a Finish. In 10745bd8deadSopenharmony_ci terms of sharing across contexts/threads, ARB_vertex_buffer_object 10755bd8deadSopenharmony_ci says: 10765bd8deadSopenharmony_ci 10775bd8deadSopenharmony_ci "How is synchronization enforced when buffer objects are shared by 10785bd8deadSopenharmony_ci multiple OpenGL contexts? 10795bd8deadSopenharmony_ci 10805bd8deadSopenharmony_ci RESOLVED: It is generally the clients' responsibility to 10815bd8deadSopenharmony_ci synchronize modifications made to shared buffer objects." 10825bd8deadSopenharmony_ci 10835bd8deadSopenharmony_ci So we shouldn't dictate any additional shared object synchronization. 10845bd8deadSopenharmony_ci So the best we could do is a Finish, but it's not clear that this 10855bd8deadSopenharmony_ci accomplishes anything for the application since they can just as 10865bd8deadSopenharmony_ci easily call Finish. Or if they don't want synchronization, they can 10875bd8deadSopenharmony_ci use MAP_UNSYNCHRONIZED_BIT. It seems the resolution to this is 10885bd8deadSopenharmony_ci inconsequential as GL already provides the tools to achieve either 10895bd8deadSopenharmony_ci behavior. Hence, don't bother stalling. 10905bd8deadSopenharmony_ci 10915bd8deadSopenharmony_ci However, if a buffer was previously resident and has since been made 10925bd8deadSopenharmony_ci non-resident, the implementation should enforce the stalling 10935bd8deadSopenharmony_ci behavior for those pending operations from before it was made non- 10945bd8deadSopenharmony_ci resident. 10955bd8deadSopenharmony_ci 10965bd8deadSopenharmony_ci 9) Given issue (8), what are some effective ways to load data into 10975bd8deadSopenharmony_ci a buffer that is resident? 10985bd8deadSopenharmony_ci 10995bd8deadSopenharmony_ci RESOLVED: There are several possibilities: 11005bd8deadSopenharmony_ci 11015bd8deadSopenharmony_ci - BufferSubData. 11025bd8deadSopenharmony_ci 11035bd8deadSopenharmony_ci - The application may track using Fences which parts of the buffer 11045bd8deadSopenharmony_ci are actually in use and update them with CPU writes using 11055bd8deadSopenharmony_ci MAP_UNSYNCHRONIZED_BIT. This is potentially error-prone, as 11065bd8deadSopenharmony_ci described in ARB_copy_buffer. 11075bd8deadSopenharmony_ci 11085bd8deadSopenharmony_ci - CopyBufferSubData. ARB_copy_buffer describes a simple usage example 11095bd8deadSopenharmony_ci for a single-threaded application. Since this extension is targeted 11105bd8deadSopenharmony_ci at reducing the CPU bottleneck in the rendering thread, offloading 11115bd8deadSopenharmony_ci some of the work to other threads may be useful. 11125bd8deadSopenharmony_ci 11135bd8deadSopenharmony_ci Example with a single Loading thread and Rendering thread: 11145bd8deadSopenharmony_ci 11155bd8deadSopenharmony_ci Loading thread: 11165bd8deadSopenharmony_ci while (1) { 11175bd8deadSopenharmony_ci WaitForEvent(something to do); 11185bd8deadSopenharmony_ci 11195bd8deadSopenharmony_ci NamedBufferData(tempBuffer, updateSize, NULL, STREAM_DRAW); 11205bd8deadSopenharmony_ci ptr = MapNamedBuffer(tempBuffer, WRITE_ONLY); 11215bd8deadSopenharmony_ci // fill ptr 11225bd8deadSopenharmony_ci UnmapNamedBuffer(tempBuffer); 11235bd8deadSopenharmony_ci // the buffer could have been filled via BufferData, if 11245bd8deadSopenharmony_ci // that's more natural. 11255bd8deadSopenharmony_ci 11265bd8deadSopenharmony_ci // send tempBuffer name to Rendering thread 11275bd8deadSopenharmony_ci } 11285bd8deadSopenharmony_ci Rendering thread: 11295bd8deadSopenharmony_ci foreach (obj in scene) { 11305bd8deadSopenharmony_ci if (obj has changed) { 11315bd8deadSopenharmony_ci // get tempBuffer name from Loading thread 11325bd8deadSopenharmony_ci 11335bd8deadSopenharmony_ci NamedCopyBufferSubData(tempBuffer, objBuf, objOffset, updateSize); 11345bd8deadSopenharmony_ci } 11355bd8deadSopenharmony_ci Draw(obj); 11365bd8deadSopenharmony_ci } 11375bd8deadSopenharmony_ci 11385bd8deadSopenharmony_ci If we further desire to offload the data transfer to another 11395bd8deadSopenharmony_ci thread, and the implementation supports concurrent data transfers 11405bd8deadSopenharmony_ci in one context/thread while rendering in another context/thread, 11415bd8deadSopenharmony_ci this may also be accomplished thusly: 11425bd8deadSopenharmony_ci 11435bd8deadSopenharmony_ci Loading thread: 11445bd8deadSopenharmony_ci while (1) { 11455bd8deadSopenharmony_ci WaitForEvent(something to do); 11465bd8deadSopenharmony_ci 11475bd8deadSopenharmony_ci NamedBufferData(sysBuffer, updateSize, NULL, STREAM_DRAW); 11485bd8deadSopenharmony_ci ptr = MapNamedBuffer(sysBuffer, WRITE_ONLY); 11495bd8deadSopenharmony_ci // fill ptr 11505bd8deadSopenharmony_ci UnmapNamedBuffer(sysBuffer); 11515bd8deadSopenharmony_ci 11525bd8deadSopenharmony_ci NamedBufferData(vidBuffer, updateSize, NULL, STREAM_COPY); 11535bd8deadSopenharmony_ci // This is a sysmem->vidmem blit. 11545bd8deadSopenharmony_ci NamedCopyBufferSubData(sysBuffer, vidBuffer, 0, updateSize); 11555bd8deadSopenharmony_ci SetFence(fenceId, ALL_COMPLETED); 11565bd8deadSopenharmony_ci 11575bd8deadSopenharmony_ci // send vidBuffer name and fenceId to Rendering thread 11585bd8deadSopenharmony_ci 11595bd8deadSopenharmony_ci // This could have been a BufferSubData directly into 11605bd8deadSopenharmony_ci // vidBuffer, if that's more natural. 11615bd8deadSopenharmony_ci } 11625bd8deadSopenharmony_ci Rendering thread: 11635bd8deadSopenharmony_ci foreach (obj in scene) { 11645bd8deadSopenharmony_ci if (obj has changed) { 11655bd8deadSopenharmony_ci // get vidBuffer name and fenceId from Loading thread 11665bd8deadSopenharmony_ci 11675bd8deadSopenharmony_ci // note: there aren't any sharable fences currently, 11685bd8deadSopenharmony_ci // actually need to ask the loading thread when it 11695bd8deadSopenharmony_ci // has finished. 11705bd8deadSopenharmony_ci FinishFence(fenceId); 11715bd8deadSopenharmony_ci 11725bd8deadSopenharmony_ci // This is hopefully a fast vidmem->vidmem blit. 11735bd8deadSopenharmony_ci NamedCopyBufferSubData(vidBuffer, objBuffer, objOffset, updateSize); 11745bd8deadSopenharmony_ci } 11755bd8deadSopenharmony_ci Draw(obj); 11765bd8deadSopenharmony_ci } 11775bd8deadSopenharmony_ci 11785bd8deadSopenharmony_ci In both of these examples, the point at which the data is written to 11795bd8deadSopenharmony_ci the resident buffer's data store is clearly specified in order 11805bd8deadSopenharmony_ci with rendering commands. This resolves a whole class of 11815bd8deadSopenharmony_ci synchronization bugs (Write After Read hazard) that 11825bd8deadSopenharmony_ci MAP_UNSYNCHRONIZED_BIT is prone to. 11835bd8deadSopenharmony_ci 11845bd8deadSopenharmony_ci 10) What happens if BufferData is called on a buffer that is resident? 11855bd8deadSopenharmony_ci 11865bd8deadSopenharmony_ci RESOLVED: BufferData is specified to "delete the existing data store", 11875bd8deadSopenharmony_ci so the GPU address of that data should become invalid. The buffer is 11885bd8deadSopenharmony_ci therefore made non-resident in the current context. 11895bd8deadSopenharmony_ci 11905bd8deadSopenharmony_ci 11) Should residency be a property of the buffer object, or should 11915bd8deadSopenharmony_ci a buffer be "made resident to a context"? 11925bd8deadSopenharmony_ci 11935bd8deadSopenharmony_ci RESOLVED: Made resident to a context. If a shared buffer is used in 11945bd8deadSopenharmony_ci two threads/contexts, it may be difficult for the application to know 11955bd8deadSopenharmony_ci when the residency state actually changes on the shared object 11965bd8deadSopenharmony_ci particularly if there is a large latency between commands being 11975bd8deadSopenharmony_ci submitted on the client and processed on the server. Allowing the 11985bd8deadSopenharmony_ci buffer to be made resident to each context individually allows the 11995bd8deadSopenharmony_ci state to be reliably toggled in-order in each command stream. This 12005bd8deadSopenharmony_ci also allows MakeBufferNonResident to serve as indication to the GL 12015bd8deadSopenharmony_ci that the buffer is no longer in use in each command stream. 12025bd8deadSopenharmony_ci 12035bd8deadSopenharmony_ci This leads to an unfortunate orphaning issue. For example, if the 12045bd8deadSopenharmony_ci buffer is resident in context A and then deleted in context B, how 12055bd8deadSopenharmony_ci can the app make it non-resident in context A? Given the name-based 12065bd8deadSopenharmony_ci object model, it is impossible. It would be complex from an 12075bd8deadSopenharmony_ci implementation point of view for DeleteBuffers (or BufferData) to 12085bd8deadSopenharmony_ci either make it non-resident or throw an error if it is resident in 12095bd8deadSopenharmony_ci some other context. 12105bd8deadSopenharmony_ci 12115bd8deadSopenharmony_ci An ideal solution would be a (separate) extension that allows the 12125bd8deadSopenharmony_ci application to increment the refcount on the object and to decrement 12135bd8deadSopenharmony_ci the refcount without necessarily deleting the object's name. Until 12145bd8deadSopenharmony_ci such an extension exists, the unsatisfying proposed resolution is that 12155bd8deadSopenharmony_ci a buffer can be "stuck" resident until the context is deleted. Note 12165bd8deadSopenharmony_ci that DeleteBuffers should make the buffer non-resident in the context 12175bd8deadSopenharmony_ci that does the delete, so this problem only applies to rare multi- 12185bd8deadSopenharmony_ci context corner cases. 12195bd8deadSopenharmony_ci 12205bd8deadSopenharmony_ci 12) Is there any value in requiring an "immutable structure" bit of 12215bd8deadSopenharmony_ci state to be set in order to query the address? 12225bd8deadSopenharmony_ci 12235bd8deadSopenharmony_ci RESOLVED: NO. Given that the BufferData behavior is fairly 12245bd8deadSopenharmony_ci straightforward to specify and implement, it's not clear that this 12255bd8deadSopenharmony_ci would be useful. 12265bd8deadSopenharmony_ci 12275bd8deadSopenharmony_ci 13) What should the program syntax look like? 12285bd8deadSopenharmony_ci 12295bd8deadSopenharmony_ci RESOLVED: Support 1-, 2-, 4-vec fetches of float/int/uint types, as 12305bd8deadSopenharmony_ci well as 8- and 16-bit int/uint fetches via a new LOAD instruction 12315bd8deadSopenharmony_ci with a slew of suffixes. Handling 8/16bit sizes will be useful for 12325bd8deadSopenharmony_ci high-level languages compiling to the assembly. Addresses are required 12335bd8deadSopenharmony_ci to be a multiple of the size of the data, as some implementations may 12345bd8deadSopenharmony_ci require this. 12355bd8deadSopenharmony_ci 12365bd8deadSopenharmony_ci Other options include a more x86-style pointer dereference 12375bd8deadSopenharmony_ci ("MOV R0, DWORD PTR[R1];") or a complement to program.local 12385bd8deadSopenharmony_ci ("MOV R0, program.global[R1];") but neither of these provide the 12395bd8deadSopenharmony_ci simple granularity of the explicit type suffixes, and a new 12405bd8deadSopenharmony_ci instruction is convenient in terms of implementation and not muddling 12415bd8deadSopenharmony_ci the clean definition of MOV. 12425bd8deadSopenharmony_ci 12435bd8deadSopenharmony_ci 14) How does the GL know to invalidate caches when data has changed? 12445bd8deadSopenharmony_ci 12455bd8deadSopenharmony_ci RESOLVED: Any entry points that can write to buffer objects should 12465bd8deadSopenharmony_ci trigger the necessary invalidation. A new entry point may only be 12475bd8deadSopenharmony_ci necessary once there is a way to write to a buffer by GPU address. 12485bd8deadSopenharmony_ci 12495bd8deadSopenharmony_ci 15) Does this extension require 64bit register/operation support in 12505bd8deadSopenharmony_ci programs and shaders? 12515bd8deadSopenharmony_ci 12525bd8deadSopenharmony_ci RESOLVED: NO. At the API level, GPU addresses are always 64bit values 12535bd8deadSopenharmony_ci and when they are stored in uniforms, attribs, parameters, etc. they 12545bd8deadSopenharmony_ci should always be stored at full precision. However, if programs and 12555bd8deadSopenharmony_ci shaders don't support 64bit registers/operations via another 12565bd8deadSopenharmony_ci programmability extension, then they will need to use only 32 bits. 12575bd8deadSopenharmony_ci On such implementations, the usable address space is therefore limited 12585bd8deadSopenharmony_ci to 4GB. Such a limit should be reflected in the value of 12595bd8deadSopenharmony_ci MAX_SHADER_BUFFER_ADDRESS_NV. 12605bd8deadSopenharmony_ci 12615bd8deadSopenharmony_ci It is expected that GLSL shaders will be compiled in such a way as to 12625bd8deadSopenharmony_ci generate 64bit pointers on implementations that support it and 32bit 12635bd8deadSopenharmony_ci pointers on implementations that don't. So GLSL shaders written against 12645bd8deadSopenharmony_ci a 32bit implementation can be expected to be forward-compatible when 12655bd8deadSopenharmony_ci run against a 64bit implementation. (u)intptr_t types are provided to 12665bd8deadSopenharmony_ci ease this compatibility. 12675bd8deadSopenharmony_ci 12685bd8deadSopenharmony_ci Built-in functions are provided to convert pointers to and from a pair 12695bd8deadSopenharmony_ci of integers. These can be used to pass pointers as two components of a 12705bd8deadSopenharmony_ci generic attrib, to construct a pointer from an RGUI32 texture fetch, 12715bd8deadSopenharmony_ci or to write a pointer to a fragment shader output. 12725bd8deadSopenharmony_ci 12735bd8deadSopenharmony_ci 16) What assumption can applications make about the alignment of 12745bd8deadSopenharmony_ci addresses returned by GetBufferParameterui64vNV? 12755bd8deadSopenharmony_ci 12765bd8deadSopenharmony_ci RESOLVED: All buffers will begin at an address that is a multiple of 12775bd8deadSopenharmony_ci 16 bytes. 12785bd8deadSopenharmony_ci 12795bd8deadSopenharmony_ci 17) How can the application guarantee that the layout of a structure 12805bd8deadSopenharmony_ci on the CPU matches the layout used by the GLSL compiler? 12815bd8deadSopenharmony_ci 12825bd8deadSopenharmony_ci RESOLVED: Provide a standard set of packing rules designed around 12835bd8deadSopenharmony_ci naturally aligning simple types. This spec will define pointer fetches 12845bd8deadSopenharmony_ci in GLSL to use these rules, but does not explicitly guarantee that 12855bd8deadSopenharmony_ci other extensions (like EXT_bindable_uniform) will use the same packing 12865bd8deadSopenharmony_ci rules for their bufferobject fetches. These packing rules are 12875bd8deadSopenharmony_ci different from the ARB_uniform_buffer_object rules - in particular, 12885bd8deadSopenharmony_ci these rules do not require vec4 padding of the array stride. 12895bd8deadSopenharmony_ci 12905bd8deadSopenharmony_ci 18) Is the address space per-context, per-share-group, or global? 12915bd8deadSopenharmony_ci 12925bd8deadSopenharmony_ci RESOLVED: It is per-share-group. Using addresses from one share group 12935bd8deadSopenharmony_ci in another share group will cause undefined results. 12945bd8deadSopenharmony_ci 12955bd8deadSopenharmony_ci 19) Is there risk of using invalid pointers for "killed" fragments, 12965bd8deadSopenharmony_ci fragments that don't take a certain branch of an "if" block, or 12975bd8deadSopenharmony_ci fragments whose shader is conceptually never executed due to pixel 12985bd8deadSopenharmony_ci ownership, stipple, etc.? 12995bd8deadSopenharmony_ci 13005bd8deadSopenharmony_ci RESOLVED: NO. OpenGL implementations sometimes run fragment programs 13015bd8deadSopenharmony_ci on "helper" pixels that have no coverage, or continue to run fragment 13025bd8deadSopenharmony_ci programs on killed pixels in order to be able to compute sane partial 13035bd8deadSopenharmony_ci derivatives for fragment program instructions (DDX, DDY) or automatic 13045bd8deadSopenharmony_ci level-of-detail calculations for texturing. In this approach, 13055bd8deadSopenharmony_ci derivatives are approximated by computing the difference in a quantity 13065bd8deadSopenharmony_ci computed for a given fragment at (x,y) and a fragment at a neighboring 13075bd8deadSopenharmony_ci pixel. When a fragment program is executed on a "helper" pixel or 13085bd8deadSopenharmony_ci killed pixel, global loads may not be executed in order to prevent 13095bd8deadSopenharmony_ci spurious faults. Helper pixels aren't explicitly mentioned in the spec 13105bd8deadSopenharmony_ci body; instead, partial derivatives are obtained by magic. 13115bd8deadSopenharmony_ci 13125bd8deadSopenharmony_ci If a fragment program contains a KIL instruction, compilers may not 13135bd8deadSopenharmony_ci reorder code such that a LOAD instruction is executed before a KIL 13145bd8deadSopenharmony_ci instruction that logically precedes it in flow control. Once a 13155bd8deadSopenharmony_ci fragment is killed, subsequent loads should never be executed if they 13165bd8deadSopenharmony_ci could cause any observable side effects. 13175bd8deadSopenharmony_ci 13185bd8deadSopenharmony_ci As a result, if a shader uses instructions that explicitly or 13195bd8deadSopenharmony_ci implicitly do LOD calculations dependent on the result of a global 13205bd8deadSopenharmony_ci load, those instructions will have undefined results. 13215bd8deadSopenharmony_ci 13225bd8deadSopenharmony_ci 20) How are structures and arrays stored in buffer object memory? 13235bd8deadSopenharmony_ci 13245bd8deadSopenharmony_ci RESOLVED: Individual structure members and array elements are stored 13255bd8deadSopenharmony_ci "packed" in memory, subject to an alignment requirement. Structure 13265bd8deadSopenharmony_ci members are stored according to the order of declaration. Array elements 13275bd8deadSopenharmony_ci are stored consecutively by element number. Unreferenced structure 13285bd8deadSopenharmony_ci members or array elements are never eliminated. 13295bd8deadSopenharmony_ci 13305bd8deadSopenharmony_ci The alignment requirement of individual structure members or array 13315bd8deadSopenharmony_ci elements is usually equal to the size of the item. For the purposes of 13325bd8deadSopenharmony_ci this requirement, vector types are treated atomically (i.e., a "vec4" with 13335bd8deadSopenharmony_ci 32-bit floats will be 16-byte aligned). One exception is that the 13345bd8deadSopenharmony_ci required alignment of three-component vectors is the same as the required 13355bd8deadSopenharmony_ci alignment of a four-component vector of the same base type. 13365bd8deadSopenharmony_ci 13375bd8deadSopenharmony_ci 21) How do the memory layout rules relate to the similar layout rules 13385bd8deadSopenharmony_ci specified for the uniform buffer object (UBO) feature incorporated in 13395bd8deadSopenharmony_ci OpenGL 3.1? 13405bd8deadSopenharmony_ci 13415bd8deadSopenharmony_ci RESOLVED: This extension was completed prior to OpenGL 3.1, but the 13425bd8deadSopenharmony_ci layout rules for this extension and for UBO were developed roughly 13435bd8deadSopenharmony_ci concurrently. The layout rules here are nearly identical to those for the 13445bd8deadSopenharmony_ci "std140" layout for uniform blocks. The main difference here is that 13455bd8deadSopenharmony_ci "std140" requires arrays of small types (e.g., "float") to be padded out 13465bd8deadSopenharmony_ci to vec4 alignment (16B), while this extension does not. 13475bd8deadSopenharmony_ci 13485bd8deadSopenharmony_ci Note that this extension does NOT allow shaders to use the layout() 13495bd8deadSopenharmony_ci qualifier added by GLSL 1.40 to achieve fine-grained control of structure 13505bd8deadSopenharmony_ci or array layout using pointers. A subsequent extension could provide this 13515bd8deadSopenharmony_ci capability. 13525bd8deadSopenharmony_ci 13535bd8deadSopenharmony_ci 22) Should we provide a mechanism for tighter packing of an array of 13545bd8deadSopenharmony_ci three-component vectors? 13555bd8deadSopenharmony_ci 13565bd8deadSopenharmony_ci RESOLVED: This could be desirable, but it won't be provided in this 13575bd8deadSopenharmony_ci extension. A subsequent extension could support alternate layouts by 13585bd8deadSopenharmony_ci allowing shaders to use of the GLSL 1.40 layout() modifier to qualify 13595bd8deadSopenharmony_ci pointer types. 13605bd8deadSopenharmony_ci 13615bd8deadSopenharmony_ci If tight packing of vec3's is strongly required, a three component array 13625bd8deadSopenharmony_ci element could be constructed using three single component loads or by 13635bd8deadSopenharmony_ci selecting/swizzling components of one or more larger loads. The former 13645bd8deadSopenharmony_ci technique could be done using GLSL by replacing: 13655bd8deadSopenharmony_ci 13665bd8deadSopenharmony_ci vec3 *pointer; 13675bd8deadSopenharmony_ci vec3 elementN; 13685bd8deadSopenharmony_ci int n; 13695bd8deadSopenharmony_ci elementN = pointer[n]; 13705bd8deadSopenharmony_ci 13715bd8deadSopenharmony_ci with 13725bd8deadSopenharmony_ci 13735bd8deadSopenharmony_ci float *pointer; 13745bd8deadSopenharmony_ci vec3 elementN; 13755bd8deadSopenharmony_ci int n; 13765bd8deadSopenharmony_ci elementN = vec3(pointer[n*3], pointer[n*3+1], pointer[n*3+2]); 13775bd8deadSopenharmony_ci 13785bd8deadSopenharmony_ci 13795bd8deadSopenharmony_ciRevision History 13805bd8deadSopenharmony_ci 13815bd8deadSopenharmony_ci Rev. Date Author Changes 13825bd8deadSopenharmony_ci ---- -------- -------- ----------------------------------------- 13835bd8deadSopenharmony_ci 8 08/06/10 istewart Modify behavior of named buffer functions 13845bd8deadSopenharmony_ci to match those of EXT_direct_state_access. 13855bd8deadSopenharmony_ci Add INVALID_OPERATION error to 13865bd8deadSopenharmony_ci MakeBufferResidentNV and GetBufferParameterui64vNV 13875bd8deadSopenharmony_ci if the buffer object has no data store. 13885bd8deadSopenharmony_ci 13895bd8deadSopenharmony_ci 7 06/22/10 pbrown Document INVALID_OPERATION errors on 13905bd8deadSopenharmony_ci residency managment and query APIs when an 13915bd8deadSopenharmony_ci non-existent buffer object is referenced, 13925bd8deadSopenharmony_ci when trying to make an already resident buffer 13935bd8deadSopenharmony_ci resident, or when trying to make an already 13945bd8deadSopenharmony_ci non-resident buffer non-resident. 13955bd8deadSopenharmony_ci 13965bd8deadSopenharmony_ci 6 09/21/09 groth Fix non-conformant DSA function names. 13975bd8deadSopenharmony_ci 13985bd8deadSopenharmony_ci 5 09/10/09 Jon Leech Add 'const' to type of Uniformui64vNV and 13995bd8deadSopenharmony_ci ProgramUniformui64vNV 'count' argument. 14005bd8deadSopenharmony_ci 14015bd8deadSopenharmony_ci 4 09/09/09 mjk Fix typos 14025bd8deadSopenharmony_ci 14035bd8deadSopenharmony_ci 3 08/21/09 pbrown Add explicit spec language describing the 14045bd8deadSopenharmony_ci typecast operator implemented here. The 14055bd8deadSopenharmony_ci previous spec language said it was allowed 14065bd8deadSopenharmony_ci but didn't say what it did. 14075bd8deadSopenharmony_ci 14085bd8deadSopenharmony_ci 2 08/05/09 pbrown Update section describing memory layout of 14095bd8deadSopenharmony_ci variables pointed to; moved to the core 14105bd8deadSopenharmony_ci specification as with OpenGL 3.1's uniform 14115bd8deadSopenharmony_ci buffer layout. Added a few issues on memory 14125bd8deadSopenharmony_ci layout. Explicitly documented the set of 14135bd8deadSopenharmony_ci operations and implicit conversions allowed 14145bd8deadSopenharmony_ci on pointers. 14155bd8deadSopenharmony_ci 14165bd8deadSopenharmony_ci 1 jbolz Internal revisions. 1417