15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci NV_texture_barrier 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_NV_texture_barrier 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Jeff Bolz, NVIDIA Corporation (jbolz 'at' nvidia.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Mark Kilgard, NVIDIA 165bd8deadSopenharmony_ci Shazia Rahman, NVIDIA 175bd8deadSopenharmony_ci 185bd8deadSopenharmony_ciStatus 195bd8deadSopenharmony_ci 205bd8deadSopenharmony_ci Shipping (August 2009, Release 190) 215bd8deadSopenharmony_ci 225bd8deadSopenharmony_ciVersion 235bd8deadSopenharmony_ci 245bd8deadSopenharmony_ci Last Modified Date: September 29, 2016 255bd8deadSopenharmony_ci NVIDIA Revision: 4 265bd8deadSopenharmony_ci 275bd8deadSopenharmony_ciNumber 285bd8deadSopenharmony_ci 295bd8deadSopenharmony_ci OpenGL Extension #381 305bd8deadSopenharmony_ci OpenGL ES Extension #271 315bd8deadSopenharmony_ci 325bd8deadSopenharmony_ciDependencies 335bd8deadSopenharmony_ci 345bd8deadSopenharmony_ci This extension is written against the OpenGL 3.0 specification. 355bd8deadSopenharmony_ci 365bd8deadSopenharmony_ci Also written based on the wording of the OpenGL ES 3.2 specification. 375bd8deadSopenharmony_ci 385bd8deadSopenharmony_ciOverview 395bd8deadSopenharmony_ci 405bd8deadSopenharmony_ci This extension relaxes the restrictions on rendering to a currently 415bd8deadSopenharmony_ci bound texture and provides a mechanism to avoid read-after-write 425bd8deadSopenharmony_ci hazards. 435bd8deadSopenharmony_ci 445bd8deadSopenharmony_ciNew Procedures and Functions 455bd8deadSopenharmony_ci 465bd8deadSopenharmony_ci void TextureBarrierNV(void); 475bd8deadSopenharmony_ci 485bd8deadSopenharmony_ciNew Tokens 495bd8deadSopenharmony_ci 505bd8deadSopenharmony_ci None. 515bd8deadSopenharmony_ci 525bd8deadSopenharmony_ciAdditions to Chapter 2 of the OpenGL 3.0 Specification (OpenGL Operation) 535bd8deadSopenharmony_ci 545bd8deadSopenharmony_ci None. 555bd8deadSopenharmony_ci 565bd8deadSopenharmony_ciAdditions to Chapter 3 of the OpenGL 3.0 Specification (Rasterization) 575bd8deadSopenharmony_ci 585bd8deadSopenharmony_ci None. 595bd8deadSopenharmony_ci 605bd8deadSopenharmony_ciAdditions to Chapter 4 of the OpenGL 3.0 Specification (Per-Fragment 615bd8deadSopenharmony_ciOperations and the Frame Buffer) 625bd8deadSopenharmony_ci 635bd8deadSopenharmony_ci Modify Section 4.4.3, Rendering When an Image of a Bound Texture Object 645bd8deadSopenharmony_ci is Also Attached to the Framebuffer, p. 288 655bd8deadSopenharmony_ci 665bd8deadSopenharmony_ci (Replace the complicated set of conditions with the following) 675bd8deadSopenharmony_ci 685bd8deadSopenharmony_ci Specifically, the values of rendered fragments are undefined if any 695bd8deadSopenharmony_ci shader stage fetches texels and the same texels are written via fragment 705bd8deadSopenharmony_ci shader outputs, even if the reads and writes are not in the same Draw 715bd8deadSopenharmony_ci call, unless any of the following exceptions apply: 725bd8deadSopenharmony_ci 735bd8deadSopenharmony_ci - The reads and writes are from/to disjoint sets of texels (after 745bd8deadSopenharmony_ci accounting for texture filtering rules). 755bd8deadSopenharmony_ci 765bd8deadSopenharmony_ci - There is only a single read and write of each texel, and the read is in 775bd8deadSopenharmony_ci the fragment shader invocation that writes the same texel (e.g. using 785bd8deadSopenharmony_ci "texelFetch2D(sampler, ivec2(gl_FragCoord.xy), 0);"). 795bd8deadSopenharmony_ci 805bd8deadSopenharmony_ci - If a texel has been written, then in order to safely read the result 815bd8deadSopenharmony_ci a texel fetch must be in a subsequent Draw separated by the command 825bd8deadSopenharmony_ci 835bd8deadSopenharmony_ci void TextureBarrierNV(void); 845bd8deadSopenharmony_ci 855bd8deadSopenharmony_ci TextureBarrierNV() will guarantee that writes have completed and caches 865bd8deadSopenharmony_ci have been invalidated before subsequent Draws are executed. 875bd8deadSopenharmony_ci 885bd8deadSopenharmony_ciAdditions to Chapter 5 of the OpenGL 3.0 Specification (Special Functions) 895bd8deadSopenharmony_ci 905bd8deadSopenharmony_ci None. 915bd8deadSopenharmony_ci 925bd8deadSopenharmony_ciAdditions to Chapter 6 of the OpenGL 3.0 Specification (State and 935bd8deadSopenharmony_ciState Requests) 945bd8deadSopenharmony_ci 955bd8deadSopenharmony_ci None. 965bd8deadSopenharmony_ci 975bd8deadSopenharmony_ciAdditions to the AGL/GLX/WGL Specifications 985bd8deadSopenharmony_ci 995bd8deadSopenharmony_ci None 1005bd8deadSopenharmony_ci 1015bd8deadSopenharmony_ciAdditions to Chapter 9 of the OpenGL ES 3.2 Specification (Framebuffers 1025bd8deadSopenharmony_ciand Framebuffer Objects) 1035bd8deadSopenharmony_ci 1045bd8deadSopenharmony_ci Modify section 9.3.1, Rendering Feedback Loops: 1055bd8deadSopenharmony_ci 1065bd8deadSopenharmony_ci (Replace the complicated 2nd and 3rd paragraphs 1075bd8deadSopenharmony_ci "Specifically... ...only be executed conditionally." with the 1085bd8deadSopenharmony_ci following) 1095bd8deadSopenharmony_ci 1105bd8deadSopenharmony_ci Specifically, the values of rendered fragments are undefined if any 1115bd8deadSopenharmony_ci shader stage fetches texels and the same texels are written via fragment 1125bd8deadSopenharmony_ci shader outputs, even if the reads and writes are not in the same Draw 1135bd8deadSopenharmony_ci call, unless any of the following exceptions apply: 1145bd8deadSopenharmony_ci 1155bd8deadSopenharmony_ci - The reads and writes are from/to disjoint sets of texels (after 1165bd8deadSopenharmony_ci accounting for texture filtering rules). 1175bd8deadSopenharmony_ci 1185bd8deadSopenharmony_ci - There is only a single read and write of each texel, and the read is in 1195bd8deadSopenharmony_ci the fragment shader invocation that writes the same texel (e.g. using 1205bd8deadSopenharmony_ci "texelFetch2D(sampler, ivec2(gl_FragCoord.xy), 0);"). 1215bd8deadSopenharmony_ci 1225bd8deadSopenharmony_ci - If a texel has been written, then in order to safely read the result 1235bd8deadSopenharmony_ci a texel fetch must be in a subsequent Draw separated by the command 1245bd8deadSopenharmony_ci 1255bd8deadSopenharmony_ci void TextureBarrierNV(void); 1265bd8deadSopenharmony_ci 1275bd8deadSopenharmony_ci TextureBarrierNV() will guarantee that writes have completed and caches 1285bd8deadSopenharmony_ci have been invalidated before subsequent Draws are executed. 1295bd8deadSopenharmony_ci 1305bd8deadSopenharmony_ciErrors 1315bd8deadSopenharmony_ci 1325bd8deadSopenharmony_ciNew State 1335bd8deadSopenharmony_ci 1345bd8deadSopenharmony_ci None. 1355bd8deadSopenharmony_ci 1365bd8deadSopenharmony_ciNew Implementation Dependent State 1375bd8deadSopenharmony_ci 1385bd8deadSopenharmony_ci None. 1395bd8deadSopenharmony_ci 1405bd8deadSopenharmony_ciGLX Protocol 1415bd8deadSopenharmony_ci 1425bd8deadSopenharmony_ci The following rendering command is sent to the server as 1435bd8deadSopenharmony_ci a glXRender request: 1445bd8deadSopenharmony_ci 1455bd8deadSopenharmony_ci TextureBarrierNV 1465bd8deadSopenharmony_ci 1475bd8deadSopenharmony_ci 2 4 rendering command length 1485bd8deadSopenharmony_ci 2 4348 rendering command opcode 1495bd8deadSopenharmony_ci 1505bd8deadSopenharmony_ciIssues 1515bd8deadSopenharmony_ci 1525bd8deadSopenharmony_ci (1) What algorithms can take advantage of TextureBarrierNV? 1535bd8deadSopenharmony_ci 1545bd8deadSopenharmony_ci This can be used to accomplish a limited form of programmable blending 1555bd8deadSopenharmony_ci for applications where a single Draw call does not self-intersect, by 1565bd8deadSopenharmony_ci binding the same texture as both render target and texture and applying 1575bd8deadSopenharmony_ci blending operations in the fragment shader. Additionally, bounding-box 1585bd8deadSopenharmony_ci optimizations can be used to minimize the number of TextureBarrierNV 1595bd8deadSopenharmony_ci calls between Draws. For example: 1605bd8deadSopenharmony_ci 1615bd8deadSopenharmony_ci dirtybbox.empty(); 1625bd8deadSopenharmony_ci foreach (object in scene) { 1635bd8deadSopenharmony_ci if (dirtybbox.intersects(object.bbox())) { 1645bd8deadSopenharmony_ci TextureBarrierNV(); 1655bd8deadSopenharmony_ci dirtybbox.empty(); 1665bd8deadSopenharmony_ci } 1675bd8deadSopenharmony_ci object.draw(); 1685bd8deadSopenharmony_ci dirtybbox = bound(dirtybbox, object.bbox()); 1695bd8deadSopenharmony_ci } 1705bd8deadSopenharmony_ci 1715bd8deadSopenharmony_ci Another application is to render-to-texture algorithms that ping-pong 1725bd8deadSopenharmony_ci between two textures, using the result of one rendering pass as the input 1735bd8deadSopenharmony_ci to the next. Existing mechanisms require expensive FBO Binds, DrawBuffer 1745bd8deadSopenharmony_ci changes, or FBO attachment changes to safely swap the render target and 1755bd8deadSopenharmony_ci texture. With texture barriers, layered geometry shader rendering, and 1765bd8deadSopenharmony_ci texture arrays, an application can very cheaply ping-pong between two 1775bd8deadSopenharmony_ci layers of a single texture. i.e. 1785bd8deadSopenharmony_ci 1795bd8deadSopenharmony_ci X = 0; 1805bd8deadSopenharmony_ci // Bind the array texture to a texture unit 1815bd8deadSopenharmony_ci // Attach the array texture to an FBO using FramebufferTexture3D 1825bd8deadSopenharmony_ci while (!done) { 1835bd8deadSopenharmony_ci // Stuff X in a constant, vertex attrib, etc. 1845bd8deadSopenharmony_ci Draw - 1855bd8deadSopenharmony_ci Texturing from layer X; 1865bd8deadSopenharmony_ci Writing gl_Layer = 1 - X in the geometry shader; 1875bd8deadSopenharmony_ci 1885bd8deadSopenharmony_ci TextureBarrierNV(); 1895bd8deadSopenharmony_ci X = 1 - X; 1905bd8deadSopenharmony_ci } 1915bd8deadSopenharmony_ci 1925bd8deadSopenharmony_ci However, be warned that this requires geometry shaders and hence adds 1935bd8deadSopenharmony_ci the overhead that all geometry must pass through an additional program 1945bd8deadSopenharmony_ci stage, so an application using large amounts of geometry could become 1955bd8deadSopenharmony_ci geometry-limited or more shader-limited. 1965bd8deadSopenharmony_ci 1975bd8deadSopenharmony_ci (2) Does this support OpenGL ES? 1985bd8deadSopenharmony_ci 1995bd8deadSopenharmony_ci RESOLVED: Yes. ES specification language has been added, written 2005bd8deadSopenharmony_ci against the OpenGL 3.2 specification. The added language is 2015bd8deadSopenharmony_ci identical to the regular OpenGL language. 2025bd8deadSopenharmony_ci 2035bd8deadSopenharmony_ci As this specification has no dependencies other than assuming 2045bd8deadSopenharmony_ci framebuffer objects, this extension could support any version of ES 2055bd8deadSopenharmony_ci from 2.0 up. However the texelFetch operation for fetching from a 2065bd8deadSopenharmony_ci texture is introduced by OpenGL ES 3.0's GLSL or the NV_gpu_shader4 2075bd8deadSopenharmony_ci extension. 2085bd8deadSopenharmony_ci 2095bd8deadSopenharmony_ciRevision History 2105bd8deadSopenharmony_ci 2115bd8deadSopenharmony_ci Rev. Date Author Changes 2125bd8deadSopenharmony_ci ---- -------- -------- ----------------------------------------- 2135bd8deadSopenharmony_ci 1 jbolz Initial revision. 2145bd8deadSopenharmony_ci 2 mjk Assign number. 2155bd8deadSopenharmony_ci 3 srahman Add glx protocol specification. 2165bd8deadSopenharmony_ci 4 9/29/16 mjk Add ES support 217