15bd8deadSopenharmony_ciName 25bd8deadSopenharmony_ci 35bd8deadSopenharmony_ci ARB_shader_ballot 45bd8deadSopenharmony_ci 55bd8deadSopenharmony_ciName Strings 65bd8deadSopenharmony_ci 75bd8deadSopenharmony_ci GL_ARB_shader_ballot 85bd8deadSopenharmony_ci 95bd8deadSopenharmony_ciContact 105bd8deadSopenharmony_ci 115bd8deadSopenharmony_ci Timothy Lottes (timothy.lottes 'at' amd.com) 125bd8deadSopenharmony_ci 135bd8deadSopenharmony_ciContributors 145bd8deadSopenharmony_ci 155bd8deadSopenharmony_ci Timothy Lottes, AMD 165bd8deadSopenharmony_ci Graham Sellers, AMD 175bd8deadSopenharmony_ci Daniel Rakos, AMD 185bd8deadSopenharmony_ci Jeannot Breton, NVIDIA 195bd8deadSopenharmony_ci Pat Brown, NVIDIA 205bd8deadSopenharmony_ci Eric Werness, NVIDIA 215bd8deadSopenharmony_ci Mark Kilgard, NVIDIA 225bd8deadSopenharmony_ci Jeff Bolz, NVIDIA 235bd8deadSopenharmony_ci 245bd8deadSopenharmony_ciNotice 255bd8deadSopenharmony_ci 265bd8deadSopenharmony_ci Copyright (c) 2015 The Khronos Group Inc. Copyright terms at 275bd8deadSopenharmony_ci http://www.khronos.org/registry/speccopyright.html 285bd8deadSopenharmony_ci 295bd8deadSopenharmony_ciSpecification Update Policy 305bd8deadSopenharmony_ci 315bd8deadSopenharmony_ci Khronos-approved extension specifications are updated in response to 325bd8deadSopenharmony_ci issues and bugs prioritized by the Khronos OpenGL Working Group. For 335bd8deadSopenharmony_ci extensions which have been promoted to a core Specification, fixes will 345bd8deadSopenharmony_ci first appear in the latest version of that core Specification, and will 355bd8deadSopenharmony_ci eventually be backported to the extension document. This policy is 365bd8deadSopenharmony_ci described in more detail at 375bd8deadSopenharmony_ci https://www.khronos.org/registry/OpenGL/docs/update_policy.php 385bd8deadSopenharmony_ci 395bd8deadSopenharmony_ciStatus 405bd8deadSopenharmony_ci 415bd8deadSopenharmony_ci Complete. Approved by the ARB on June 26, 2015. 425bd8deadSopenharmony_ci Ratified by the Khronos Board of Promoters on August 7, 2015. 435bd8deadSopenharmony_ci 445bd8deadSopenharmony_ciVersion 455bd8deadSopenharmony_ci 465bd8deadSopenharmony_ci Last Modified Date: 03/18/2017 475bd8deadSopenharmony_ci Revision: 8 485bd8deadSopenharmony_ci 495bd8deadSopenharmony_ciNumber 505bd8deadSopenharmony_ci 515bd8deadSopenharmony_ci ARB Extension #183 525bd8deadSopenharmony_ci 535bd8deadSopenharmony_ciDependencies 545bd8deadSopenharmony_ci 555bd8deadSopenharmony_ci This extension is written against Revision 5 of the version 4.50 of the 565bd8deadSopenharmony_ci OpenGL Shading Language Specification, dated January 30, 2015. 575bd8deadSopenharmony_ci 585bd8deadSopenharmony_ci This extension requires GL_ARB_gpu_shader_int64. 595bd8deadSopenharmony_ci 605bd8deadSopenharmony_ciOverview 615bd8deadSopenharmony_ci 625bd8deadSopenharmony_ci This extension provides the ability for a group of invocations which 635bd8deadSopenharmony_ci execute in lockstep to do limited forms of cross-invocation communication 645bd8deadSopenharmony_ci via a group broadcast of a invocation value, or broadcast of a bitarray 655bd8deadSopenharmony_ci representing a predicate value from each invocation in the group. 665bd8deadSopenharmony_ci 675bd8deadSopenharmony_ciNew Procedures and Functions 685bd8deadSopenharmony_ci 695bd8deadSopenharmony_ci None. 705bd8deadSopenharmony_ci 715bd8deadSopenharmony_ciNew Tokens 725bd8deadSopenharmony_ci 735bd8deadSopenharmony_ci None. 745bd8deadSopenharmony_ci 755bd8deadSopenharmony_ciIP Status 765bd8deadSopenharmony_ci 775bd8deadSopenharmony_ci None. 785bd8deadSopenharmony_ci 795bd8deadSopenharmony_ciModifications to the OpenGL Shading Language Specification, Version 4.50 805bd8deadSopenharmony_ci 815bd8deadSopenharmony_ci Including the following line in a shader can be used to control the 825bd8deadSopenharmony_ci language features described in this extension: 835bd8deadSopenharmony_ci 845bd8deadSopenharmony_ci #extension GL_ARB_shader_ballot : <behavior> 855bd8deadSopenharmony_ci 865bd8deadSopenharmony_ci where <behavior> is as specified in section 3.3. 875bd8deadSopenharmony_ci 885bd8deadSopenharmony_ci New preprocessor #defines are added to the OpenGL Shading Language: 895bd8deadSopenharmony_ci 905bd8deadSopenharmony_ci #define GL_ARB_shader_ballot 1 915bd8deadSopenharmony_ci 925bd8deadSopenharmony_ciAdditions to Chapter 7 of the OpenGL Shading Language Specification 935bd8deadSopenharmony_ci(Built-in Variables) 945bd8deadSopenharmony_ci 955bd8deadSopenharmony_ci Modify Section 7.4, Built-In Uniform State, p. 133 965bd8deadSopenharmony_ci 975bd8deadSopenharmony_ci (Add to the list of built-in uniform variable declaration) 985bd8deadSopenharmony_ci 995bd8deadSopenharmony_ci uniform uint gl_SubGroupSizeARB; 1005bd8deadSopenharmony_ci 1015bd8deadSopenharmony_ci (Add this paragraph at the end of this section) 1025bd8deadSopenharmony_ci 1035bd8deadSopenharmony_ci A sub-group is a collection of invocations which execute in lockstep. 1045bd8deadSopenharmony_ci The variable <gl_SubGroupSizeARB> is the maximum number of invocations 1055bd8deadSopenharmony_ci in a sub-group. The maximum <gl_SubGroupSizeARB> supported in this 1065bd8deadSopenharmony_ci extension is 64. 1075bd8deadSopenharmony_ci 1085bd8deadSopenharmony_ci Modify Section 7.1, Built-in Languages Variable, p. 110 1095bd8deadSopenharmony_ci 1105bd8deadSopenharmony_ci (Add to the list of built-in variables for the compute, vertex, geometry, 1115bd8deadSopenharmony_ci tessellation control, tessellation evaluation and fragment languages) 1125bd8deadSopenharmony_ci 1135bd8deadSopenharmony_ci in uint gl_SubGroupInvocationARB; 1145bd8deadSopenharmony_ci in uint64_t gl_SubGroupEqMaskARB; 1155bd8deadSopenharmony_ci in uint64_t gl_SubGroupGeMaskARB; 1165bd8deadSopenharmony_ci in uint64_t gl_SubGroupGtMaskARB; 1175bd8deadSopenharmony_ci in uint64_t gl_SubGroupLeMaskARB; 1185bd8deadSopenharmony_ci in uint64_t gl_SubGroupLtMaskARB; 1195bd8deadSopenharmony_ci 1205bd8deadSopenharmony_ci (Add those paragraphs at the end of this section) 1215bd8deadSopenharmony_ci 1225bd8deadSopenharmony_ci The variable <gl_SubGroupInvocationARB> holds the index of the invocation within 1235bd8deadSopenharmony_ci sub-group. This variable is in the range 0 to <gl_SubGroupSizeARB>-1, where 1245bd8deadSopenharmony_ci <gl_SubGroupSizeARB> is the total number of invocations in a sub-group. 1255bd8deadSopenharmony_ci 1265bd8deadSopenharmony_ci The <gl_SubGroup??MaskARB> variables provide a bitmask for all invocations, 1275bd8deadSopenharmony_ci with one bit per invocation starting with the least significant bit, 1285bd8deadSopenharmony_ci according to the following table, 1295bd8deadSopenharmony_ci 1305bd8deadSopenharmony_ci variable equation for bit values 1315bd8deadSopenharmony_ci -------------------- ------------------------------------ 1325bd8deadSopenharmony_ci gl_SubGroupEqMaskARB bit index == gl_SubGroupInvocationARB 1335bd8deadSopenharmony_ci gl_SubGroupGeMaskARB bit index >= gl_SubGroupInvocationARB 1345bd8deadSopenharmony_ci gl_SubGroupGtMaskARB bit index > gl_SubGroupInvocationARB 1355bd8deadSopenharmony_ci gl_SubGroupLeMaskARB bit index <= gl_SubGroupInvocationARB 1365bd8deadSopenharmony_ci gl_SubGroupLtMaskARB bit index < gl_SubGroupInvocationARB 1375bd8deadSopenharmony_ci 1385bd8deadSopenharmony_ciAdditions to Chapter 8 of the OpenGL Shading Language Specification 1395bd8deadSopenharmony_ci(Built-in Functions) 1405bd8deadSopenharmony_ci 1415bd8deadSopenharmony_ci Add Section 8.18, Shader Invocation Group Functions 1425bd8deadSopenharmony_ci 1435bd8deadSopenharmony_ci Syntax: 1445bd8deadSopenharmony_ci 1455bd8deadSopenharmony_ci uint64_t ballotARB(bool value); 1465bd8deadSopenharmony_ci 1475bd8deadSopenharmony_ci The function ballotARB() returns a bitfield containing the result of 1485bd8deadSopenharmony_ci evaluating the expression <value> in all active invocations in the 1495bd8deadSopenharmony_ci sub-group. An active invocation is one that is executing the ballotARB() 1505bd8deadSopenharmony_ci call. The sub-group may have inactive invocations for example due to 1515bd8deadSopenharmony_ci exit of the shader, or divergent branching. Sub-groups of up to 64 1525bd8deadSopenharmony_ci invocations may be represented by the return value of ballotARB(). Bits 1535bd8deadSopenharmony_ci for each invocation are packed in least significant bit ordering. If 1545bd8deadSopenharmony_ci <value> evaluates to true for an active invocation then the corresponding 1555bd8deadSopenharmony_ci bit is set to one in the result, otherwise it is zero. Bits corresponding 1565bd8deadSopenharmony_ci to invocations that are not active or that do not exist in the sub group 1575bd8deadSopenharmony_ci (because, for example, they are at bit positions beyond the sub-group 1585bd8deadSopenharmony_ci size) are set to zero. The following trivial assumptions can be made: 1595bd8deadSopenharmony_ci 1605bd8deadSopenharmony_ci * ballotARB(true) returns bitfield where the corresponding bits are 1615bd8deadSopenharmony_ci set for all active invocations in the sub-group. 1625bd8deadSopenharmony_ci 1635bd8deadSopenharmony_ci * ballotARB(false) returns zero. 1645bd8deadSopenharmony_ci 1655bd8deadSopenharmony_ci Syntax: 1665bd8deadSopenharmony_ci 1675bd8deadSopenharmony_ci genType readInvocationARB(genType value, uint invocationIndex); 1685bd8deadSopenharmony_ci genIType readInvocationARB(genIType value, uint invocationIndex); 1695bd8deadSopenharmony_ci genUType readInvocationARB(genUType value, uint invocationIndex); 1705bd8deadSopenharmony_ci 1715bd8deadSopenharmony_ci genType readFirstInvocationARB(genType value); 1725bd8deadSopenharmony_ci genIType readFirstInvocationARB(genIType value); 1735bd8deadSopenharmony_ci genUType readFirstInvocationARB(genUType value); 1745bd8deadSopenharmony_ci 1755bd8deadSopenharmony_ci The function readInvocationARB() returns the <value> from a given 1765bd8deadSopenharmony_ci <invocationIndex> to all active invocations in the sub-group. 1775bd8deadSopenharmony_ci The <invocationIndex> must be the same for all active invocations 1785bd8deadSopenharmony_ci in the sub-group otherwise results are undefined. 1795bd8deadSopenharmony_ci 1805bd8deadSopenharmony_ci The function readFirstInvocationARB() returns the <value> from the first 1815bd8deadSopenharmony_ci active invocation to all active invocations in the sub-group. 1825bd8deadSopenharmony_ci 1835bd8deadSopenharmony_ciIssues 1845bd8deadSopenharmony_ci 1855bd8deadSopenharmony_ci 1) How are the values of gl_SubGroup??MaskARB defined? 1865bd8deadSopenharmony_ci 1875bd8deadSopenharmony_ci RESOLVED. Earlier versions of this specification defined a bitmask 1885bd8deadSopenharmony_ci such as "LtMask" ("less than mask") as having bits set if 1895bd8deadSopenharmony_ci "gl_SubGroupInvocationARB < bit index". However, this was reversed 1905bd8deadSopenharmony_ci from the definition in GL_NV_shader_thread_group that these built-ins 1915bd8deadSopenharmony_ci were derived from, and also mismatched a recent Vulkan/SPIR-V extension. 1925bd8deadSopenharmony_ci 1935bd8deadSopenharmony_ci Fortunately, all known implementations of this extension had implemented 1945bd8deadSopenharmony_ci "wrong" behavior (matching the sense of the original built-ins in 1955bd8deadSopenharmony_ci GL_NV_shader_thread_group), so the best thing to do is change the 1965bd8deadSopenharmony_ci definition in the spec. 1975bd8deadSopenharmony_ci 1985bd8deadSopenharmony_ciRevision History 1995bd8deadSopenharmony_ci 2005bd8deadSopenharmony_ci Rev Date Author Changes 2015bd8deadSopenharmony_ci --- ---------- -------- --------------------------------------------- 2025bd8deadSopenharmony_ci 8 03/18/2017 jbolz Reversed the sense of the comparison in the 2035bd8deadSopenharmony_ci definition of gl_SubGroup??MaskARB. 2045bd8deadSopenharmony_ci 7 08/25/2015 nhenning Add ARB suffix on documentation for 2055bd8deadSopenharmony_ci readInvocation and readFirstInvocation 2065bd8deadSopenharmony_ci functions. 2075bd8deadSopenharmony_ci 6 07/31/2015 pdaniell Add ARB suffix on the readInvocation and 2085bd8deadSopenharmony_ci readFirstInvocation functions. 2095bd8deadSopenharmony_ci 5 07/30/2015 pdaniell Update the function definition syntax to use 2105bd8deadSopenharmony_ci our standard gen*Type conventions. 2115bd8deadSopenharmony_ci 4 06/23/2015 tlottes More precise spec language. 2125bd8deadSopenharmony_ci 3 06/22/2015 tlottes Deferred GPU processor another spec. 2135bd8deadSopenharmony_ci Cleaned up spec language. 2145bd8deadSopenharmony_ci 2 04/20/2015 tlottes Updated spec language. 2155bd8deadSopenharmony_ci 1 03/09/2015 tlottes Initial revision based on AMD_gcn_shader and 2165bd8deadSopenharmony_ci NV_shader_thread_group. 217