1e5c31af7Sopenharmony_ci// Copyright 2017-2021 The Khronos Group, Inc. 2e5c31af7Sopenharmony_ci// 3e5c31af7Sopenharmony_ci// SPDX-License-Identifier: CC-BY-4.0 4e5c31af7Sopenharmony_ci 5e5c31af7Sopenharmony_ci[appendix] 6e5c31af7Sopenharmony_ci[[memory-model]] 7e5c31af7Sopenharmony_ci= Memory Model 8e5c31af7Sopenharmony_ci 9e5c31af7Sopenharmony_ci 10e5c31af7Sopenharmony_ci[[memory-model-agent]] 11e5c31af7Sopenharmony_ci== Agent 12e5c31af7Sopenharmony_ci 13e5c31af7Sopenharmony_ci_Operation_ is a general term for any task that is executed on the system. 14e5c31af7Sopenharmony_ci 15e5c31af7Sopenharmony_ci[NOTE] 16e5c31af7Sopenharmony_ci.Note 17e5c31af7Sopenharmony_ci==== 18e5c31af7Sopenharmony_ciAn operation is by definition something that is executed. 19e5c31af7Sopenharmony_ciThus if an instruction is skipped due to control flow, it does not 20e5c31af7Sopenharmony_ciconstitute an operation. 21e5c31af7Sopenharmony_ci==== 22e5c31af7Sopenharmony_ci 23e5c31af7Sopenharmony_ciEach operation is executed by a particular _agent_. 24e5c31af7Sopenharmony_ciPossible agents include each shader invocation, each host thread, and each 25e5c31af7Sopenharmony_cifixed-function stage of the pipeline. 26e5c31af7Sopenharmony_ci 27e5c31af7Sopenharmony_ci 28e5c31af7Sopenharmony_ci[[memory-model-memory-location]] 29e5c31af7Sopenharmony_ci== Memory Location 30e5c31af7Sopenharmony_ci 31e5c31af7Sopenharmony_ciA _memory location_ identifies unique storage for 8 bits of data. 32e5c31af7Sopenharmony_ciMemory operations access a _set of memory locations_ consisting of one or 33e5c31af7Sopenharmony_cimore memory locations at a time, e.g. an operation accessing a 32-bit 34e5c31af7Sopenharmony_ciinteger in memory would read/write a set of four memory locations. 35e5c31af7Sopenharmony_ciMemory operations that access whole aggregates may: access any padding bytes 36e5c31af7Sopenharmony_cibetween elements or members, but no padding bytes at the end of the 37e5c31af7Sopenharmony_ciaggregate. 38e5c31af7Sopenharmony_ciTwo sets of memory locations _overlap_ if the intersection of their sets of 39e5c31af7Sopenharmony_cimemory locations is non-empty. 40e5c31af7Sopenharmony_ciA memory operation must: not affect memory at a memory location not within 41e5c31af7Sopenharmony_ciits set of memory locations. 42e5c31af7Sopenharmony_ci 43e5c31af7Sopenharmony_ciMemory locations for buffers and images are explicitly allocated in 44e5c31af7Sopenharmony_cislink:VkDeviceMemory objects, and are implicitly allocated for SPIR-V 45e5c31af7Sopenharmony_civariables in each shader invocation. 46e5c31af7Sopenharmony_ci 47e5c31af7Sopenharmony_ciifdef::VK_KHR_workgroup_memory_explicit_layout[] 48e5c31af7Sopenharmony_ciVariables with code:Workgroup storage class that point to a block-decorated 49e5c31af7Sopenharmony_citype share a set of memory locations. 50e5c31af7Sopenharmony_ciendif::VK_KHR_workgroup_memory_explicit_layout[] 51e5c31af7Sopenharmony_ci 52e5c31af7Sopenharmony_ci 53e5c31af7Sopenharmony_ci[[memory-model-allocation]] 54e5c31af7Sopenharmony_ci== Allocation 55e5c31af7Sopenharmony_ci 56e5c31af7Sopenharmony_ciThe values stored in newly allocated memory locations are determined by a 57e5c31af7Sopenharmony_ciSPIR-V variable's initializer, if present, or else are undefined:. 58e5c31af7Sopenharmony_ciAt the time an allocation is created there have been no 59e5c31af7Sopenharmony_ci<<memory-model-memory-operation,memory operations>> to any of its memory 60e5c31af7Sopenharmony_cilocations. 61e5c31af7Sopenharmony_ciThe initialization is not considered to be a memory operation. 62e5c31af7Sopenharmony_ci 63e5c31af7Sopenharmony_ci[NOTE] 64e5c31af7Sopenharmony_ci.Note 65e5c31af7Sopenharmony_ci==== 66e5c31af7Sopenharmony_ciFor tessellation control shader output variables, a consequence of 67e5c31af7Sopenharmony_ciinitialization not being considered a memory operation is that some 68e5c31af7Sopenharmony_ciimplementations may need to insert a barrier between the initialization of 69e5c31af7Sopenharmony_cithe output variables and any reads of those variables. 70e5c31af7Sopenharmony_ci==== 71e5c31af7Sopenharmony_ci 72e5c31af7Sopenharmony_ci 73e5c31af7Sopenharmony_ci[[memory-model-memory-operation]] 74e5c31af7Sopenharmony_ci== Memory Operation 75e5c31af7Sopenharmony_ci 76e5c31af7Sopenharmony_ciFor an operation A and memory location M: 77e5c31af7Sopenharmony_ci 78e5c31af7Sopenharmony_ci * [[memory-model-access-read]] A _reads_ M if and only if the data stored 79e5c31af7Sopenharmony_ci in M is an input to A. 80e5c31af7Sopenharmony_ci * [[memory-model-access-write]] A _writes_ M if and only if the data 81e5c31af7Sopenharmony_ci output from A is stored to M. 82e5c31af7Sopenharmony_ci * [[memory-model-access-access]] A _accesses_ M if and only if it either 83e5c31af7Sopenharmony_ci reads or writes (or both) M. 84e5c31af7Sopenharmony_ci 85e5c31af7Sopenharmony_ci[NOTE] 86e5c31af7Sopenharmony_ci.Note 87e5c31af7Sopenharmony_ci==== 88e5c31af7Sopenharmony_ciA write whose value is the same as what was already in those memory 89e5c31af7Sopenharmony_cilocations is still considered to be a write and has all the same effects. 90e5c31af7Sopenharmony_ci==== 91e5c31af7Sopenharmony_ci 92e5c31af7Sopenharmony_ci 93e5c31af7Sopenharmony_ci[[memory-model-references]] 94e5c31af7Sopenharmony_ci== Reference 95e5c31af7Sopenharmony_ci 96e5c31af7Sopenharmony_ciA _reference_ is an object that a particular agent can: use to access a set 97e5c31af7Sopenharmony_ciof memory locations. 98e5c31af7Sopenharmony_ciOn the host, a reference is a host virtual address. 99e5c31af7Sopenharmony_ciOn the device, a reference is: 100e5c31af7Sopenharmony_ci 101e5c31af7Sopenharmony_ci * The descriptor that a variable is bound to, for variables in Image, 102e5c31af7Sopenharmony_ci Uniform, or StorageBuffer storage classes. 103e5c31af7Sopenharmony_ci If the variable is an array (or array of arrays, etc.) then each element 104e5c31af7Sopenharmony_ci of the array may: be a unique reference. 105e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 106e5c31af7Sopenharmony_ci * The address range for a buffer in code:PhysicalStorageBuffer storage 107e5c31af7Sopenharmony_ci class, where the base of the address range is queried with 108e5c31af7Sopenharmony_ciifndef::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 109e5c31af7Sopenharmony_ci flink:vkGetBufferDeviceAddressEXT 110e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 111e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 112e5c31af7Sopenharmony_ci flink:vkGetBufferDeviceAddress 113e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 114e5c31af7Sopenharmony_ci and the length of the range is the size of the buffer. 115e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 116e5c31af7Sopenharmony_ciifdef::VK_KHR_workgroup_memory_explicit_layout[] 117e5c31af7Sopenharmony_ci * A single common reference for all variables with code:Workgroup storage 118e5c31af7Sopenharmony_ci class that point to a block-decorated type. 119e5c31af7Sopenharmony_ci * The variable itself for non-block-decorated type variables in 120e5c31af7Sopenharmony_ci code:Workgroup storage class. 121e5c31af7Sopenharmony_ciendif::VK_KHR_workgroup_memory_explicit_layout[] 122e5c31af7Sopenharmony_ci * The variable itself for variables in other storage classes. 123e5c31af7Sopenharmony_ci 124e5c31af7Sopenharmony_ciTwo memory accesses through distinct references may: require availability 125e5c31af7Sopenharmony_ciand visibility operations as defined 126e5c31af7Sopenharmony_ci<<memory-model-location-ordered,below>>. 127e5c31af7Sopenharmony_ci 128e5c31af7Sopenharmony_ci 129e5c31af7Sopenharmony_ci[[memory-model-program-order]] 130e5c31af7Sopenharmony_ci== Program-Order 131e5c31af7Sopenharmony_ci 132e5c31af7Sopenharmony_ciA _dynamic instance_ of an instruction is defined in SPIR-V 133e5c31af7Sopenharmony_ci(https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#DynamicInstance) 134e5c31af7Sopenharmony_cias a way of referring to a particular execution of a static instruction. 135e5c31af7Sopenharmony_ciProgram-order is an ordering on dynamic instances of instructions executed 136e5c31af7Sopenharmony_ciby a single shader invocation: 137e5c31af7Sopenharmony_ci 138e5c31af7Sopenharmony_ci * (Basic block): If instructions A and B are in the same basic block, and 139e5c31af7Sopenharmony_ci A is listed in the module before B, then the n'th dynamic instance of A 140e5c31af7Sopenharmony_ci is program-ordered before the n'th dynamic instance of B. 141e5c31af7Sopenharmony_ci * (Branch): The dynamic instance of a branch or switch instruction is 142e5c31af7Sopenharmony_ci program-ordered before the dynamic instance of the OpLabel instruction 143e5c31af7Sopenharmony_ci to which it transfers control. 144e5c31af7Sopenharmony_ci * (Call entry): The dynamic instance of an code:OpFunctionCall instruction 145e5c31af7Sopenharmony_ci is program-ordered before the dynamic instances of the 146e5c31af7Sopenharmony_ci code:OpFunctionParameter instructions and the body of the called 147e5c31af7Sopenharmony_ci function. 148e5c31af7Sopenharmony_ci * (Call exit): The dynamic instance of the instruction following an 149e5c31af7Sopenharmony_ci code:OpFunctionCall instruction is program-ordered after the dynamic 150e5c31af7Sopenharmony_ci instance of the return instruction executed by the called function. 151e5c31af7Sopenharmony_ci * (Transitive Closure): If dynamic instance A of any instruction is 152e5c31af7Sopenharmony_ci program-ordered before dynamic instance B of any instruction and B is 153e5c31af7Sopenharmony_ci program-ordered before dynamic instance C of any instruction then A is 154e5c31af7Sopenharmony_ci program-ordered before C. 155e5c31af7Sopenharmony_ci * (Complete definition): No other dynamic instances are program-ordered. 156e5c31af7Sopenharmony_ci 157e5c31af7Sopenharmony_ciFor instructions executed on the host, the source language defines the 158e5c31af7Sopenharmony_ciprogram-order relation (e.g. as "`sequenced-before`"). 159e5c31af7Sopenharmony_ci 160e5c31af7Sopenharmony_ci 161e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 162e5c31af7Sopenharmony_ci[[shader-call-related]] 163e5c31af7Sopenharmony_ci== Shader Call Related 164e5c31af7Sopenharmony_ci 165e5c31af7Sopenharmony_ciShader-call-related is an equivalence relation on invocations defined as the 166e5c31af7Sopenharmony_cisymmetric and transitive closure of: 167e5c31af7Sopenharmony_ci 168e5c31af7Sopenharmony_ci * A is shader-call-related to B if A is created by an 169e5c31af7Sopenharmony_ci <<ray-tracing-repack,invocation repack>> instruction executed by B. 170e5c31af7Sopenharmony_ci 171e5c31af7Sopenharmony_ci 172e5c31af7Sopenharmony_ci[[shader-call-order]] 173e5c31af7Sopenharmony_ci== Shader Call Order 174e5c31af7Sopenharmony_ci 175e5c31af7Sopenharmony_ciShader-call-order is a partial order on dynamic instances of instructions 176e5c31af7Sopenharmony_ciexecuted by invocations that are shader-call-related: 177e5c31af7Sopenharmony_ci 178e5c31af7Sopenharmony_ci * (Program order): If dynamic instance A is program-ordered before B, then 179e5c31af7Sopenharmony_ci A is shader-call-ordered before B. 180e5c31af7Sopenharmony_ci * (Shader call entry): If A is a dynamic instance of an 181e5c31af7Sopenharmony_ci <<ray-tracing-repack,invocation repack>> instruction and B is a dynamic 182e5c31af7Sopenharmony_ci instance executed by an invocation that is created by A, then A is 183e5c31af7Sopenharmony_ci shader-call-ordered before B. 184e5c31af7Sopenharmony_ci * (Shader call exit): If A is a dynamic instance of an 185e5c31af7Sopenharmony_ci <<ray-tracing-repack,invocation repack>> instruction, B is the next 186e5c31af7Sopenharmony_ci dynamic instance executed by the same invocation, and C is a dynamic 187e5c31af7Sopenharmony_ci instance executed by an invocation that is created by A, then C is 188e5c31af7Sopenharmony_ci shader-call-ordered before B. 189e5c31af7Sopenharmony_ci * (Transitive closure): If A is shader-call-ordered-before B and B is 190e5c31af7Sopenharmony_ci shader-call-ordered-before C, then A is shader-call-ordered-before C. 191e5c31af7Sopenharmony_ci * (Complete definition): No other dynamic instances are 192e5c31af7Sopenharmony_ci shader-call-ordered. 193e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 194e5c31af7Sopenharmony_ci 195e5c31af7Sopenharmony_ci 196e5c31af7Sopenharmony_ci[[memory-model-scope]] 197e5c31af7Sopenharmony_ci== Scope 198e5c31af7Sopenharmony_ci 199e5c31af7Sopenharmony_ciAtomic and barrier instructions include scopes which identify sets of shader 200e5c31af7Sopenharmony_ciinvocations that must: obey the requested ordering and atomicity rules of 201e5c31af7Sopenharmony_cithe operation, as defined below. 202e5c31af7Sopenharmony_ci 203e5c31af7Sopenharmony_ciThe various scopes are described in detail in <<shaders-scope, the Shaders 204e5c31af7Sopenharmony_cichapter>>. 205e5c31af7Sopenharmony_ci 206e5c31af7Sopenharmony_ci 207e5c31af7Sopenharmony_ci[[memory-model-atomic-operation]] 208e5c31af7Sopenharmony_ci== Atomic Operation 209e5c31af7Sopenharmony_ci 210e5c31af7Sopenharmony_ciAn _atomic operation_ on the device is any SPIR-V operation whose name 211e5c31af7Sopenharmony_cibegins with code:OpAtomic. 212e5c31af7Sopenharmony_ciAn atomic operation on the host is any operation performed with an 213e5c31af7Sopenharmony_cistd::atomic typed object. 214e5c31af7Sopenharmony_ci 215e5c31af7Sopenharmony_ciEach atomic operation has a memory <<memory-model-scope,scope>> and a 216e5c31af7Sopenharmony_ci<<memory-model-memory-semantics,semantics>>. 217e5c31af7Sopenharmony_ciInformally, the scope determines which other agents it is atomic with 218e5c31af7Sopenharmony_cirespect to, and the <<memory-model-memory-semantics,semantics>> constrains 219e5c31af7Sopenharmony_ciits ordering against other memory accesses. 220e5c31af7Sopenharmony_ciDevice atomic operations have explicit scopes and semantics. 221e5c31af7Sopenharmony_ciEach host atomic operation implicitly uses the code:CrossDevice scope, and 222e5c31af7Sopenharmony_ciuses a memory semantics equivalent to a C++ std::memory_order value of 223e5c31af7Sopenharmony_cirelaxed, acquire, release, acq_rel, or seq_cst. 224e5c31af7Sopenharmony_ci 225e5c31af7Sopenharmony_ciTwo atomic operations A and B are _potentially-mutually-ordered_ if and only 226e5c31af7Sopenharmony_ciif all of the following are true: 227e5c31af7Sopenharmony_ci 228e5c31af7Sopenharmony_ci * They access the same set of memory locations. 229e5c31af7Sopenharmony_ci * They use the same reference. 230e5c31af7Sopenharmony_ci * A is in the instance of B's memory scope. 231e5c31af7Sopenharmony_ci * B is in the instance of A's memory scope. 232e5c31af7Sopenharmony_ci * A and B are not the same operation (irreflexive). 233e5c31af7Sopenharmony_ci 234e5c31af7Sopenharmony_ciTwo atomic operations A and B are _mutually-ordered_ if and only if they are 235e5c31af7Sopenharmony_cipotentially-mutually-ordered and any of the following are true: 236e5c31af7Sopenharmony_ci 237e5c31af7Sopenharmony_ci * A and B are both device operations. 238e5c31af7Sopenharmony_ci * A and B are both host operations. 239e5c31af7Sopenharmony_ci * A is a device operation, B is a host operation, and the implementation 240e5c31af7Sopenharmony_ci supports concurrent host- and device-atomics. 241e5c31af7Sopenharmony_ci 242e5c31af7Sopenharmony_ci[NOTE] 243e5c31af7Sopenharmony_ci.Note 244e5c31af7Sopenharmony_ci==== 245e5c31af7Sopenharmony_ciIf two atomic operations are not mutually-ordered, and if their sets of 246e5c31af7Sopenharmony_cimemory locations overlap, then each must: be synchronized against the other 247e5c31af7Sopenharmony_cias if they were non-atomic operations. 248e5c31af7Sopenharmony_ci==== 249e5c31af7Sopenharmony_ci 250e5c31af7Sopenharmony_ci 251e5c31af7Sopenharmony_ci[[memory-model-scoped-modification-order]] 252e5c31af7Sopenharmony_ci== Scoped Modification Order 253e5c31af7Sopenharmony_ci 254e5c31af7Sopenharmony_ciFor a given atomic write A, all atomic writes that are mutually-ordered with 255e5c31af7Sopenharmony_ciA occur in an order known as A's _scoped modification order_. 256e5c31af7Sopenharmony_ciA's scoped modification order relates no other operations. 257e5c31af7Sopenharmony_ci 258e5c31af7Sopenharmony_ci[NOTE] 259e5c31af7Sopenharmony_ci.Note 260e5c31af7Sopenharmony_ci==== 261e5c31af7Sopenharmony_ciInvocations outside the instance of A's memory scope may: observe the values 262e5c31af7Sopenharmony_ciat A's set of memory locations becoming visible to it in an order that 263e5c31af7Sopenharmony_cidisagrees with the scoped modification order. 264e5c31af7Sopenharmony_ci==== 265e5c31af7Sopenharmony_ci 266e5c31af7Sopenharmony_ci[NOTE] 267e5c31af7Sopenharmony_ci.Note 268e5c31af7Sopenharmony_ci==== 269e5c31af7Sopenharmony_ciIt is valid to have non-atomic operations or atomics in a different scope 270e5c31af7Sopenharmony_ciinstance to the same set of memory locations, as long as they are 271e5c31af7Sopenharmony_cisynchronized against each other as if they were non-atomic (if they are not, 272e5c31af7Sopenharmony_ciit is treated as a <<memory-model-access-data-race,data race>>). 273e5c31af7Sopenharmony_ciThat means this definition of A's scoped modification order could include 274e5c31af7Sopenharmony_ciatomic operations that occur much later, after intervening non-atomics. 275e5c31af7Sopenharmony_ciThat is a bit non-intuitive, but it helps to keep this definition simple and 276e5c31af7Sopenharmony_cinon-circular. 277e5c31af7Sopenharmony_ci==== 278e5c31af7Sopenharmony_ci 279e5c31af7Sopenharmony_ci 280e5c31af7Sopenharmony_ci[[memory-model-memory-semantics]] 281e5c31af7Sopenharmony_ci== Memory Semantics 282e5c31af7Sopenharmony_ci 283e5c31af7Sopenharmony_ciNon-atomic memory operations, by default, may: be observed by one agent in a 284e5c31af7Sopenharmony_cidifferent order than they were written by another agent. 285e5c31af7Sopenharmony_ci 286e5c31af7Sopenharmony_ciAtomics and some synchronization operations include _memory semantics_, 287e5c31af7Sopenharmony_ciwhich are flags that constrain the order in which other memory accesses 288e5c31af7Sopenharmony_ci(including non-atomic memory accesses and 289e5c31af7Sopenharmony_ci<<memory-model-availability-visibility,availability and visibility 290e5c31af7Sopenharmony_cioperations>>) performed by the same agent can: be observed by other agents, 291e5c31af7Sopenharmony_cior can: observe accesses by other agents. 292e5c31af7Sopenharmony_ci 293e5c31af7Sopenharmony_ciDevice instructions that include semantics are code:OpAtomic*, 294e5c31af7Sopenharmony_cicode:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier. 295e5c31af7Sopenharmony_ciHost instructions that include semantics are some std::atomic methods and 296e5c31af7Sopenharmony_cimemory fences. 297e5c31af7Sopenharmony_ci 298e5c31af7Sopenharmony_ciSPIR-V supports the following memory semantics: 299e5c31af7Sopenharmony_ci 300e5c31af7Sopenharmony_ci * Relaxed: No constraints on order of other memory accesses. 301e5c31af7Sopenharmony_ci * Acquire: A memory read with this semantic performs an _acquire 302e5c31af7Sopenharmony_ci operation_. 303e5c31af7Sopenharmony_ci A memory barrier with this semantic is an _acquire barrier_. 304e5c31af7Sopenharmony_ci * Release: A memory write with this semantic performs a _release 305e5c31af7Sopenharmony_ci operation_. 306e5c31af7Sopenharmony_ci A memory barrier with this semantic is a _release barrier_. 307e5c31af7Sopenharmony_ci * AcquireRelease: A memory read-modify-write operation with this semantic 308e5c31af7Sopenharmony_ci performs both an acquire operation and a release operation, and inherits 309e5c31af7Sopenharmony_ci the limitations on ordering from both of those operations. 310e5c31af7Sopenharmony_ci A memory barrier with this semantic is both a release and acquire 311e5c31af7Sopenharmony_ci barrier. 312e5c31af7Sopenharmony_ci 313e5c31af7Sopenharmony_ci[NOTE] 314e5c31af7Sopenharmony_ci.Note 315e5c31af7Sopenharmony_ci==== 316e5c31af7Sopenharmony_ciSPIR-V does not support "`consume`" semantics on the device. 317e5c31af7Sopenharmony_ci==== 318e5c31af7Sopenharmony_ci 319e5c31af7Sopenharmony_ciThe memory semantics operand also includes _storage class semantics_ which 320e5c31af7Sopenharmony_ciindicate which storage classes are constrained by the synchronization. 321e5c31af7Sopenharmony_ciSPIR-V storage class semantics include: 322e5c31af7Sopenharmony_ci 323e5c31af7Sopenharmony_ci * UniformMemory 324e5c31af7Sopenharmony_ci * WorkgroupMemory 325e5c31af7Sopenharmony_ci * ImageMemory 326e5c31af7Sopenharmony_ci * OutputMemory 327e5c31af7Sopenharmony_ci 328e5c31af7Sopenharmony_ciEach SPIR-V memory operation accesses a single storage class. 329e5c31af7Sopenharmony_ciSemantics in synchronization operations can include a combination of storage 330e5c31af7Sopenharmony_ciclasses. 331e5c31af7Sopenharmony_ci 332e5c31af7Sopenharmony_ciThe UniformMemory storage class semantic applies to accesses to memory in 333e5c31af7Sopenharmony_cithe 334e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 335e5c31af7Sopenharmony_ciPhysicalStorageBuffer, 336e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 337e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 338e5c31af7Sopenharmony_cicode:ShaderRecordBufferKHR, 339e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 340e5c31af7Sopenharmony_ciUniform and StorageBuffer storage classes. 341e5c31af7Sopenharmony_ciThe WorkgroupMemory storage class semantic applies to accesses to memory in 342e5c31af7Sopenharmony_cithe Workgroup storage class. 343e5c31af7Sopenharmony_ciThe ImageMemory storage class semantic applies to accesses to memory in the 344e5c31af7Sopenharmony_ciImage storage class. 345e5c31af7Sopenharmony_ciThe OutputMemory storage class semantic applies to accesses to memory in the 346e5c31af7Sopenharmony_ciOutput storage class. 347e5c31af7Sopenharmony_ci 348e5c31af7Sopenharmony_ci[NOTE] 349e5c31af7Sopenharmony_ci.Note 350e5c31af7Sopenharmony_ci==== 351e5c31af7Sopenharmony_ciInformally, these constraints limit how memory operations can be reordered, 352e5c31af7Sopenharmony_ciand these limits apply not only to the order of accesses as performed in the 353e5c31af7Sopenharmony_ciagent that executes the instruction, but also to the order the effects of 354e5c31af7Sopenharmony_ciwrites become visible to all other agents within the same instance of the 355e5c31af7Sopenharmony_ciinstruction's memory scope. 356e5c31af7Sopenharmony_ci==== 357e5c31af7Sopenharmony_ci 358e5c31af7Sopenharmony_ci[NOTE] 359e5c31af7Sopenharmony_ci.Note 360e5c31af7Sopenharmony_ci==== 361e5c31af7Sopenharmony_ciRelease and acquire operations in different threads can: act as 362e5c31af7Sopenharmony_cisynchronization operations, to guarantee that writes that happened before 363e5c31af7Sopenharmony_cithe release are visible after the acquire. 364e5c31af7Sopenharmony_ci(This is not a formal definition, just an Informative forward reference.) 365e5c31af7Sopenharmony_ci==== 366e5c31af7Sopenharmony_ci 367e5c31af7Sopenharmony_ci[NOTE] 368e5c31af7Sopenharmony_ci.Note 369e5c31af7Sopenharmony_ci==== 370e5c31af7Sopenharmony_ciThe OutputMemory storage class semantic is only useful in tessellation 371e5c31af7Sopenharmony_cicontrol shaders, which is the only execution model where output variables 372e5c31af7Sopenharmony_ciare shared between invocations. 373e5c31af7Sopenharmony_ci==== 374e5c31af7Sopenharmony_ci 375e5c31af7Sopenharmony_ciThe memory semantics operand can: also include availability and visibility 376e5c31af7Sopenharmony_ciflags, which apply availability and visibility operations as described in 377e5c31af7Sopenharmony_ci<<memory-model-availability-visibility,availability and visibility>>. 378e5c31af7Sopenharmony_ciThe availability/visibility flags are: 379e5c31af7Sopenharmony_ci 380e5c31af7Sopenharmony_ci * MakeAvailable: Semantics must: be Release or AcquireRelease. 381e5c31af7Sopenharmony_ci Performs an availability operation before the release operation or 382e5c31af7Sopenharmony_ci barrier. 383e5c31af7Sopenharmony_ci * MakeVisible: Semantics must: be Acquire or AcquireRelease. 384e5c31af7Sopenharmony_ci Performs a visibility operation after the acquire operation or barrier. 385e5c31af7Sopenharmony_ci 386e5c31af7Sopenharmony_ciThe specifics of these operations are defined in 387e5c31af7Sopenharmony_ci<<memory-model-availability-visibility-semantics,Availability and Visibility 388e5c31af7Sopenharmony_ciSemantics>>. 389e5c31af7Sopenharmony_ci 390e5c31af7Sopenharmony_ciHost atomic operations may: support a different list of memory semantics and 391e5c31af7Sopenharmony_cisynchronization operations, depending on the host architecture and source 392e5c31af7Sopenharmony_cilanguage. 393e5c31af7Sopenharmony_ci 394e5c31af7Sopenharmony_ci 395e5c31af7Sopenharmony_ci[[memory-model-release-sequence]] 396e5c31af7Sopenharmony_ci== Release Sequence 397e5c31af7Sopenharmony_ci 398e5c31af7Sopenharmony_ciAfter an atomic operation A performs a release operation on a set of memory 399e5c31af7Sopenharmony_cilocations M, the _release sequence headed by A_ is the longest continuous 400e5c31af7Sopenharmony_cisubsequence of A's scoped modification order that consists of: 401e5c31af7Sopenharmony_ci 402e5c31af7Sopenharmony_ci * the atomic operation A as its first element 403e5c31af7Sopenharmony_ci * atomic read-modify-write operations on M by any agent 404e5c31af7Sopenharmony_ci 405e5c31af7Sopenharmony_ci[NOTE] 406e5c31af7Sopenharmony_ci.Note 407e5c31af7Sopenharmony_ci==== 408e5c31af7Sopenharmony_ciThe atomics in the last bullet must: be mutually-ordered with A by virtue of 409e5c31af7Sopenharmony_cibeing in A's scoped modification order. 410e5c31af7Sopenharmony_ci==== 411e5c31af7Sopenharmony_ci 412e5c31af7Sopenharmony_ci[NOTE] 413e5c31af7Sopenharmony_ci.Note 414e5c31af7Sopenharmony_ci==== 415e5c31af7Sopenharmony_ciThis intentionally omits "`atomic writes to M performed by the same agent 416e5c31af7Sopenharmony_cithat performed A`", which is present in the corresponding C++ definition. 417e5c31af7Sopenharmony_ci==== 418e5c31af7Sopenharmony_ci 419e5c31af7Sopenharmony_ci 420e5c31af7Sopenharmony_ci[[memory-model-synchronizes-with]] 421e5c31af7Sopenharmony_ci== Synchronizes-With 422e5c31af7Sopenharmony_ci 423e5c31af7Sopenharmony_ci_Synchronizes-with_ is a relation between operations, where each operation 424e5c31af7Sopenharmony_ciis either an atomic operation or a memory barrier (aka fence on the host). 425e5c31af7Sopenharmony_ci 426e5c31af7Sopenharmony_ciIf A and B are atomic operations, then A synchronizes-with B if and only if 427e5c31af7Sopenharmony_ciall of the following are true: 428e5c31af7Sopenharmony_ci 429e5c31af7Sopenharmony_ci * A performs a release operation 430e5c31af7Sopenharmony_ci * B performs an acquire operation 431e5c31af7Sopenharmony_ci * A and B are mutually-ordered 432e5c31af7Sopenharmony_ci * B reads a value written by A or by an operation in the release sequence 433e5c31af7Sopenharmony_ci headed by A 434e5c31af7Sopenharmony_ci 435e5c31af7Sopenharmony_cicode:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier 436e5c31af7Sopenharmony_ciare _memory barrier_ instructions in SPIR-V. 437e5c31af7Sopenharmony_ci 438e5c31af7Sopenharmony_ciIf A is a release barrier and B is an atomic operation that performs an 439e5c31af7Sopenharmony_ciacquire operation, then A synchronizes-with B if and only if all of the 440e5c31af7Sopenharmony_cifollowing are true: 441e5c31af7Sopenharmony_ci 442e5c31af7Sopenharmony_ci * there exists an atomic write X (with any memory semantics) 443e5c31af7Sopenharmony_ci * A is program-ordered before X 444e5c31af7Sopenharmony_ci * X and B are mutually-ordered 445e5c31af7Sopenharmony_ci * B reads a value written by X or by an operation in the release sequence 446e5c31af7Sopenharmony_ci headed by X 447e5c31af7Sopenharmony_ci ** If X is relaxed, it is still considered to head a hypothetical release 448e5c31af7Sopenharmony_ci sequence for this rule 449e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 450e5c31af7Sopenharmony_ci * X's storage class is in A's semantics. 451e5c31af7Sopenharmony_ci 452e5c31af7Sopenharmony_ciIf A is an atomic operation that performs a release operation and B is an 453e5c31af7Sopenharmony_ciacquire barrier, then A synchronizes-with B if and only if all of the 454e5c31af7Sopenharmony_cifollowing are true: 455e5c31af7Sopenharmony_ci 456e5c31af7Sopenharmony_ci * there exists an atomic read X (with any memory semantics) 457e5c31af7Sopenharmony_ci * X is program-ordered before B 458e5c31af7Sopenharmony_ci * X and A are mutually-ordered 459e5c31af7Sopenharmony_ci * X reads a value written by A or by an operation in the release sequence 460e5c31af7Sopenharmony_ci headed by A 461e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 462e5c31af7Sopenharmony_ci * X's storage class is in B's semantics. 463e5c31af7Sopenharmony_ci 464e5c31af7Sopenharmony_ciIf A is a release barrier and B is an acquire barrier, then A 465e5c31af7Sopenharmony_cisynchronizes-with B if all of the following are true: 466e5c31af7Sopenharmony_ci 467e5c31af7Sopenharmony_ci * there exists an atomic write X (with any memory semantics) 468e5c31af7Sopenharmony_ci * A is program-ordered before X 469e5c31af7Sopenharmony_ci * there exists an atomic read Y (with any memory semantics) 470e5c31af7Sopenharmony_ci * Y is program-ordered before B 471e5c31af7Sopenharmony_ci * X and Y are mutually-ordered 472e5c31af7Sopenharmony_ci * Y reads the value written by X or by an operation in the release 473e5c31af7Sopenharmony_ci sequence headed by X 474e5c31af7Sopenharmony_ci ** If X is relaxed, it is still considered to head a hypothetical release 475e5c31af7Sopenharmony_ci sequence for this rule 476e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 477e5c31af7Sopenharmony_ci * X's and Y's storage class is in A's and B's semantics. 478e5c31af7Sopenharmony_ci ** NOTE: X and Y must have the same storage class, because they are 479e5c31af7Sopenharmony_ci mutually ordered. 480e5c31af7Sopenharmony_ci 481e5c31af7Sopenharmony_ciIf A is a release barrier, B is an acquire barrier, and C is a control 482e5c31af7Sopenharmony_cibarrier (where A can: equal C, and B can: equal C), then A synchronizes-with 483e5c31af7Sopenharmony_ciB if all of the following are true: 484e5c31af7Sopenharmony_ci 485e5c31af7Sopenharmony_ci * A is program-ordered before (or equals) C 486e5c31af7Sopenharmony_ci * C is program-ordered before (or equals) B 487e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 488e5c31af7Sopenharmony_ci * A and B are in the instance of C's execution scope 489e5c31af7Sopenharmony_ci 490e5c31af7Sopenharmony_ci[NOTE] 491e5c31af7Sopenharmony_ci.Note 492e5c31af7Sopenharmony_ci==== 493e5c31af7Sopenharmony_ciThis is similar to the barrier-barrier synchronization above, but with a 494e5c31af7Sopenharmony_cicontrol barrier filling the role of the relaxed atomics. 495e5c31af7Sopenharmony_ci==== 496e5c31af7Sopenharmony_ci 497e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 498e5c31af7Sopenharmony_ci 499e5c31af7Sopenharmony_ciLet F be an ordering of fragment shader invocations, such that invocation 500e5c31af7Sopenharmony_ciF~1~ is ordered before invocation F~2~ if and only if F~1~ and F~2~ overlap 501e5c31af7Sopenharmony_cias described in <<shaders-scope-fragment-interlock,Fragment Shader 502e5c31af7Sopenharmony_ciInterlock>> and F~1~ executes the interlocked code before F~2~. 503e5c31af7Sopenharmony_ci 504e5c31af7Sopenharmony_ciIf A is an code:OpEndInvocationInterlockEXT instruction and B is an 505e5c31af7Sopenharmony_cicode:OpBeginInvocationInterlockEXT instruction, then A synchronizes-with B 506e5c31af7Sopenharmony_ciif the agent that executes A is ordered before the agent that executes B in 507e5c31af7Sopenharmony_ciF. A and B are both considered to have code:FragmentInterlock memory scope 508e5c31af7Sopenharmony_ciand semantics of UniformMemory and ImageMemory, and A is considered to have 509e5c31af7Sopenharmony_ciRelease semantics and B is considered to have Acquire semantics. 510e5c31af7Sopenharmony_ci 511e5c31af7Sopenharmony_ci[NOTE] 512e5c31af7Sopenharmony_ci.Note 513e5c31af7Sopenharmony_ci==== 514e5c31af7Sopenharmony_cicode:OpBeginInvocationInterlockEXT and code:OpBeginInvocationInterlockEXT do 515e5c31af7Sopenharmony_cinot perform implicit availability or visibility operations. 516e5c31af7Sopenharmony_ciUsually, shaders using fragment shader interlock will declare the relevant 517e5c31af7Sopenharmony_ciresources as `coherent` to get implicit 518e5c31af7Sopenharmony_ci<<memory-model-instruction-av-vis,per-instruction availability and 519e5c31af7Sopenharmony_civisibility operations>>. 520e5c31af7Sopenharmony_ci==== 521e5c31af7Sopenharmony_ci 522e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 523e5c31af7Sopenharmony_ci 524e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 525e5c31af7Sopenharmony_ciIf A is a release barrier and B is an acquire barrier, then A 526e5c31af7Sopenharmony_cisynchronizes-with B if all of the following are true: 527e5c31af7Sopenharmony_ci 528e5c31af7Sopenharmony_ci * A is shader-call-ordered-before B 529e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 530e5c31af7Sopenharmony_ci 531e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 532e5c31af7Sopenharmony_ci 533e5c31af7Sopenharmony_ciNo other release and acquire barriers synchronize-with each other. 534e5c31af7Sopenharmony_ci 535e5c31af7Sopenharmony_ci 536e5c31af7Sopenharmony_ci[[memory-model-system-synchronizes-with]] 537e5c31af7Sopenharmony_ci== System-Synchronizes-With 538e5c31af7Sopenharmony_ci 539e5c31af7Sopenharmony_ci_System-synchronizes-with_ is a relation between arbitrary operations on the 540e5c31af7Sopenharmony_cidevice or host. 541e5c31af7Sopenharmony_ciCertain operations system-synchronize-with each other, which informally 542e5c31af7Sopenharmony_cimeans the first operation occurs before the second and that the 543e5c31af7Sopenharmony_cisynchronization is performed without using application-visible memory 544e5c31af7Sopenharmony_ciaccesses. 545e5c31af7Sopenharmony_ci 546e5c31af7Sopenharmony_ciIf there is an <<synchronization-dependencies-execution,execution 547e5c31af7Sopenharmony_cidependency>> between two operations A and B, then the operation in the first 548e5c31af7Sopenharmony_cisynchronization scope system-synchronizes-with the operation in the second 549e5c31af7Sopenharmony_cisynchronization scope. 550e5c31af7Sopenharmony_ci 551e5c31af7Sopenharmony_ci[NOTE] 552e5c31af7Sopenharmony_ci.Note 553e5c31af7Sopenharmony_ci==== 554e5c31af7Sopenharmony_ciThis covers all Vulkan synchronization primitives, including device 555e5c31af7Sopenharmony_cioperations executing before a synchronization primitive is signaled, wait 556e5c31af7Sopenharmony_cioperations happening before subsequent device operations, signal operations 557e5c31af7Sopenharmony_cihappening before host operations that wait on them, and host operations 558e5c31af7Sopenharmony_cihappening before flink:vkQueueSubmit. 559e5c31af7Sopenharmony_ciThe list is spread throughout the synchronization chapter, and is not 560e5c31af7Sopenharmony_cirepeated here. 561e5c31af7Sopenharmony_ci==== 562e5c31af7Sopenharmony_ci 563e5c31af7Sopenharmony_ciSystem-synchronizes-with implicitly includes all storage class semantics and 564e5c31af7Sopenharmony_cihas code:CrossDevice scope. 565e5c31af7Sopenharmony_ci 566e5c31af7Sopenharmony_ciIf A system-synchronizes-with B, we also say A is 567e5c31af7Sopenharmony_ci_system-synchronized-before_ B and B is _system-synchronized-after_ A. 568e5c31af7Sopenharmony_ci 569e5c31af7Sopenharmony_ci 570e5c31af7Sopenharmony_ci[[memory-model-non-private]] 571e5c31af7Sopenharmony_ci== Private vs. Non-Private 572e5c31af7Sopenharmony_ci 573e5c31af7Sopenharmony_ciBy default, non-atomic memory operations are treated as _private_, meaning 574e5c31af7Sopenharmony_cisuch a memory operation is not intended to be used for communication with 575e5c31af7Sopenharmony_ciother agents. 576e5c31af7Sopenharmony_ciMemory operations with the NonPrivatePointer/NonPrivateTexel bit set are 577e5c31af7Sopenharmony_citreated as _non-private_, and are intended to be used for communication with 578e5c31af7Sopenharmony_ciother agents. 579e5c31af7Sopenharmony_ci 580e5c31af7Sopenharmony_ciMore precisely, for private memory operations to be 581e5c31af7Sopenharmony_ci<<memory-model-location-ordered,Location-Ordered>> between distinct agents 582e5c31af7Sopenharmony_cirequires using system-synchronizes-with rather than shader-based 583e5c31af7Sopenharmony_cisynchronization. 584e5c31af7Sopenharmony_ciNon-private memory operations still obey program-order. 585e5c31af7Sopenharmony_ci 586e5c31af7Sopenharmony_ciAtomic operations are always considered non-private. 587e5c31af7Sopenharmony_ci 588e5c31af7Sopenharmony_ci 589e5c31af7Sopenharmony_ci[[memory-model-inter-thread-happens-before]] 590e5c31af7Sopenharmony_ci== Inter-Thread-Happens-Before 591e5c31af7Sopenharmony_ci 592e5c31af7Sopenharmony_ciLet SC be a non-empty set of storage class semantics. 593e5c31af7Sopenharmony_ciThen (using template syntax) operation A _inter-thread-happens-before_<SC> 594e5c31af7Sopenharmony_cioperation B if and only if any of the following is true: 595e5c31af7Sopenharmony_ci 596e5c31af7Sopenharmony_ci * A system-synchronizes-with B 597e5c31af7Sopenharmony_ci * A synchronizes-with B, and both A and B have all of SC in their 598e5c31af7Sopenharmony_ci semantics 599e5c31af7Sopenharmony_ci * A is an operation on memory in a storage class in SC or that has all of 600e5c31af7Sopenharmony_ci SC in its semantics, B is a release barrier or release atomic with all 601e5c31af7Sopenharmony_ci of SC in its semantics, and A is program-ordered before B 602e5c31af7Sopenharmony_ci * A is an acquire barrier or acquire atomic with all of SC in its 603e5c31af7Sopenharmony_ci semantics, B is an operation on memory in a storage class in SC or that 604e5c31af7Sopenharmony_ci has all of SC in its semantics, and A is program-ordered before B 605e5c31af7Sopenharmony_ci * A and B are both host operations and A inter-thread-happens-before B as 606e5c31af7Sopenharmony_ci defined in the host language specification 607e5c31af7Sopenharmony_ci * A inter-thread-happens-before<SC> some X and X 608e5c31af7Sopenharmony_ci inter-thread-happens-before<SC> B 609e5c31af7Sopenharmony_ci 610e5c31af7Sopenharmony_ci 611e5c31af7Sopenharmony_ci[[memory-model-happens-before]] 612e5c31af7Sopenharmony_ci== Happens-Before 613e5c31af7Sopenharmony_ci 614e5c31af7Sopenharmony_ciOperation A _happens-before_ operation B if and only if any of the following 615e5c31af7Sopenharmony_ciis true: 616e5c31af7Sopenharmony_ci 617e5c31af7Sopenharmony_ci * A is program-ordered before B 618e5c31af7Sopenharmony_ci * A inter-thread-happens-before<SC> B for some set of storage classes SC 619e5c31af7Sopenharmony_ci 620e5c31af7Sopenharmony_ci_Happens-after_ is defined similarly. 621e5c31af7Sopenharmony_ci 622e5c31af7Sopenharmony_ci[NOTE] 623e5c31af7Sopenharmony_ci.Note 624e5c31af7Sopenharmony_ci==== 625e5c31af7Sopenharmony_ciUnlike C++, happens-before is not always sufficient for a write to be 626e5c31af7Sopenharmony_civisible to a read. 627e5c31af7Sopenharmony_ciAdditional <<memory-model-availability-visibility,availability and 628e5c31af7Sopenharmony_civisibility>> operations may: be required for writes to be 629e5c31af7Sopenharmony_ci<<memory-model-visible-to,visible-to>> other memory accesses. 630e5c31af7Sopenharmony_ci==== 631e5c31af7Sopenharmony_ci 632e5c31af7Sopenharmony_ci[NOTE] 633e5c31af7Sopenharmony_ci.Note 634e5c31af7Sopenharmony_ci==== 635e5c31af7Sopenharmony_ciHappens-before is not transitive, but each of program-order and 636e5c31af7Sopenharmony_ciinter-thread-happens-before<SC> are transitive. 637e5c31af7Sopenharmony_ciThese can be thought of as covering the "`single-threaded`" case and the 638e5c31af7Sopenharmony_ci"`multi-threaded`" case, and it is not necessary (and not valid) to form 639e5c31af7Sopenharmony_cichains between the two. 640e5c31af7Sopenharmony_ci==== 641e5c31af7Sopenharmony_ci 642e5c31af7Sopenharmony_ci 643e5c31af7Sopenharmony_ci[[memory-model-availability-visibility]] 644e5c31af7Sopenharmony_ci== Availability and Visibility 645e5c31af7Sopenharmony_ci 646e5c31af7Sopenharmony_ci_Availability_ and _visibility_ are states of a write operation, which 647e5c31af7Sopenharmony_ci(informally) track how far the write has permeated the system, i.e. which 648e5c31af7Sopenharmony_ciagents and references are able to observe the write. 649e5c31af7Sopenharmony_ciAvailability state is per _memory domain_. 650e5c31af7Sopenharmony_ciVisibility state is per (agent,reference) pair. 651e5c31af7Sopenharmony_ciAvailability and visibility states are per-memory location for each write. 652e5c31af7Sopenharmony_ci 653e5c31af7Sopenharmony_ciMemory domains are named according to the agents whose memory accesses use 654e5c31af7Sopenharmony_cithe domain. 655e5c31af7Sopenharmony_ciDomains used by shader invocations are organized hierarchically into 656e5c31af7Sopenharmony_cimultiple smaller memory domains which correspond to the different 657e5c31af7Sopenharmony_ci<<shaders-scope, scopes>>. 658e5c31af7Sopenharmony_ciEach memory domain is considered the _dual_ of a scope, and vice versa. 659e5c31af7Sopenharmony_ciThe memory domains defined in Vulkan include: 660e5c31af7Sopenharmony_ci 661e5c31af7Sopenharmony_ci * _host_ - accessible by host agents 662e5c31af7Sopenharmony_ci * _device_ - accessible by all device agents for a particular device 663e5c31af7Sopenharmony_ci * _shader_ - accessible by shader agents for a particular device, 664e5c31af7Sopenharmony_ci corresponding to the code:Device scope 665e5c31af7Sopenharmony_ci * _queue family instance_ - accessible by shader agents in a single queue 666e5c31af7Sopenharmony_ci family, corresponding to the code:QueueFamily scope. 667e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 668e5c31af7Sopenharmony_ci * _fragment interlock instance_ - accessible by fragment shader agents 669e5c31af7Sopenharmony_ci that <<shaders-scope-fragment-interlock,overlap>>, corresponding to the 670e5c31af7Sopenharmony_ci code:FragmentInterlock scope. 671e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 672e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 673e5c31af7Sopenharmony_ci * _shader call instance_ - accessible by shader agents that are 674e5c31af7Sopenharmony_ci <<shader-call-related,shader-call-related>>, corresponding to the 675e5c31af7Sopenharmony_ci code:ShaderCallKHR scope. 676e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 677e5c31af7Sopenharmony_ci * _workgroup instance_ - accessible by shader agents in the same 678e5c31af7Sopenharmony_ci workgroup, corresponding to the code:Workgroup scope. 679e5c31af7Sopenharmony_ci * _subgroup instance_ - accessible by shader agents in the same subgroup, 680e5c31af7Sopenharmony_ci corresponding to the code:Subgroup scope. 681e5c31af7Sopenharmony_ci 682e5c31af7Sopenharmony_ciThe memory domains are nested in the order listed above, 683e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 684e5c31af7Sopenharmony_ciexcept for shader call instance domain, 685e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 686e5c31af7Sopenharmony_ciwith memory domains later in the list nested in the domains earlier in the 687e5c31af7Sopenharmony_cilist. 688e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 689e5c31af7Sopenharmony_ciThe shader call instance domain is at an implementation-dependent location 690e5c31af7Sopenharmony_ciin the list, and is nested according to that location. 691e5c31af7Sopenharmony_ciThe shader call instance domain is not broader than the queue family 692e5c31af7Sopenharmony_ciinstance domain. 693e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 694e5c31af7Sopenharmony_ci 695e5c31af7Sopenharmony_ci[NOTE] 696e5c31af7Sopenharmony_ci.Note 697e5c31af7Sopenharmony_ci==== 698e5c31af7Sopenharmony_ciMemory domains do not correspond to storage classes or device-local and 699e5c31af7Sopenharmony_cihost-local slink:VkDeviceMemory allocations, rather they indicate whether a 700e5c31af7Sopenharmony_ciwrite can be made visible only to agents in the same subgroup, same 701e5c31af7Sopenharmony_ciworkgroup, 702e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 703e5c31af7Sopenharmony_cioverlapping fragment shader invocation, 704e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 705e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 706e5c31af7Sopenharmony_cishader-call-related ray tracing invocation, 707e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 708e5c31af7Sopenharmony_ciin any shader invocation, or anywhere on the device, or host. 709e5c31af7Sopenharmony_ciThe shader, queue family instance, 710e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 711e5c31af7Sopenharmony_cifragment interlock instance, 712e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 713e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 714e5c31af7Sopenharmony_cishader call instance, 715e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 716e5c31af7Sopenharmony_ciworkgroup instance, and subgroup instance domains are only used for 717e5c31af7Sopenharmony_cishader-based availability/visibility operatons, in other cases writes can be 718e5c31af7Sopenharmony_cimade available from/visible to the shader via the device domain. 719e5c31af7Sopenharmony_ci==== 720e5c31af7Sopenharmony_ci 721e5c31af7Sopenharmony_ci_Availability operations_, _visibility operations_, and _memory domain 722e5c31af7Sopenharmony_cioperations_ alter the state of the write operations that happen-before them, 723e5c31af7Sopenharmony_ciand which are included in their _source scope_ to be available or visible to 724e5c31af7Sopenharmony_citheir _destination scope_. 725e5c31af7Sopenharmony_ci 726e5c31af7Sopenharmony_ci * For an availability operation, the source scope is a set of 727e5c31af7Sopenharmony_ci (agent,reference,memory location) tuples, and the destination scope is a 728e5c31af7Sopenharmony_ci set of memory domains. 729e5c31af7Sopenharmony_ci * For a memory domain operation, the source scope is a memory domain and 730e5c31af7Sopenharmony_ci the destination scope is a memory domain. 731e5c31af7Sopenharmony_ci * For a visibility operation, the source scope is a set of memory domains 732e5c31af7Sopenharmony_ci and the destination scope is a set of (agent,reference,memory location) 733e5c31af7Sopenharmony_ci tuples. 734e5c31af7Sopenharmony_ci 735e5c31af7Sopenharmony_ciHow the scopes are determined depends on the specific operation. 736e5c31af7Sopenharmony_ciAvailability and memory domain operations expand the set of memory domains 737e5c31af7Sopenharmony_cito which the write is available. 738e5c31af7Sopenharmony_ciVisibility operations expand the set of (agent,reference,memory location) 739e5c31af7Sopenharmony_cituples to which the write is visible. 740e5c31af7Sopenharmony_ci 741e5c31af7Sopenharmony_ciRecall that availability and visibility states are per-memory location, and 742e5c31af7Sopenharmony_cilet W be a write operation to one or more locations performed by agent A via 743e5c31af7Sopenharmony_cireference R. Let L be one of the locations written. 744e5c31af7Sopenharmony_ci(W,L) (the write W to L), is initially not available to any memory domain 745e5c31af7Sopenharmony_ciand only visible to (A,R,L). 746e5c31af7Sopenharmony_ciAn availability operation AV that happens-after W and that includes (A,R,L) 747e5c31af7Sopenharmony_ciin its source scope makes (W,L) _available_ to the memory domains in its 748e5c31af7Sopenharmony_cidestination scope. 749e5c31af7Sopenharmony_ci 750e5c31af7Sopenharmony_ciA memory domain operation DOM that happens-after AV and for which (W,L) is 751e5c31af7Sopenharmony_ciavailable in the source scope makes (W,L) available in the destination 752e5c31af7Sopenharmony_cimemory domain. 753e5c31af7Sopenharmony_ci 754e5c31af7Sopenharmony_ciA visibility operation VIS that happens-after AV (or DOM) and for which 755e5c31af7Sopenharmony_ci(W,L) is available in any domain in the source scope makes (W,L) _visible_ 756e5c31af7Sopenharmony_cito all (agent,reference,L) tuples included in its destination scope. 757e5c31af7Sopenharmony_ci 758e5c31af7Sopenharmony_ciIf write W~2~ happens-after W, and their sets of memory locations overlap, 759e5c31af7Sopenharmony_cithen W will not be available/visible to all agents/references for those 760e5c31af7Sopenharmony_cimemory locations that overlap (and future AV/DOM/VIS ops cannot revive W's 761e5c31af7Sopenharmony_ciwrite to those locations). 762e5c31af7Sopenharmony_ci 763e5c31af7Sopenharmony_ciAvailability, memory domain, and visibility operations are treated like 764e5c31af7Sopenharmony_ciother non-atomic memory accesses for the purpose of 765e5c31af7Sopenharmony_ci<<memory-model-memory-semantics,memory semantics>>, meaning they can be 766e5c31af7Sopenharmony_ciordered by release-acquire sequences or memory barriers. 767e5c31af7Sopenharmony_ci 768e5c31af7Sopenharmony_ciAn _availability chain_ is a sequence of availability operations to 769e5c31af7Sopenharmony_ciincreasingly broad memory domains, where element N+1 of the chain is 770e5c31af7Sopenharmony_ciperformed in the dual scope instance of the destination memory domain of 771e5c31af7Sopenharmony_cielement N and element N happens-before element N+1. 772e5c31af7Sopenharmony_ciAn example is an availability operation with destination scope of the 773e5c31af7Sopenharmony_ciworkgroup instance domain that happens-before an availability operation to 774e5c31af7Sopenharmony_cithe shader domain performed by an invocation in the same workgroup. 775e5c31af7Sopenharmony_ciAn availability chain AVC that happens-after W and that includes (A,R,L) in 776e5c31af7Sopenharmony_cithe source scope makes (W,L) _available_ to the memory domains in its final 777e5c31af7Sopenharmony_cidestination scope. 778e5c31af7Sopenharmony_ciAn availability chain with a single element is just the availability 779e5c31af7Sopenharmony_cioperation. 780e5c31af7Sopenharmony_ci 781e5c31af7Sopenharmony_ciSimilarly, a _visibility chain_ is a sequence of visibility operations from 782e5c31af7Sopenharmony_ciincreasingly narrow memory domains, where element N of the chain is 783e5c31af7Sopenharmony_ciperformed in the dual scope instance of the source memory domain of element 784e5c31af7Sopenharmony_ciN+1 and element N happens-before element N+1. 785e5c31af7Sopenharmony_ciAn example is a visibility operation with source scope of the shader domain 786e5c31af7Sopenharmony_cithat happens-before a visibility operation with source scope of the 787e5c31af7Sopenharmony_ciworkgroup instance domain performed by an invocation in the same workgroup. 788e5c31af7Sopenharmony_ciA visibility chain VISC that happens-after AVC (or DOM) and for which (W,L) 789e5c31af7Sopenharmony_ciis available in any domain in the source scope makes (W,L) _visible_ to all 790e5c31af7Sopenharmony_ci(agent,reference,L) tuples included in its final destination scope. 791e5c31af7Sopenharmony_ciA visibility chain with a single element is just the visibility operation. 792e5c31af7Sopenharmony_ci 793e5c31af7Sopenharmony_ci 794e5c31af7Sopenharmony_ci[[memory-model-vulkan-availability-visibility]] 795e5c31af7Sopenharmony_ci== Availability, Visibility, and Domain Operations 796e5c31af7Sopenharmony_ci 797e5c31af7Sopenharmony_ciThe following operations generate availability, visibility, and domain 798e5c31af7Sopenharmony_cioperations. 799e5c31af7Sopenharmony_ciWhen multiple availability/visibility/domain operations are described, they 800e5c31af7Sopenharmony_ciare system-synchronized-with each other in the order listed. 801e5c31af7Sopenharmony_ci 802e5c31af7Sopenharmony_ciAn operation that performs a <<synchronization-dependencies-memory,memory 803e5c31af7Sopenharmony_cidependency>> generates: 804e5c31af7Sopenharmony_ci 805e5c31af7Sopenharmony_ci * If the source access mask includes ename:VK_ACCESS_HOST_WRITE_BIT, then 806e5c31af7Sopenharmony_ci the dependency includes a memory domain operation from host domain to 807e5c31af7Sopenharmony_ci device domain. 808e5c31af7Sopenharmony_ci * An availability operation with source scope of all writes in the first 809e5c31af7Sopenharmony_ci <<synchronization-dependencies-access-scopes,access scope>> of the 810e5c31af7Sopenharmony_ci dependency and a destination scope of the device domain. 811e5c31af7Sopenharmony_ci * A visibility operation with source scope of the device domain and 812e5c31af7Sopenharmony_ci destination scope of the second access scope of the dependency. 813e5c31af7Sopenharmony_ci * If the destination access mask includes ename:VK_ACCESS_HOST_READ_BIT or 814e5c31af7Sopenharmony_ci ename:VK_ACCESS_HOST_WRITE_BIT, then the dependency includes a memory 815e5c31af7Sopenharmony_ci domain operation from device domain to host domain. 816e5c31af7Sopenharmony_ci 817e5c31af7Sopenharmony_ciflink:vkFlushMappedMemoryRanges performs an availability operation, with a 818e5c31af7Sopenharmony_cisource scope of (agents,references) = (all host threads, all mapped memory 819e5c31af7Sopenharmony_ciranges passed to the command), and destination scope of the host domain. 820e5c31af7Sopenharmony_ci 821e5c31af7Sopenharmony_ciflink:vkInvalidateMappedMemoryRanges performs a visibility operation, with a 822e5c31af7Sopenharmony_cisource scope of the host domain and a destination scope of 823e5c31af7Sopenharmony_ci(agents,references) = (all host threads, all mapped memory ranges passed to 824e5c31af7Sopenharmony_cithe command). 825e5c31af7Sopenharmony_ci 826e5c31af7Sopenharmony_ciflink:vkQueueSubmit performs a memory domain operation from host to device, 827e5c31af7Sopenharmony_ciand a visibility operation with source scope of the device domain and 828e5c31af7Sopenharmony_cidestination scope of all agents and references on the device. 829e5c31af7Sopenharmony_ci 830e5c31af7Sopenharmony_ci 831e5c31af7Sopenharmony_ci[[memory-model-availability-visibility-semantics]] 832e5c31af7Sopenharmony_ci== Availability and Visibility Semantics 833e5c31af7Sopenharmony_ci 834e5c31af7Sopenharmony_ciA memory barrier or atomic operation via agent A that includes MakeAvailable 835e5c31af7Sopenharmony_ciin its semantics performs an availability operation whose source scope 836e5c31af7Sopenharmony_ciincludes agent A and all references in the storage classes in that 837e5c31af7Sopenharmony_ciinstruction's storage class semantics, and all memory locations, and whose 838e5c31af7Sopenharmony_cidestination scope is a set of memory domains selected as specified below. 839e5c31af7Sopenharmony_ciThe implicit availability operation is program-ordered between the barrier 840e5c31af7Sopenharmony_cior atomic and all other operations program-ordered before the barrier or 841e5c31af7Sopenharmony_ciatomic. 842e5c31af7Sopenharmony_ci 843e5c31af7Sopenharmony_ciA memory barrier or atomic operation via agent A that includes MakeVisible 844e5c31af7Sopenharmony_ciin its semantics performs a visibility operation whose source scope is a set 845e5c31af7Sopenharmony_ciof memory domains selected as specified below, and whose destination scope 846e5c31af7Sopenharmony_ciincludes agent A and all references in the storage classes in that 847e5c31af7Sopenharmony_ciinstruction's storage class semantics, and all memory locations. 848e5c31af7Sopenharmony_ciThe implicit visibility operation is program-ordered between the barrier or 849e5c31af7Sopenharmony_ciatomic and all other operations program-ordered after the barrier or atomic. 850e5c31af7Sopenharmony_ci 851e5c31af7Sopenharmony_ciThe memory domains are selected based on the memory scope of the instruction 852e5c31af7Sopenharmony_cias follows: 853e5c31af7Sopenharmony_ci 854e5c31af7Sopenharmony_ci * code:Device scope uses the shader domain 855e5c31af7Sopenharmony_ci * code:QueueFamily scope uses the queue family instance domain 856e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 857e5c31af7Sopenharmony_ci * code:FragmentInterlock scope uses the fragment interlock instance domain 858e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 859e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 860e5c31af7Sopenharmony_ci * code:ShaderCallKHR scope uses the shader call instance domain 861e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 862e5c31af7Sopenharmony_ci * code:Workgroup scope uses the workgroup instance domain 863e5c31af7Sopenharmony_ci * code:Subgroup uses the subgroup instance domain 864e5c31af7Sopenharmony_ci * code:Invocation perform no availability/visibility operations. 865e5c31af7Sopenharmony_ci 866e5c31af7Sopenharmony_ciWhen an availability operation performed by an agent A includes a memory 867e5c31af7Sopenharmony_cidomain D in its destination scope, where D corresponds to scope instance S, 868e5c31af7Sopenharmony_ciit also includes the memory domains that correspond to each smaller scope 869e5c31af7Sopenharmony_ciinstance S' that is a subset of S and that includes A. Similarly for 870e5c31af7Sopenharmony_civisibility operations. 871e5c31af7Sopenharmony_ci 872e5c31af7Sopenharmony_ci 873e5c31af7Sopenharmony_ci[[memory-model-instruction-av-vis]] 874e5c31af7Sopenharmony_ci== Per-Instruction Availability and Visibility Semantics 875e5c31af7Sopenharmony_ci 876e5c31af7Sopenharmony_ciA memory write instruction that includes MakePointerAvailable, or an image 877e5c31af7Sopenharmony_ciwrite instruction that includes MakeTexelAvailable, performs an availability 878e5c31af7Sopenharmony_cioperation whose source scope includes the agent and reference used to 879e5c31af7Sopenharmony_ciperform the write and the memory locations written by the instruction, and 880e5c31af7Sopenharmony_ciwhose destination scope is a set of memory domains selected by the Scope 881e5c31af7Sopenharmony_cioperand specified in <<memory-model-availability-visibility-semantics, 882e5c31af7Sopenharmony_ciAvailability and Visibility Semantics>>. 883e5c31af7Sopenharmony_ciThe implicit availability operation is program-ordered between the write and 884e5c31af7Sopenharmony_ciall other operations program-ordered after the write. 885e5c31af7Sopenharmony_ci 886e5c31af7Sopenharmony_ciA memory read instruction that includes MakePointerVisible, or an image read 887e5c31af7Sopenharmony_ciinstruction that includes MakeTexelVisible, performs a visibility operation 888e5c31af7Sopenharmony_ciwhose source scope is a set of memory domains selected by the Scope operand 889e5c31af7Sopenharmony_cias specified in <<memory-model-availability-visibility-semantics, 890e5c31af7Sopenharmony_ciAvailability and Visibility Semantics>>, and whose destination scope 891e5c31af7Sopenharmony_ciincludes the agent and reference used to perform the read and the memory 892e5c31af7Sopenharmony_cilocations read by the instruction. 893e5c31af7Sopenharmony_ciThe implicit visibility operation is program-ordered between read and all 894e5c31af7Sopenharmony_ciother operations program-ordered before the read. 895e5c31af7Sopenharmony_ci 896e5c31af7Sopenharmony_ci[NOTE] 897e5c31af7Sopenharmony_ci.Note 898e5c31af7Sopenharmony_ci==== 899e5c31af7Sopenharmony_ciAlthough reads with per-instruction visibility only perform visibility ops 900e5c31af7Sopenharmony_cifrom the shader or 901e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 902e5c31af7Sopenharmony_cifragment interlock instance or 903e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 904e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 905e5c31af7Sopenharmony_cishader call instance or 906e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 907e5c31af7Sopenharmony_ciworkgroup instance or subgroup instance domain, they will also see writes 908e5c31af7Sopenharmony_cithat were made visible via the device domain, i.e. those writes previously 909e5c31af7Sopenharmony_ciperformed by non-shader agents and made visible via API commands. 910e5c31af7Sopenharmony_ci==== 911e5c31af7Sopenharmony_ci 912e5c31af7Sopenharmony_ci[NOTE] 913e5c31af7Sopenharmony_ci.Note 914e5c31af7Sopenharmony_ci==== 915e5c31af7Sopenharmony_ciIt is expected that all invocations in a subgroup execute on the same 916e5c31af7Sopenharmony_ciprocessor with the same path to memory, and thus availability and visibility 917e5c31af7Sopenharmony_cioperations with subgroup scope can be expected to be "`free`". 918e5c31af7Sopenharmony_ci==== 919e5c31af7Sopenharmony_ci 920e5c31af7Sopenharmony_ci 921e5c31af7Sopenharmony_ci[[memory-model-location-ordered]] 922e5c31af7Sopenharmony_ci== Location-Ordered 923e5c31af7Sopenharmony_ci 924e5c31af7Sopenharmony_ciLet X and Y be memory accesses to overlapping sets of memory locations M, 925e5c31af7Sopenharmony_ciwhere X != Y. Let (A~X~,R~X~) be the agent and reference used for X, and 926e5c31af7Sopenharmony_ci(A~Y~,R~Y~) be the agent and reference used for Y. For now, let "`->`" 927e5c31af7Sopenharmony_cidenote happens-before and "`->^rcpo^`" denote the reflexive closure of 928e5c31af7Sopenharmony_ciprogram-ordered before. 929e5c31af7Sopenharmony_ci 930e5c31af7Sopenharmony_ciIf D~1~ and D~2~ are different memory domains, then let DOM(D~1~,D~2~) be a 931e5c31af7Sopenharmony_cimemory domain operation from D~1~ to D~2~. 932e5c31af7Sopenharmony_ciOtherwise, let DOM(D,D) be a placeholder such that X->DOM(D,D)->Y if and 933e5c31af7Sopenharmony_cionly if X->Y. 934e5c31af7Sopenharmony_ci 935e5c31af7Sopenharmony_ciX is _location-ordered_ before Y for a location L in M if and only if any of 936e5c31af7Sopenharmony_cithe following is true: 937e5c31af7Sopenharmony_ci 938e5c31af7Sopenharmony_ci * A~X~ == A~Y~ and R~X~ == R~Y~ and X->Y 939e5c31af7Sopenharmony_ci ** NOTE: this case means no availability/visibility ops are required when 940e5c31af7Sopenharmony_ci it is the same (agent,reference). 941e5c31af7Sopenharmony_ci 942e5c31af7Sopenharmony_ci * X is a read, both X and Y are non-private, and X->Y 943e5c31af7Sopenharmony_ci * X is a read, and X (transitively) system-synchronizes with Y 944e5c31af7Sopenharmony_ci 945e5c31af7Sopenharmony_ci * If R~X~ == R~Y~ and A~X~ and A~Y~ access a common memory domain D (e.g. 946e5c31af7Sopenharmony_ci are in the same workgroup instance if D is the workgroup instance 947e5c31af7Sopenharmony_ci domain), and both X and Y are non-private: 948e5c31af7Sopenharmony_ci ** X is a write, Y is a write, AVC(A~X~,R~X~,D,L) is an availability chain 949e5c31af7Sopenharmony_ci making (X,L) available to domain D, and X->^rcpo^AVC(A~X~,R~X~,D,L)->Y 950e5c31af7Sopenharmony_ci ** X is a write, Y is a read, AVC(A~X~,R~X~,D,L) is an availability chain 951e5c31af7Sopenharmony_ci making (X,L) available to domain D, VISC(A~Y~,R~Y~,D,L) is a visibility 952e5c31af7Sopenharmony_ci chain making writes to L available in domain D visible to Y, and 953e5c31af7Sopenharmony_ci X->^rcpo^AVC(A~X~,R~X~,D,L)->VISC(A~Y~,R~Y~,D,L)->^rcpo^Y 954e5c31af7Sopenharmony_ci ** If 955e5c31af7Sopenharmony_ci slink:VkPhysicalDeviceVulkanMemoryModelFeatures::pname:vulkanMemoryModelAvailabilityVisibilityChains 956e5c31af7Sopenharmony_ci is ename:VK_FALSE, then AVC and VISC must: each only have a single 957e5c31af7Sopenharmony_ci element in the chain, in each sub-bullet above. 958e5c31af7Sopenharmony_ci 959e5c31af7Sopenharmony_ci * Let D~X~ and D~Y~ each be either the device domain or the host domain, 960e5c31af7Sopenharmony_ci depending on whether A~X~ and A~Y~ execute on the device or host: 961e5c31af7Sopenharmony_ci ** X is a write and Y is a write, and 962e5c31af7Sopenharmony_ci X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->Y 963e5c31af7Sopenharmony_ci ** X is a write and Y is a read, and 964e5c31af7Sopenharmony_ci X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->VIS(A~Y~,R~Y~,D~Y~,L)->Y 965e5c31af7Sopenharmony_ci 966e5c31af7Sopenharmony_ci[NOTE] 967e5c31af7Sopenharmony_ci.Note 968e5c31af7Sopenharmony_ci==== 969e5c31af7Sopenharmony_ciThe final bullet (synchronization through device/host domain) requires 970e5c31af7Sopenharmony_ciAPI-level synchronization operations, since the device/host domains are not 971e5c31af7Sopenharmony_ciaccessible via shader instructions. 972e5c31af7Sopenharmony_ciAnd "`device domain`" is not to be confused with "`device scope`", which 973e5c31af7Sopenharmony_cisynchronizes through the "`shader domain`". 974e5c31af7Sopenharmony_ci==== 975e5c31af7Sopenharmony_ci 976e5c31af7Sopenharmony_ci 977e5c31af7Sopenharmony_ci[[memory-model-access-data-race]] 978e5c31af7Sopenharmony_ci== Data Race 979e5c31af7Sopenharmony_ci 980e5c31af7Sopenharmony_ciLet X and Y be operations that access overlapping sets of memory locations 981e5c31af7Sopenharmony_ciM, where X != Y, and at least one of X and Y is a write, and X and Y are not 982e5c31af7Sopenharmony_cimutually-ordered atomic operations. 983e5c31af7Sopenharmony_ciIf there does not exist a location-ordered relation between X and Y for each 984e5c31af7Sopenharmony_cilocation in M, then there is a _data race_. 985e5c31af7Sopenharmony_ci 986e5c31af7Sopenharmony_ciApplications must: ensure that no data races occur during the execution of 987e5c31af7Sopenharmony_citheir application. 988e5c31af7Sopenharmony_ci 989e5c31af7Sopenharmony_ci[NOTE] 990e5c31af7Sopenharmony_ci.Note 991e5c31af7Sopenharmony_ci==== 992e5c31af7Sopenharmony_ciData races can only occur due to instructions that are actually executed. 993e5c31af7Sopenharmony_ciFor example, an instruction skipped due to control flow must not contribute 994e5c31af7Sopenharmony_cito a data race. 995e5c31af7Sopenharmony_ci==== 996e5c31af7Sopenharmony_ci 997e5c31af7Sopenharmony_ci 998e5c31af7Sopenharmony_ci[[memory-model-visible-to]] 999e5c31af7Sopenharmony_ci== Visible-To 1000e5c31af7Sopenharmony_ci 1001e5c31af7Sopenharmony_ciLet X be a write and Y be a read whose sets of memory locations overlap, and 1002e5c31af7Sopenharmony_cilet M be the set of memory locations that overlap. 1003e5c31af7Sopenharmony_ciLet M~2~ be a non-empty subset of M. Then X is _visible-to_ Y for memory 1004e5c31af7Sopenharmony_cilocations M~2~ if and only if all of the following are true: 1005e5c31af7Sopenharmony_ci 1006e5c31af7Sopenharmony_ci * X is location-ordered before Y for each location L in M~2~. 1007e5c31af7Sopenharmony_ci * There does not exist another write Z to any location L in M~2~ such that 1008e5c31af7Sopenharmony_ci X is location-ordered before Z for location L and Z is location-ordered 1009e5c31af7Sopenharmony_ci before Y for location L. 1010e5c31af7Sopenharmony_ci 1011e5c31af7Sopenharmony_ciIf X is visible-to Y, then Y reads the value written by X for locations 1012e5c31af7Sopenharmony_ciM~2~. 1013e5c31af7Sopenharmony_ci 1014e5c31af7Sopenharmony_ci[NOTE] 1015e5c31af7Sopenharmony_ci.Note 1016e5c31af7Sopenharmony_ci==== 1017e5c31af7Sopenharmony_ciIt is possible for there to be a write between X and Y that overwrites a 1018e5c31af7Sopenharmony_cisubset of the memory locations, but the remaining memory locations (M~2~) 1019e5c31af7Sopenharmony_ciwill still be visible-to Y. 1020e5c31af7Sopenharmony_ci==== 1021e5c31af7Sopenharmony_ci 1022e5c31af7Sopenharmony_ci 1023e5c31af7Sopenharmony_ci[[memory-model-acyclicity]] 1024e5c31af7Sopenharmony_ci== Acyclicity 1025e5c31af7Sopenharmony_ci 1026e5c31af7Sopenharmony_ci_Reads-from_ is a relation between operations, where the first operation is 1027e5c31af7Sopenharmony_cia write, the second operation is a read, and the second operation reads the 1028e5c31af7Sopenharmony_civalue written by the first operation. 1029e5c31af7Sopenharmony_ci_From-reads_ is a relation between operations, where the first operation is 1030e5c31af7Sopenharmony_cia read, the second operation is a write, and the first operation reads a 1031e5c31af7Sopenharmony_civalue written earlier than the second operation in the second operation's 1032e5c31af7Sopenharmony_ciscoped modification order (or the first operation reads from the initial 1033e5c31af7Sopenharmony_civalue, and the second operation is any write to the same locations). 1034e5c31af7Sopenharmony_ci 1035e5c31af7Sopenharmony_ciThen the implementation must: guarantee that no cycles exist in the union of 1036e5c31af7Sopenharmony_cithe following relations: 1037e5c31af7Sopenharmony_ci 1038e5c31af7Sopenharmony_ci * location-ordered 1039e5c31af7Sopenharmony_ci * scoped modification order (over all atomic writes) 1040e5c31af7Sopenharmony_ci * reads-from 1041e5c31af7Sopenharmony_ci * from-reads 1042e5c31af7Sopenharmony_ci 1043e5c31af7Sopenharmony_ci[NOTE] 1044e5c31af7Sopenharmony_ci.Note 1045e5c31af7Sopenharmony_ci==== 1046e5c31af7Sopenharmony_ciThis is a "`consistency`" axiom, which informally guarantees that sequences 1047e5c31af7Sopenharmony_ciof operations cannot violate causality. 1048e5c31af7Sopenharmony_ci==== 1049e5c31af7Sopenharmony_ci 1050e5c31af7Sopenharmony_ci 1051e5c31af7Sopenharmony_ci[[memory-model-scoped-modification-order-coherence]] 1052e5c31af7Sopenharmony_ci=== Scoped Modification Order Coherence 1053e5c31af7Sopenharmony_ci 1054e5c31af7Sopenharmony_ciLet A and B be mutually-ordered atomic operations, where A is 1055e5c31af7Sopenharmony_cilocation-ordered before B. Then the following rules are a consequence of 1056e5c31af7Sopenharmony_ciacyclicity: 1057e5c31af7Sopenharmony_ci 1058e5c31af7Sopenharmony_ci * If A and B are both reads and A does not read the initial value, then 1059e5c31af7Sopenharmony_ci the write that A takes its value from must: be earlier in its own scoped 1060e5c31af7Sopenharmony_ci modification order than (or the same as) the write that B takes its 1061e5c31af7Sopenharmony_ci value from (no cycles between location-order, reads-from, and 1062e5c31af7Sopenharmony_ci from-reads). 1063e5c31af7Sopenharmony_ci * If A is a read and B is a write and A does not read the initial value, 1064e5c31af7Sopenharmony_ci then A must: take its value from a write earlier than B in B's scoped 1065e5c31af7Sopenharmony_ci modification order (no cycles between location-order, scope modification 1066e5c31af7Sopenharmony_ci order, and reads-from). 1067e5c31af7Sopenharmony_ci * If A is a write and B is a read, then B must: take its value from A or a 1068e5c31af7Sopenharmony_ci write later than A in A's scoped modification order (no cycles between 1069e5c31af7Sopenharmony_ci location-order, scoped modification order, and from-reads). 1070e5c31af7Sopenharmony_ci * If A and B are both writes, then A must: be earlier than B in A's scoped 1071e5c31af7Sopenharmony_ci modification order (no cycles between location-order and scoped 1072e5c31af7Sopenharmony_ci modification order). 1073e5c31af7Sopenharmony_ci * If A is a write and B is a read-modify-write and B reads the value 1074e5c31af7Sopenharmony_ci written by A, then B comes immediately after A in A's scoped 1075e5c31af7Sopenharmony_ci modification order (no cycles between scoped modification order and 1076e5c31af7Sopenharmony_ci from-reads). 1077e5c31af7Sopenharmony_ci 1078e5c31af7Sopenharmony_ci 1079e5c31af7Sopenharmony_ci[[memory-model-shader-io]] 1080e5c31af7Sopenharmony_ci== Shader I/O 1081e5c31af7Sopenharmony_ci 1082e5c31af7Sopenharmony_ciIf a shader invocation A in a shader stage other than code:Vertex performs a 1083e5c31af7Sopenharmony_cimemory read operation X from an object in storage class 1084e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1085e5c31af7Sopenharmony_cicode:CallableDataKHR, code:IncomingCallableDataKHR, code:RayPayloadKHR, 1086e5c31af7Sopenharmony_cicode:HitAttributeKHR, code:IncomingRayPayloadKHR, or 1087e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1088e5c31af7Sopenharmony_cicode:Input, then X is system-synchronized-after all writes to the 1089e5c31af7Sopenharmony_cicorresponding 1090e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1091e5c31af7Sopenharmony_cicode:CallableDataKHR, code:IncomingCallableDataKHR, code:RayPayloadKHR, 1092e5c31af7Sopenharmony_cicode:HitAttributeKHR, code:IncomingRayPayloadKHR, or 1093e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1094e5c31af7Sopenharmony_cicode:Output storage variable(s) in the shader invocation(s) that contribute 1095e5c31af7Sopenharmony_cito generating invocation A, and those writes are all visible-to X. 1096e5c31af7Sopenharmony_ci 1097e5c31af7Sopenharmony_ci[NOTE] 1098e5c31af7Sopenharmony_ci.Note 1099e5c31af7Sopenharmony_ci==== 1100e5c31af7Sopenharmony_ciIt is not necessary for the upstream shader invocations to have completed 1101e5c31af7Sopenharmony_ciexecution, they only need to have generated the output that is being read. 1102e5c31af7Sopenharmony_ci==== 1103e5c31af7Sopenharmony_ci 1104e5c31af7Sopenharmony_ci 1105e5c31af7Sopenharmony_ci[[memory-model-deallocation]] 1106e5c31af7Sopenharmony_ci== Deallocation 1107e5c31af7Sopenharmony_ci 1108e5c31af7Sopenharmony_ciA call to flink:vkFreeMemory must: happen-after all memory operations on all 1109e5c31af7Sopenharmony_cimemory locations in that slink:VkDeviceMemory object. 1110e5c31af7Sopenharmony_ci 1111e5c31af7Sopenharmony_ci[NOTE] 1112e5c31af7Sopenharmony_ci.Note 1113e5c31af7Sopenharmony_ci==== 1114e5c31af7Sopenharmony_ciNormally, device memory operations in a given queue are synchronized with 1115e5c31af7Sopenharmony_ciflink:vkFreeMemory by having a host thread wait on a fence signalled by that 1116e5c31af7Sopenharmony_ciqueue, and the wait happens-before the call to flink:vkFreeMemory on the 1117e5c31af7Sopenharmony_cihost. 1118e5c31af7Sopenharmony_ci==== 1119e5c31af7Sopenharmony_ci 1120e5c31af7Sopenharmony_ciThe deallocation of SPIR-V variables is managed by the system and 1121e5c31af7Sopenharmony_cihappens-after all operations on those variables. 1122e5c31af7Sopenharmony_ci 1123e5c31af7Sopenharmony_ci 1124e5c31af7Sopenharmony_ci[[memory-model-informative-descriptions]] 1125e5c31af7Sopenharmony_ci== Descriptions (Informative) 1126e5c31af7Sopenharmony_ci 1127e5c31af7Sopenharmony_ciThis subsection offers more easily understandable consequences of the memory 1128e5c31af7Sopenharmony_cimodel for app/compiler developers. 1129e5c31af7Sopenharmony_ci 1130e5c31af7Sopenharmony_ciLet SC be the storage class(es) specified by a release or acquire operation 1131e5c31af7Sopenharmony_cior barrier. 1132e5c31af7Sopenharmony_ci 1133e5c31af7Sopenharmony_ci * An atomic write with release semantics must not be reordered against any 1134e5c31af7Sopenharmony_ci read or write to SC that is program-ordered before it (regardless of the 1135e5c31af7Sopenharmony_ci storage class the atomic is in). 1136e5c31af7Sopenharmony_ci 1137e5c31af7Sopenharmony_ci * An atomic read with acquire semantics must not be reordered against any 1138e5c31af7Sopenharmony_ci read or write to SC that is program-ordered after it (regardless of the 1139e5c31af7Sopenharmony_ci storage class the atomic is in). 1140e5c31af7Sopenharmony_ci 1141e5c31af7Sopenharmony_ci * Any write to SC program-ordered after a release barrier must not be 1142e5c31af7Sopenharmony_ci reordered against any read or write to SC program-ordered before that 1143e5c31af7Sopenharmony_ci barrier. 1144e5c31af7Sopenharmony_ci 1145e5c31af7Sopenharmony_ci * Any read from SC program-ordered before an acquire barrier must not be 1146e5c31af7Sopenharmony_ci reordered against any read or write to SC program-ordered after the 1147e5c31af7Sopenharmony_ci barrier. 1148e5c31af7Sopenharmony_ci 1149e5c31af7Sopenharmony_ciA control barrier (even if it has no memory semantics) must not be reordered 1150e5c31af7Sopenharmony_ciagainst any memory barriers. 1151e5c31af7Sopenharmony_ci 1152e5c31af7Sopenharmony_ciThis memory model allows memory accesses with and without availability and 1153e5c31af7Sopenharmony_civisibility operations, as well as atomic operations, all to be performed on 1154e5c31af7Sopenharmony_cithe same memory location. 1155e5c31af7Sopenharmony_ciThis is critical to allow it to reason about memory that is reused in 1156e5c31af7Sopenharmony_cimultiple ways, e.g. across the lifetime of different shader invocations or 1157e5c31af7Sopenharmony_cidraw calls. 1158e5c31af7Sopenharmony_ciWhile GLSL (and legacy SPIR-V) applies the "`coherent`" decoration to 1159e5c31af7Sopenharmony_civariables (for historical reasons), this model treats each memory access 1160e5c31af7Sopenharmony_ciinstruction as having optional implicit availability/visibility operations. 1161e5c31af7Sopenharmony_ciGLSL to SPIR-V compilers should map all (non-atomic) operations on a 1162e5c31af7Sopenharmony_cicoherent variable to Make{Pointer,Texel}{Available}{Visible} flags in this 1163e5c31af7Sopenharmony_cimodel. 1164e5c31af7Sopenharmony_ci 1165e5c31af7Sopenharmony_ciAtomic operations implicitly have availability/visibility operations, and 1166e5c31af7Sopenharmony_cithe scope of those operations is taken from the atomic operation's scope. 1167e5c31af7Sopenharmony_ci 1168e5c31af7Sopenharmony_ci 1169e5c31af7Sopenharmony_ci[[memory-model-tessellation-output-ordering]] 1170e5c31af7Sopenharmony_ci== Tessellation Output Ordering 1171e5c31af7Sopenharmony_ci 1172e5c31af7Sopenharmony_ciFor SPIR-V that uses the Vulkan Memory Model, the code:OutputMemory storage 1173e5c31af7Sopenharmony_ciclass is used to synchronize accesses to tessellation control output 1174e5c31af7Sopenharmony_civariables. 1175e5c31af7Sopenharmony_ciFor legacy SPIR-V that does not enable the Vulkan Memory Model via 1176e5c31af7Sopenharmony_cicode:OpMemoryModel, tessellation outputs can be ordered using a control 1177e5c31af7Sopenharmony_cibarrier with no particular memory scope or semantics, as defined below. 1178e5c31af7Sopenharmony_ci 1179e5c31af7Sopenharmony_ciLet X and Y be memory operations performed by shader invocations A~X~ and 1180e5c31af7Sopenharmony_ciA~Y~. 1181e5c31af7Sopenharmony_ciOperation X is _tessellation-output-ordered_ before operation Y if and only 1182e5c31af7Sopenharmony_ciif all of the following are true: 1183e5c31af7Sopenharmony_ci 1184e5c31af7Sopenharmony_ci * There is a dynamic instance of an code:OpControlBarrier instruction C 1185e5c31af7Sopenharmony_ci such that X is program-ordered before C in A~X~ and C is program-ordered 1186e5c31af7Sopenharmony_ci before Y in A~Y~. 1187e5c31af7Sopenharmony_ci * A~X~ and A~Y~ are in the same instance of C's execution scope. 1188e5c31af7Sopenharmony_ci 1189e5c31af7Sopenharmony_ciIf shader invocations A~X~ and A~Y~ in the code:TessellationControl 1190e5c31af7Sopenharmony_ciexecution model execute memory operations X and Y, respectively, on the 1191e5c31af7Sopenharmony_cicode:Output storage class, and X is tessellation-output-ordered before Y 1192e5c31af7Sopenharmony_ciwith a scope of code:Workgroup, then X is location-ordered before Y, and if 1193e5c31af7Sopenharmony_ciX is a write and Y is a read then X is visible-to Y. 1194e5c31af7Sopenharmony_ci 1195e5c31af7Sopenharmony_ci 1196e5c31af7Sopenharmony_ciifdef::VK_NV_cooperative_matrix[] 1197e5c31af7Sopenharmony_ci 1198e5c31af7Sopenharmony_ci[[memory-model-cooperative-matrix]] 1199e5c31af7Sopenharmony_ci== Cooperative Matrix Memory Access 1200e5c31af7Sopenharmony_ci 1201e5c31af7Sopenharmony_ciFor each dynamic instance of a cooperative matrix load or store instruction 1202e5c31af7Sopenharmony_ci(code:OpCooperativeMatrixLoadNV or code:OpCooperativeMatrixStoreNV), a 1203e5c31af7Sopenharmony_cisingle implementation-dependent invocation within the instance of the 1204e5c31af7Sopenharmony_cimatrix's scope performs a non-atomic load or store (respectively) to each 1205e5c31af7Sopenharmony_cimemory location that is defined to be accessed by the instruction. 1206e5c31af7Sopenharmony_ci 1207e5c31af7Sopenharmony_ciendif::VK_NV_cooperative_matrix[] 1208