1e5c31af7Sopenharmony_ci// Copyright 2017-2024 The Khronos Group Inc. 2e5c31af7Sopenharmony_ci// 3e5c31af7Sopenharmony_ci// SPDX-License-Identifier: CC-BY-4.0 4e5c31af7Sopenharmony_ci 5e5c31af7Sopenharmony_ci[appendix] 6e5c31af7Sopenharmony_ci[[memory-model]] 7e5c31af7Sopenharmony_ci= Memory Model 8e5c31af7Sopenharmony_ci 9e5c31af7Sopenharmony_ci[NOTE] 10e5c31af7Sopenharmony_ci.Note 11e5c31af7Sopenharmony_ci==== 12e5c31af7Sopenharmony_ciThis memory model describes synchronizations provided by all 13e5c31af7Sopenharmony_ciimplementations; however, some of the synchronizations defined require extra 14e5c31af7Sopenharmony_cifeatures to be supported by the implementation. 15e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 16e5c31af7Sopenharmony_ciSee slink:VkPhysicalDeviceVulkanMemoryModelFeatures. 17e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] 18e5c31af7Sopenharmony_ci==== 19e5c31af7Sopenharmony_ci 20e5c31af7Sopenharmony_ci[[memory-model-agent]] 21e5c31af7Sopenharmony_ci== Agent 22e5c31af7Sopenharmony_ci 23e5c31af7Sopenharmony_ci_Operation_ is a general term for any task that is executed on the system. 24e5c31af7Sopenharmony_ci 25e5c31af7Sopenharmony_ci[NOTE] 26e5c31af7Sopenharmony_ci.Note 27e5c31af7Sopenharmony_ci==== 28e5c31af7Sopenharmony_ciAn operation is by definition something that is executed. 29e5c31af7Sopenharmony_ciThus if an instruction is skipped due to control flow, it does not 30e5c31af7Sopenharmony_ciconstitute an operation. 31e5c31af7Sopenharmony_ci==== 32e5c31af7Sopenharmony_ci 33e5c31af7Sopenharmony_ciEach operation is executed by a particular _agent_. 34e5c31af7Sopenharmony_ciPossible agents include each shader invocation, each host thread, and each 35e5c31af7Sopenharmony_cifixed-function stage of the pipeline. 36e5c31af7Sopenharmony_ci 37e5c31af7Sopenharmony_ci 38e5c31af7Sopenharmony_ci[[memory-model-memory-location]] 39e5c31af7Sopenharmony_ci== Memory Location 40e5c31af7Sopenharmony_ci 41e5c31af7Sopenharmony_ciA _memory location_ identifies unique storage for 8 bits of data. 42e5c31af7Sopenharmony_ciMemory operations access a _set of memory locations_ consisting of one or 43e5c31af7Sopenharmony_cimore memory locations at a time, e.g. an operation accessing a 32-bit 44e5c31af7Sopenharmony_ciinteger in memory would read/write a set of four memory locations. 45e5c31af7Sopenharmony_ciMemory operations that access whole aggregates may: access any padding bytes 46e5c31af7Sopenharmony_cibetween elements or members, but no padding bytes at the end of the 47e5c31af7Sopenharmony_ciaggregate. 48e5c31af7Sopenharmony_ciTwo sets of memory locations _overlap_ if the intersection of their sets of 49e5c31af7Sopenharmony_cimemory locations is non-empty. 50e5c31af7Sopenharmony_ciA memory operation must: not affect memory at a memory location not within 51e5c31af7Sopenharmony_ciits set of memory locations. 52e5c31af7Sopenharmony_ci 53e5c31af7Sopenharmony_ciMemory locations for buffers and images are explicitly allocated in 54e5c31af7Sopenharmony_cislink:VkDeviceMemory objects, and are implicitly allocated for SPIR-V 55e5c31af7Sopenharmony_civariables in each shader invocation. 56e5c31af7Sopenharmony_ci 57e5c31af7Sopenharmony_ciifdef::VK_KHR_workgroup_memory_explicit_layout[] 58e5c31af7Sopenharmony_ciVariables with code:Workgroup storage class that point to a block-decorated 59e5c31af7Sopenharmony_citype share a set of memory locations. 60e5c31af7Sopenharmony_ciendif::VK_KHR_workgroup_memory_explicit_layout[] 61e5c31af7Sopenharmony_ci 62e5c31af7Sopenharmony_ci 63e5c31af7Sopenharmony_ci[[memory-model-allocation]] 64e5c31af7Sopenharmony_ci== Allocation 65e5c31af7Sopenharmony_ci 66e5c31af7Sopenharmony_ciThe values stored in newly allocated memory locations are determined by a 67e5c31af7Sopenharmony_ciSPIR-V variable's initializer, if present, or else are undefined:. 68e5c31af7Sopenharmony_ciAt the time an allocation is created there have been no 69e5c31af7Sopenharmony_ci<<memory-model-memory-operation,memory operations>> to any of its memory 70e5c31af7Sopenharmony_cilocations. 71e5c31af7Sopenharmony_ciThe initialization is not considered to be a memory operation. 72e5c31af7Sopenharmony_ci 73e5c31af7Sopenharmony_ci[NOTE] 74e5c31af7Sopenharmony_ci.Note 75e5c31af7Sopenharmony_ci==== 76e5c31af7Sopenharmony_ciFor tessellation control shader output variables, a consequence of 77e5c31af7Sopenharmony_ciinitialization not being considered a memory operation is that some 78e5c31af7Sopenharmony_ciimplementations may need to insert a barrier between the initialization of 79e5c31af7Sopenharmony_cithe output variables and any reads of those variables. 80e5c31af7Sopenharmony_ci==== 81e5c31af7Sopenharmony_ci 82e5c31af7Sopenharmony_ci 83e5c31af7Sopenharmony_ci[[memory-model-memory-operation]] 84e5c31af7Sopenharmony_ci== Memory Operation 85e5c31af7Sopenharmony_ci 86e5c31af7Sopenharmony_ciFor an operation A and memory location M: 87e5c31af7Sopenharmony_ci 88e5c31af7Sopenharmony_ci * [[memory-model-access-read]] A _reads_ M if and only if the data stored 89e5c31af7Sopenharmony_ci in M is an input to A. 90e5c31af7Sopenharmony_ci * [[memory-model-access-write]] A _writes_ M if and only if the data 91e5c31af7Sopenharmony_ci output from A is stored to M. 92e5c31af7Sopenharmony_ci * [[memory-model-access-access]] A _accesses_ M if and only if it either 93e5c31af7Sopenharmony_ci reads or writes (or both) M. 94e5c31af7Sopenharmony_ci 95e5c31af7Sopenharmony_ci[NOTE] 96e5c31af7Sopenharmony_ci.Note 97e5c31af7Sopenharmony_ci==== 98e5c31af7Sopenharmony_ciA write whose value is the same as what was already in those memory 99e5c31af7Sopenharmony_cilocations is still considered to be a write and has all the same effects. 100e5c31af7Sopenharmony_ci==== 101e5c31af7Sopenharmony_ci 102e5c31af7Sopenharmony_ci 103e5c31af7Sopenharmony_ci[[memory-model-references]] 104e5c31af7Sopenharmony_ci== Reference 105e5c31af7Sopenharmony_ci 106e5c31af7Sopenharmony_ciA _reference_ is an object that a particular agent can: use to access a set 107e5c31af7Sopenharmony_ciof memory locations. 108e5c31af7Sopenharmony_ciOn the host, a reference is a host virtual address. 109e5c31af7Sopenharmony_ciOn the device, a reference is: 110e5c31af7Sopenharmony_ci 111e5c31af7Sopenharmony_ci * The descriptor that a variable is bound to, for variables in Image, 112e5c31af7Sopenharmony_ci Uniform, or StorageBuffer storage classes. 113e5c31af7Sopenharmony_ci If the variable is an array (or array of arrays, etc.) then each element 114e5c31af7Sopenharmony_ci of the array may: be a unique reference. 115e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 116e5c31af7Sopenharmony_ci * The address range for a buffer in code:PhysicalStorageBuffer storage 117e5c31af7Sopenharmony_ci class, where the base of the address range is queried with 118e5c31af7Sopenharmony_ciifndef::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 119e5c31af7Sopenharmony_ci flink:vkGetBufferDeviceAddressEXT 120e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 121e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 122e5c31af7Sopenharmony_ci flink:vkGetBufferDeviceAddress 123e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_KHR_buffer_device_address[] 124e5c31af7Sopenharmony_ci and the length of the range is the size of the buffer. 125e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 126e5c31af7Sopenharmony_ciifdef::VK_KHR_workgroup_memory_explicit_layout[] 127e5c31af7Sopenharmony_ci * A single common reference for all variables with code:Workgroup storage 128e5c31af7Sopenharmony_ci class that point to a block-decorated type. 129e5c31af7Sopenharmony_ci * The variable itself for non-block-decorated type variables in 130e5c31af7Sopenharmony_ci code:Workgroup storage class. 131e5c31af7Sopenharmony_ciendif::VK_KHR_workgroup_memory_explicit_layout[] 132e5c31af7Sopenharmony_ci * The variable itself for variables in other storage classes. 133e5c31af7Sopenharmony_ci 134e5c31af7Sopenharmony_ciTwo memory accesses through distinct references may: require availability 135e5c31af7Sopenharmony_ciand visibility operations as defined 136e5c31af7Sopenharmony_ci<<memory-model-location-ordered,below>>. 137e5c31af7Sopenharmony_ci 138e5c31af7Sopenharmony_ci 139e5c31af7Sopenharmony_ci[[memory-model-program-order]] 140e5c31af7Sopenharmony_ci== Program-Order 141e5c31af7Sopenharmony_ci 142e5c31af7Sopenharmony_ciA _dynamic instance_ of an instruction is defined in SPIR-V 143e5c31af7Sopenharmony_ci(https://registry.khronos.org/spir-v/specs/unified1/SPIRV.html#DynamicInstance) 144e5c31af7Sopenharmony_cias a way of referring to a particular execution of a static instruction. 145e5c31af7Sopenharmony_ciProgram-order is an ordering on dynamic instances of instructions executed 146e5c31af7Sopenharmony_ciby a single shader invocation: 147e5c31af7Sopenharmony_ci 148e5c31af7Sopenharmony_ci * (Basic block): If instructions A and B are in the same basic block, and 149e5c31af7Sopenharmony_ci A is listed in the module before B, then the n'th dynamic instance of A 150e5c31af7Sopenharmony_ci is program-ordered before the n'th dynamic instance of B. 151e5c31af7Sopenharmony_ci * (Branch): The dynamic instance of a branch or switch instruction is 152e5c31af7Sopenharmony_ci program-ordered before the dynamic instance of the OpLabel instruction 153e5c31af7Sopenharmony_ci to which it transfers control. 154e5c31af7Sopenharmony_ci * (Call entry): The dynamic instance of an code:OpFunctionCall instruction 155e5c31af7Sopenharmony_ci is program-ordered before the dynamic instances of the 156e5c31af7Sopenharmony_ci code:OpFunctionParameter instructions and the body of the called 157e5c31af7Sopenharmony_ci function. 158e5c31af7Sopenharmony_ci * (Call exit): The dynamic instance of the instruction following an 159e5c31af7Sopenharmony_ci code:OpFunctionCall instruction is program-ordered after the dynamic 160e5c31af7Sopenharmony_ci instance of the return instruction executed by the called function. 161e5c31af7Sopenharmony_ci * (Transitive Closure): If dynamic instance A of any instruction is 162e5c31af7Sopenharmony_ci program-ordered before dynamic instance B of any instruction and B is 163e5c31af7Sopenharmony_ci program-ordered before dynamic instance C of any instruction then A is 164e5c31af7Sopenharmony_ci program-ordered before C. 165e5c31af7Sopenharmony_ci * (Complete definition): No other dynamic instances are program-ordered. 166e5c31af7Sopenharmony_ci 167e5c31af7Sopenharmony_ciFor instructions executed on the host, the source language defines the 168e5c31af7Sopenharmony_ciprogram-order relation (e.g. as "`sequenced-before`"). 169e5c31af7Sopenharmony_ci 170e5c31af7Sopenharmony_ci 171e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 172e5c31af7Sopenharmony_ci[[shader-call-related]] 173e5c31af7Sopenharmony_ci== Shader Call Related 174e5c31af7Sopenharmony_ci 175e5c31af7Sopenharmony_ciShader-call-related is an equivalence relation on invocations defined as the 176e5c31af7Sopenharmony_cisymmetric and transitive closure of: 177e5c31af7Sopenharmony_ci 178e5c31af7Sopenharmony_ci * A is shader-call-related to B if A is created by an 179e5c31af7Sopenharmony_ci <<ray-tracing-shader-call,shader call>> instruction executed by B. 180e5c31af7Sopenharmony_ci 181e5c31af7Sopenharmony_ci 182e5c31af7Sopenharmony_ci[[shader-call-order]] 183e5c31af7Sopenharmony_ci== Shader Call Order 184e5c31af7Sopenharmony_ci 185e5c31af7Sopenharmony_ciShader-call-order is a partial order on dynamic instances of instructions 186e5c31af7Sopenharmony_ciexecuted by invocations that are shader-call-related: 187e5c31af7Sopenharmony_ci 188e5c31af7Sopenharmony_ci * (Program order): If dynamic instance A is program-ordered before B, then 189e5c31af7Sopenharmony_ci A is shader-call-ordered before B. 190e5c31af7Sopenharmony_ci * (Shader call entry): If A is a dynamic instance of an 191e5c31af7Sopenharmony_ci <<ray-tracing-shader-call,shader call>> instruction and B is a dynamic 192e5c31af7Sopenharmony_ci instance executed by an invocation that is created by A, then A is 193e5c31af7Sopenharmony_ci shader-call-ordered before B. 194e5c31af7Sopenharmony_ci * (Shader call exit): If A is a dynamic instance of an 195e5c31af7Sopenharmony_ci <<ray-tracing-shader-call,shader call>> instruction, B is the next 196e5c31af7Sopenharmony_ci dynamic instance executed by the same invocation, and C is a dynamic 197e5c31af7Sopenharmony_ci instance executed by an invocation that is created by A, then C is 198e5c31af7Sopenharmony_ci shader-call-ordered before B. 199e5c31af7Sopenharmony_ci * (Transitive closure): If A is shader-call-ordered-before B and B is 200e5c31af7Sopenharmony_ci shader-call-ordered-before C, then A is shader-call-ordered-before C. 201e5c31af7Sopenharmony_ci * (Complete definition): No other dynamic instances are 202e5c31af7Sopenharmony_ci shader-call-ordered. 203e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 204e5c31af7Sopenharmony_ci 205e5c31af7Sopenharmony_ci 206e5c31af7Sopenharmony_ci[[memory-model-scope]] 207e5c31af7Sopenharmony_ci== Scope 208e5c31af7Sopenharmony_ci 209e5c31af7Sopenharmony_ciAtomic and barrier instructions include scopes which identify sets of shader 210e5c31af7Sopenharmony_ciinvocations that must: obey the requested ordering and atomicity rules of 211e5c31af7Sopenharmony_cithe operation, as defined below. 212e5c31af7Sopenharmony_ci 213e5c31af7Sopenharmony_ciThe various scopes are described in detail in <<shaders-scope, the Shaders 214e5c31af7Sopenharmony_cichapter>>. 215e5c31af7Sopenharmony_ci 216e5c31af7Sopenharmony_ci 217e5c31af7Sopenharmony_ci[[memory-model-atomic-operation]] 218e5c31af7Sopenharmony_ci== Atomic Operation 219e5c31af7Sopenharmony_ci 220e5c31af7Sopenharmony_ciAn _atomic operation_ on the device is any SPIR-V operation whose name 221e5c31af7Sopenharmony_cibegins with code:OpAtomic. 222e5c31af7Sopenharmony_ciAn atomic operation on the host is any operation performed with an 223e5c31af7Sopenharmony_cistd::atomic typed object. 224e5c31af7Sopenharmony_ci 225e5c31af7Sopenharmony_ciEach atomic operation has a memory <<memory-model-scope,scope>> and a 226e5c31af7Sopenharmony_ci<<memory-model-memory-semantics,semantics>>. 227e5c31af7Sopenharmony_ciInformally, the scope determines which other agents it is atomic with 228e5c31af7Sopenharmony_cirespect to, and the <<memory-model-memory-semantics,semantics>> constrains 229e5c31af7Sopenharmony_ciits ordering against other memory accesses. 230e5c31af7Sopenharmony_ciDevice atomic operations have explicit scopes and semantics. 231e5c31af7Sopenharmony_ciEach host atomic operation implicitly uses the code:CrossDevice scope, and 232e5c31af7Sopenharmony_ciuses a memory semantics equivalent to a C++ std::memory_order value of 233e5c31af7Sopenharmony_cirelaxed, acquire, release, acq_rel, or seq_cst. 234e5c31af7Sopenharmony_ci 235e5c31af7Sopenharmony_ciTwo atomic operations A and B are _potentially-mutually-ordered_ if and only 236e5c31af7Sopenharmony_ciif all of the following are true: 237e5c31af7Sopenharmony_ci 238e5c31af7Sopenharmony_ci * They access the same set of memory locations. 239e5c31af7Sopenharmony_ci * They use the same reference. 240e5c31af7Sopenharmony_ci * A is in the instance of B's memory scope. 241e5c31af7Sopenharmony_ci * B is in the instance of A's memory scope. 242e5c31af7Sopenharmony_ci * A and B are not the same operation (irreflexive). 243e5c31af7Sopenharmony_ci 244e5c31af7Sopenharmony_ciTwo atomic operations A and B are _mutually-ordered_ if and only if they are 245e5c31af7Sopenharmony_cipotentially-mutually-ordered and any of the following are true: 246e5c31af7Sopenharmony_ci 247e5c31af7Sopenharmony_ci * A and B are both device operations. 248e5c31af7Sopenharmony_ci * A and B are both host operations. 249e5c31af7Sopenharmony_ci * A is a device operation, B is a host operation, and the implementation 250e5c31af7Sopenharmony_ci supports concurrent host- and device-atomics. 251e5c31af7Sopenharmony_ci 252e5c31af7Sopenharmony_ci[NOTE] 253e5c31af7Sopenharmony_ci.Note 254e5c31af7Sopenharmony_ci==== 255e5c31af7Sopenharmony_ciIf two atomic operations are not mutually-ordered, and if their sets of 256e5c31af7Sopenharmony_cimemory locations overlap, then each must: be synchronized against the other 257e5c31af7Sopenharmony_cias if they were non-atomic operations. 258e5c31af7Sopenharmony_ci==== 259e5c31af7Sopenharmony_ci 260e5c31af7Sopenharmony_ci 261e5c31af7Sopenharmony_ci[[memory-model-scoped-modification-order]] 262e5c31af7Sopenharmony_ci== Scoped Modification Order 263e5c31af7Sopenharmony_ci 264e5c31af7Sopenharmony_ciFor a given atomic write A, all atomic writes that are mutually-ordered with 265e5c31af7Sopenharmony_ciA occur in an order known as A's _scoped modification order_. 266e5c31af7Sopenharmony_ciA's scoped modification order relates no other operations. 267e5c31af7Sopenharmony_ci 268e5c31af7Sopenharmony_ci[NOTE] 269e5c31af7Sopenharmony_ci.Note 270e5c31af7Sopenharmony_ci==== 271e5c31af7Sopenharmony_ciInvocations outside the instance of A's memory scope may: observe the values 272e5c31af7Sopenharmony_ciat A's set of memory locations becoming visible to it in an order that 273e5c31af7Sopenharmony_cidisagrees with the scoped modification order. 274e5c31af7Sopenharmony_ci==== 275e5c31af7Sopenharmony_ci 276e5c31af7Sopenharmony_ci[NOTE] 277e5c31af7Sopenharmony_ci.Note 278e5c31af7Sopenharmony_ci==== 279e5c31af7Sopenharmony_ciIt is valid to have non-atomic operations or atomics in a different scope 280e5c31af7Sopenharmony_ciinstance to the same set of memory locations, as long as they are 281e5c31af7Sopenharmony_cisynchronized against each other as if they were non-atomic (if they are not, 282e5c31af7Sopenharmony_ciit is treated as a <<memory-model-access-data-race,data race>>). 283e5c31af7Sopenharmony_ciThat means this definition of A's scoped modification order could include 284e5c31af7Sopenharmony_ciatomic operations that occur much later, after intervening non-atomics. 285e5c31af7Sopenharmony_ciThat is a bit non-intuitive, but it helps to keep this definition simple and 286e5c31af7Sopenharmony_cinon-circular. 287e5c31af7Sopenharmony_ci==== 288e5c31af7Sopenharmony_ci 289e5c31af7Sopenharmony_ci 290e5c31af7Sopenharmony_ci[[memory-model-memory-semantics]] 291e5c31af7Sopenharmony_ci== Memory Semantics 292e5c31af7Sopenharmony_ci 293e5c31af7Sopenharmony_ciNon-atomic memory operations, by default, may: be observed by one agent in a 294e5c31af7Sopenharmony_cidifferent order than they were written by another agent. 295e5c31af7Sopenharmony_ci 296e5c31af7Sopenharmony_ciAtomics and some synchronization operations include _memory semantics_, 297e5c31af7Sopenharmony_ciwhich are flags that constrain the order in which other memory accesses 298e5c31af7Sopenharmony_ci(including non-atomic memory accesses and 299e5c31af7Sopenharmony_ci<<memory-model-availability-visibility,availability and visibility 300e5c31af7Sopenharmony_cioperations>>) performed by the same agent can: be observed by other agents, 301e5c31af7Sopenharmony_cior can: observe accesses by other agents. 302e5c31af7Sopenharmony_ci 303e5c31af7Sopenharmony_ciDevice instructions that include semantics are code:OpAtomic*, 304e5c31af7Sopenharmony_cicode:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier. 305e5c31af7Sopenharmony_ciHost instructions that include semantics are some std::atomic methods and 306e5c31af7Sopenharmony_cimemory fences. 307e5c31af7Sopenharmony_ci 308e5c31af7Sopenharmony_ciSPIR-V supports the following memory semantics: 309e5c31af7Sopenharmony_ci 310e5c31af7Sopenharmony_ci * Relaxed: No constraints on order of other memory accesses. 311e5c31af7Sopenharmony_ci * Acquire: A memory read with this semantic performs an _acquire 312e5c31af7Sopenharmony_ci operation_. 313e5c31af7Sopenharmony_ci A memory barrier with this semantic is an _acquire barrier_. 314e5c31af7Sopenharmony_ci * Release: A memory write with this semantic performs a _release 315e5c31af7Sopenharmony_ci operation_. 316e5c31af7Sopenharmony_ci A memory barrier with this semantic is a _release barrier_. 317e5c31af7Sopenharmony_ci * AcquireRelease: A memory read-modify-write operation with this semantic 318e5c31af7Sopenharmony_ci performs both an acquire operation and a release operation, and inherits 319e5c31af7Sopenharmony_ci the limitations on ordering from both of those operations. 320e5c31af7Sopenharmony_ci A memory barrier with this semantic is both a release and acquire 321e5c31af7Sopenharmony_ci barrier. 322e5c31af7Sopenharmony_ci 323e5c31af7Sopenharmony_ci[NOTE] 324e5c31af7Sopenharmony_ci.Note 325e5c31af7Sopenharmony_ci==== 326e5c31af7Sopenharmony_ciSPIR-V does not support "`consume`" semantics on the device. 327e5c31af7Sopenharmony_ci==== 328e5c31af7Sopenharmony_ci 329e5c31af7Sopenharmony_ciThe memory semantics operand also includes _storage class semantics_ which 330e5c31af7Sopenharmony_ciindicate which storage classes are constrained by the synchronization. 331e5c31af7Sopenharmony_ciSPIR-V storage class semantics include: 332e5c31af7Sopenharmony_ci 333e5c31af7Sopenharmony_ci * UniformMemory 334e5c31af7Sopenharmony_ci * WorkgroupMemory 335e5c31af7Sopenharmony_ci * ImageMemory 336e5c31af7Sopenharmony_ci * OutputMemory 337e5c31af7Sopenharmony_ci 338e5c31af7Sopenharmony_ciEach SPIR-V memory operation accesses a single storage class. 339e5c31af7Sopenharmony_ciSemantics in synchronization operations can include a combination of storage 340e5c31af7Sopenharmony_ciclasses. 341e5c31af7Sopenharmony_ci 342e5c31af7Sopenharmony_ciThe UniformMemory storage class semantic applies to accesses to memory in 343e5c31af7Sopenharmony_cithe 344e5c31af7Sopenharmony_ciifdef::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 345e5c31af7Sopenharmony_ciPhysicalStorageBuffer, 346e5c31af7Sopenharmony_ciendif::VK_VERSION_1_2,VK_EXT_buffer_device_address,VK_KHR_buffer_device_address[] 347e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 348e5c31af7Sopenharmony_cicode:ShaderRecordBufferKHR, 349e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 350e5c31af7Sopenharmony_ciUniform and StorageBuffer storage classes. 351e5c31af7Sopenharmony_ciThe WorkgroupMemory storage class semantic applies to accesses to memory in 352e5c31af7Sopenharmony_cithe Workgroup storage class. 353e5c31af7Sopenharmony_ciThe ImageMemory storage class semantic applies to accesses to memory in the 354e5c31af7Sopenharmony_ciImage storage class. 355e5c31af7Sopenharmony_ciThe OutputMemory storage class semantic applies to accesses to memory in the 356e5c31af7Sopenharmony_ciOutput storage class. 357e5c31af7Sopenharmony_ci 358e5c31af7Sopenharmony_ci[NOTE] 359e5c31af7Sopenharmony_ci.Note 360e5c31af7Sopenharmony_ci==== 361e5c31af7Sopenharmony_ciInformally, these constraints limit how memory operations can be reordered, 362e5c31af7Sopenharmony_ciand these limits apply not only to the order of accesses as performed in the 363e5c31af7Sopenharmony_ciagent that executes the instruction, but also to the order the effects of 364e5c31af7Sopenharmony_ciwrites become visible to all other agents within the same instance of the 365e5c31af7Sopenharmony_ciinstruction's memory scope. 366e5c31af7Sopenharmony_ci==== 367e5c31af7Sopenharmony_ci 368e5c31af7Sopenharmony_ci[NOTE] 369e5c31af7Sopenharmony_ci.Note 370e5c31af7Sopenharmony_ci==== 371e5c31af7Sopenharmony_ciRelease and acquire operations in different threads can: act as 372e5c31af7Sopenharmony_cisynchronization operations, to guarantee that writes that happened before 373e5c31af7Sopenharmony_cithe release are visible after the acquire. 374e5c31af7Sopenharmony_ci(This is not a formal definition, just an Informative forward reference.) 375e5c31af7Sopenharmony_ci==== 376e5c31af7Sopenharmony_ci 377e5c31af7Sopenharmony_ci[NOTE] 378e5c31af7Sopenharmony_ci.Note 379e5c31af7Sopenharmony_ci==== 380e5c31af7Sopenharmony_ciThe OutputMemory storage class semantic is only useful in tessellation 381e5c31af7Sopenharmony_cicontrol shaders, which is the only execution model where output variables 382e5c31af7Sopenharmony_ciare shared between invocations. 383e5c31af7Sopenharmony_ci==== 384e5c31af7Sopenharmony_ci 385e5c31af7Sopenharmony_ciThe memory semantics operand can: also include availability and visibility 386e5c31af7Sopenharmony_ciflags, which apply availability and visibility operations as described in 387e5c31af7Sopenharmony_ci<<memory-model-availability-visibility,availability and visibility>>. 388e5c31af7Sopenharmony_ciThe availability/visibility flags are: 389e5c31af7Sopenharmony_ci 390e5c31af7Sopenharmony_ci * MakeAvailable: Semantics must: be Release or AcquireRelease. 391e5c31af7Sopenharmony_ci Performs an availability operation before the release operation or 392e5c31af7Sopenharmony_ci barrier. 393e5c31af7Sopenharmony_ci * MakeVisible: Semantics must: be Acquire or AcquireRelease. 394e5c31af7Sopenharmony_ci Performs a visibility operation after the acquire operation or barrier. 395e5c31af7Sopenharmony_ci 396e5c31af7Sopenharmony_ciThe specifics of these operations are defined in 397e5c31af7Sopenharmony_ci<<memory-model-availability-visibility-semantics,Availability and Visibility 398e5c31af7Sopenharmony_ciSemantics>>. 399e5c31af7Sopenharmony_ci 400e5c31af7Sopenharmony_ciHost atomic operations may: support a different list of memory semantics and 401e5c31af7Sopenharmony_cisynchronization operations, depending on the host architecture and source 402e5c31af7Sopenharmony_cilanguage. 403e5c31af7Sopenharmony_ci 404e5c31af7Sopenharmony_ci 405e5c31af7Sopenharmony_ci[[memory-model-release-sequence]] 406e5c31af7Sopenharmony_ci== Release Sequence 407e5c31af7Sopenharmony_ci 408e5c31af7Sopenharmony_ciAfter an atomic operation A performs a release operation on a set of memory 409e5c31af7Sopenharmony_cilocations M, the _release sequence headed by A_ is the longest continuous 410e5c31af7Sopenharmony_cisubsequence of A's scoped modification order that consists of: 411e5c31af7Sopenharmony_ci 412e5c31af7Sopenharmony_ci * the atomic operation A as its first element 413e5c31af7Sopenharmony_ci * atomic read-modify-write operations on M by any agent 414e5c31af7Sopenharmony_ci 415e5c31af7Sopenharmony_ci[NOTE] 416e5c31af7Sopenharmony_ci.Note 417e5c31af7Sopenharmony_ci==== 418e5c31af7Sopenharmony_ciThe atomics in the last bullet must: be mutually-ordered with A by virtue of 419e5c31af7Sopenharmony_cibeing in A's scoped modification order. 420e5c31af7Sopenharmony_ci==== 421e5c31af7Sopenharmony_ci 422e5c31af7Sopenharmony_ci[NOTE] 423e5c31af7Sopenharmony_ci.Note 424e5c31af7Sopenharmony_ci==== 425e5c31af7Sopenharmony_ciThis intentionally omits "`atomic writes to M performed by the same agent 426e5c31af7Sopenharmony_cithat performed A`", which is present in the corresponding C++ definition. 427e5c31af7Sopenharmony_ci==== 428e5c31af7Sopenharmony_ci 429e5c31af7Sopenharmony_ci 430e5c31af7Sopenharmony_ci[[memory-model-synchronizes-with]] 431e5c31af7Sopenharmony_ci== Synchronizes-With 432e5c31af7Sopenharmony_ci 433e5c31af7Sopenharmony_ci_Synchronizes-with_ is a relation between operations, where each operation 434e5c31af7Sopenharmony_ciis either an atomic operation or a memory barrier (aka fence on the host). 435e5c31af7Sopenharmony_ci 436e5c31af7Sopenharmony_ciIf A and B are atomic operations, then A synchronizes-with B if and only if 437e5c31af7Sopenharmony_ciall of the following are true: 438e5c31af7Sopenharmony_ci 439e5c31af7Sopenharmony_ci * A performs a release operation 440e5c31af7Sopenharmony_ci * B performs an acquire operation 441e5c31af7Sopenharmony_ci * A and B are mutually-ordered 442e5c31af7Sopenharmony_ci * B reads a value written by A or by an operation in the release sequence 443e5c31af7Sopenharmony_ci headed by A 444e5c31af7Sopenharmony_ci 445e5c31af7Sopenharmony_cicode:OpControlBarrier, code:OpMemoryBarrier, and code:OpMemoryNamedBarrier 446e5c31af7Sopenharmony_ciare _memory barrier_ instructions in SPIR-V. 447e5c31af7Sopenharmony_ci 448e5c31af7Sopenharmony_ciIf A is a release barrier and B is an atomic operation that performs an 449e5c31af7Sopenharmony_ciacquire operation, then A synchronizes-with B if and only if all of the 450e5c31af7Sopenharmony_cifollowing are true: 451e5c31af7Sopenharmony_ci 452e5c31af7Sopenharmony_ci * there exists an atomic write X (with any memory semantics) 453e5c31af7Sopenharmony_ci * A is program-ordered before X 454e5c31af7Sopenharmony_ci * X and B are mutually-ordered 455e5c31af7Sopenharmony_ci * B reads a value written by X or by an operation in the release sequence 456e5c31af7Sopenharmony_ci headed by X 457e5c31af7Sopenharmony_ci ** If X is relaxed, it is still considered to head a hypothetical release 458e5c31af7Sopenharmony_ci sequence for this rule 459e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 460e5c31af7Sopenharmony_ci * X's storage class is in A's semantics. 461e5c31af7Sopenharmony_ci 462e5c31af7Sopenharmony_ciIf A is an atomic operation that performs a release operation and B is an 463e5c31af7Sopenharmony_ciacquire barrier, then A synchronizes-with B if and only if all of the 464e5c31af7Sopenharmony_cifollowing are true: 465e5c31af7Sopenharmony_ci 466e5c31af7Sopenharmony_ci * there exists an atomic read X (with any memory semantics) 467e5c31af7Sopenharmony_ci * X is program-ordered before B 468e5c31af7Sopenharmony_ci * X and A are mutually-ordered 469e5c31af7Sopenharmony_ci * X reads a value written by A or by an operation in the release sequence 470e5c31af7Sopenharmony_ci headed by A 471e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 472e5c31af7Sopenharmony_ci * X's storage class is in B's semantics. 473e5c31af7Sopenharmony_ci 474e5c31af7Sopenharmony_ciIf A is a release barrier and B is an acquire barrier, then A 475e5c31af7Sopenharmony_cisynchronizes-with B if all of the following are true: 476e5c31af7Sopenharmony_ci 477e5c31af7Sopenharmony_ci * there exists an atomic write X (with any memory semantics) 478e5c31af7Sopenharmony_ci * A is program-ordered before X 479e5c31af7Sopenharmony_ci * there exists an atomic read Y (with any memory semantics) 480e5c31af7Sopenharmony_ci * Y is program-ordered before B 481e5c31af7Sopenharmony_ci * X and Y are mutually-ordered 482e5c31af7Sopenharmony_ci * Y reads the value written by X or by an operation in the release 483e5c31af7Sopenharmony_ci sequence headed by X 484e5c31af7Sopenharmony_ci ** If X is relaxed, it is still considered to head a hypothetical release 485e5c31af7Sopenharmony_ci sequence for this rule 486e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 487e5c31af7Sopenharmony_ci * X's and Y's storage class is in A's and B's semantics. 488e5c31af7Sopenharmony_ci ** NOTE: X and Y must have the same storage class, because they are 489e5c31af7Sopenharmony_ci mutually ordered. 490e5c31af7Sopenharmony_ci 491e5c31af7Sopenharmony_ciIf A is a release barrier, B is an acquire barrier, and C is a control 492e5c31af7Sopenharmony_cibarrier (where A can: equal C, and B can: equal C), then A synchronizes-with 493e5c31af7Sopenharmony_ciB if all of the following are true: 494e5c31af7Sopenharmony_ci 495e5c31af7Sopenharmony_ci * A is program-ordered before (or equals) C 496e5c31af7Sopenharmony_ci * C is program-ordered before (or equals) B 497e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 498e5c31af7Sopenharmony_ci * A and B are in the instance of C's execution scope 499e5c31af7Sopenharmony_ci 500e5c31af7Sopenharmony_ci[NOTE] 501e5c31af7Sopenharmony_ci.Note 502e5c31af7Sopenharmony_ci==== 503e5c31af7Sopenharmony_ciThis is similar to the barrier-barrier synchronization above, but with a 504e5c31af7Sopenharmony_cicontrol barrier filling the role of the relaxed atomics. 505e5c31af7Sopenharmony_ci==== 506e5c31af7Sopenharmony_ci 507e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 508e5c31af7Sopenharmony_ci 509e5c31af7Sopenharmony_ciLet F be an ordering of fragment shader invocations, such that invocation 510e5c31af7Sopenharmony_ciF~1~ is ordered before invocation F~2~ if and only if F~1~ and F~2~ overlap 511e5c31af7Sopenharmony_cias described in <<shaders-scope-fragment-interlock,Fragment Shader 512e5c31af7Sopenharmony_ciInterlock>> and F~1~ executes the interlocked code before F~2~. 513e5c31af7Sopenharmony_ci 514e5c31af7Sopenharmony_ciIf A is an code:OpEndInvocationInterlockEXT instruction and B is an 515e5c31af7Sopenharmony_cicode:OpBeginInvocationInterlockEXT instruction, then A synchronizes-with B 516e5c31af7Sopenharmony_ciif the agent that executes A is ordered before the agent that executes B in 517e5c31af7Sopenharmony_ciF. A and B are both considered to have code:FragmentInterlock memory scope 518e5c31af7Sopenharmony_ciand semantics of UniformMemory and ImageMemory, and A is considered to have 519e5c31af7Sopenharmony_ciRelease semantics and B is considered to have Acquire semantics. 520e5c31af7Sopenharmony_ci 521e5c31af7Sopenharmony_ci[NOTE] 522e5c31af7Sopenharmony_ci.Note 523e5c31af7Sopenharmony_ci==== 524e5c31af7Sopenharmony_cicode:OpBeginInvocationInterlockEXT and code:OpBeginInvocationInterlockEXT do 525e5c31af7Sopenharmony_cinot perform implicit availability or visibility operations. 526e5c31af7Sopenharmony_ciUsually, shaders using fragment shader interlock will declare the relevant 527e5c31af7Sopenharmony_ciresources as `coherent` to get implicit 528e5c31af7Sopenharmony_ci<<memory-model-instruction-av-vis,per-instruction availability and 529e5c31af7Sopenharmony_civisibility operations>>. 530e5c31af7Sopenharmony_ci==== 531e5c31af7Sopenharmony_ci 532e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 533e5c31af7Sopenharmony_ci 534e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 535e5c31af7Sopenharmony_ciIf A is a release barrier and B is an acquire barrier, then A 536e5c31af7Sopenharmony_cisynchronizes-with B if all of the following are true: 537e5c31af7Sopenharmony_ci 538e5c31af7Sopenharmony_ci * A is shader-call-ordered-before B 539e5c31af7Sopenharmony_ci * A and B are in the instance of each other's memory scopes 540e5c31af7Sopenharmony_ci 541e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 542e5c31af7Sopenharmony_ci 543e5c31af7Sopenharmony_ciNo other release and acquire barriers synchronize-with each other. 544e5c31af7Sopenharmony_ci 545e5c31af7Sopenharmony_ci 546e5c31af7Sopenharmony_ci[[memory-model-system-synchronizes-with]] 547e5c31af7Sopenharmony_ci== System-Synchronizes-With 548e5c31af7Sopenharmony_ci 549e5c31af7Sopenharmony_ci_System-synchronizes-with_ is a relation between arbitrary operations on the 550e5c31af7Sopenharmony_cidevice or host. 551e5c31af7Sopenharmony_ciCertain operations system-synchronize-with each other, which informally 552e5c31af7Sopenharmony_cimeans the first operation occurs before the second and that the 553e5c31af7Sopenharmony_cisynchronization is performed without using application-visible memory 554e5c31af7Sopenharmony_ciaccesses. 555e5c31af7Sopenharmony_ci 556e5c31af7Sopenharmony_ciIf there is an <<synchronization-dependencies-execution,execution 557e5c31af7Sopenharmony_cidependency>> between two operations A and B, then the operation in the first 558e5c31af7Sopenharmony_cisynchronization scope system-synchronizes-with the operation in the second 559e5c31af7Sopenharmony_cisynchronization scope. 560e5c31af7Sopenharmony_ci 561e5c31af7Sopenharmony_ci[NOTE] 562e5c31af7Sopenharmony_ci.Note 563e5c31af7Sopenharmony_ci==== 564e5c31af7Sopenharmony_ciThis covers all Vulkan synchronization primitives, including device 565e5c31af7Sopenharmony_cioperations executing before a synchronization primitive is signaled, wait 566e5c31af7Sopenharmony_cioperations happening before subsequent device operations, signal operations 567e5c31af7Sopenharmony_cihappening before host operations that wait on them, and host operations 568e5c31af7Sopenharmony_cihappening before flink:vkQueueSubmit. 569e5c31af7Sopenharmony_ciThe list is spread throughout the synchronization chapter, and is not 570e5c31af7Sopenharmony_cirepeated here. 571e5c31af7Sopenharmony_ci==== 572e5c31af7Sopenharmony_ci 573e5c31af7Sopenharmony_ciSystem-synchronizes-with implicitly includes all storage class semantics and 574e5c31af7Sopenharmony_cihas code:CrossDevice scope. 575e5c31af7Sopenharmony_ci 576e5c31af7Sopenharmony_ciIf A system-synchronizes-with B, we also say A is 577e5c31af7Sopenharmony_ci_system-synchronized-before_ B and B is _system-synchronized-after_ A. 578e5c31af7Sopenharmony_ci 579e5c31af7Sopenharmony_ci 580e5c31af7Sopenharmony_ci[[memory-model-non-private]] 581e5c31af7Sopenharmony_ci== Private vs. Non-Private 582e5c31af7Sopenharmony_ci 583e5c31af7Sopenharmony_ciBy default, non-atomic memory operations are treated as _private_, meaning 584e5c31af7Sopenharmony_cisuch a memory operation is not intended to be used for communication with 585e5c31af7Sopenharmony_ciother agents. 586e5c31af7Sopenharmony_ciMemory operations with the NonPrivatePointer/NonPrivateTexel bit set are 587e5c31af7Sopenharmony_citreated as _non-private_, and are intended to be used for communication with 588e5c31af7Sopenharmony_ciother agents. 589e5c31af7Sopenharmony_ci 590e5c31af7Sopenharmony_ciMore precisely, for private memory operations to be 591e5c31af7Sopenharmony_ci<<memory-model-location-ordered,Location-Ordered>> between distinct agents 592e5c31af7Sopenharmony_cirequires using system-synchronizes-with rather than shader-based 593e5c31af7Sopenharmony_cisynchronization. 594e5c31af7Sopenharmony_ciPrivate memory operations still obey program-order. 595e5c31af7Sopenharmony_ci 596e5c31af7Sopenharmony_ciAtomic operations are always considered non-private. 597e5c31af7Sopenharmony_ci 598e5c31af7Sopenharmony_ci 599e5c31af7Sopenharmony_ci[[memory-model-inter-thread-happens-before]] 600e5c31af7Sopenharmony_ci== Inter-Thread-Happens-Before 601e5c31af7Sopenharmony_ci 602e5c31af7Sopenharmony_ciLet SC be a non-empty set of storage class semantics. 603e5c31af7Sopenharmony_ciThen (using template syntax) operation A _inter-thread-happens-before_<SC> 604e5c31af7Sopenharmony_cioperation B if and only if any of the following is true: 605e5c31af7Sopenharmony_ci 606e5c31af7Sopenharmony_ci * A system-synchronizes-with B 607e5c31af7Sopenharmony_ci * A synchronizes-with B, and both A and B have all of SC in their 608e5c31af7Sopenharmony_ci semantics 609e5c31af7Sopenharmony_ci * A is an operation on memory in a storage class in SC or that has all of 610e5c31af7Sopenharmony_ci SC in its semantics, B is a release barrier or release atomic with all 611e5c31af7Sopenharmony_ci of SC in its semantics, and A is program-ordered before B 612e5c31af7Sopenharmony_ci * A is an acquire barrier or acquire atomic with all of SC in its 613e5c31af7Sopenharmony_ci semantics, B is an operation on memory in a storage class in SC or that 614e5c31af7Sopenharmony_ci has all of SC in its semantics, and A is program-ordered before B 615e5c31af7Sopenharmony_ci * A and B are both host operations and A inter-thread-happens-before B as 616e5c31af7Sopenharmony_ci defined in the host language specification 617e5c31af7Sopenharmony_ci * A inter-thread-happens-before<SC> some X and X 618e5c31af7Sopenharmony_ci inter-thread-happens-before<SC> B 619e5c31af7Sopenharmony_ci 620e5c31af7Sopenharmony_ci 621e5c31af7Sopenharmony_ci[[memory-model-happens-before]] 622e5c31af7Sopenharmony_ci== Happens-Before 623e5c31af7Sopenharmony_ci 624e5c31af7Sopenharmony_ciOperation A _happens-before_ operation B if and only if any of the following 625e5c31af7Sopenharmony_ciis true: 626e5c31af7Sopenharmony_ci 627e5c31af7Sopenharmony_ci * A is program-ordered before B 628e5c31af7Sopenharmony_ci * A inter-thread-happens-before<SC> B for some set of storage classes SC 629e5c31af7Sopenharmony_ci 630e5c31af7Sopenharmony_ci_Happens-after_ is defined similarly. 631e5c31af7Sopenharmony_ci 632e5c31af7Sopenharmony_ci[NOTE] 633e5c31af7Sopenharmony_ci.Note 634e5c31af7Sopenharmony_ci==== 635e5c31af7Sopenharmony_ciUnlike C++, happens-before is not always sufficient for a write to be 636e5c31af7Sopenharmony_civisible to a read. 637e5c31af7Sopenharmony_ciAdditional <<memory-model-availability-visibility,availability and 638e5c31af7Sopenharmony_civisibility>> operations may: be required for writes to be 639e5c31af7Sopenharmony_ci<<memory-model-visible-to,visible-to>> other memory accesses. 640e5c31af7Sopenharmony_ci==== 641e5c31af7Sopenharmony_ci 642e5c31af7Sopenharmony_ci[NOTE] 643e5c31af7Sopenharmony_ci.Note 644e5c31af7Sopenharmony_ci==== 645e5c31af7Sopenharmony_ciHappens-before is not transitive, but each of program-order and 646e5c31af7Sopenharmony_ciinter-thread-happens-before<SC> are transitive. 647e5c31af7Sopenharmony_ciThese can be thought of as covering the "`single-threaded`" case and the 648e5c31af7Sopenharmony_ci"`multi-threaded`" case, and it is not necessary (and not valid) to form 649e5c31af7Sopenharmony_cichains between the two. 650e5c31af7Sopenharmony_ci==== 651e5c31af7Sopenharmony_ci 652e5c31af7Sopenharmony_ci 653e5c31af7Sopenharmony_ci[[memory-model-availability-visibility]] 654e5c31af7Sopenharmony_ci== Availability and Visibility 655e5c31af7Sopenharmony_ci 656e5c31af7Sopenharmony_ci_Availability_ and _visibility_ are states of a write operation, which 657e5c31af7Sopenharmony_ci(informally) track how far the write has permeated the system, i.e. which 658e5c31af7Sopenharmony_ciagents and references are able to observe the write. 659e5c31af7Sopenharmony_ciAvailability state is per _memory domain_. 660e5c31af7Sopenharmony_ciVisibility state is per (agent,reference) pair. 661e5c31af7Sopenharmony_ciAvailability and visibility states are per-memory location for each write. 662e5c31af7Sopenharmony_ci 663e5c31af7Sopenharmony_ciMemory domains are named according to the agents whose memory accesses use 664e5c31af7Sopenharmony_cithe domain. 665e5c31af7Sopenharmony_ciDomains used by shader invocations are organized hierarchically into 666e5c31af7Sopenharmony_cimultiple smaller memory domains which correspond to the different 667e5c31af7Sopenharmony_ci<<shaders-scope, scopes>>. 668e5c31af7Sopenharmony_ciEach memory domain is considered the _dual_ of a scope, and vice versa. 669e5c31af7Sopenharmony_ciThe memory domains defined in Vulkan include: 670e5c31af7Sopenharmony_ci 671e5c31af7Sopenharmony_ci * _host_ - accessible by host agents 672e5c31af7Sopenharmony_ci * _device_ - accessible by all device agents for a particular device 673e5c31af7Sopenharmony_ci * _shader_ - accessible by shader agents for a particular device, 674e5c31af7Sopenharmony_ci corresponding to the code:Device scope 675e5c31af7Sopenharmony_ci * _queue family instance_ - accessible by shader agents in a single queue 676e5c31af7Sopenharmony_ci family, corresponding to the code:QueueFamily scope. 677e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 678e5c31af7Sopenharmony_ci * _fragment interlock instance_ - accessible by fragment shader agents 679e5c31af7Sopenharmony_ci that <<shaders-scope-fragment-interlock,overlap>>, corresponding to the 680e5c31af7Sopenharmony_ci code:FragmentInterlock scope. 681e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 682e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 683e5c31af7Sopenharmony_ci * _shader call instance_ - accessible by shader agents that are 684e5c31af7Sopenharmony_ci <<shader-call-related,shader-call-related>>, corresponding to the 685e5c31af7Sopenharmony_ci code:ShaderCallKHR scope. 686e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 687e5c31af7Sopenharmony_ci * _workgroup instance_ - accessible by shader agents in the same 688e5c31af7Sopenharmony_ci workgroup, corresponding to the code:Workgroup scope. 689e5c31af7Sopenharmony_ci * _subgroup instance_ - accessible by shader agents in the same subgroup, 690e5c31af7Sopenharmony_ci corresponding to the code:Subgroup scope. 691e5c31af7Sopenharmony_ci 692e5c31af7Sopenharmony_ciThe memory domains are nested in the order listed above, 693e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 694e5c31af7Sopenharmony_ciexcept for shader call instance domain, 695e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 696e5c31af7Sopenharmony_ciwith memory domains later in the list nested in the domains earlier in the 697e5c31af7Sopenharmony_cilist. 698e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 699e5c31af7Sopenharmony_ciThe shader call instance domain is at an implementation-dependent location 700e5c31af7Sopenharmony_ciin the list, and is nested according to that location. 701e5c31af7Sopenharmony_ciThe shader call instance domain is not broader than the queue family 702e5c31af7Sopenharmony_ciinstance domain. 703e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 704e5c31af7Sopenharmony_ci 705e5c31af7Sopenharmony_ci[NOTE] 706e5c31af7Sopenharmony_ci.Note 707e5c31af7Sopenharmony_ci==== 708e5c31af7Sopenharmony_ciMemory domains do not correspond to storage classes or device-local and 709e5c31af7Sopenharmony_cihost-local slink:VkDeviceMemory allocations, rather they indicate whether a 710e5c31af7Sopenharmony_ciwrite can be made visible only to agents in the same subgroup, same 711e5c31af7Sopenharmony_ciworkgroup, 712e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 713e5c31af7Sopenharmony_cioverlapping fragment shader invocation, 714e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 715e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 716e5c31af7Sopenharmony_cishader-call-related ray tracing invocation, 717e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 718e5c31af7Sopenharmony_ciin any shader invocation, or anywhere on the device, or host. 719e5c31af7Sopenharmony_ciThe shader, queue family instance, 720e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 721e5c31af7Sopenharmony_cifragment interlock instance, 722e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 723e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 724e5c31af7Sopenharmony_cishader call instance, 725e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 726e5c31af7Sopenharmony_ciworkgroup instance, and subgroup instance domains are only used for 727e5c31af7Sopenharmony_cishader-based availability/visibility operations, in other cases writes can 728e5c31af7Sopenharmony_cibe made available from/visible to the shader via the device domain. 729e5c31af7Sopenharmony_ci==== 730e5c31af7Sopenharmony_ci 731e5c31af7Sopenharmony_ci_Availability operations_, _visibility operations_, and _memory domain 732e5c31af7Sopenharmony_cioperations_ alter the state of the write operations that happen-before them, 733e5c31af7Sopenharmony_ciand which are included in their _source scope_ to be available or visible to 734e5c31af7Sopenharmony_citheir _destination scope_. 735e5c31af7Sopenharmony_ci 736e5c31af7Sopenharmony_ci * For an availability operation, the source scope is a set of 737e5c31af7Sopenharmony_ci (agent,reference,memory location) tuples, and the destination scope is a 738e5c31af7Sopenharmony_ci set of memory domains. 739e5c31af7Sopenharmony_ci * For a memory domain operation, the source scope is a memory domain and 740e5c31af7Sopenharmony_ci the destination scope is a memory domain. 741e5c31af7Sopenharmony_ci * For a visibility operation, the source scope is a set of memory domains 742e5c31af7Sopenharmony_ci and the destination scope is a set of (agent,reference,memory location) 743e5c31af7Sopenharmony_ci tuples. 744e5c31af7Sopenharmony_ci 745e5c31af7Sopenharmony_ciHow the scopes are determined depends on the specific operation. 746e5c31af7Sopenharmony_ciAvailability and memory domain operations expand the set of memory domains 747e5c31af7Sopenharmony_cito which the write is available. 748e5c31af7Sopenharmony_ciVisibility operations expand the set of (agent,reference,memory location) 749e5c31af7Sopenharmony_cituples to which the write is visible. 750e5c31af7Sopenharmony_ci 751e5c31af7Sopenharmony_ciRecall that availability and visibility states are per-memory location, and 752e5c31af7Sopenharmony_cilet W be a write operation to one or more locations performed by agent A via 753e5c31af7Sopenharmony_cireference R. Let L be one of the locations written. 754e5c31af7Sopenharmony_ci(W,L) (the write W to L), is initially not available to any memory domain 755e5c31af7Sopenharmony_ciand only visible to (A,R,L). 756e5c31af7Sopenharmony_ciAn availability operation AV that happens-after W and that includes (A,R,L) 757e5c31af7Sopenharmony_ciin its source scope makes (W,L) _available_ to the memory domains in its 758e5c31af7Sopenharmony_cidestination scope. 759e5c31af7Sopenharmony_ci 760e5c31af7Sopenharmony_ciA memory domain operation DOM that happens-after AV and for which (W,L) is 761e5c31af7Sopenharmony_ciavailable in the source scope makes (W,L) available in the destination 762e5c31af7Sopenharmony_cimemory domain. 763e5c31af7Sopenharmony_ci 764e5c31af7Sopenharmony_ciA visibility operation VIS that happens-after AV (or DOM) and for which 765e5c31af7Sopenharmony_ci(W,L) is available in any domain in the source scope makes (W,L) _visible_ 766e5c31af7Sopenharmony_cito all (agent,reference,L) tuples included in its destination scope. 767e5c31af7Sopenharmony_ci 768e5c31af7Sopenharmony_ciIf write W~2~ happens-after W, and their sets of memory locations overlap, 769e5c31af7Sopenharmony_cithen W will not be available/visible to all agents/references for those 770e5c31af7Sopenharmony_cimemory locations that overlap (and future AV/DOM/VIS ops cannot revive W's 771e5c31af7Sopenharmony_ciwrite to those locations). 772e5c31af7Sopenharmony_ci 773e5c31af7Sopenharmony_ciAvailability, memory domain, and visibility operations are treated like 774e5c31af7Sopenharmony_ciother non-atomic memory accesses for the purpose of 775e5c31af7Sopenharmony_ci<<memory-model-memory-semantics,memory semantics>>, meaning they can be 776e5c31af7Sopenharmony_ciordered by release-acquire sequences or memory barriers. 777e5c31af7Sopenharmony_ci 778e5c31af7Sopenharmony_ciAn _availability chain_ is a sequence of availability operations to 779e5c31af7Sopenharmony_ciincreasingly broad memory domains, where element N+1 of the chain is 780e5c31af7Sopenharmony_ciperformed in the dual scope instance of the destination memory domain of 781e5c31af7Sopenharmony_cielement N and element N happens-before element N+1. 782e5c31af7Sopenharmony_ciAn example is an availability operation with destination scope of the 783e5c31af7Sopenharmony_ciworkgroup instance domain that happens-before an availability operation to 784e5c31af7Sopenharmony_cithe shader domain performed by an invocation in the same workgroup. 785e5c31af7Sopenharmony_ciAn availability chain AVC that happens-after W and that includes (A,R,L) in 786e5c31af7Sopenharmony_cithe source scope makes (W,L) _available_ to the memory domains in its final 787e5c31af7Sopenharmony_cidestination scope. 788e5c31af7Sopenharmony_ciAn availability chain with a single element is just the availability 789e5c31af7Sopenharmony_cioperation. 790e5c31af7Sopenharmony_ci 791e5c31af7Sopenharmony_ciSimilarly, a _visibility chain_ is a sequence of visibility operations from 792e5c31af7Sopenharmony_ciincreasingly narrow memory domains, where element N of the chain is 793e5c31af7Sopenharmony_ciperformed in the dual scope instance of the source memory domain of element 794e5c31af7Sopenharmony_ciN+1 and element N happens-before element N+1. 795e5c31af7Sopenharmony_ciAn example is a visibility operation with source scope of the shader domain 796e5c31af7Sopenharmony_cithat happens-before a visibility operation with source scope of the 797e5c31af7Sopenharmony_ciworkgroup instance domain performed by an invocation in the same workgroup. 798e5c31af7Sopenharmony_ciA visibility chain VISC that happens-after AVC (or DOM) and for which (W,L) 799e5c31af7Sopenharmony_ciis available in any domain in the source scope makes (W,L) _visible_ to all 800e5c31af7Sopenharmony_ci(agent,reference,L) tuples included in its final destination scope. 801e5c31af7Sopenharmony_ciA visibility chain with a single element is just the visibility operation. 802e5c31af7Sopenharmony_ci 803e5c31af7Sopenharmony_ci 804e5c31af7Sopenharmony_ci[[memory-model-vulkan-availability-visibility]] 805e5c31af7Sopenharmony_ci== Availability, Visibility, and Domain Operations 806e5c31af7Sopenharmony_ci 807e5c31af7Sopenharmony_ciThe following operations generate availability, visibility, and domain 808e5c31af7Sopenharmony_cioperations. 809e5c31af7Sopenharmony_ciWhen multiple availability/visibility/domain operations are described, they 810e5c31af7Sopenharmony_ciare system-synchronized-with each other in the order listed. 811e5c31af7Sopenharmony_ci 812e5c31af7Sopenharmony_ciAn operation that performs a <<synchronization-dependencies-memory,memory 813e5c31af7Sopenharmony_cidependency>> generates: 814e5c31af7Sopenharmony_ci 815e5c31af7Sopenharmony_ci * If the source access mask includes ename:VK_ACCESS_HOST_WRITE_BIT, then 816e5c31af7Sopenharmony_ci the dependency includes a memory domain operation from host domain to 817e5c31af7Sopenharmony_ci device domain. 818e5c31af7Sopenharmony_ci * An availability operation with source scope of all writes in the first 819e5c31af7Sopenharmony_ci <<synchronization-dependencies-access-scopes,access scope>> of the 820e5c31af7Sopenharmony_ci dependency and a destination scope of the device domain. 821e5c31af7Sopenharmony_ci * A visibility operation with source scope of the device domain and 822e5c31af7Sopenharmony_ci destination scope of the second access scope of the dependency. 823e5c31af7Sopenharmony_ci * If the destination access mask includes ename:VK_ACCESS_HOST_READ_BIT or 824e5c31af7Sopenharmony_ci ename:VK_ACCESS_HOST_WRITE_BIT, then the dependency includes a memory 825e5c31af7Sopenharmony_ci domain operation from device domain to host domain. 826e5c31af7Sopenharmony_ci 827e5c31af7Sopenharmony_ciflink:vkFlushMappedMemoryRanges performs an availability operation, with a 828e5c31af7Sopenharmony_cisource scope of (agents,references) = (all host threads, all mapped memory 829e5c31af7Sopenharmony_ciranges passed to the command), and destination scope of the host domain. 830e5c31af7Sopenharmony_ci 831e5c31af7Sopenharmony_ciflink:vkInvalidateMappedMemoryRanges performs a visibility operation, with a 832e5c31af7Sopenharmony_cisource scope of the host domain and a destination scope of 833e5c31af7Sopenharmony_ci(agents,references) = (all host threads, all mapped memory ranges passed to 834e5c31af7Sopenharmony_cithe command). 835e5c31af7Sopenharmony_ci 836e5c31af7Sopenharmony_ciflink:vkQueueSubmit performs a memory domain operation from host to device, 837e5c31af7Sopenharmony_ciand a visibility operation with source scope of the device domain and 838e5c31af7Sopenharmony_cidestination scope of all agents and references on the device. 839e5c31af7Sopenharmony_ci 840e5c31af7Sopenharmony_ci 841e5c31af7Sopenharmony_ci[[memory-model-availability-visibility-semantics]] 842e5c31af7Sopenharmony_ci== Availability and Visibility Semantics 843e5c31af7Sopenharmony_ci 844e5c31af7Sopenharmony_ciA memory barrier or atomic operation via agent A that includes MakeAvailable 845e5c31af7Sopenharmony_ciin its semantics performs an availability operation whose source scope 846e5c31af7Sopenharmony_ciincludes agent A and all references in the storage classes in that 847e5c31af7Sopenharmony_ciinstruction's storage class semantics, and all memory locations, and whose 848e5c31af7Sopenharmony_cidestination scope is a set of memory domains selected as specified below. 849e5c31af7Sopenharmony_ciThe implicit availability operation is program-ordered between the barrier 850e5c31af7Sopenharmony_cior atomic and all other operations program-ordered before the barrier or 851e5c31af7Sopenharmony_ciatomic. 852e5c31af7Sopenharmony_ci 853e5c31af7Sopenharmony_ciA memory barrier or atomic operation via agent A that includes MakeVisible 854e5c31af7Sopenharmony_ciin its semantics performs a visibility operation whose source scope is a set 855e5c31af7Sopenharmony_ciof memory domains selected as specified below, and whose destination scope 856e5c31af7Sopenharmony_ciincludes agent A and all references in the storage classes in that 857e5c31af7Sopenharmony_ciinstruction's storage class semantics, and all memory locations. 858e5c31af7Sopenharmony_ciThe implicit visibility operation is program-ordered between the barrier or 859e5c31af7Sopenharmony_ciatomic and all other operations program-ordered after the barrier or atomic. 860e5c31af7Sopenharmony_ci 861e5c31af7Sopenharmony_ciThe memory domains are selected based on the memory scope of the instruction 862e5c31af7Sopenharmony_cias follows: 863e5c31af7Sopenharmony_ci 864e5c31af7Sopenharmony_ci * code:Device scope uses the shader domain 865e5c31af7Sopenharmony_ci * code:QueueFamily scope uses the queue family instance domain 866e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 867e5c31af7Sopenharmony_ci * code:FragmentInterlock scope uses the fragment interlock instance domain 868e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 869e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 870e5c31af7Sopenharmony_ci * code:ShaderCallKHR scope uses the shader call instance domain 871e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 872e5c31af7Sopenharmony_ci * code:Workgroup scope uses the workgroup instance domain 873e5c31af7Sopenharmony_ci * code:Subgroup uses the subgroup instance domain 874e5c31af7Sopenharmony_ci * code:Invocation perform no availability/visibility operations. 875e5c31af7Sopenharmony_ci 876e5c31af7Sopenharmony_ciWhen an availability operation performed by an agent A includes a memory 877e5c31af7Sopenharmony_cidomain D in its destination scope, where D corresponds to scope instance S, 878e5c31af7Sopenharmony_ciit also includes the memory domains that correspond to each smaller scope 879e5c31af7Sopenharmony_ciinstance S' that is a subset of S and that includes A. Similarly for 880e5c31af7Sopenharmony_civisibility operations. 881e5c31af7Sopenharmony_ci 882e5c31af7Sopenharmony_ci 883e5c31af7Sopenharmony_ci[[memory-model-instruction-av-vis]] 884e5c31af7Sopenharmony_ci== Per-Instruction Availability and Visibility Semantics 885e5c31af7Sopenharmony_ci 886e5c31af7Sopenharmony_ciA memory write instruction that includes MakePointerAvailable, or an image 887e5c31af7Sopenharmony_ciwrite instruction that includes MakeTexelAvailable, performs an availability 888e5c31af7Sopenharmony_cioperation whose source scope includes the agent and reference used to 889e5c31af7Sopenharmony_ciperform the write and the memory locations written by the instruction, and 890e5c31af7Sopenharmony_ciwhose destination scope is a set of memory domains selected by the Scope 891e5c31af7Sopenharmony_cioperand specified in <<memory-model-availability-visibility-semantics, 892e5c31af7Sopenharmony_ciAvailability and Visibility Semantics>>. 893e5c31af7Sopenharmony_ciThe implicit availability operation is program-ordered between the write and 894e5c31af7Sopenharmony_ciall other operations program-ordered after the write. 895e5c31af7Sopenharmony_ci 896e5c31af7Sopenharmony_ciA memory read instruction that includes MakePointerVisible, or an image read 897e5c31af7Sopenharmony_ciinstruction that includes MakeTexelVisible, performs a visibility operation 898e5c31af7Sopenharmony_ciwhose source scope is a set of memory domains selected by the Scope operand 899e5c31af7Sopenharmony_cias specified in <<memory-model-availability-visibility-semantics, 900e5c31af7Sopenharmony_ciAvailability and Visibility Semantics>>, and whose destination scope 901e5c31af7Sopenharmony_ciincludes the agent and reference used to perform the read and the memory 902e5c31af7Sopenharmony_cilocations read by the instruction. 903e5c31af7Sopenharmony_ciThe implicit visibility operation is program-ordered between read and all 904e5c31af7Sopenharmony_ciother operations program-ordered before the read. 905e5c31af7Sopenharmony_ci 906e5c31af7Sopenharmony_ci[NOTE] 907e5c31af7Sopenharmony_ci.Note 908e5c31af7Sopenharmony_ci==== 909e5c31af7Sopenharmony_ciAlthough reads with per-instruction visibility only perform visibility ops 910e5c31af7Sopenharmony_cifrom the shader or 911e5c31af7Sopenharmony_ciifdef::VK_EXT_fragment_shader_interlock[] 912e5c31af7Sopenharmony_cifragment interlock instance or 913e5c31af7Sopenharmony_ciendif::VK_EXT_fragment_shader_interlock[] 914e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline[] 915e5c31af7Sopenharmony_cishader call instance or 916e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline[] 917e5c31af7Sopenharmony_ciworkgroup instance or subgroup instance domain, they will also see writes 918e5c31af7Sopenharmony_cithat were made visible via the device domain, i.e. those writes previously 919e5c31af7Sopenharmony_ciperformed by non-shader agents and made visible via API commands. 920e5c31af7Sopenharmony_ci==== 921e5c31af7Sopenharmony_ci 922e5c31af7Sopenharmony_ci[NOTE] 923e5c31af7Sopenharmony_ci.Note 924e5c31af7Sopenharmony_ci==== 925e5c31af7Sopenharmony_ciIt is expected that all invocations in a subgroup execute on the same 926e5c31af7Sopenharmony_ciprocessor with the same path to memory, and thus availability and visibility 927e5c31af7Sopenharmony_cioperations with subgroup scope can be expected to be "`free`". 928e5c31af7Sopenharmony_ci==== 929e5c31af7Sopenharmony_ci 930e5c31af7Sopenharmony_ci 931e5c31af7Sopenharmony_ci[[memory-model-location-ordered]] 932e5c31af7Sopenharmony_ci== Location-Ordered 933e5c31af7Sopenharmony_ci 934e5c31af7Sopenharmony_ciLet X and Y be memory accesses to overlapping sets of memory locations M, 935e5c31af7Sopenharmony_ciwhere X != Y. Let (A~X~,R~X~) be the agent and reference used for X, and 936e5c31af7Sopenharmony_ci(A~Y~,R~Y~) be the agent and reference used for Y. For now, let "`->`" 937e5c31af7Sopenharmony_cidenote happens-before and "`->^rcpo^`" denote the reflexive closure of 938e5c31af7Sopenharmony_ciprogram-ordered before. 939e5c31af7Sopenharmony_ci 940e5c31af7Sopenharmony_ciIf D~1~ and D~2~ are different memory domains, then let DOM(D~1~,D~2~) be a 941e5c31af7Sopenharmony_cimemory domain operation from D~1~ to D~2~. 942e5c31af7Sopenharmony_ciOtherwise, let DOM(D,D) be a placeholder such that X->DOM(D,D)->Y if and 943e5c31af7Sopenharmony_cionly if X->Y. 944e5c31af7Sopenharmony_ci 945e5c31af7Sopenharmony_ciX is _location-ordered_ before Y for a location L in M if and only if any of 946e5c31af7Sopenharmony_cithe following is true: 947e5c31af7Sopenharmony_ci 948e5c31af7Sopenharmony_ci * A~X~ == A~Y~ and R~X~ == R~Y~ and X->Y 949e5c31af7Sopenharmony_ci ** NOTE: this case means no availability/visibility ops are required when 950e5c31af7Sopenharmony_ci it is the same (agent,reference). 951e5c31af7Sopenharmony_ci 952e5c31af7Sopenharmony_ci * X is a read, both X and Y are non-private, and X->Y 953e5c31af7Sopenharmony_ci * X is a read, and X (transitively) system-synchronizes with Y 954e5c31af7Sopenharmony_ci 955e5c31af7Sopenharmony_ci * If R~X~ == R~Y~ and A~X~ and A~Y~ access a common memory domain D (e.g. 956e5c31af7Sopenharmony_ci are in the same workgroup instance if D is the workgroup instance 957e5c31af7Sopenharmony_ci domain), and both X and Y are non-private: 958e5c31af7Sopenharmony_ci ** X is a write, Y is a write, AVC(A~X~,R~X~,D,L) is an availability chain 959e5c31af7Sopenharmony_ci making (X,L) available to domain D, and X->^rcpo^AVC(A~X~,R~X~,D,L)->Y 960e5c31af7Sopenharmony_ci ** X is a write, Y is a read, AVC(A~X~,R~X~,D,L) is an availability chain 961e5c31af7Sopenharmony_ci making (X,L) available to domain D, VISC(A~Y~,R~Y~,D,L) is a visibility 962e5c31af7Sopenharmony_ci chain making writes to L available in domain D visible to Y, and 963e5c31af7Sopenharmony_ci X->^rcpo^AVC(A~X~,R~X~,D,L)->VISC(A~Y~,R~Y~,D,L)->^rcpo^Y 964e5c31af7Sopenharmony_ci ** If 965e5c31af7Sopenharmony_ci slink:VkPhysicalDeviceVulkanMemoryModelFeatures::pname:vulkanMemoryModelAvailabilityVisibilityChains 966e5c31af7Sopenharmony_ci is ename:VK_FALSE, then AVC and VISC must: each only have a single 967e5c31af7Sopenharmony_ci element in the chain, in each sub-bullet above. 968e5c31af7Sopenharmony_ci 969e5c31af7Sopenharmony_ci * Let D~X~ and D~Y~ each be either the device domain or the host domain, 970e5c31af7Sopenharmony_ci depending on whether A~X~ and A~Y~ execute on the device or host: 971e5c31af7Sopenharmony_ci ** X is a write and Y is a write, and 972e5c31af7Sopenharmony_ci X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->Y 973e5c31af7Sopenharmony_ci ** X is a write and Y is a read, and 974e5c31af7Sopenharmony_ci X->AV(A~X~,R~X~,D~X~,L)->DOM(D~X~,D~Y~)->VIS(A~Y~,R~Y~,D~Y~,L)->Y 975e5c31af7Sopenharmony_ci 976e5c31af7Sopenharmony_ci[NOTE] 977e5c31af7Sopenharmony_ci.Note 978e5c31af7Sopenharmony_ci==== 979e5c31af7Sopenharmony_ciThe final bullet (synchronization through device/host domain) requires 980e5c31af7Sopenharmony_ciAPI-level synchronization operations, since the device/host domains are not 981e5c31af7Sopenharmony_ciaccessible via shader instructions. 982e5c31af7Sopenharmony_ciAnd "`device domain`" is not to be confused with "`device scope`", which 983e5c31af7Sopenharmony_cisynchronizes through the "`shader domain`". 984e5c31af7Sopenharmony_ci==== 985e5c31af7Sopenharmony_ci 986e5c31af7Sopenharmony_ci 987e5c31af7Sopenharmony_ci[[memory-model-access-data-race]] 988e5c31af7Sopenharmony_ci== Data Race 989e5c31af7Sopenharmony_ci 990e5c31af7Sopenharmony_ciLet X and Y be operations that access overlapping sets of memory locations 991e5c31af7Sopenharmony_ciM, where X != Y, and at least one of X and Y is a write, and X and Y are not 992e5c31af7Sopenharmony_cimutually-ordered atomic operations. 993e5c31af7Sopenharmony_ciIf there does not exist a location-ordered relation between X and Y for each 994e5c31af7Sopenharmony_cilocation in M, then there is a _data race_. 995e5c31af7Sopenharmony_ci 996e5c31af7Sopenharmony_ciApplications must: ensure that no data races occur during the execution of 997e5c31af7Sopenharmony_citheir application. 998e5c31af7Sopenharmony_ci 999e5c31af7Sopenharmony_ci[NOTE] 1000e5c31af7Sopenharmony_ci.Note 1001e5c31af7Sopenharmony_ci==== 1002e5c31af7Sopenharmony_ciData races can only occur due to instructions that are actually executed. 1003e5c31af7Sopenharmony_ciFor example, an instruction skipped due to control flow must not contribute 1004e5c31af7Sopenharmony_cito a data race. 1005e5c31af7Sopenharmony_ci==== 1006e5c31af7Sopenharmony_ci 1007e5c31af7Sopenharmony_ci 1008e5c31af7Sopenharmony_ci[[memory-model-visible-to]] 1009e5c31af7Sopenharmony_ci== Visible-To 1010e5c31af7Sopenharmony_ci 1011e5c31af7Sopenharmony_ciLet X be a write and Y be a read whose sets of memory locations overlap, and 1012e5c31af7Sopenharmony_cilet M be the set of memory locations that overlap. 1013e5c31af7Sopenharmony_ciLet M~2~ be a non-empty subset of M. Then X is _visible-to_ Y for memory 1014e5c31af7Sopenharmony_cilocations M~2~ if and only if all of the following are true: 1015e5c31af7Sopenharmony_ci 1016e5c31af7Sopenharmony_ci * X is location-ordered before Y for each location L in M~2~. 1017e5c31af7Sopenharmony_ci * There does not exist another write Z to any location L in M~2~ such that 1018e5c31af7Sopenharmony_ci X is location-ordered before Z for location L and Z is location-ordered 1019e5c31af7Sopenharmony_ci before Y for location L. 1020e5c31af7Sopenharmony_ci 1021e5c31af7Sopenharmony_ciIf X is visible-to Y, then Y reads the value written by X for locations 1022e5c31af7Sopenharmony_ciM~2~. 1023e5c31af7Sopenharmony_ci 1024e5c31af7Sopenharmony_ci[NOTE] 1025e5c31af7Sopenharmony_ci.Note 1026e5c31af7Sopenharmony_ci==== 1027e5c31af7Sopenharmony_ciIt is possible for there to be a write between X and Y that overwrites a 1028e5c31af7Sopenharmony_cisubset of the memory locations, but the remaining memory locations (M~2~) 1029e5c31af7Sopenharmony_ciwill still be visible-to Y. 1030e5c31af7Sopenharmony_ci==== 1031e5c31af7Sopenharmony_ci 1032e5c31af7Sopenharmony_ci 1033e5c31af7Sopenharmony_ci[[memory-model-acyclicity]] 1034e5c31af7Sopenharmony_ci== Acyclicity 1035e5c31af7Sopenharmony_ci 1036e5c31af7Sopenharmony_ci_Reads-from_ is a relation between operations, where the first operation is 1037e5c31af7Sopenharmony_cia write, the second operation is a read, and the second operation reads the 1038e5c31af7Sopenharmony_civalue written by the first operation. 1039e5c31af7Sopenharmony_ci_From-reads_ is a relation between operations, where the first operation is 1040e5c31af7Sopenharmony_cia read, the second operation is a write, and the first operation reads a 1041e5c31af7Sopenharmony_civalue written earlier than the second operation in the second operation's 1042e5c31af7Sopenharmony_ciscoped modification order (or the first operation reads from the initial 1043e5c31af7Sopenharmony_civalue, and the second operation is any write to the same locations). 1044e5c31af7Sopenharmony_ci 1045e5c31af7Sopenharmony_ciThen the implementation must: guarantee that no cycles exist in the union of 1046e5c31af7Sopenharmony_cithe following relations: 1047e5c31af7Sopenharmony_ci 1048e5c31af7Sopenharmony_ci * location-ordered 1049e5c31af7Sopenharmony_ci * scoped modification order (over all atomic writes) 1050e5c31af7Sopenharmony_ci * reads-from 1051e5c31af7Sopenharmony_ci * from-reads 1052e5c31af7Sopenharmony_ci 1053e5c31af7Sopenharmony_ci[NOTE] 1054e5c31af7Sopenharmony_ci.Note 1055e5c31af7Sopenharmony_ci==== 1056e5c31af7Sopenharmony_ciThis is a "`consistency`" axiom, which informally guarantees that sequences 1057e5c31af7Sopenharmony_ciof operations cannot violate causality. 1058e5c31af7Sopenharmony_ci==== 1059e5c31af7Sopenharmony_ci 1060e5c31af7Sopenharmony_ci 1061e5c31af7Sopenharmony_ci[[memory-model-scoped-modification-order-coherence]] 1062e5c31af7Sopenharmony_ci=== Scoped Modification Order Coherence 1063e5c31af7Sopenharmony_ci 1064e5c31af7Sopenharmony_ciLet A and B be mutually-ordered atomic operations, where A is 1065e5c31af7Sopenharmony_cilocation-ordered before B. Then the following rules are a consequence of 1066e5c31af7Sopenharmony_ciacyclicity: 1067e5c31af7Sopenharmony_ci 1068e5c31af7Sopenharmony_ci * If A and B are both reads and A does not read the initial value, then 1069e5c31af7Sopenharmony_ci the write that A takes its value from must: be earlier in its own scoped 1070e5c31af7Sopenharmony_ci modification order than (or the same as) the write that B takes its 1071e5c31af7Sopenharmony_ci value from (no cycles between location-order, reads-from, and 1072e5c31af7Sopenharmony_ci from-reads). 1073e5c31af7Sopenharmony_ci * If A is a read and B is a write and A does not read the initial value, 1074e5c31af7Sopenharmony_ci then A must: take its value from a write earlier than B in B's scoped 1075e5c31af7Sopenharmony_ci modification order (no cycles between location-order, scope modification 1076e5c31af7Sopenharmony_ci order, and reads-from). 1077e5c31af7Sopenharmony_ci * If A is a write and B is a read, then B must: take its value from A or a 1078e5c31af7Sopenharmony_ci write later than A in A's scoped modification order (no cycles between 1079e5c31af7Sopenharmony_ci location-order, scoped modification order, and from-reads). 1080e5c31af7Sopenharmony_ci * If A and B are both writes, then A must: be earlier than B in A's scoped 1081e5c31af7Sopenharmony_ci modification order (no cycles between location-order and scoped 1082e5c31af7Sopenharmony_ci modification order). 1083e5c31af7Sopenharmony_ci * If A is a write and B is a read-modify-write and B reads the value 1084e5c31af7Sopenharmony_ci written by A, then B comes immediately after A in A's scoped 1085e5c31af7Sopenharmony_ci modification order (no cycles between scoped modification order and 1086e5c31af7Sopenharmony_ci from-reads). 1087e5c31af7Sopenharmony_ci 1088e5c31af7Sopenharmony_ci 1089e5c31af7Sopenharmony_ci[[memory-model-shader-io]] 1090e5c31af7Sopenharmony_ci== Shader I/O 1091e5c31af7Sopenharmony_ci 1092e5c31af7Sopenharmony_ciIf a shader invocation A in a shader stage other than code:Vertex performs a 1093e5c31af7Sopenharmony_cimemory read operation X from an object in storage class 1094e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1095e5c31af7Sopenharmony_cicode:CallableDataKHR, code:IncomingCallableDataKHR, code:RayPayloadKHR, 1096e5c31af7Sopenharmony_cicode:HitAttributeKHR, code:IncomingRayPayloadKHR, or 1097e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1098e5c31af7Sopenharmony_cicode:Input, then X is system-synchronized-after all writes to the 1099e5c31af7Sopenharmony_cicorresponding 1100e5c31af7Sopenharmony_ciifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1101e5c31af7Sopenharmony_cicode:CallableDataKHR, code:IncomingCallableDataKHR, code:RayPayloadKHR, 1102e5c31af7Sopenharmony_cicode:HitAttributeKHR, code:IncomingRayPayloadKHR, or 1103e5c31af7Sopenharmony_ciendif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] 1104e5c31af7Sopenharmony_cicode:Output storage variable(s) in the shader invocation(s) that contribute 1105e5c31af7Sopenharmony_cito generating invocation A, and those writes are all visible-to X. 1106e5c31af7Sopenharmony_ci 1107e5c31af7Sopenharmony_ci[NOTE] 1108e5c31af7Sopenharmony_ci.Note 1109e5c31af7Sopenharmony_ci==== 1110e5c31af7Sopenharmony_ciIt is not necessary for the upstream shader invocations to have completed 1111e5c31af7Sopenharmony_ciexecution, they only need to have generated the output that is being read. 1112e5c31af7Sopenharmony_ci==== 1113e5c31af7Sopenharmony_ci 1114e5c31af7Sopenharmony_ci 1115e5c31af7Sopenharmony_ci[[memory-model-deallocation]] 1116e5c31af7Sopenharmony_ci== Deallocation 1117e5c31af7Sopenharmony_ci 1118e5c31af7Sopenharmony_ciifndef::VKSC_VERSION_1_0[] 1119e5c31af7Sopenharmony_ci 1120e5c31af7Sopenharmony_ciA call to flink:vkFreeMemory must: happen-after all memory operations on all 1121e5c31af7Sopenharmony_cimemory locations in that slink:VkDeviceMemory object. 1122e5c31af7Sopenharmony_ci 1123e5c31af7Sopenharmony_ci[NOTE] 1124e5c31af7Sopenharmony_ci.Note 1125e5c31af7Sopenharmony_ci==== 1126e5c31af7Sopenharmony_ciNormally, device memory operations in a given queue are synchronized with 1127e5c31af7Sopenharmony_ciflink:vkFreeMemory by having a host thread wait on a fence signaled by that 1128e5c31af7Sopenharmony_ciqueue, and the wait happens-before the call to flink:vkFreeMemory on the 1129e5c31af7Sopenharmony_cihost. 1130e5c31af7Sopenharmony_ci==== 1131e5c31af7Sopenharmony_ci 1132e5c31af7Sopenharmony_ciendif::VKSC_VERSION_1_0[] 1133e5c31af7Sopenharmony_ci 1134e5c31af7Sopenharmony_ciThe deallocation of SPIR-V variables is managed by the system and 1135e5c31af7Sopenharmony_cihappens-after all operations on those variables. 1136e5c31af7Sopenharmony_ci 1137e5c31af7Sopenharmony_ci 1138e5c31af7Sopenharmony_ci[[memory-model-informative-descriptions]] 1139e5c31af7Sopenharmony_ci== Descriptions (Informative) 1140e5c31af7Sopenharmony_ci 1141e5c31af7Sopenharmony_ciThis subsection offers more easily understandable consequences of the memory 1142e5c31af7Sopenharmony_cimodel for app/compiler developers. 1143e5c31af7Sopenharmony_ci 1144e5c31af7Sopenharmony_ciLet SC be the storage class(es) specified by a release or acquire operation 1145e5c31af7Sopenharmony_cior barrier. 1146e5c31af7Sopenharmony_ci 1147e5c31af7Sopenharmony_ci * An atomic write with release semantics must not be reordered against any 1148e5c31af7Sopenharmony_ci read or write to SC that is program-ordered before it (regardless of the 1149e5c31af7Sopenharmony_ci storage class the atomic is in). 1150e5c31af7Sopenharmony_ci 1151e5c31af7Sopenharmony_ci * An atomic read with acquire semantics must not be reordered against any 1152e5c31af7Sopenharmony_ci read or write to SC that is program-ordered after it (regardless of the 1153e5c31af7Sopenharmony_ci storage class the atomic is in). 1154e5c31af7Sopenharmony_ci 1155e5c31af7Sopenharmony_ci * Any write to SC program-ordered after a release barrier must not be 1156e5c31af7Sopenharmony_ci reordered against any read or write to SC program-ordered before that 1157e5c31af7Sopenharmony_ci barrier. 1158e5c31af7Sopenharmony_ci 1159e5c31af7Sopenharmony_ci * Any read from SC program-ordered before an acquire barrier must not be 1160e5c31af7Sopenharmony_ci reordered against any read or write to SC program-ordered after the 1161e5c31af7Sopenharmony_ci barrier. 1162e5c31af7Sopenharmony_ci 1163e5c31af7Sopenharmony_ciA control barrier (even if it has no memory semantics) must not be reordered 1164e5c31af7Sopenharmony_ciagainst any memory barriers. 1165e5c31af7Sopenharmony_ci 1166e5c31af7Sopenharmony_ciThis memory model allows memory accesses with and without availability and 1167e5c31af7Sopenharmony_civisibility operations, as well as atomic operations, all to be performed on 1168e5c31af7Sopenharmony_cithe same memory location. 1169e5c31af7Sopenharmony_ciThis is critical to allow it to reason about memory that is reused in 1170e5c31af7Sopenharmony_cimultiple ways, e.g. across the lifetime of different shader invocations or 1171e5c31af7Sopenharmony_cidraw calls. 1172e5c31af7Sopenharmony_ciWhile GLSL (and legacy SPIR-V) applies the "`coherent`" decoration to 1173e5c31af7Sopenharmony_civariables (for historical reasons), this model treats each memory access 1174e5c31af7Sopenharmony_ciinstruction as having optional implicit availability/visibility operations. 1175e5c31af7Sopenharmony_ciGLSL to SPIR-V compilers should map all (non-atomic) operations on a 1176e5c31af7Sopenharmony_cicoherent variable to Make{Pointer,Texel}\{Available}\{Visible} flags in this 1177e5c31af7Sopenharmony_cimodel. 1178e5c31af7Sopenharmony_ci 1179e5c31af7Sopenharmony_ciAtomic operations implicitly have availability/visibility operations, and 1180e5c31af7Sopenharmony_cithe scope of those operations is taken from the atomic operation's scope. 1181e5c31af7Sopenharmony_ci 1182e5c31af7Sopenharmony_ci 1183e5c31af7Sopenharmony_ci[[memory-model-tessellation-output-ordering]] 1184e5c31af7Sopenharmony_ci== Tessellation Output Ordering 1185e5c31af7Sopenharmony_ci 1186e5c31af7Sopenharmony_ciFor SPIR-V that uses the Vulkan Memory Model, the code:OutputMemory storage 1187e5c31af7Sopenharmony_ciclass is used to synchronize accesses to tessellation control output 1188e5c31af7Sopenharmony_civariables. 1189e5c31af7Sopenharmony_ciFor legacy SPIR-V that does not enable the Vulkan Memory Model via 1190e5c31af7Sopenharmony_cicode:OpMemoryModel, tessellation outputs can be ordered using a control 1191e5c31af7Sopenharmony_cibarrier with no particular memory scope or semantics, as defined below. 1192e5c31af7Sopenharmony_ci 1193e5c31af7Sopenharmony_ciLet X and Y be memory operations performed by shader invocations A~X~ and 1194e5c31af7Sopenharmony_ciA~Y~. 1195e5c31af7Sopenharmony_ciOperation X is _tessellation-output-ordered_ before operation Y if and only 1196e5c31af7Sopenharmony_ciif all of the following are true: 1197e5c31af7Sopenharmony_ci 1198e5c31af7Sopenharmony_ci * There is a dynamic instance of an code:OpControlBarrier instruction C 1199e5c31af7Sopenharmony_ci such that X is program-ordered before C in A~X~ and C is program-ordered 1200e5c31af7Sopenharmony_ci before Y in A~Y~. 1201e5c31af7Sopenharmony_ci * A~X~ and A~Y~ are in the same instance of C's execution scope. 1202e5c31af7Sopenharmony_ci 1203e5c31af7Sopenharmony_ciIf shader invocations A~X~ and A~Y~ in the code:TessellationControl 1204e5c31af7Sopenharmony_ciexecution model execute memory operations X and Y, respectively, on the 1205e5c31af7Sopenharmony_cicode:Output storage class, and X is tessellation-output-ordered before Y 1206e5c31af7Sopenharmony_ciwith a scope of code:Workgroup, then X is location-ordered before Y, and if 1207e5c31af7Sopenharmony_ciX is a write and Y is a read then X is visible-to Y. 1208e5c31af7Sopenharmony_ci 1209e5c31af7Sopenharmony_ci 1210e5c31af7Sopenharmony_ciifdef::VK_NV_cooperative_matrix[] 1211e5c31af7Sopenharmony_ci[[memory-model-cooperative-matrix]] 1212e5c31af7Sopenharmony_ci== Cooperative Matrix Memory Access 1213e5c31af7Sopenharmony_ci 1214e5c31af7Sopenharmony_ciFor each dynamic instance of a cooperative matrix load or store instruction 1215e5c31af7Sopenharmony_ci(code:OpCooperativeMatrixLoadNV or code:OpCooperativeMatrixStoreNV), a 1216e5c31af7Sopenharmony_cisingle implementation-dependent invocation within the instance of the 1217e5c31af7Sopenharmony_cimatrix's scope performs a non-atomic load or store (respectively) to each 1218e5c31af7Sopenharmony_cimemory location that is defined to be accessed by the instruction. 1219e5c31af7Sopenharmony_ciendif::VK_NV_cooperative_matrix[] 1220