1bf215546Sopenharmony_ciNew IR, or NIR, is an IR for Mesa intended to sit below GLSL IR and Mesa IR. 2bf215546Sopenharmony_ciIts design inherits from the various IRs that Mesa has used in the past, as 3bf215546Sopenharmony_ciwell as Direct3D assembly, and it includes a few new ideas as well. It is a 4bf215546Sopenharmony_ciflat (in terms of using instructions instead of expressions), typeless IR, 5bf215546Sopenharmony_cisimilar to TGSI and Mesa IR. It also supports SSA (although it doesn't require 6bf215546Sopenharmony_ciit). 7bf215546Sopenharmony_ci 8bf215546Sopenharmony_ciVariables 9bf215546Sopenharmony_ci========= 10bf215546Sopenharmony_ci 11bf215546Sopenharmony_ciNIR includes support for source-level GLSL variables through a structure mostly 12bf215546Sopenharmony_cicopied from GLSL IR. These will be used for linking and conversion from GLSL IR 13bf215546Sopenharmony_ci(and later, from an AST), but for the most part, they will be lowered to 14bf215546Sopenharmony_ciregisters (see below) and loads/stores. 15bf215546Sopenharmony_ci 16bf215546Sopenharmony_ciRegisters 17bf215546Sopenharmony_ci========= 18bf215546Sopenharmony_ci 19bf215546Sopenharmony_ciRegisters are light-weight; they consist of a structure that only contains its 20bf215546Sopenharmony_cisize, its index for liveness analysis, and an optional name for debugging. In 21bf215546Sopenharmony_ciaddition, registers can be local to a function or global to the entire shader; 22bf215546Sopenharmony_cithe latter will be used in ARB_shader_subroutine for passing parameters and 23bf215546Sopenharmony_cigetting return values from subroutines. Registers can also be an array, in which 24bf215546Sopenharmony_cicase they can be accessed indirectly. Each ALU instruction (add, subtract, etc.) 25bf215546Sopenharmony_ciworks directly with registers or SSA values (see below). 26bf215546Sopenharmony_ci 27bf215546Sopenharmony_ciSSA 28bf215546Sopenharmony_ci======== 29bf215546Sopenharmony_ci 30bf215546Sopenharmony_ciEverywhere a register can be loaded/stored, an SSA value can be used instead. 31bf215546Sopenharmony_ciThe only exception is that arrays/indirect addressing are not supported with 32bf215546Sopenharmony_ciSSA; although research has been done on extensions of SSA to arrays before, it's 33bf215546Sopenharmony_ciusually for the purpose of parallelization (which we're not interested in), and 34bf215546Sopenharmony_ciadds some overhead in the form of adding copies or extra arrays (which is much 35bf215546Sopenharmony_cimore expensive than introducing copies between non-array registers). SSA uses 36bf215546Sopenharmony_cipoint directly to their corresponding definition, which in turn points to the 37bf215546Sopenharmony_ciinstruction it is part of. This creates an implicit use-def chain and avoids the 38bf215546Sopenharmony_cineed for an external structure for each SSA register. 39bf215546Sopenharmony_ci 40bf215546Sopenharmony_ciFunctions 41bf215546Sopenharmony_ci========= 42bf215546Sopenharmony_ci 43bf215546Sopenharmony_ciSupport for function calls is mostly similar to GLSL IR. Each shader contains a 44bf215546Sopenharmony_cilist of functions, and each function has a list of overloads. Each overload 45bf215546Sopenharmony_cicontains a list of parameters, and may contain an implementation which specifies 46bf215546Sopenharmony_cithe variables that correspond to the parameters and return value. Inlining a 47bf215546Sopenharmony_cifunction, assuming it has a single return point, is as simple as copying its 48bf215546Sopenharmony_ciinstructions, registers, and local variables into the target function and then 49bf215546Sopenharmony_ciinserting copies to and from the new parameters as appropriate. After functions 50bf215546Sopenharmony_ciare inlined and any non-subroutine functions are deleted, parameters and return 51bf215546Sopenharmony_civariables will be converted to global variables and then global registers. We 52bf215546Sopenharmony_cidon't do this lowering earlier (i.e. the fortranizer idea) for a few reasons: 53bf215546Sopenharmony_ci 54bf215546Sopenharmony_ci- If we want to do optimizations before link time, we need to have the function 55bf215546Sopenharmony_cisignature available during link-time. 56bf215546Sopenharmony_ci 57bf215546Sopenharmony_ci- If we do any inlining before link time, then we might wind up with the 58bf215546Sopenharmony_ciinlined function and the non-inlined function using the same global 59bf215546Sopenharmony_civariables/registers which would preclude optimization. 60bf215546Sopenharmony_ci 61bf215546Sopenharmony_ciIntrinsics 62bf215546Sopenharmony_ci========= 63bf215546Sopenharmony_ci 64bf215546Sopenharmony_ciAny operation (other than function calls and textures) which touches a variable 65bf215546Sopenharmony_cior is not referentially transparent is represented by an intrinsic. Intrinsics 66bf215546Sopenharmony_ciare similar to the idea of a "builtin function," i.e. a function declaration 67bf215546Sopenharmony_ciwhose implementation is provided by the backend, except they are more powerful 68bf215546Sopenharmony_ciin the following ways: 69bf215546Sopenharmony_ci 70bf215546Sopenharmony_ci- They can also load and store registers when appropriate, which limits the 71bf215546Sopenharmony_cinumber of variables needed in later stages of the IR while obviating the need 72bf215546Sopenharmony_cifor a separate load/store variable instruction. 73bf215546Sopenharmony_ci 74bf215546Sopenharmony_ci- Intrinsics can be marked as side-effect free, which permits them to be 75bf215546Sopenharmony_citreated like any other instruction when it comes to optimizations. This allows 76bf215546Sopenharmony_ciload intrinsics to be represented as intrinsics while still being optimized 77bf215546Sopenharmony_ciaway by dead code elimination, common subexpression elimination, etc. 78bf215546Sopenharmony_ci 79bf215546Sopenharmony_ciIntrinsics are used for: 80bf215546Sopenharmony_ci 81bf215546Sopenharmony_ci- Atomic operations 82bf215546Sopenharmony_ci- Memory barriers 83bf215546Sopenharmony_ci- Subroutine calls 84bf215546Sopenharmony_ci- Geometry shader emitVertex and endPrimitive 85bf215546Sopenharmony_ci- Loading and storing variables (before lowering) 86bf215546Sopenharmony_ci- Loading and storing uniforms, shader inputs and outputs, etc (after lowering) 87bf215546Sopenharmony_ci- Copying variables (cases where in GLSL the destination is a structure or 88bf215546Sopenharmony_ciarray) 89bf215546Sopenharmony_ci- The kitchen sink 90bf215546Sopenharmony_ci- ... 91bf215546Sopenharmony_ci 92bf215546Sopenharmony_ciTextures 93bf215546Sopenharmony_ci========= 94bf215546Sopenharmony_ci 95bf215546Sopenharmony_ciUnfortunately, there are far too many texture operations to represent each one 96bf215546Sopenharmony_ciof them with an intrinsic, so there's a special texture instruction similar to 97bf215546Sopenharmony_cithe GLSL IR one. The biggest difference is that, while the texture instruction 98bf215546Sopenharmony_cihas a sampler dereference field used just like in GLSL IR, this gets lowered to 99bf215546Sopenharmony_cia texture unit index (with a possible indirect offset) while the type 100bf215546Sopenharmony_ciinformation of the original sampler is kept around for backends. Also, all the 101bf215546Sopenharmony_cinon-constant sources are stored in a single array to make it easier for 102bf215546Sopenharmony_cioptimization passes to iterate over all the sources. 103bf215546Sopenharmony_ci 104bf215546Sopenharmony_ciControl Flow 105bf215546Sopenharmony_ci========= 106bf215546Sopenharmony_ci 107bf215546Sopenharmony_ciLike in GLSL IR, control flow consists of a tree of "control flow nodes", which 108bf215546Sopenharmony_ciinclude if statements and loops, and jump instructions (break, continue, and 109bf215546Sopenharmony_cireturn). Unlike GLSL IR, though, the leaves of the tree aren't statements but 110bf215546Sopenharmony_cibasic blocks. Each basic block also keeps track of its successors and 111bf215546Sopenharmony_cipredecessors, and function implementations keep track of the beginning basic 112bf215546Sopenharmony_ciblock (the first basic block of the function) and the ending basic block (a fake 113bf215546Sopenharmony_cibasic block that every return statement points to). Together, these elements 114bf215546Sopenharmony_cimake up the control flow graph, in this case a redundant piece of information on 115bf215546Sopenharmony_citop of the control flow tree that will be used by almost all the optimizations. 116bf215546Sopenharmony_ciThere are helper functions to add and remove control flow nodes that also update 117bf215546Sopenharmony_cithe control flow graph, and so usually it doesn't need to be touched by passes 118bf215546Sopenharmony_cithat modify control flow nodes. 119