1bf215546Sopenharmony_ciNew IR, or NIR, is an IR for Mesa intended to sit below GLSL IR and Mesa IR.
2bf215546Sopenharmony_ciIts design inherits from the various IRs that Mesa has used in the past, as
3bf215546Sopenharmony_ciwell as Direct3D assembly, and it includes a few new ideas as well. It is a
4bf215546Sopenharmony_ciflat (in terms of using instructions instead of expressions), typeless IR,
5bf215546Sopenharmony_cisimilar to TGSI and Mesa IR.  It also supports SSA (although it doesn't require
6bf215546Sopenharmony_ciit).
7bf215546Sopenharmony_ci
8bf215546Sopenharmony_ciVariables
9bf215546Sopenharmony_ci=========
10bf215546Sopenharmony_ci
11bf215546Sopenharmony_ciNIR includes support for source-level GLSL variables through a structure mostly
12bf215546Sopenharmony_cicopied from GLSL IR. These will be used for linking and conversion from GLSL IR
13bf215546Sopenharmony_ci(and later, from an AST), but for the most part, they will be lowered to
14bf215546Sopenharmony_ciregisters (see below) and loads/stores.
15bf215546Sopenharmony_ci
16bf215546Sopenharmony_ciRegisters
17bf215546Sopenharmony_ci=========
18bf215546Sopenharmony_ci
19bf215546Sopenharmony_ciRegisters are light-weight; they consist of a structure that only contains its
20bf215546Sopenharmony_cisize, its index for liveness analysis, and an optional name for debugging. In
21bf215546Sopenharmony_ciaddition, registers can be local to a function or global to the entire shader;
22bf215546Sopenharmony_cithe latter will be used in ARB_shader_subroutine for passing parameters and
23bf215546Sopenharmony_cigetting return values from subroutines. Registers can also be an array, in which
24bf215546Sopenharmony_cicase they can be accessed indirectly. Each ALU instruction (add, subtract, etc.)
25bf215546Sopenharmony_ciworks directly with registers or SSA values (see below).
26bf215546Sopenharmony_ci
27bf215546Sopenharmony_ciSSA
28bf215546Sopenharmony_ci========
29bf215546Sopenharmony_ci
30bf215546Sopenharmony_ciEverywhere a register can be loaded/stored, an SSA value can be used instead.
31bf215546Sopenharmony_ciThe only exception is that arrays/indirect addressing are not supported with
32bf215546Sopenharmony_ciSSA; although research has been done on extensions of SSA to arrays before, it's
33bf215546Sopenharmony_ciusually for the purpose of parallelization (which we're not interested in), and
34bf215546Sopenharmony_ciadds some overhead in the form of adding copies or extra arrays (which is much
35bf215546Sopenharmony_cimore expensive than introducing copies between non-array registers). SSA uses
36bf215546Sopenharmony_cipoint directly to their corresponding definition, which in turn points to the
37bf215546Sopenharmony_ciinstruction it is part of. This creates an implicit use-def chain and avoids the
38bf215546Sopenharmony_cineed for an external structure for each SSA register.
39bf215546Sopenharmony_ci
40bf215546Sopenharmony_ciFunctions
41bf215546Sopenharmony_ci=========
42bf215546Sopenharmony_ci
43bf215546Sopenharmony_ciSupport for function calls is mostly similar to GLSL IR. Each shader contains a
44bf215546Sopenharmony_cilist of functions, and each function has a list of overloads. Each overload
45bf215546Sopenharmony_cicontains a list of parameters, and may contain an implementation which specifies
46bf215546Sopenharmony_cithe variables that correspond to the parameters and return value. Inlining a
47bf215546Sopenharmony_cifunction, assuming it has a single return point, is as simple as copying its
48bf215546Sopenharmony_ciinstructions, registers, and local variables into the target function and then
49bf215546Sopenharmony_ciinserting copies to and from the new parameters as appropriate. After functions
50bf215546Sopenharmony_ciare inlined and any non-subroutine functions are deleted, parameters and return
51bf215546Sopenharmony_civariables will be converted to global variables and then global registers. We
52bf215546Sopenharmony_cidon't do this lowering earlier (i.e. the fortranizer idea) for a few reasons:
53bf215546Sopenharmony_ci
54bf215546Sopenharmony_ci- If we want to do optimizations before link time, we need to have the function
55bf215546Sopenharmony_cisignature available during link-time.
56bf215546Sopenharmony_ci
57bf215546Sopenharmony_ci- If we do any inlining before link time, then we might wind up with the
58bf215546Sopenharmony_ciinlined function and the non-inlined function using the same global
59bf215546Sopenharmony_civariables/registers which would preclude optimization.
60bf215546Sopenharmony_ci
61bf215546Sopenharmony_ciIntrinsics
62bf215546Sopenharmony_ci=========
63bf215546Sopenharmony_ci
64bf215546Sopenharmony_ciAny operation (other than function calls and textures) which touches a variable
65bf215546Sopenharmony_cior is not referentially transparent is represented by an intrinsic. Intrinsics
66bf215546Sopenharmony_ciare similar to the idea of a "builtin function," i.e. a function declaration
67bf215546Sopenharmony_ciwhose implementation is provided by the backend, except they are more powerful
68bf215546Sopenharmony_ciin the following ways:
69bf215546Sopenharmony_ci
70bf215546Sopenharmony_ci- They can also load and store registers when appropriate, which limits the
71bf215546Sopenharmony_cinumber of variables needed in later stages of the IR while obviating the need
72bf215546Sopenharmony_cifor a separate load/store variable instruction.
73bf215546Sopenharmony_ci
74bf215546Sopenharmony_ci- Intrinsics can be marked as side-effect free, which permits them to be
75bf215546Sopenharmony_citreated like any other instruction when it comes to optimizations. This allows
76bf215546Sopenharmony_ciload intrinsics to be represented as intrinsics while still being optimized
77bf215546Sopenharmony_ciaway by dead code elimination, common subexpression elimination, etc.
78bf215546Sopenharmony_ci
79bf215546Sopenharmony_ciIntrinsics are used for:
80bf215546Sopenharmony_ci
81bf215546Sopenharmony_ci- Atomic operations
82bf215546Sopenharmony_ci- Memory barriers
83bf215546Sopenharmony_ci- Subroutine calls
84bf215546Sopenharmony_ci- Geometry shader emitVertex and endPrimitive
85bf215546Sopenharmony_ci- Loading and storing variables (before lowering)
86bf215546Sopenharmony_ci- Loading and storing uniforms, shader inputs and outputs, etc (after lowering)
87bf215546Sopenharmony_ci- Copying variables (cases where in GLSL the destination is a structure or
88bf215546Sopenharmony_ciarray)
89bf215546Sopenharmony_ci- The kitchen sink
90bf215546Sopenharmony_ci- ...
91bf215546Sopenharmony_ci
92bf215546Sopenharmony_ciTextures
93bf215546Sopenharmony_ci=========
94bf215546Sopenharmony_ci
95bf215546Sopenharmony_ciUnfortunately, there are far too many texture operations to represent each one
96bf215546Sopenharmony_ciof them with an intrinsic, so there's a special texture instruction similar to
97bf215546Sopenharmony_cithe GLSL IR one. The biggest difference is that, while the texture instruction
98bf215546Sopenharmony_cihas a sampler dereference field used just like in GLSL IR, this gets lowered to
99bf215546Sopenharmony_cia texture unit index (with a possible indirect offset) while the type
100bf215546Sopenharmony_ciinformation of the original sampler is kept around for backends. Also, all the
101bf215546Sopenharmony_cinon-constant sources are stored in a single array to make it easier for
102bf215546Sopenharmony_cioptimization passes to iterate over all the sources.
103bf215546Sopenharmony_ci
104bf215546Sopenharmony_ciControl Flow
105bf215546Sopenharmony_ci=========
106bf215546Sopenharmony_ci
107bf215546Sopenharmony_ciLike in GLSL IR, control flow consists of a tree of "control flow nodes", which
108bf215546Sopenharmony_ciinclude if statements and loops, and jump instructions (break, continue, and
109bf215546Sopenharmony_cireturn). Unlike GLSL IR, though, the leaves of the tree aren't statements but
110bf215546Sopenharmony_cibasic blocks. Each basic block also keeps track of its successors and
111bf215546Sopenharmony_cipredecessors, and function implementations keep track of the beginning basic
112bf215546Sopenharmony_ciblock (the first basic block of the function) and the ending basic block (a fake
113bf215546Sopenharmony_cibasic block that every return statement points to). Together, these elements
114bf215546Sopenharmony_cimake up the control flow graph, in this case a redundant piece of information on
115bf215546Sopenharmony_citop of the control flow tree that will be used by almost all the optimizations.
116bf215546Sopenharmony_ciThere are helper functions to add and remove control flow nodes that also update
117bf215546Sopenharmony_cithe control flow graph, and so usually it doesn't need to be touched by passes
118bf215546Sopenharmony_cithat modify control flow nodes.
119