1bf215546Sopenharmony_cir600-sb
2bf215546Sopenharmony_ci=======
3bf215546Sopenharmony_ci
4bf215546Sopenharmony_ci* * * * *
5bf215546Sopenharmony_ci
6bf215546Sopenharmony_ciDebugging
7bf215546Sopenharmony_ci---------
8bf215546Sopenharmony_ci
9bf215546Sopenharmony_ci### Environment variables
10bf215546Sopenharmony_ci
11bf215546Sopenharmony_ci-   **R600\_DEBUG**
12bf215546Sopenharmony_ci
13bf215546Sopenharmony_ci    There are new flags:
14bf215546Sopenharmony_ci
15bf215546Sopenharmony_ci    -   **nosb** - Disable sb backend for graphics shaders
16bf215546Sopenharmony_ci    -   **sbdry** - Dry run, optimize but use source bytecode -
17bf215546Sopenharmony_ci        useful if you only want to check shader dumps
18bf215546Sopenharmony_ci        without the risk of lockups and other problems
19bf215546Sopenharmony_ci    -   **sbstat** - Print optimization statistics (only time so far)
20bf215546Sopenharmony_ci    -   **sbdump** - Print IR after some passes.
21bf215546Sopenharmony_ci    -   **sbnofallback** - Abort on errors instead of fallback
22bf215546Sopenharmony_ci    -   **sbdisasm** - Use sb disassembler for shader dumps
23bf215546Sopenharmony_ci    -   **sbsafemath** - Disable unsafe math optimizations
24bf215546Sopenharmony_ci    -   **use_tgsi** - Take in TGSI from the frontend instead of asking for NIR
25bf215546Sopenharmony_ci
26bf215546Sopenharmony_ci### Regression debugging
27bf215546Sopenharmony_ci
28bf215546Sopenharmony_ciIf there are any regressions as compared to the default backend
29bf215546Sopenharmony_ci(R600\_SB=0), it's possible to use the following environment variables
30bf215546Sopenharmony_cito find the incorrectly optimized shader that causes the regression.
31bf215546Sopenharmony_ci
32bf215546Sopenharmony_ci-   **R600\_SB\_DSKIP\_MODE** - allows to skip optimization for some
33bf215546Sopenharmony_ci    shaders
34bf215546Sopenharmony_ci    -   0 - disabled (default)
35bf215546Sopenharmony_ci    -   1 - skip optimization for the shaders in the range
36bf215546Sopenharmony_ci        [R600\_SB\_DSKIP\_START; R600\_SB\_DSKIP\_END], that is,
37bf215546Sopenharmony_ci        optimize only the shaders that are not in this range
38bf215546Sopenharmony_ci    -   2 - optimize only the shaders in the range
39bf215546Sopenharmony_ci        [R600\_SB\_DSKIP\_START; R600\_SB\_DSKIP\_END]
40bf215546Sopenharmony_ci
41bf215546Sopenharmony_ci-   **R600\_SB\_DSKIP\_START** - start of the range (1-based)
42bf215546Sopenharmony_ci
43bf215546Sopenharmony_ci-   **R600\_SB\_DSKIP\_END** - end of the range (1-based)
44bf215546Sopenharmony_ci
45bf215546Sopenharmony_ciExample - optimize only the shaders 5, 6, and 7:
46bf215546Sopenharmony_ci
47bf215546Sopenharmony_ci    R600_SB_DSKIP_START=5 R600_SB_DSKIP_END=7 R600_SB_DSKIP_MODE=2
48bf215546Sopenharmony_ci
49bf215546Sopenharmony_ciAll shaders compiled by the application are numbered starting from 1,
50bf215546Sopenharmony_cithe number of shaders used by the application may be obtained by running
51bf215546Sopenharmony_ciit with "R600_DEBUG=sb,sbstat" - it will print "sb: shader \#index\#"
52bf215546Sopenharmony_cifor each compiled shader.
53bf215546Sopenharmony_ci
54bf215546Sopenharmony_ciAfter figuring out the total number of shaders used by the application,
55bf215546Sopenharmony_cithe variables above allow to use bisection to find the shader that is
56bf215546Sopenharmony_cithe cause of regression. E.g. if the application uses 100 shaders, we
57bf215546Sopenharmony_cican divide the range [1; 100] and run the application with the
58bf215546Sopenharmony_cioptimization enabled only for the first half of the shaders:
59bf215546Sopenharmony_ci
60bf215546Sopenharmony_ci    R600_SB_DSKIP_START=1 R600_SB_DSKIP_END=50 R600_SB_DSKIP_MODE=2 <app>
61bf215546Sopenharmony_ci
62bf215546Sopenharmony_ciIf the regression is reproduced with these parameters, then the failing
63bf215546Sopenharmony_cishader is in the range [1; 50], if it's not reproduced - then it's in
64bf215546Sopenharmony_cithe range [51; 100]. Then we can divide the new range again and repeat
65bf215546Sopenharmony_cithe testing, until we'll reduce the range to a single failing shader.
66bf215546Sopenharmony_ci
67bf215546Sopenharmony_ci*NOTE: This method relies on the assumption that the application
68bf215546Sopenharmony_ciproduces the same sequence of the shaders on each run. It's not always
69bf215546Sopenharmony_citrue - some applications may produce different sequences of the shaders,
70bf215546Sopenharmony_ciin such cases the tools like apitrace may be used to record the trace
71bf215546Sopenharmony_ciwith the application, then this method may be applied when replaying the
72bf215546Sopenharmony_citrace - also this may be faster and/or more convenient than testing the
73bf215546Sopenharmony_ciapplication itself.*
74bf215546Sopenharmony_ci
75bf215546Sopenharmony_ci* * * * *
76bf215546Sopenharmony_ci
77bf215546Sopenharmony_ciIntermediate Representation
78bf215546Sopenharmony_ci---------------------------
79bf215546Sopenharmony_ci
80bf215546Sopenharmony_ci### Values
81bf215546Sopenharmony_ci
82bf215546Sopenharmony_ciAll kinds of the operands (literal constants, references to kcache
83bf215546Sopenharmony_ciconstants, references to GPRs, etc) are currently represented by the
84bf215546Sopenharmony_ci**value** class (possibly it makes sense to switch to hierarchy of
85bf215546Sopenharmony_ciclasses derived from **value** instead, to save some memory).
86bf215546Sopenharmony_ci
87bf215546Sopenharmony_ciAll values (except some pseudo values like the exec\_mask or predicate
88bf215546Sopenharmony_ciregister) represent 32bit scalar values - there are no vector values,
89bf215546Sopenharmony_ciCF/FETCH instructions use groups of 4 values for src and dst operands.
90bf215546Sopenharmony_ci
91bf215546Sopenharmony_ci### Nodes
92bf215546Sopenharmony_ci
93bf215546Sopenharmony_ciShader programs are represented using the tree data structure, some
94bf215546Sopenharmony_cinodes contain a list of subnodes.
95bf215546Sopenharmony_ci
96bf215546Sopenharmony_ci#### Control flow nodes
97bf215546Sopenharmony_ci
98bf215546Sopenharmony_ciControl flow information is represented using four special node types
99bf215546Sopenharmony_ci(based on the ideas from [[1]](#references) )
100bf215546Sopenharmony_ci
101bf215546Sopenharmony_ci-   **region\_node** - single-entry, single-exit region.
102bf215546Sopenharmony_ci
103bf215546Sopenharmony_ci    All loops and if's in the program are enclosed in region nodes.
104bf215546Sopenharmony_ci    Region nodes have two containers for phi nodes -
105bf215546Sopenharmony_ci    region\_node::loop\_phi contains the phi expressions to be executed
106bf215546Sopenharmony_ci    at the region entry, region\_node::phi contains the phi expressions
107bf215546Sopenharmony_ci    to be executed at the region exit. It's the only type of the node
108bf215546Sopenharmony_ci    that contains associated phi expressions.
109bf215546Sopenharmony_ci
110bf215546Sopenharmony_ci-   **depart\_node** - "depart region \$id after { ... }"
111bf215546Sopenharmony_ci
112bf215546Sopenharmony_ci    Depart target region (jump to exit point) after executing contained
113bf215546Sopenharmony_ci    code.
114bf215546Sopenharmony_ci
115bf215546Sopenharmony_ci-   **repeat\_node** - "repeat region \$id after { ... }"
116bf215546Sopenharmony_ci
117bf215546Sopenharmony_ci    Repeat target region (jump to entry point) after executing contained
118bf215546Sopenharmony_ci    code.
119bf215546Sopenharmony_ci
120bf215546Sopenharmony_ci-   **if\_node** - "if (cond) { ... }"
121bf215546Sopenharmony_ci
122bf215546Sopenharmony_ci    Execute contained code if condition is true. The difference from
123bf215546Sopenharmony_ci    [[1]](#references) is that we don't have associated phi expressions
124bf215546Sopenharmony_ci    for the **if\_node**, we enclose **if\_node** in the
125bf215546Sopenharmony_ci    **region\_node** and store corresponding phi's in the
126bf215546Sopenharmony_ci    **region\_node**, this allows more uniform handling.
127bf215546Sopenharmony_ci
128bf215546Sopenharmony_ciThe target region of depart and repeat nodes is always the region where
129bf215546Sopenharmony_cithey are located (possibly in the nested region), there are no arbitrary
130bf215546Sopenharmony_cijumps/goto's - control flow in the program is always structured.
131bf215546Sopenharmony_ci
132bf215546Sopenharmony_ciTypical control flow constructs can be represented as in the following
133bf215546Sopenharmony_ciexamples:
134bf215546Sopenharmony_ci
135bf215546Sopenharmony_ciGLSL:
136bf215546Sopenharmony_ci
137bf215546Sopenharmony_ci    if (cond) {
138bf215546Sopenharmony_ci        < 1 >
139bf215546Sopenharmony_ci    } else {
140bf215546Sopenharmony_ci        < 2 >
141bf215546Sopenharmony_ci    }
142bf215546Sopenharmony_ci
143bf215546Sopenharmony_ciIR:
144bf215546Sopenharmony_ci
145bf215546Sopenharmony_ci    region #0 {
146bf215546Sopenharmony_ci        depart region #0 after {
147bf215546Sopenharmony_ci            if (cond) {
148bf215546Sopenharmony_ci                depart region #0 after {
149bf215546Sopenharmony_ci                    < 1 >
150bf215546Sopenharmony_ci                }
151bf215546Sopenharmony_ci            }
152bf215546Sopenharmony_ci            < 2 >
153bf215546Sopenharmony_ci        }
154bf215546Sopenharmony_ci        <region #0 phi nodes >
155bf215546Sopenharmony_ci    }
156bf215546Sopenharmony_ci
157bf215546Sopenharmony_ciGLSL:
158bf215546Sopenharmony_ci
159bf215546Sopenharmony_ci    while (cond) {
160bf215546Sopenharmony_ci        < 1 >
161bf215546Sopenharmony_ci    }
162bf215546Sopenharmony_ci
163bf215546Sopenharmony_ciIR:
164bf215546Sopenharmony_ci
165bf215546Sopenharmony_ci    region #0 {
166bf215546Sopenharmony_ci        <region #0 loop_phi nodes>
167bf215546Sopenharmony_ci        repeat region #0 after {
168bf215546Sopenharmony_ci            region #1 {
169bf215546Sopenharmony_ci                depart region #1 after {
170bf215546Sopenharmony_ci                    if (!cond) {
171bf215546Sopenharmony_ci                        depart region #0
172bf215546Sopenharmony_ci                    }
173bf215546Sopenharmony_ci                }
174bf215546Sopenharmony_ci            }
175bf215546Sopenharmony_ci            < 1 >
176bf215546Sopenharmony_ci        }
177bf215546Sopenharmony_ci        <region #0 phi nodes>
178bf215546Sopenharmony_ci    }
179bf215546Sopenharmony_ci
180bf215546Sopenharmony_ci'Break' and 'continue' inside the loops are directly translated to the
181bf215546Sopenharmony_cidepart and repeat nodes for the corresponding loop region.
182bf215546Sopenharmony_ci
183bf215546Sopenharmony_ciThis may look a bit too complicated, but in fact this allows more simple
184bf215546Sopenharmony_ciand uniform handling of the control flow.
185bf215546Sopenharmony_ci
186bf215546Sopenharmony_ciAll loop\_phi and phi nodes for some region always have the same number
187bf215546Sopenharmony_ciof source operands. The number of source operands for
188bf215546Sopenharmony_ciregion\_node::loop\_phi nodes is 1 + number of repeat nodes that
189bf215546Sopenharmony_cireference this region as a target. The number of source operands for
190bf215546Sopenharmony_ciregion\_node::phi nodes is equal to the number of depart nodes that
191bf215546Sopenharmony_cireference this region as a target. All depart/repeat nodes for the
192bf215546Sopenharmony_ciregion have unique indices equal to the index of source operand for
193bf215546Sopenharmony_ciphi/loop\_phi nodes.
194bf215546Sopenharmony_ci
195bf215546Sopenharmony_ciFirst source operand for region\_node::loop\_phi nodes (src[0]) is an
196bf215546Sopenharmony_ciincoming value that enters the region from the outside. Each remaining
197bf215546Sopenharmony_cisource operand comes from the corresponding repeat node.
198bf215546Sopenharmony_ci
199bf215546Sopenharmony_ciMore complex example:
200bf215546Sopenharmony_ci
201bf215546Sopenharmony_ciGLSL:
202bf215546Sopenharmony_ci
203bf215546Sopenharmony_ci    a = 1;
204bf215546Sopenharmony_ci    while (a < 5) {
205bf215546Sopenharmony_ci        a = a * 2;
206bf215546Sopenharmony_ci        if (b == 3) {
207bf215546Sopenharmony_ci            continue;
208bf215546Sopenharmony_ci        } else {
209bf215546Sopenharmony_ci            a = 6;
210bf215546Sopenharmony_ci        }
211bf215546Sopenharmony_ci        if (c == 4)
212bf215546Sopenharmony_ci            break;
213bf215546Sopenharmony_ci        a = a + 1;
214bf215546Sopenharmony_ci    }
215bf215546Sopenharmony_ci
216bf215546Sopenharmony_ciIR with SSA form:
217bf215546Sopenharmony_ci
218bf215546Sopenharmony_ci    a.1 = 1;
219bf215546Sopenharmony_ci    region #0 {
220bf215546Sopenharmony_ci        // loop phi values: src[0] - incoming, src[1] - from repeat_1, src[2] - from repeat_2
221bf215546Sopenharmony_ci        region#0 loop_phi: a.2 = phi a.1, a.6, a.3
222bf215546Sopenharmony_ci
223bf215546Sopenharmony_ci        repeat_1 region #0 after {
224bf215546Sopenharmony_ci            a.3 = a.2 * 2;
225bf215546Sopenharmony_ci            cond1 = (b == 3);
226bf215546Sopenharmony_ci            region #1 {
227bf215546Sopenharmony_ci                depart_0 region #1 after {
228bf215546Sopenharmony_ci                    if (cond1) {
229bf215546Sopenharmony_ci                        repeat_2 region #0;
230bf215546Sopenharmony_ci                    }
231bf215546Sopenharmony_ci                }
232bf215546Sopenharmony_ci                a.4 = 6;
233bf215546Sopenharmony_ci
234bf215546Sopenharmony_ci                region #1 phi: a.5 = phi a.4; // src[0] - from depart_0
235bf215546Sopenharmony_ci            }
236bf215546Sopenharmony_ci            cond2 = (c == 4);
237bf215546Sopenharmony_ci            region #2 {
238bf215546Sopenharmony_ci                depart_0 region #2 after {
239bf215546Sopenharmony_ci                    if (cond2) {
240bf215546Sopenharmony_ci                        depart_0 region #0;
241bf215546Sopenharmony_ci                    }
242bf215546Sopenharmony_ci                }
243bf215546Sopenharmony_ci            }
244bf215546Sopenharmony_ci            a.6 = a.5 + 1;
245bf215546Sopenharmony_ci        }
246bf215546Sopenharmony_ci
247bf215546Sopenharmony_ci        region #0 phi: a.7 = phi a.5 // src[0] from depart_0
248bf215546Sopenharmony_ci    }
249bf215546Sopenharmony_ci
250bf215546Sopenharmony_ciPhi nodes with single source operand are just copies, they are not
251bf215546Sopenharmony_cireally necessary, but this allows to handle all **depart\_node**s in the
252bf215546Sopenharmony_ciuniform way.
253bf215546Sopenharmony_ci
254bf215546Sopenharmony_ci#### Instruction nodes
255bf215546Sopenharmony_ci
256bf215546Sopenharmony_ciInstruction nodes represent different kinds of instructions -
257bf215546Sopenharmony_ci**alu\_node**, **cf\_node**, **fetch\_node**, etc. Each of them contains
258bf215546Sopenharmony_cithe "bc" structure where all fields of the bytecode are stored (the type
259bf215546Sopenharmony_ciis **bc\_alu** for **alu\_node**, etc). The operands are represented
260bf215546Sopenharmony_ciusing the vectors of pointers to **value** class (node::src, node::dst)
261bf215546Sopenharmony_ci
262bf215546Sopenharmony_ci#### SSA-specific nodes
263bf215546Sopenharmony_ci
264bf215546Sopenharmony_ciPhi nodes currently don't have special node class, they are stored as
265bf215546Sopenharmony_ci**node**. Destination vector contains a single destination value, source
266bf215546Sopenharmony_civector contains 1 or more source values.
267bf215546Sopenharmony_ci
268bf215546Sopenharmony_ciPsi nodes [[5], [6]](#references) also don't have a special node class
269bf215546Sopenharmony_ciand stored as **node**. Source vector contains 3 values for each source
270bf215546Sopenharmony_cioperand - the **value** of predicate, **value** of corresponding
271bf215546Sopenharmony_ciPRED\_SEL field, and the source **value** itself.
272bf215546Sopenharmony_ci
273bf215546Sopenharmony_ci### Indirect addressing
274bf215546Sopenharmony_ci
275bf215546Sopenharmony_ciSpecial kind of values (VLK\_RELREG) is used to represent indirect
276bf215546Sopenharmony_cioperands. These values don't have SSA versions. The representation is
277bf215546Sopenharmony_cimostly based on the [[2]](#references). Indirect operand contains the
278bf215546Sopenharmony_ci"offset/address" value (value::rel), (e.g. some SSA version of the AR
279bf215546Sopenharmony_ciregister value, though after some passes it may be any value - constant,
280bf215546Sopenharmony_ciregister, etc), also it contains the maydef and mayuse vectors of
281bf215546Sopenharmony_cipointers to **value**s (similar to dst/src vectors in the **node**) to
282bf215546Sopenharmony_cirepresent the effects of aliasing in the SSA form.
283bf215546Sopenharmony_ci
284bf215546Sopenharmony_ciE.g. if we have the array R5.x ... R8.x and the following instruction :
285bf215546Sopenharmony_ci
286bf215546Sopenharmony_ci    MOV R0.x, R[5 + AR].x
287bf215546Sopenharmony_ci
288bf215546Sopenharmony_cithen source indirect operand is represented with the VLK\_RELREG value,
289bf215546Sopenharmony_civalue::rel is AR, value::maydef is empty (in fact it always contain the
290bf215546Sopenharmony_cisame number of elements as mayuse to simplify the handling, but they are
291bf215546Sopenharmony_ciNULLs), value::mayuse contains [R5.x, R6.x, R7.x, R8.x] (or the
292bf215546Sopenharmony_cicorresponding SSA versions after ssa\_rename).
293bf215546Sopenharmony_ci
294bf215546Sopenharmony_ciAdditional "virtual variables" as in [HSSA [2]](#references) are not
295bf215546Sopenharmony_ciused, also there is no special handling for "zero versions". Typical
296bf215546Sopenharmony_ciprograms in our case are small, indirect addressing is rare, array sizes
297bf215546Sopenharmony_ciare limited by max gpr number, so we don't really need to use special
298bf215546Sopenharmony_citricks to avoid the explosion of value versions. Also this allows more
299bf215546Sopenharmony_ciprecise liveness computation for array elements without modifications to
300bf215546Sopenharmony_cithe algorithms.
301bf215546Sopenharmony_ci
302bf215546Sopenharmony_ciWith the following instruction:
303bf215546Sopenharmony_ci
304bf215546Sopenharmony_ci    MOV R[5+AR].x, R0.x
305bf215546Sopenharmony_ci
306bf215546Sopenharmony_ciwe'll have both maydef and mayuse vectors for dst operand filled with
307bf215546Sopenharmony_ciarray values initially: [R5.x, R6.x, R7.x, R8.x]. After the ssa\_rename
308bf215546Sopenharmony_cipass mayuse will contain previous versions, maydef will contain new
309bf215546Sopenharmony_cipotentially-defined versions.
310bf215546Sopenharmony_ci
311bf215546Sopenharmony_ci* * * * *
312bf215546Sopenharmony_ci
313bf215546Sopenharmony_ciPasses
314bf215546Sopenharmony_ci------
315bf215546Sopenharmony_ci
316bf215546Sopenharmony_ci-   **bc\_parser** - creates the IR from the source bytecode,
317bf215546Sopenharmony_ci    initializes src and dst value vectors for instruction nodes. Most
318bf215546Sopenharmony_ci    ALU nodes have one dst operand and the number of source operands is
319bf215546Sopenharmony_ci    equal to the number of source operands for the ISA instruction.
320bf215546Sopenharmony_ci    Nodes for PREDSETxx instructions have 3 dst operands - dst[0] is dst
321bf215546Sopenharmony_ci    gpr as in the original instruction, other two are pseudo-operands
322bf215546Sopenharmony_ci    that represent possibly updated predicate and exec\_mask. Predicate
323bf215546Sopenharmony_ci    values are used in the predicated alu instructions (node::pred),
324bf215546Sopenharmony_ci    exec\_mask values are used in the if\_nodes (if\_node::cond). Each
325bf215546Sopenharmony_ci    vector operand in the CF/TEX/VTX instructions is represented with 4
326bf215546Sopenharmony_ci    values - components of the vector.
327bf215546Sopenharmony_ci
328bf215546Sopenharmony_ci-   **ssa\_prepare** - creates phi expressions.
329bf215546Sopenharmony_ci
330bf215546Sopenharmony_ci-   **ssa\_rename** - renames the values (assigns versions).
331bf215546Sopenharmony_ci
332bf215546Sopenharmony_ci-   **liveness** - liveness computation, sets 'dead' flag for unused
333bf215546Sopenharmony_ci    nodes and values, optionally computes interference information for
334bf215546Sopenharmony_ci    the values.
335bf215546Sopenharmony_ci
336bf215546Sopenharmony_ci-   **dce\_cleanup** - eliminates 'dead' nodes, also removes some
337bf215546Sopenharmony_ci    unnecessary nodes created by bc\_parser, e.g. the nodes for the JUMP
338bf215546Sopenharmony_ci    instructions in the source, containers for ALU groups (they were
339bf215546Sopenharmony_ci    only needed for the ssa\_rename pass)
340bf215546Sopenharmony_ci
341bf215546Sopenharmony_ci-   **if\_conversion** - converts control flow with if\_nodes to the
342bf215546Sopenharmony_ci    data flow in cases where it can improve performance (small alu-only
343bf215546Sopenharmony_ci    branches). Both branches are executed speculatively and the phi
344bf215546Sopenharmony_ci    expressions are replaced with conditional moves (CNDxx) to select
345bf215546Sopenharmony_ci    the final value using the same condition predicate as was used by
346bf215546Sopenharmony_ci    the original if\_node. E.g. **if\_node** used dst[2] from PREDSETxx
347bf215546Sopenharmony_ci    instruction, CNDxx now uses dst[0] from the same PREDSETxx
348bf215546Sopenharmony_ci    instruction.
349bf215546Sopenharmony_ci
350bf215546Sopenharmony_ci-   **peephole** - peephole optimizations
351bf215546Sopenharmony_ci
352bf215546Sopenharmony_ci-   **gvn** - Global Value Numbering [[2]](#references),
353bf215546Sopenharmony_ci    [[3]](#references)
354bf215546Sopenharmony_ci
355bf215546Sopenharmony_ci-   **gcm** - Global Code Motion [[3]](#references). Also performs
356bf215546Sopenharmony_ci    grouping of the instructions of the same kind (CF/FETCH/ALU).
357bf215546Sopenharmony_ci
358bf215546Sopenharmony_ci-   register allocation passes, some ideas are used from
359bf215546Sopenharmony_ci    [[4]](#references), but implementation is simplified to make it more
360bf215546Sopenharmony_ci    efficient in terms of the compilation speed (e.g. no recursive
361bf215546Sopenharmony_ci    recoloring) while achieving good enough results.
362bf215546Sopenharmony_ci
363bf215546Sopenharmony_ci    -   **ra\_split** - prepares the program to register allocation.
364bf215546Sopenharmony_ci        Splits live ranges for constrained values by inserting the
365bf215546Sopenharmony_ci        copies to/from temporary values, so that the live range of the
366bf215546Sopenharmony_ci        constrained values becomes minimal.
367bf215546Sopenharmony_ci
368bf215546Sopenharmony_ci    -   **ra\_coalesce** - performs global allocation on registers used
369bf215546Sopenharmony_ci        in CF/FETCH instructions. It's performed first to make sure they
370bf215546Sopenharmony_ci        end up in the same GPR. Also tries to allocate all values
371bf215546Sopenharmony_ci        involved in copies (inserted by the ra\_split pass) to the same
372bf215546Sopenharmony_ci        register, so that the copies may be eliminated.
373bf215546Sopenharmony_ci
374bf215546Sopenharmony_ci    -   **ra\_init** - allocates gpr arrays (if indirect addressing is
375bf215546Sopenharmony_ci        used), and remaining values.
376bf215546Sopenharmony_ci
377bf215546Sopenharmony_ci-   **post\_scheduler** - ALU scheduler, handles VLIW packing and
378bf215546Sopenharmony_ci    performs the final register allocation for local values inside ALU
379bf215546Sopenharmony_ci    clauses. Eliminates all coalesced copies (if src and dst of the copy
380bf215546Sopenharmony_ci    are allocated to the same register).
381bf215546Sopenharmony_ci
382bf215546Sopenharmony_ci-   **ra\_checker** - optional debugging pass that tries to catch basic
383bf215546Sopenharmony_ci    errors of the scheduler or regalloc,
384bf215546Sopenharmony_ci
385bf215546Sopenharmony_ci-   **bc\_finalize** - propagates the regalloc information from values
386bf215546Sopenharmony_ci    in node::src and node::dst vectors to the bytecode fields, converts
387bf215546Sopenharmony_ci    control flow structure (region/depart/repeat) to the target
388bf215546Sopenharmony_ci    instructions (JUMP/ELSE/POP,
389bf215546Sopenharmony_ci    LOOP\_START/LOOP\_END/LOOP\_CONTINUE/LOOP\_BREAK).
390bf215546Sopenharmony_ci
391bf215546Sopenharmony_ci-   **bc\_builder** - builds final bytecode,
392bf215546Sopenharmony_ci
393bf215546Sopenharmony_ci* * * * *
394bf215546Sopenharmony_ci
395bf215546Sopenharmony_ciReferences
396bf215546Sopenharmony_ci----------
397bf215546Sopenharmony_ci
398bf215546Sopenharmony_ci[1] ["Tree-Based Code Optimization. A Thesis Proposal", Carl
399bf215546Sopenharmony_ciMcConnell](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.4210&rep=rep1&type=pdf)
400bf215546Sopenharmony_ci
401bf215546Sopenharmony_ci[2] ["Effective Representation of Aliases and Indirect Memory Operations
402bf215546Sopenharmony_ciin SSA Form", Fred Chow, Sun Chan, Shin-Ming Liu, Raymond Lo, Mark
403bf215546Sopenharmony_ciStreich](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.6974&rep=rep1&type=pdf)
404bf215546Sopenharmony_ci
405bf215546Sopenharmony_ci[3] ["Global Code Motion. Global Value Numbering.", Cliff
406bf215546Sopenharmony_ciClick](http://www.cs.washington.edu/education/courses/cse501/06wi/reading/click-pldi95.pdf)
407bf215546Sopenharmony_ci
408bf215546Sopenharmony_ci[4] ["Register Allocation for Programs in SSA Form", Sebastian
409bf215546Sopenharmony_ciHack](http://digbib.ubka.uni-karlsruhe.de/volltexte/documents/6532)
410bf215546Sopenharmony_ci
411bf215546Sopenharmony_ci[5] ["An extension to the SSA representation for predicated code",
412bf215546Sopenharmony_ciFrancois de
413bf215546Sopenharmony_ciFerriere](http://www.cdl.uni-saarland.de/ssasem/talks/Francois.de.Ferriere.pdf)
414bf215546Sopenharmony_ci
415bf215546Sopenharmony_ci[6] ["Improvements to the Psi-SSA Representation", F. de
416bf215546Sopenharmony_ciFerriere](http://www.scopesconf.org/scopes-07/presentations/3_Presentation.pdf)
417