1bf215546Sopenharmony_cir600-sb 2bf215546Sopenharmony_ci======= 3bf215546Sopenharmony_ci 4bf215546Sopenharmony_ci* * * * * 5bf215546Sopenharmony_ci 6bf215546Sopenharmony_ciDebugging 7bf215546Sopenharmony_ci--------- 8bf215546Sopenharmony_ci 9bf215546Sopenharmony_ci### Environment variables 10bf215546Sopenharmony_ci 11bf215546Sopenharmony_ci- **R600\_DEBUG** 12bf215546Sopenharmony_ci 13bf215546Sopenharmony_ci There are new flags: 14bf215546Sopenharmony_ci 15bf215546Sopenharmony_ci - **nosb** - Disable sb backend for graphics shaders 16bf215546Sopenharmony_ci - **sbdry** - Dry run, optimize but use source bytecode - 17bf215546Sopenharmony_ci useful if you only want to check shader dumps 18bf215546Sopenharmony_ci without the risk of lockups and other problems 19bf215546Sopenharmony_ci - **sbstat** - Print optimization statistics (only time so far) 20bf215546Sopenharmony_ci - **sbdump** - Print IR after some passes. 21bf215546Sopenharmony_ci - **sbnofallback** - Abort on errors instead of fallback 22bf215546Sopenharmony_ci - **sbdisasm** - Use sb disassembler for shader dumps 23bf215546Sopenharmony_ci - **sbsafemath** - Disable unsafe math optimizations 24bf215546Sopenharmony_ci - **use_tgsi** - Take in TGSI from the frontend instead of asking for NIR 25bf215546Sopenharmony_ci 26bf215546Sopenharmony_ci### Regression debugging 27bf215546Sopenharmony_ci 28bf215546Sopenharmony_ciIf there are any regressions as compared to the default backend 29bf215546Sopenharmony_ci(R600\_SB=0), it's possible to use the following environment variables 30bf215546Sopenharmony_cito find the incorrectly optimized shader that causes the regression. 31bf215546Sopenharmony_ci 32bf215546Sopenharmony_ci- **R600\_SB\_DSKIP\_MODE** - allows to skip optimization for some 33bf215546Sopenharmony_ci shaders 34bf215546Sopenharmony_ci - 0 - disabled (default) 35bf215546Sopenharmony_ci - 1 - skip optimization for the shaders in the range 36bf215546Sopenharmony_ci [R600\_SB\_DSKIP\_START; R600\_SB\_DSKIP\_END], that is, 37bf215546Sopenharmony_ci optimize only the shaders that are not in this range 38bf215546Sopenharmony_ci - 2 - optimize only the shaders in the range 39bf215546Sopenharmony_ci [R600\_SB\_DSKIP\_START; R600\_SB\_DSKIP\_END] 40bf215546Sopenharmony_ci 41bf215546Sopenharmony_ci- **R600\_SB\_DSKIP\_START** - start of the range (1-based) 42bf215546Sopenharmony_ci 43bf215546Sopenharmony_ci- **R600\_SB\_DSKIP\_END** - end of the range (1-based) 44bf215546Sopenharmony_ci 45bf215546Sopenharmony_ciExample - optimize only the shaders 5, 6, and 7: 46bf215546Sopenharmony_ci 47bf215546Sopenharmony_ci R600_SB_DSKIP_START=5 R600_SB_DSKIP_END=7 R600_SB_DSKIP_MODE=2 48bf215546Sopenharmony_ci 49bf215546Sopenharmony_ciAll shaders compiled by the application are numbered starting from 1, 50bf215546Sopenharmony_cithe number of shaders used by the application may be obtained by running 51bf215546Sopenharmony_ciit with "R600_DEBUG=sb,sbstat" - it will print "sb: shader \#index\#" 52bf215546Sopenharmony_cifor each compiled shader. 53bf215546Sopenharmony_ci 54bf215546Sopenharmony_ciAfter figuring out the total number of shaders used by the application, 55bf215546Sopenharmony_cithe variables above allow to use bisection to find the shader that is 56bf215546Sopenharmony_cithe cause of regression. E.g. if the application uses 100 shaders, we 57bf215546Sopenharmony_cican divide the range [1; 100] and run the application with the 58bf215546Sopenharmony_cioptimization enabled only for the first half of the shaders: 59bf215546Sopenharmony_ci 60bf215546Sopenharmony_ci R600_SB_DSKIP_START=1 R600_SB_DSKIP_END=50 R600_SB_DSKIP_MODE=2 <app> 61bf215546Sopenharmony_ci 62bf215546Sopenharmony_ciIf the regression is reproduced with these parameters, then the failing 63bf215546Sopenharmony_cishader is in the range [1; 50], if it's not reproduced - then it's in 64bf215546Sopenharmony_cithe range [51; 100]. Then we can divide the new range again and repeat 65bf215546Sopenharmony_cithe testing, until we'll reduce the range to a single failing shader. 66bf215546Sopenharmony_ci 67bf215546Sopenharmony_ci*NOTE: This method relies on the assumption that the application 68bf215546Sopenharmony_ciproduces the same sequence of the shaders on each run. It's not always 69bf215546Sopenharmony_citrue - some applications may produce different sequences of the shaders, 70bf215546Sopenharmony_ciin such cases the tools like apitrace may be used to record the trace 71bf215546Sopenharmony_ciwith the application, then this method may be applied when replaying the 72bf215546Sopenharmony_citrace - also this may be faster and/or more convenient than testing the 73bf215546Sopenharmony_ciapplication itself.* 74bf215546Sopenharmony_ci 75bf215546Sopenharmony_ci* * * * * 76bf215546Sopenharmony_ci 77bf215546Sopenharmony_ciIntermediate Representation 78bf215546Sopenharmony_ci--------------------------- 79bf215546Sopenharmony_ci 80bf215546Sopenharmony_ci### Values 81bf215546Sopenharmony_ci 82bf215546Sopenharmony_ciAll kinds of the operands (literal constants, references to kcache 83bf215546Sopenharmony_ciconstants, references to GPRs, etc) are currently represented by the 84bf215546Sopenharmony_ci**value** class (possibly it makes sense to switch to hierarchy of 85bf215546Sopenharmony_ciclasses derived from **value** instead, to save some memory). 86bf215546Sopenharmony_ci 87bf215546Sopenharmony_ciAll values (except some pseudo values like the exec\_mask or predicate 88bf215546Sopenharmony_ciregister) represent 32bit scalar values - there are no vector values, 89bf215546Sopenharmony_ciCF/FETCH instructions use groups of 4 values for src and dst operands. 90bf215546Sopenharmony_ci 91bf215546Sopenharmony_ci### Nodes 92bf215546Sopenharmony_ci 93bf215546Sopenharmony_ciShader programs are represented using the tree data structure, some 94bf215546Sopenharmony_cinodes contain a list of subnodes. 95bf215546Sopenharmony_ci 96bf215546Sopenharmony_ci#### Control flow nodes 97bf215546Sopenharmony_ci 98bf215546Sopenharmony_ciControl flow information is represented using four special node types 99bf215546Sopenharmony_ci(based on the ideas from [[1]](#references) ) 100bf215546Sopenharmony_ci 101bf215546Sopenharmony_ci- **region\_node** - single-entry, single-exit region. 102bf215546Sopenharmony_ci 103bf215546Sopenharmony_ci All loops and if's in the program are enclosed in region nodes. 104bf215546Sopenharmony_ci Region nodes have two containers for phi nodes - 105bf215546Sopenharmony_ci region\_node::loop\_phi contains the phi expressions to be executed 106bf215546Sopenharmony_ci at the region entry, region\_node::phi contains the phi expressions 107bf215546Sopenharmony_ci to be executed at the region exit. It's the only type of the node 108bf215546Sopenharmony_ci that contains associated phi expressions. 109bf215546Sopenharmony_ci 110bf215546Sopenharmony_ci- **depart\_node** - "depart region \$id after { ... }" 111bf215546Sopenharmony_ci 112bf215546Sopenharmony_ci Depart target region (jump to exit point) after executing contained 113bf215546Sopenharmony_ci code. 114bf215546Sopenharmony_ci 115bf215546Sopenharmony_ci- **repeat\_node** - "repeat region \$id after { ... }" 116bf215546Sopenharmony_ci 117bf215546Sopenharmony_ci Repeat target region (jump to entry point) after executing contained 118bf215546Sopenharmony_ci code. 119bf215546Sopenharmony_ci 120bf215546Sopenharmony_ci- **if\_node** - "if (cond) { ... }" 121bf215546Sopenharmony_ci 122bf215546Sopenharmony_ci Execute contained code if condition is true. The difference from 123bf215546Sopenharmony_ci [[1]](#references) is that we don't have associated phi expressions 124bf215546Sopenharmony_ci for the **if\_node**, we enclose **if\_node** in the 125bf215546Sopenharmony_ci **region\_node** and store corresponding phi's in the 126bf215546Sopenharmony_ci **region\_node**, this allows more uniform handling. 127bf215546Sopenharmony_ci 128bf215546Sopenharmony_ciThe target region of depart and repeat nodes is always the region where 129bf215546Sopenharmony_cithey are located (possibly in the nested region), there are no arbitrary 130bf215546Sopenharmony_cijumps/goto's - control flow in the program is always structured. 131bf215546Sopenharmony_ci 132bf215546Sopenharmony_ciTypical control flow constructs can be represented as in the following 133bf215546Sopenharmony_ciexamples: 134bf215546Sopenharmony_ci 135bf215546Sopenharmony_ciGLSL: 136bf215546Sopenharmony_ci 137bf215546Sopenharmony_ci if (cond) { 138bf215546Sopenharmony_ci < 1 > 139bf215546Sopenharmony_ci } else { 140bf215546Sopenharmony_ci < 2 > 141bf215546Sopenharmony_ci } 142bf215546Sopenharmony_ci 143bf215546Sopenharmony_ciIR: 144bf215546Sopenharmony_ci 145bf215546Sopenharmony_ci region #0 { 146bf215546Sopenharmony_ci depart region #0 after { 147bf215546Sopenharmony_ci if (cond) { 148bf215546Sopenharmony_ci depart region #0 after { 149bf215546Sopenharmony_ci < 1 > 150bf215546Sopenharmony_ci } 151bf215546Sopenharmony_ci } 152bf215546Sopenharmony_ci < 2 > 153bf215546Sopenharmony_ci } 154bf215546Sopenharmony_ci <region #0 phi nodes > 155bf215546Sopenharmony_ci } 156bf215546Sopenharmony_ci 157bf215546Sopenharmony_ciGLSL: 158bf215546Sopenharmony_ci 159bf215546Sopenharmony_ci while (cond) { 160bf215546Sopenharmony_ci < 1 > 161bf215546Sopenharmony_ci } 162bf215546Sopenharmony_ci 163bf215546Sopenharmony_ciIR: 164bf215546Sopenharmony_ci 165bf215546Sopenharmony_ci region #0 { 166bf215546Sopenharmony_ci <region #0 loop_phi nodes> 167bf215546Sopenharmony_ci repeat region #0 after { 168bf215546Sopenharmony_ci region #1 { 169bf215546Sopenharmony_ci depart region #1 after { 170bf215546Sopenharmony_ci if (!cond) { 171bf215546Sopenharmony_ci depart region #0 172bf215546Sopenharmony_ci } 173bf215546Sopenharmony_ci } 174bf215546Sopenharmony_ci } 175bf215546Sopenharmony_ci < 1 > 176bf215546Sopenharmony_ci } 177bf215546Sopenharmony_ci <region #0 phi nodes> 178bf215546Sopenharmony_ci } 179bf215546Sopenharmony_ci 180bf215546Sopenharmony_ci'Break' and 'continue' inside the loops are directly translated to the 181bf215546Sopenharmony_cidepart and repeat nodes for the corresponding loop region. 182bf215546Sopenharmony_ci 183bf215546Sopenharmony_ciThis may look a bit too complicated, but in fact this allows more simple 184bf215546Sopenharmony_ciand uniform handling of the control flow. 185bf215546Sopenharmony_ci 186bf215546Sopenharmony_ciAll loop\_phi and phi nodes for some region always have the same number 187bf215546Sopenharmony_ciof source operands. The number of source operands for 188bf215546Sopenharmony_ciregion\_node::loop\_phi nodes is 1 + number of repeat nodes that 189bf215546Sopenharmony_cireference this region as a target. The number of source operands for 190bf215546Sopenharmony_ciregion\_node::phi nodes is equal to the number of depart nodes that 191bf215546Sopenharmony_cireference this region as a target. All depart/repeat nodes for the 192bf215546Sopenharmony_ciregion have unique indices equal to the index of source operand for 193bf215546Sopenharmony_ciphi/loop\_phi nodes. 194bf215546Sopenharmony_ci 195bf215546Sopenharmony_ciFirst source operand for region\_node::loop\_phi nodes (src[0]) is an 196bf215546Sopenharmony_ciincoming value that enters the region from the outside. Each remaining 197bf215546Sopenharmony_cisource operand comes from the corresponding repeat node. 198bf215546Sopenharmony_ci 199bf215546Sopenharmony_ciMore complex example: 200bf215546Sopenharmony_ci 201bf215546Sopenharmony_ciGLSL: 202bf215546Sopenharmony_ci 203bf215546Sopenharmony_ci a = 1; 204bf215546Sopenharmony_ci while (a < 5) { 205bf215546Sopenharmony_ci a = a * 2; 206bf215546Sopenharmony_ci if (b == 3) { 207bf215546Sopenharmony_ci continue; 208bf215546Sopenharmony_ci } else { 209bf215546Sopenharmony_ci a = 6; 210bf215546Sopenharmony_ci } 211bf215546Sopenharmony_ci if (c == 4) 212bf215546Sopenharmony_ci break; 213bf215546Sopenharmony_ci a = a + 1; 214bf215546Sopenharmony_ci } 215bf215546Sopenharmony_ci 216bf215546Sopenharmony_ciIR with SSA form: 217bf215546Sopenharmony_ci 218bf215546Sopenharmony_ci a.1 = 1; 219bf215546Sopenharmony_ci region #0 { 220bf215546Sopenharmony_ci // loop phi values: src[0] - incoming, src[1] - from repeat_1, src[2] - from repeat_2 221bf215546Sopenharmony_ci region#0 loop_phi: a.2 = phi a.1, a.6, a.3 222bf215546Sopenharmony_ci 223bf215546Sopenharmony_ci repeat_1 region #0 after { 224bf215546Sopenharmony_ci a.3 = a.2 * 2; 225bf215546Sopenharmony_ci cond1 = (b == 3); 226bf215546Sopenharmony_ci region #1 { 227bf215546Sopenharmony_ci depart_0 region #1 after { 228bf215546Sopenharmony_ci if (cond1) { 229bf215546Sopenharmony_ci repeat_2 region #0; 230bf215546Sopenharmony_ci } 231bf215546Sopenharmony_ci } 232bf215546Sopenharmony_ci a.4 = 6; 233bf215546Sopenharmony_ci 234bf215546Sopenharmony_ci region #1 phi: a.5 = phi a.4; // src[0] - from depart_0 235bf215546Sopenharmony_ci } 236bf215546Sopenharmony_ci cond2 = (c == 4); 237bf215546Sopenharmony_ci region #2 { 238bf215546Sopenharmony_ci depart_0 region #2 after { 239bf215546Sopenharmony_ci if (cond2) { 240bf215546Sopenharmony_ci depart_0 region #0; 241bf215546Sopenharmony_ci } 242bf215546Sopenharmony_ci } 243bf215546Sopenharmony_ci } 244bf215546Sopenharmony_ci a.6 = a.5 + 1; 245bf215546Sopenharmony_ci } 246bf215546Sopenharmony_ci 247bf215546Sopenharmony_ci region #0 phi: a.7 = phi a.5 // src[0] from depart_0 248bf215546Sopenharmony_ci } 249bf215546Sopenharmony_ci 250bf215546Sopenharmony_ciPhi nodes with single source operand are just copies, they are not 251bf215546Sopenharmony_cireally necessary, but this allows to handle all **depart\_node**s in the 252bf215546Sopenharmony_ciuniform way. 253bf215546Sopenharmony_ci 254bf215546Sopenharmony_ci#### Instruction nodes 255bf215546Sopenharmony_ci 256bf215546Sopenharmony_ciInstruction nodes represent different kinds of instructions - 257bf215546Sopenharmony_ci**alu\_node**, **cf\_node**, **fetch\_node**, etc. Each of them contains 258bf215546Sopenharmony_cithe "bc" structure where all fields of the bytecode are stored (the type 259bf215546Sopenharmony_ciis **bc\_alu** for **alu\_node**, etc). The operands are represented 260bf215546Sopenharmony_ciusing the vectors of pointers to **value** class (node::src, node::dst) 261bf215546Sopenharmony_ci 262bf215546Sopenharmony_ci#### SSA-specific nodes 263bf215546Sopenharmony_ci 264bf215546Sopenharmony_ciPhi nodes currently don't have special node class, they are stored as 265bf215546Sopenharmony_ci**node**. Destination vector contains a single destination value, source 266bf215546Sopenharmony_civector contains 1 or more source values. 267bf215546Sopenharmony_ci 268bf215546Sopenharmony_ciPsi nodes [[5], [6]](#references) also don't have a special node class 269bf215546Sopenharmony_ciand stored as **node**. Source vector contains 3 values for each source 270bf215546Sopenharmony_cioperand - the **value** of predicate, **value** of corresponding 271bf215546Sopenharmony_ciPRED\_SEL field, and the source **value** itself. 272bf215546Sopenharmony_ci 273bf215546Sopenharmony_ci### Indirect addressing 274bf215546Sopenharmony_ci 275bf215546Sopenharmony_ciSpecial kind of values (VLK\_RELREG) is used to represent indirect 276bf215546Sopenharmony_cioperands. These values don't have SSA versions. The representation is 277bf215546Sopenharmony_cimostly based on the [[2]](#references). Indirect operand contains the 278bf215546Sopenharmony_ci"offset/address" value (value::rel), (e.g. some SSA version of the AR 279bf215546Sopenharmony_ciregister value, though after some passes it may be any value - constant, 280bf215546Sopenharmony_ciregister, etc), also it contains the maydef and mayuse vectors of 281bf215546Sopenharmony_cipointers to **value**s (similar to dst/src vectors in the **node**) to 282bf215546Sopenharmony_cirepresent the effects of aliasing in the SSA form. 283bf215546Sopenharmony_ci 284bf215546Sopenharmony_ciE.g. if we have the array R5.x ... R8.x and the following instruction : 285bf215546Sopenharmony_ci 286bf215546Sopenharmony_ci MOV R0.x, R[5 + AR].x 287bf215546Sopenharmony_ci 288bf215546Sopenharmony_cithen source indirect operand is represented with the VLK\_RELREG value, 289bf215546Sopenharmony_civalue::rel is AR, value::maydef is empty (in fact it always contain the 290bf215546Sopenharmony_cisame number of elements as mayuse to simplify the handling, but they are 291bf215546Sopenharmony_ciNULLs), value::mayuse contains [R5.x, R6.x, R7.x, R8.x] (or the 292bf215546Sopenharmony_cicorresponding SSA versions after ssa\_rename). 293bf215546Sopenharmony_ci 294bf215546Sopenharmony_ciAdditional "virtual variables" as in [HSSA [2]](#references) are not 295bf215546Sopenharmony_ciused, also there is no special handling for "zero versions". Typical 296bf215546Sopenharmony_ciprograms in our case are small, indirect addressing is rare, array sizes 297bf215546Sopenharmony_ciare limited by max gpr number, so we don't really need to use special 298bf215546Sopenharmony_citricks to avoid the explosion of value versions. Also this allows more 299bf215546Sopenharmony_ciprecise liveness computation for array elements without modifications to 300bf215546Sopenharmony_cithe algorithms. 301bf215546Sopenharmony_ci 302bf215546Sopenharmony_ciWith the following instruction: 303bf215546Sopenharmony_ci 304bf215546Sopenharmony_ci MOV R[5+AR].x, R0.x 305bf215546Sopenharmony_ci 306bf215546Sopenharmony_ciwe'll have both maydef and mayuse vectors for dst operand filled with 307bf215546Sopenharmony_ciarray values initially: [R5.x, R6.x, R7.x, R8.x]. After the ssa\_rename 308bf215546Sopenharmony_cipass mayuse will contain previous versions, maydef will contain new 309bf215546Sopenharmony_cipotentially-defined versions. 310bf215546Sopenharmony_ci 311bf215546Sopenharmony_ci* * * * * 312bf215546Sopenharmony_ci 313bf215546Sopenharmony_ciPasses 314bf215546Sopenharmony_ci------ 315bf215546Sopenharmony_ci 316bf215546Sopenharmony_ci- **bc\_parser** - creates the IR from the source bytecode, 317bf215546Sopenharmony_ci initializes src and dst value vectors for instruction nodes. Most 318bf215546Sopenharmony_ci ALU nodes have one dst operand and the number of source operands is 319bf215546Sopenharmony_ci equal to the number of source operands for the ISA instruction. 320bf215546Sopenharmony_ci Nodes for PREDSETxx instructions have 3 dst operands - dst[0] is dst 321bf215546Sopenharmony_ci gpr as in the original instruction, other two are pseudo-operands 322bf215546Sopenharmony_ci that represent possibly updated predicate and exec\_mask. Predicate 323bf215546Sopenharmony_ci values are used in the predicated alu instructions (node::pred), 324bf215546Sopenharmony_ci exec\_mask values are used in the if\_nodes (if\_node::cond). Each 325bf215546Sopenharmony_ci vector operand in the CF/TEX/VTX instructions is represented with 4 326bf215546Sopenharmony_ci values - components of the vector. 327bf215546Sopenharmony_ci 328bf215546Sopenharmony_ci- **ssa\_prepare** - creates phi expressions. 329bf215546Sopenharmony_ci 330bf215546Sopenharmony_ci- **ssa\_rename** - renames the values (assigns versions). 331bf215546Sopenharmony_ci 332bf215546Sopenharmony_ci- **liveness** - liveness computation, sets 'dead' flag for unused 333bf215546Sopenharmony_ci nodes and values, optionally computes interference information for 334bf215546Sopenharmony_ci the values. 335bf215546Sopenharmony_ci 336bf215546Sopenharmony_ci- **dce\_cleanup** - eliminates 'dead' nodes, also removes some 337bf215546Sopenharmony_ci unnecessary nodes created by bc\_parser, e.g. the nodes for the JUMP 338bf215546Sopenharmony_ci instructions in the source, containers for ALU groups (they were 339bf215546Sopenharmony_ci only needed for the ssa\_rename pass) 340bf215546Sopenharmony_ci 341bf215546Sopenharmony_ci- **if\_conversion** - converts control flow with if\_nodes to the 342bf215546Sopenharmony_ci data flow in cases where it can improve performance (small alu-only 343bf215546Sopenharmony_ci branches). Both branches are executed speculatively and the phi 344bf215546Sopenharmony_ci expressions are replaced with conditional moves (CNDxx) to select 345bf215546Sopenharmony_ci the final value using the same condition predicate as was used by 346bf215546Sopenharmony_ci the original if\_node. E.g. **if\_node** used dst[2] from PREDSETxx 347bf215546Sopenharmony_ci instruction, CNDxx now uses dst[0] from the same PREDSETxx 348bf215546Sopenharmony_ci instruction. 349bf215546Sopenharmony_ci 350bf215546Sopenharmony_ci- **peephole** - peephole optimizations 351bf215546Sopenharmony_ci 352bf215546Sopenharmony_ci- **gvn** - Global Value Numbering [[2]](#references), 353bf215546Sopenharmony_ci [[3]](#references) 354bf215546Sopenharmony_ci 355bf215546Sopenharmony_ci- **gcm** - Global Code Motion [[3]](#references). Also performs 356bf215546Sopenharmony_ci grouping of the instructions of the same kind (CF/FETCH/ALU). 357bf215546Sopenharmony_ci 358bf215546Sopenharmony_ci- register allocation passes, some ideas are used from 359bf215546Sopenharmony_ci [[4]](#references), but implementation is simplified to make it more 360bf215546Sopenharmony_ci efficient in terms of the compilation speed (e.g. no recursive 361bf215546Sopenharmony_ci recoloring) while achieving good enough results. 362bf215546Sopenharmony_ci 363bf215546Sopenharmony_ci - **ra\_split** - prepares the program to register allocation. 364bf215546Sopenharmony_ci Splits live ranges for constrained values by inserting the 365bf215546Sopenharmony_ci copies to/from temporary values, so that the live range of the 366bf215546Sopenharmony_ci constrained values becomes minimal. 367bf215546Sopenharmony_ci 368bf215546Sopenharmony_ci - **ra\_coalesce** - performs global allocation on registers used 369bf215546Sopenharmony_ci in CF/FETCH instructions. It's performed first to make sure they 370bf215546Sopenharmony_ci end up in the same GPR. Also tries to allocate all values 371bf215546Sopenharmony_ci involved in copies (inserted by the ra\_split pass) to the same 372bf215546Sopenharmony_ci register, so that the copies may be eliminated. 373bf215546Sopenharmony_ci 374bf215546Sopenharmony_ci - **ra\_init** - allocates gpr arrays (if indirect addressing is 375bf215546Sopenharmony_ci used), and remaining values. 376bf215546Sopenharmony_ci 377bf215546Sopenharmony_ci- **post\_scheduler** - ALU scheduler, handles VLIW packing and 378bf215546Sopenharmony_ci performs the final register allocation for local values inside ALU 379bf215546Sopenharmony_ci clauses. Eliminates all coalesced copies (if src and dst of the copy 380bf215546Sopenharmony_ci are allocated to the same register). 381bf215546Sopenharmony_ci 382bf215546Sopenharmony_ci- **ra\_checker** - optional debugging pass that tries to catch basic 383bf215546Sopenharmony_ci errors of the scheduler or regalloc, 384bf215546Sopenharmony_ci 385bf215546Sopenharmony_ci- **bc\_finalize** - propagates the regalloc information from values 386bf215546Sopenharmony_ci in node::src and node::dst vectors to the bytecode fields, converts 387bf215546Sopenharmony_ci control flow structure (region/depart/repeat) to the target 388bf215546Sopenharmony_ci instructions (JUMP/ELSE/POP, 389bf215546Sopenharmony_ci LOOP\_START/LOOP\_END/LOOP\_CONTINUE/LOOP\_BREAK). 390bf215546Sopenharmony_ci 391bf215546Sopenharmony_ci- **bc\_builder** - builds final bytecode, 392bf215546Sopenharmony_ci 393bf215546Sopenharmony_ci* * * * * 394bf215546Sopenharmony_ci 395bf215546Sopenharmony_ciReferences 396bf215546Sopenharmony_ci---------- 397bf215546Sopenharmony_ci 398bf215546Sopenharmony_ci[1] ["Tree-Based Code Optimization. A Thesis Proposal", Carl 399bf215546Sopenharmony_ciMcConnell](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.4210&rep=rep1&type=pdf) 400bf215546Sopenharmony_ci 401bf215546Sopenharmony_ci[2] ["Effective Representation of Aliases and Indirect Memory Operations 402bf215546Sopenharmony_ciin SSA Form", Fred Chow, Sun Chan, Shin-Ming Liu, Raymond Lo, Mark 403bf215546Sopenharmony_ciStreich](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.33.6974&rep=rep1&type=pdf) 404bf215546Sopenharmony_ci 405bf215546Sopenharmony_ci[3] ["Global Code Motion. Global Value Numbering.", Cliff 406bf215546Sopenharmony_ciClick](http://www.cs.washington.edu/education/courses/cse501/06wi/reading/click-pldi95.pdf) 407bf215546Sopenharmony_ci 408bf215546Sopenharmony_ci[4] ["Register Allocation for Programs in SSA Form", Sebastian 409bf215546Sopenharmony_ciHack](http://digbib.ubka.uni-karlsruhe.de/volltexte/documents/6532) 410bf215546Sopenharmony_ci 411bf215546Sopenharmony_ci[5] ["An extension to the SSA representation for predicated code", 412bf215546Sopenharmony_ciFrancois de 413bf215546Sopenharmony_ciFerriere](http://www.cdl.uni-saarland.de/ssasem/talks/Francois.de.Ferriere.pdf) 414bf215546Sopenharmony_ci 415bf215546Sopenharmony_ci[6] ["Improvements to the Psi-SSA Representation", F. de 416bf215546Sopenharmony_ciFerriere](http://www.scopesconf.org/scopes-07/presentations/3_Presentation.pdf) 417