1# IR Builder 2 3## Overview 4 5The IR builder pass constructs the Intermediate Representation (IR) from the Panda bytecode. 6 7Panda compiler's IR employs SSA form, which should be constructed before first pass is run. 8 9There are various algorithms to construct SSA form, most known is a Cytron et al. It produce most clear and pruned SSA 10form, without dead phi instructions, but it has various drawbacks, such as significant overhead, since it requires to 11build dominance tree and dominance frontier. 12 13We choose a simple algorithm, that is used in many Virtual Machines. It may produce dead phis, but it is much simplier 14and faster. 15 16Due to specifics of the Panda bytecode, IR builder has responsibility to handle various situation that is not common for 17the IR. Such as: 18- some instructions don't specify its type, f.e. `mov` instruction may produce int32 as well as float, final type can be 19 determined only by consumers. 20- constant hasn't type as well and if one constant is used in integer and float operations, it must be split into two 21 IR ConstInst instructions. 22- if constant is `0` and it is used in instruction that expects object(f.e. `mov.obj`), we need to create a special 23 constant instruction `NullPtr` to handle this situation. 24 25Resolving these things requires addition actions in the builder, that, in turn, can require additional analyses, 26for example, Dominators Tree Analysis. 27 28## Dependencies 29 30* DominatorsTree 31* LoopAnalysis 32* RPO 33 34## Algorithm 35 36There are three main phases: 371. Building the Control Flow Graph. 382. Building the Data Flow Graph (in SSA form). 393. Fixing the type uncertainties of the instructions. 40 41**Building the Control Flow Graph** 42 431. Iterate over all bytecode instructions and make basic block for all target instructions, i.e. instructions that can 44 be entries to the basic blocks. 452. Create try catch blocks. 463. Connect the basic blocks, hereby making a CFG. 47 48**Building the Data Flow Graph** 49 501. Get next basic block from the graph. 512. If basic block is a loop header, create SafePoint and OsrSaveState instructions. 523. Create phi instructions for the live registers. 534. Get next bytecode instruction from the current basic block. 545. Build the Panda IR instruction from the bytecode instruction: 55 - create auxiliary instructions (SaveState, NullCheck, etc) if needed 56 - set inputs from the virtual register map 57 - if has destination, update virtual register definition in the vreg map 586. If instruction is a terminator, goto 1, else goto 4. 59 60`Virtual register map` is a map, where key is virtual register, value is an IR instruction that currently defines value 61of this register. 62 63**Fixing the type uncertainties of the instructions** 64 65This final phase contains following parts: 661. Split constants: for all constants that are used in instructions with different types, split constant instruction and 67 set proper types for them. 682. Fix instructions that can't be fully completed in building process: 69 - Remove dead Phi and set types to phi which have no type. Phi may not have type if all it users are pseudo 70 instructions, like SaveState. 71 - Check all instructions that have no type and fix it. Type is got from instructions with known input types. 72 - Resolve dead and inconsistent phi instructions. 73 74## Pseudocode 75 76```python 77 78def Run(graph, bytecode): 79 BuildBasicBlocks(graph, bytecode) 80 81 for bb in graph.basic_blocks_in_rpo(): 82 83 for bc_inst in bytecode.instructions[bb.first_pc..bb.last_pc]: 84 inst = BuildInstruction(bc_inst) 85 bb.add_instruction(bb, inst) 86 87 SplitConstants() 88 FixInstructions() 89 90def BuildBasicBlocks(graph, bytecode): 91 blocks = [] 92 # each pc holds the reference to the basic block 93 blocks.resize(bytecode.instructions.size()) 94 95 for inst in bytecode.instructions: 96 if inst.is_branch(): 97 blocks[inst.pc] = graph.new_bb() 98 if inst.is_conditional(): 99 # Next instruction starts new basic block 100 blocks[inst.next.target_pc] = graph.new_bb() 101 102 # Add control flow edges between created basic blocks 103 ConnectBasicBlocks() 104 105 graph.remove_unreacheable_blocks() 106 107# This is autogenerated function from erb templates 108def BuildInstruction(bb, bc_inst): 109 # Big switch for each opcode 110 switch bc_inst.opcode: 111 case SOME_OPCODE: 112 bb.create_inst(bc_inst.opcode) 113 bb.set_input(0, self.get_definition(bc_inst.get_vreg(0)) 114 bb.set_input(1, self.get_definition(bc_inst.get_vreg(1)) 115 116def SplitConstants(): 117 for const in graph.constants: 118 if not all(x == const.inputs[0] for x in const.inputs): 119 const.split() 120 121def FixInstructions(): 122 # Fix instructions types 123``` 124 125Example of the case for the `ADD` bytecode instruction in autogenerated `BuildInstruction` function. 126```c++ 127. . . 128case BytecodeInstruction::Opcode::ADDI_IMM8: { 129 // Create IR instruction. 130 auto inst = graph_->CreateInstAdd(DataType::INT32, GetPc(instruction->GetAddress())); 131 // Get current definition of the accumulator and set it as the first input to the instruction/ 132 inst->SetInput(0, GetDefinitionAcc()); 133 // Get immediate from the bytecode and try to find if it is already exists in the IR, if not - create new one. 134 // Then set it as the second input to the instruction. 135 inst->SetInput(1, FindOrCreateConstant(instruction->GetImm<BytecodeInstruction::Format::IMM8, 0>())); 136 // ADD bytecode write resutl to the accumulator, so update definition of the accumulator by created instruction. 137 UpdateDefinitionAcc(inst); 138 // Append instruction to the IR 139 AddInstruction(inst); 140 break; 141. . . 142``` 143 144## Links 145 146[Panda IR builder source code](../optimizer/ir_builder/) 147 148[Static Single Assignment Book](https://pfalcon.github.io/ssabook/latest/book-v1.pdf) 149 150[Simple and Efficient Construction of Static Single Assignment Form](https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf) 151 152[Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, Ron Cytron](https://www.cs.utexas.edu/~pingali/CS380C/2010/papers/ssaCytron.pdf) 153