1# IR Builder 
2
3## Overview
4
5The IR builder pass constructs the Intermediate Representation (IR) from the Panda bytecode.
6
7Panda compiler's IR employs SSA form, which should be constructed before first pass is run.
8
9There are various algorithms to construct SSA form, most known is a Cytron et al. It produce most clear and pruned SSA
10form, without dead phi instructions, but it has various drawbacks, such as significant overhead, since it requires to
11build dominance tree and dominance frontier.
12
13We choose a simple algorithm, that is used in many Virtual Machines. It may produce dead phis, but it is much simplier
14and faster.
15
16Due to specifics of the Panda bytecode, IR builder has responsibility to handle various situation that is not common for
17the IR. Such as:
18- some instructions don't specify its type, f.e. `mov` instruction may produce int32 as well as float, final type can be
19  determined only by consumers.
20- constant hasn't type as well and if one constant is used in integer and float operations, it must be split into two
21  IR ConstInst instructions.
22- if constant is `0` and it is used in instruction that expects object(f.e. `mov.obj`), we need to create a special
23  constant instruction `NullPtr` to handle this situation.
24
25Resolving these things requires addition actions in the builder, that, in turn, can require additional analyses,
26for example, Dominators Tree Analysis.
27
28## Dependencies
29
30* DominatorsTree
31* LoopAnalysis
32* RPO
33
34## Algorithm
35
36There are three main phases:
371. Building the Control Flow Graph.
382. Building the Data Flow Graph (in SSA form).
393. Fixing the type uncertainties of the instructions.
40
41**Building the Control Flow Graph**
42
431. Iterate over all bytecode instructions and make basic block for all target instructions, i.e. instructions that can
44   be entries to the basic blocks.
452. Create try catch blocks.
463. Connect the basic blocks, hereby making a CFG.
47
48**Building the Data Flow Graph**
49
501. Get next basic block from the graph.
512. If basic block is a loop header, create SafePoint and OsrSaveState instructions.
523. Create phi instructions for the live registers.
534. Get next bytecode instruction from the current basic block.
545. Build the Panda IR instruction from the bytecode instruction:
55    - create auxiliary instructions (SaveState, NullCheck, etc) if needed
56    - set inputs from the virtual register map
57    - if has destination, update virtual register definition in the vreg map
586. If instruction is a terminator, goto 1, else goto 4.
59
60`Virtual register map` is a map, where key is virtual register, value is an IR instruction that currently defines value
61of this register.
62
63**Fixing the type uncertainties of the instructions**
64
65This final phase contains following parts:
661. Split constants: for all constants that are used in instructions with different types, split constant instruction and
67   set proper types for them.
682. Fix instructions that can't be fully completed in building process:
69    - Remove dead Phi and set types to phi which have no type. Phi may not have type if all it users are pseudo
70      instructions, like SaveState.
71    - Check all instructions that have no type and fix it. Type is got from instructions with known input types.
72    - Resolve dead and inconsistent phi instructions.
73
74## Pseudocode
75
76```python
77
78def Run(graph, bytecode):
79    BuildBasicBlocks(graph, bytecode)
80
81    for bb in graph.basic_blocks_in_rpo():
82
83        for bc_inst in bytecode.instructions[bb.first_pc..bb.last_pc]:
84            inst = BuildInstruction(bc_inst)
85            bb.add_instruction(bb, inst)
86
87    SplitConstants()
88    FixInstructions()
89
90def BuildBasicBlocks(graph, bytecode):
91    blocks = []
92    # each pc holds the reference to the basic block
93    blocks.resize(bytecode.instructions.size())
94
95    for inst in bytecode.instructions:
96        if inst.is_branch():
97            blocks[inst.pc] = graph.new_bb()
98            if inst.is_conditional():
99                # Next instruction starts new basic block
100                blocks[inst.next.target_pc] = graph.new_bb()
101
102    # Add control flow edges between created basic blocks
103    ConnectBasicBlocks()
104
105    graph.remove_unreacheable_blocks()
106
107# This is autogenerated function from erb templates
108def BuildInstruction(bb, bc_inst):
109    # Big switch for each opcode
110    switch bc_inst.opcode:
111    case SOME_OPCODE:
112        bb.create_inst(bc_inst.opcode)
113        bb.set_input(0, self.get_definition(bc_inst.get_vreg(0))
114        bb.set_input(1, self.get_definition(bc_inst.get_vreg(1))
115
116def SplitConstants():
117    for const in graph.constants:
118        if not all(x == const.inputs[0] for x in const.inputs):
119            const.split()
120
121def FixInstructions():
122    # Fix instructions types
123```
124
125Example of the case for the `ADD` bytecode instruction in autogenerated `BuildInstruction` function.
126```c++
127. . .
128case BytecodeInstruction::Opcode::ADDI_IMM8: {
129    // Create IR instruction.
130    auto inst = graph_->CreateInstAdd(DataType::INT32, GetPc(instruction->GetAddress()));
131    // Get current definition of the accumulator and set it as the first input to the instruction/
132    inst->SetInput(0, GetDefinitionAcc());
133    // Get immediate from the bytecode and try to find if it is already exists in the IR, if not - create new one.
134    // Then set it as the second input to the instruction.
135    inst->SetInput(1, FindOrCreateConstant(instruction->GetImm<BytecodeInstruction::Format::IMM8, 0>()));
136    // ADD bytecode write resutl to the accumulator, so update definition of the accumulator by created instruction.
137    UpdateDefinitionAcc(inst);
138    // Append instruction to the IR
139    AddInstruction(inst);
140    break;
141. . .
142```
143
144## Links
145
146[Panda IR builder source code](../optimizer/ir_builder/)    
147
148[Static Single Assignment Book](https://pfalcon.github.io/ssabook/latest/book-v1.pdf)
149
150[Simple and Efficient Construction of Static Single Assignment Form](https://pp.info.uni-karlsruhe.de/uploads/publikationen/braun13cc.pdf)
151
152[Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, Ron Cytron](https://www.cs.utexas.edu/~pingali/CS380C/2010/papers/ssaCytron.pdf)
153