1# Cleanup optimization pass
2## Overview
3
4`Cleanup` pass works like dead code elimination (DCE) and removes code which does not affect the program results.
5It also removes empty basic blocks when it is possible and merges a linear basic block sequence to one bigger basic block,
6thus simplifying control flow graph.
7
8Picture of merging a linear basic block sequence:
9
10**Before:**
11
12![before](images/merge_blocks_before.png)
13
14**After:**
15
16![after](images/merge_blocks_after.png)
17
18## Rationality
19
20Removing dead code reduces the number of instructions. Merging linear path of blocks in one big block gives an opportunity for more optimizations, because **data locality** is improved.
21
22After any optimization which deletes instructions some basic blocks may become empty or may contain only Phi instructions.
23In order not to bother about processing such cases in other transformation passes it is better to remove such blocks immediately.
24There are two exceptions, where empty basic block should stay in control flow graph.
25First case is loop pre-header, which may legally be an empty basic block.
26
27Second case is a so-called "special triangle" situation, when there are actually two edges between same blocks, but our way to store CFG edges don't support that. So, we have to insert an empty basic block on one of edges, see below. Certainly, that block is necessary only when there are Phi instruction in `BB2` which have different inputs on those incoming edges. If it is not the case, second edge on left example is not needed and empty basic block is also not needed.
28
29```
30      WRONG        RIGHT
31
32      [BB1]        [BB1]
33       |  \        |  \
34       |  |        |  [EBB]
35       |  /        |  /
36      [BB2]        [BB2]
37```
38
39## Dependence
40
41* RPO analysis
42* Loop Analysis
43* Dominators Tree
44
45## Algorithm
46
47Instructions, which we can't remove from Graph, has the attribute `NO_DCE`
48(Control Flow instructions, Return, Stores, Call instructions e.t.c.).
49
50### Remove redundant Phi(s)
51
52On first step redundant `Phi` instructions are removed - in short, if `phi` has only one real input, we can move users
53from the `phi` to the input and remove `phi`.
54
55### Remove empty blocks
56
57Walk through all basic blocks and check whether they are suitable to remove. If basic block contains only Phis
58instructions and is not an loop pre-header we still have to check each predecessor block for "special triangle"
59situation (see Rationality subsection).
60
61Some `If` or `IfImm` instructions may be marked `dead` after this stage.
62
63### Remove dead instructions
64
65During first pass through all instructions (in PRO order) we mark `live` all instructions which have `NO_DCE` attribute
66(and was not marked `dead` during previous stage) and recursively mark `live` all their inputs.
67During second pass we remove all not `live`-marked instructions.
68
69### Loop until unchanged Graph
70
71When removing EmptyBlocks when "special triangle" case is resolved conditional branch instruction (`If` or `IfImm`) become dead.
72At the same time removing dead instruction may create new empty basic blocks. That's why there are a loop to repeat two previous
73stages until IR become stable.
74
75### Merge linear blocks
76
77Finally, we visits basic blocks once again, and if:
781. basic block is not start or end block
792. basic block has 1 successor
803. the successor has 1 predecessor is not a try begin block
81
82Then we can join the block with the successor block. After that, the new block tries to join next block.
83
84## Pseudocode
85
86```c++
87bool Cleanup::RunImpl() {
88    // Two vectors to store basic blocks lists
89    auto empty_blocks = &empty1;
90    auto new_empty_blocks = &empty2;
91    bool modified = PhiChecker(empty_blocks);
92    for (auto bb : GetGraph()->GetVectorBlocks()) {
93        if (!SkipBasicBlock(bb) && bb->IsEmpty()) {
94            empty_blocks->insert(bb);
95        }
96    }
97    // Main loop
98    do {
99        modified |= RunOnce(empty_blocks, new_empty_blocks);
100        // Swap vectors pointers
101        auto temp = empty_blocks;
102        empty_blocks = new_empty_blocks;
103        new_empty_blocks = temp;
104        // Clean the "new" list
105        new_empty_blocks->clear();
106    } while (!empty_blocks->empty());
107    // Merge linear sectors.
108    for (auto bb : GetGraph()->GetVectorBlocks()) {
109        if (!SkipBasicBlock(bb)) {
110            while (bb and successor are suitable) {
111                bb->JoinSuccessorBlock();
112                modified = true;
113            }
114        }
115    }
116    return modified;
117}
118
119bool Cleanup::RunOnce(empty_blocks, new_empty_blocks)
120{
121    bool modified = false;
122    for (auto bb : *empty_blocks) {
123        modified |= ProcessBB(bb, dead_mrk, new_empty_blocks);
124    }
125    modified |= Dce(dead_mrk, new_empty_blocks);
126    return modified;
127}
128
129bool Cleanup::ProcessBB(bb, dead_mrk, new_empty_blocks)
130{
131    auto succ = bb->GetSuccessor(0);
132    if (CheckSpecialTriangle(bb, &saved_preds)) {
133        return false;
134    }
135    // Remove dead Phi(s) in bb
136    ...
137    // Process saved predecessors
138    ...
139    GetGraph()->RemoveEmptyBlockWithPhis(bb);
140    return true;
141}
142
143// DCE recursive helper
144void Cleanup::MarkLiveRec(Marker live_mrk, Inst *inst)
145
146bool Cleanup::Dce(Marker dead_mrk, new_empty_blocks)
147{
148    bool modified = false;
149    // Mark live instructions
150    for (auto bb : GetGraph()->GetBlocksRPO()) {
151        for (auto inst : bb->AllInsts()) {
152            if (inst->IsNotRemovable() && !inst->IsMarked(dead_mrk)) {
153                MarkLiveRec(live_mrk, inst);
154            }
155        }
156    }
157    // Remove non-live instructions
158    for (auto bb : GetGraph()->GetBlocksRPO()) {
159        for (auto inst : bb->AllInstsSafe()) {
160            if (inst->IsMarked(live_mrk)) {
161                continue;
162            }
163            bool is_phi = inst->IsPhi();
164            bb->RemoveInst(inst);
165            modified = true;
166
167            if (is_phi) {
168                for (auto pred : bb->GetPredsBlocks()) {
169                    if (pred->IsEmpty() && !SkipBasicBlock(pred)) {
170                        new_empty_blocks->insert(pred);
171                    }
172                }
173            } else if (bb->IsEmpty() && !SkipBasicBlock(bb)) {
174                new_empty_blocks->insert(bb);
175            }
176        }
177    }
178    return modified;
179}
180```
181
182## Examples
183
184**Removing dead instruction**
185
186Before:
187
188```
189BB 0
190prop: start
191    0.i64  Constant                   0xc -> (v3)
192    1.i64  Constant                   0xd -> (v2)
193succs: [bb 2]
194
195BB 2  preds: [bb 0]
196    2.u64  Mov                        v1 -> (v3, v4)
197    3.u64  Add                        v0, v2
198    4.u64  Return                     v2
199succs: [bb 1]
200
201BB 1  preds: [bb 2]
202prop: end
203```
204
205After:
206
207```
208BB 0
209prop: start
210    1.i64  Constant                   0xd -> (v2)
211succs: [bb 2]
212
213BB 2  preds: [bb 0]
214    2.u64  Mov                        v1 -> (v4)
215    4.u64  Return                     v2
216succs: [bb 1]
217```
218
219`Add` result was not used.
220
221
222**Remove empty (Phi-only) block**
223
224Before:
225
226```
227BB 0
228prop: start
229    0.i64  Parameter                  arg 0 -> (v6p, v3, v2, v4)
230    1.i64  Parameter                  arg 1 -> (v5p, v3, v2, v4)
231succs: [bb 2]
232
233BB 2  preds: [bb 0]
234    2.     If LE i64                  v0, v1
235succs: [bb 3, bb 6]
236
237BB 3  preds: [bb 2]
238    3.     If EQ i64                  v0, v1
239succs: [bb 4, bb 5]
240
241BB 4  preds: [bb 3]
242    4.i64  Add                        v0, v1 -> (v5p)
243succs: [bb 5]
244
245BB 5  preds: [bb 3, bb 4]
246   5p.i64  Phi                        v1(bb3), v4(bb4) -> (v6p)
247succs: [bb 6]
248
249BB 6  preds: [bb 2, bb 5]
250   6p.i64  Phi                        v0(bb2), v5p(bb5) -> (v7)
251    7.i64  Return                     v6p
252succs: [bb 1]
253
254BB 1  preds: [bb 6]
255prop: end
256```
257
258After:
259
260```
261BB 0
262prop: start
263    0.i64  Parameter                  arg 0 -> (v6p, v3, v2, v4)
264    1.i64  Parameter                  arg 1 -> (v6p, v3, v2, v4)
265succs: [bb 2]
266
267BB 2  preds: [bb 0]
268    2.     If LE i64                  v0, v1
269succs: [bb 3, bb 6]
270
271BB 3  preds: [bb 2]
272    3.     If EQ i64                  v0, v1
273succs: [bb 4, bb 6]
274
275BB 4  preds: [bb 3]
276    4.i64  Add                        v0, v1 -> (v6p)
277succs: [bb 6]
278
279BB 6  preds: [bb 2, bb 3, bb 4]
280   6p.i64  Phi                        v0(bb2), v1(bb3), v4(bb4) -> (v7)
281    7.i64  Return                     v6p
282succs: [bb 1]
283
284BB 1  preds: [bb 6]
285prop: end
286```
287
288Basic block 5 was removed and its Phi was "merged" into block 6.
289
290
291**Three blocks merge**
292
293Before:
294
295```
296BB 0
297prop: start
298    0.i64  Constant                   0xc -> (v3, v5, v6)
299    1.i64  Constant                   0xd -> (v2, v4, v6)
300succs: [bb 2]
301
302BB 2  preds: [bb 0]
303    2.u64  Mov                        v1 -> (v3, v5)
304    3.u64  Add                        v0, v2
305succs: [bb 3]
306
307BB 3  preds: [bb 2]
308    4.u64  Mov                        v1
309    5.u64  Add                        v0, v2
310succs: [bb 4]
311
312BB 4  preds: [bb 3]
313    6.u64  Add                        v0, v1
314    7.     ReturnVoid
315succs: [bb 1]
316
317BB 1  preds: [bb 4]
318prop: end
319```
320
321After:
322
323```
324BB 0
325prop: start
326    0.i64  Constant                   0xc -> (v3, v5, v6)
327    1.i64  Constant                   0xd -> (v2, v4, v6)
328succs: [bb 2]
329
330BB 2  preds: [bb 0]
331    2.u64  Mov                        v1 -> (v3, v5)
332    3.u64  Add                        v0, v2
333    4.u64  Mov                        v1
334    5.u64  Add                        v0, v2
335    6.u64  Add                        v0, v1
336    7.     ReturnVoid
337succs: [bb 1]
338
339BB 1  preds: [bb 2]
340prop: end
341```
342
343
344**Two blocks merged in a loop**
345
346Before:
347
348```
349BB 0
350prop: start
351    0.i64  Constant                   0xc -> (v8, v4, v2, v15, v16, v17)
352    1.i64  Constant                   0xd -> (v7, v5, v2, v15, v17)
353succs: [bb 2]
354
355BB 2  preds: [bb 0]
356    2.b    Compare LT i64             v0, v1 -> (v3)
357    3.     IfImm NE b                 v2, 0x0
358succs: [bb 3, bb 4]
359
360BB 4  preds: [bb 2]
361    7.u64  Mov                        v1 -> (v19p)
362    8.u64  Mov                        v0 -> (v20p)
363succs: [bb 14]
364
365BB 3  preds: [bb 2]
366    4.u64  Mov                        v0 -> (v19p)
367    5.u64  Mov                        v1 -> (v20p)
368succs: [bb 14]
369
370BB 14  preds: [bb 4, bb 3]
371  19p.u64  Phi                        v7(bb4), v4(bb3) -> (v10p)
372  20p.u64  Phi                        v8(bb4), v5(bb3) -> (v11p)
373succs: [bb 5]
374
375BB 5  preds: [bb 6, bb 14]
376prop: head, loop 1
377  10p.u64  Phi                        v15(bb6), v19p(bb14) -> (v12, v13)
378  11p.u64  Phi                        v16(bb6), v20p(bb14) -> (v13)
379   12.u64  Mov                        v10p
380   13.u64  Add                        v10p, v11p
381succs: [bb 6]
382
383BB 6  preds: [bb 5]
384prop: loop 1
385   15.u64  Add                        v0, v1 -> (v10p)
386   16.u64  Mov                        v0 -> (v11p)
387   17.b    Compare LT i64             v1, v0 -> (v18)
388   18.     IfImm NE b                 v17, 0x0
389succs: [bb 5, bb 1]
390
391BB 1  preds: [bb 6]
392prop: end
393```
394
395After:
396
397```
398BB 0
399prop: start
400    0.i64  Constant                   0xc -> (v8, v4, v2, v15, v16, v17)
401    1.i64  Constant                   0xd -> (v7, v5, v2, v15, v17)
402succs: [bb 2]
403
404BB 2  preds: [bb 0]
405    2.b    Compare LT i64             v0, v1 -> (v3)
406    3.     IfImm NE b                 v2, 0x0
407succs: [bb 3, bb 4]
408
409BB 4  preds: [bb 2]
410    7.u64  Mov                        v1 -> (v19p)
411    8.u64  Mov                        v0 -> (v20p)
412succs: [bb 14]
413
414BB 3  preds: [bb 2]
415    4.u64  Mov                        v0 -> (v19p)
416    5.u64  Mov                        v1 -> (v20p)
417succs: [bb 14]
418
419BB 14  preds: [bb 4, bb 3]
420  19p.u64  Phi                        v7(bb4), v4(bb3) -> (v10p)
421  20p.u64  Phi                        v8(bb4), v5(bb3) -> (v11p)
422succs: [bb 5]
423
424BB 5  preds: [bb 5, bb 14]
425prop: head, loop 1
426  10p.u64  Phi                        v15(bb5), v19p(bb14) -> (v12, v13)
427  11p.u64  Phi                        v16(bb5), v20p(bb14) -> (v13)
428   12.u64  Mov                        v10p
429   13.u64  Add                        v10p, v11p
430   15.u64  Add                        v0, v1 -> (v10p)
431   16.u64  Mov                        v0 -> (v11p)
432   17.b    Compare LT i64             v1, v0 -> (v18)
433   18.     IfImm NE b                 v17, 0x0
434succs: [bb 5, bb 1]
435
436BB 1  preds: [bb 5]
437prop: end
438```
439
440## Links
441
442Source code:
443[cleanup.cpp](../optimizer/optimizations/cleanup.cpp)
444[cleanup.h](../optimizer/optimizations/cleanup.h)
445
446Tests:
447[cleanup_test.cpp](../tests/cleanup_test.cpp)
448