1# Cleanup optimization pass 2## Overview 3 4`Cleanup` pass works like dead code elimination (DCE) and removes code which does not affect the program results. 5It also removes empty basic blocks when it is possible and merges a linear basic block sequence to one bigger basic block, 6thus simplifying control flow graph. 7 8Picture of merging a linear basic block sequence: 9 10**Before:** 11 12 13 14**After:** 15 16 17 18## Rationality 19 20Removing dead code reduces the number of instructions. Merging linear path of blocks in one big block gives an opportunity for more optimizations, because **data locality** is improved. 21 22After any optimization which deletes instructions some basic blocks may become empty or may contain only Phi instructions. 23In order not to bother about processing such cases in other transformation passes it is better to remove such blocks immediately. 24There are two exceptions, where empty basic block should stay in control flow graph. 25First case is loop pre-header, which may legally be an empty basic block. 26 27Second case is a so-called "special triangle" situation, when there are actually two edges between same blocks, but our way to store CFG edges don't support that. So, we have to insert an empty basic block on one of edges, see below. Certainly, that block is necessary only when there are Phi instruction in `BB2` which have different inputs on those incoming edges. If it is not the case, second edge on left example is not needed and empty basic block is also not needed. 28 29``` 30 WRONG RIGHT 31 32 [BB1] [BB1] 33 | \ | \ 34 | | | [EBB] 35 | / | / 36 [BB2] [BB2] 37``` 38 39## Dependence 40 41* RPO analysis 42* Loop Analysis 43* Dominators Tree 44 45## Algorithm 46 47Instructions, which we can't remove from Graph, has the attribute `NO_DCE` 48(Control Flow instructions, Return, Stores, Call instructions e.t.c.). 49 50### Remove redundant Phi(s) 51 52On first step redundant `Phi` instructions are removed - in short, if `phi` has only one real input, we can move users 53from the `phi` to the input and remove `phi`. 54 55### Remove empty blocks 56 57Walk through all basic blocks and check whether they are suitable to remove. If basic block contains only Phis 58instructions and is not an loop pre-header we still have to check each predecessor block for "special triangle" 59situation (see Rationality subsection). 60 61Some `If` or `IfImm` instructions may be marked `dead` after this stage. 62 63### Remove dead instructions 64 65During first pass through all instructions (in PRO order) we mark `live` all instructions which have `NO_DCE` attribute 66(and was not marked `dead` during previous stage) and recursively mark `live` all their inputs. 67During second pass we remove all not `live`-marked instructions. 68 69### Loop until unchanged Graph 70 71When removing EmptyBlocks when "special triangle" case is resolved conditional branch instruction (`If` or `IfImm`) become dead. 72At the same time removing dead instruction may create new empty basic blocks. That's why there are a loop to repeat two previous 73stages until IR become stable. 74 75### Merge linear blocks 76 77Finally, we visits basic blocks once again, and if: 781. basic block is not start or end block 792. basic block has 1 successor 803. the successor has 1 predecessor is not a try begin block 81 82Then we can join the block with the successor block. After that, the new block tries to join next block. 83 84## Pseudocode 85 86```c++ 87bool Cleanup::RunImpl() { 88 // Two vectors to store basic blocks lists 89 auto empty_blocks = &empty1; 90 auto new_empty_blocks = &empty2; 91 bool modified = PhiChecker(empty_blocks); 92 for (auto bb : GetGraph()->GetVectorBlocks()) { 93 if (!SkipBasicBlock(bb) && bb->IsEmpty()) { 94 empty_blocks->insert(bb); 95 } 96 } 97 // Main loop 98 do { 99 modified |= RunOnce(empty_blocks, new_empty_blocks); 100 // Swap vectors pointers 101 auto temp = empty_blocks; 102 empty_blocks = new_empty_blocks; 103 new_empty_blocks = temp; 104 // Clean the "new" list 105 new_empty_blocks->clear(); 106 } while (!empty_blocks->empty()); 107 // Merge linear sectors. 108 for (auto bb : GetGraph()->GetVectorBlocks()) { 109 if (!SkipBasicBlock(bb)) { 110 while (bb and successor are suitable) { 111 bb->JoinSuccessorBlock(); 112 modified = true; 113 } 114 } 115 } 116 return modified; 117} 118 119bool Cleanup::RunOnce(empty_blocks, new_empty_blocks) 120{ 121 bool modified = false; 122 for (auto bb : *empty_blocks) { 123 modified |= ProcessBB(bb, dead_mrk, new_empty_blocks); 124 } 125 modified |= Dce(dead_mrk, new_empty_blocks); 126 return modified; 127} 128 129bool Cleanup::ProcessBB(bb, dead_mrk, new_empty_blocks) 130{ 131 auto succ = bb->GetSuccessor(0); 132 if (CheckSpecialTriangle(bb, &saved_preds)) { 133 return false; 134 } 135 // Remove dead Phi(s) in bb 136 ... 137 // Process saved predecessors 138 ... 139 GetGraph()->RemoveEmptyBlockWithPhis(bb); 140 return true; 141} 142 143// DCE recursive helper 144void Cleanup::MarkLiveRec(Marker live_mrk, Inst *inst) 145 146bool Cleanup::Dce(Marker dead_mrk, new_empty_blocks) 147{ 148 bool modified = false; 149 // Mark live instructions 150 for (auto bb : GetGraph()->GetBlocksRPO()) { 151 for (auto inst : bb->AllInsts()) { 152 if (inst->IsNotRemovable() && !inst->IsMarked(dead_mrk)) { 153 MarkLiveRec(live_mrk, inst); 154 } 155 } 156 } 157 // Remove non-live instructions 158 for (auto bb : GetGraph()->GetBlocksRPO()) { 159 for (auto inst : bb->AllInstsSafe()) { 160 if (inst->IsMarked(live_mrk)) { 161 continue; 162 } 163 bool is_phi = inst->IsPhi(); 164 bb->RemoveInst(inst); 165 modified = true; 166 167 if (is_phi) { 168 for (auto pred : bb->GetPredsBlocks()) { 169 if (pred->IsEmpty() && !SkipBasicBlock(pred)) { 170 new_empty_blocks->insert(pred); 171 } 172 } 173 } else if (bb->IsEmpty() && !SkipBasicBlock(bb)) { 174 new_empty_blocks->insert(bb); 175 } 176 } 177 } 178 return modified; 179} 180``` 181 182## Examples 183 184**Removing dead instruction** 185 186Before: 187 188``` 189BB 0 190prop: start 191 0.i64 Constant 0xc -> (v3) 192 1.i64 Constant 0xd -> (v2) 193succs: [bb 2] 194 195BB 2 preds: [bb 0] 196 2.u64 Mov v1 -> (v3, v4) 197 3.u64 Add v0, v2 198 4.u64 Return v2 199succs: [bb 1] 200 201BB 1 preds: [bb 2] 202prop: end 203``` 204 205After: 206 207``` 208BB 0 209prop: start 210 1.i64 Constant 0xd -> (v2) 211succs: [bb 2] 212 213BB 2 preds: [bb 0] 214 2.u64 Mov v1 -> (v4) 215 4.u64 Return v2 216succs: [bb 1] 217``` 218 219`Add` result was not used. 220 221 222**Remove empty (Phi-only) block** 223 224Before: 225 226``` 227BB 0 228prop: start 229 0.i64 Parameter arg 0 -> (v6p, v3, v2, v4) 230 1.i64 Parameter arg 1 -> (v5p, v3, v2, v4) 231succs: [bb 2] 232 233BB 2 preds: [bb 0] 234 2. If LE i64 v0, v1 235succs: [bb 3, bb 6] 236 237BB 3 preds: [bb 2] 238 3. If EQ i64 v0, v1 239succs: [bb 4, bb 5] 240 241BB 4 preds: [bb 3] 242 4.i64 Add v0, v1 -> (v5p) 243succs: [bb 5] 244 245BB 5 preds: [bb 3, bb 4] 246 5p.i64 Phi v1(bb3), v4(bb4) -> (v6p) 247succs: [bb 6] 248 249BB 6 preds: [bb 2, bb 5] 250 6p.i64 Phi v0(bb2), v5p(bb5) -> (v7) 251 7.i64 Return v6p 252succs: [bb 1] 253 254BB 1 preds: [bb 6] 255prop: end 256``` 257 258After: 259 260``` 261BB 0 262prop: start 263 0.i64 Parameter arg 0 -> (v6p, v3, v2, v4) 264 1.i64 Parameter arg 1 -> (v6p, v3, v2, v4) 265succs: [bb 2] 266 267BB 2 preds: [bb 0] 268 2. If LE i64 v0, v1 269succs: [bb 3, bb 6] 270 271BB 3 preds: [bb 2] 272 3. If EQ i64 v0, v1 273succs: [bb 4, bb 6] 274 275BB 4 preds: [bb 3] 276 4.i64 Add v0, v1 -> (v6p) 277succs: [bb 6] 278 279BB 6 preds: [bb 2, bb 3, bb 4] 280 6p.i64 Phi v0(bb2), v1(bb3), v4(bb4) -> (v7) 281 7.i64 Return v6p 282succs: [bb 1] 283 284BB 1 preds: [bb 6] 285prop: end 286``` 287 288Basic block 5 was removed and its Phi was "merged" into block 6. 289 290 291**Three blocks merge** 292 293Before: 294 295``` 296BB 0 297prop: start 298 0.i64 Constant 0xc -> (v3, v5, v6) 299 1.i64 Constant 0xd -> (v2, v4, v6) 300succs: [bb 2] 301 302BB 2 preds: [bb 0] 303 2.u64 Mov v1 -> (v3, v5) 304 3.u64 Add v0, v2 305succs: [bb 3] 306 307BB 3 preds: [bb 2] 308 4.u64 Mov v1 309 5.u64 Add v0, v2 310succs: [bb 4] 311 312BB 4 preds: [bb 3] 313 6.u64 Add v0, v1 314 7. ReturnVoid 315succs: [bb 1] 316 317BB 1 preds: [bb 4] 318prop: end 319``` 320 321After: 322 323``` 324BB 0 325prop: start 326 0.i64 Constant 0xc -> (v3, v5, v6) 327 1.i64 Constant 0xd -> (v2, v4, v6) 328succs: [bb 2] 329 330BB 2 preds: [bb 0] 331 2.u64 Mov v1 -> (v3, v5) 332 3.u64 Add v0, v2 333 4.u64 Mov v1 334 5.u64 Add v0, v2 335 6.u64 Add v0, v1 336 7. ReturnVoid 337succs: [bb 1] 338 339BB 1 preds: [bb 2] 340prop: end 341``` 342 343 344**Two blocks merged in a loop** 345 346Before: 347 348``` 349BB 0 350prop: start 351 0.i64 Constant 0xc -> (v8, v4, v2, v15, v16, v17) 352 1.i64 Constant 0xd -> (v7, v5, v2, v15, v17) 353succs: [bb 2] 354 355BB 2 preds: [bb 0] 356 2.b Compare LT i64 v0, v1 -> (v3) 357 3. IfImm NE b v2, 0x0 358succs: [bb 3, bb 4] 359 360BB 4 preds: [bb 2] 361 7.u64 Mov v1 -> (v19p) 362 8.u64 Mov v0 -> (v20p) 363succs: [bb 14] 364 365BB 3 preds: [bb 2] 366 4.u64 Mov v0 -> (v19p) 367 5.u64 Mov v1 -> (v20p) 368succs: [bb 14] 369 370BB 14 preds: [bb 4, bb 3] 371 19p.u64 Phi v7(bb4), v4(bb3) -> (v10p) 372 20p.u64 Phi v8(bb4), v5(bb3) -> (v11p) 373succs: [bb 5] 374 375BB 5 preds: [bb 6, bb 14] 376prop: head, loop 1 377 10p.u64 Phi v15(bb6), v19p(bb14) -> (v12, v13) 378 11p.u64 Phi v16(bb6), v20p(bb14) -> (v13) 379 12.u64 Mov v10p 380 13.u64 Add v10p, v11p 381succs: [bb 6] 382 383BB 6 preds: [bb 5] 384prop: loop 1 385 15.u64 Add v0, v1 -> (v10p) 386 16.u64 Mov v0 -> (v11p) 387 17.b Compare LT i64 v1, v0 -> (v18) 388 18. IfImm NE b v17, 0x0 389succs: [bb 5, bb 1] 390 391BB 1 preds: [bb 6] 392prop: end 393``` 394 395After: 396 397``` 398BB 0 399prop: start 400 0.i64 Constant 0xc -> (v8, v4, v2, v15, v16, v17) 401 1.i64 Constant 0xd -> (v7, v5, v2, v15, v17) 402succs: [bb 2] 403 404BB 2 preds: [bb 0] 405 2.b Compare LT i64 v0, v1 -> (v3) 406 3. IfImm NE b v2, 0x0 407succs: [bb 3, bb 4] 408 409BB 4 preds: [bb 2] 410 7.u64 Mov v1 -> (v19p) 411 8.u64 Mov v0 -> (v20p) 412succs: [bb 14] 413 414BB 3 preds: [bb 2] 415 4.u64 Mov v0 -> (v19p) 416 5.u64 Mov v1 -> (v20p) 417succs: [bb 14] 418 419BB 14 preds: [bb 4, bb 3] 420 19p.u64 Phi v7(bb4), v4(bb3) -> (v10p) 421 20p.u64 Phi v8(bb4), v5(bb3) -> (v11p) 422succs: [bb 5] 423 424BB 5 preds: [bb 5, bb 14] 425prop: head, loop 1 426 10p.u64 Phi v15(bb5), v19p(bb14) -> (v12, v13) 427 11p.u64 Phi v16(bb5), v20p(bb14) -> (v13) 428 12.u64 Mov v10p 429 13.u64 Add v10p, v11p 430 15.u64 Add v0, v1 -> (v10p) 431 16.u64 Mov v0 -> (v11p) 432 17.b Compare LT i64 v1, v0 -> (v18) 433 18. IfImm NE b v17, 0x0 434succs: [bb 5, bb 1] 435 436BB 1 preds: [bb 5] 437prop: end 438``` 439 440## Links 441 442Source code: 443[cleanup.cpp](../optimizer/optimizations/cleanup.cpp) 444[cleanup.h](../optimizer/optimizations/cleanup.h) 445 446Tests: 447[cleanup_test.cpp](../tests/cleanup_test.cpp) 448