Implement post-register-allocation optimization #267

jserv · 2025-08-26T15:57:59Z

This pull request transforms shecc's peephole optimizer from basic instruction fusion to a comprehensive post-register-allocation optimization framework, providing performance improvements while maintaining educational clarity and bootstrap capability.

It creates lean and effective optimizer cooperation by eliminating redundant work between optimization passes.

Key Optimizations

Algebraic: x-x→0, x^x→0, x|x→x, x&x→x
Strength Reduction: x/8→x>>3, x%16→x&15, x*4→x<<2
Comparisons: x==x→1, x!=x→0, x<x→0
Bitwise: x&-1→x, x|0→x, x^0→x, x&0→0
Triple Patterns: 3-instruction sequences
Enhanced: Load/store elimination, dead code elimination, move elimination

This commit implements redundant move elimination to optimize away unnecessary move operations that are immediately overwritten, targetting common inefficiencies in compiler-generated code. Added 5 optimization patterns: - Consecutive assignments to same destination: {mov rd,rs1; mov rd,rs2} → {mov rd,rs2} - Load immediately overwritten: {load rd,offset; mov rd,rs} → {mov rd,rs} - Constant load immediately overwritten: {li rd,imm; mov rd,rs} → {mov rd,rs} - Consecutive loads to same register: {load rd,off1; load rd,off2} → {load rd,off2} - Consecutive constant loads: {li rd,imm1; li rd,imm2} → {li rd,imm2}

This commit implements dead code elimination that works in conjunction with SCCP to remove unreachable code after constant propagation and branch folding. These optimizations target code that becomes dead after constant propagation, such as: - Branches with constant conditions (if(1), if(0)) - Instructions that are immediately overwritten - Unreachable code blocks after branch folding

This extends load/store elimination with more aggressive patterns, reducing memory traffic by eliminating redundant memory operations. Local memory optimizations: - Dead store elimination: Consecutive stores to same location - Redundant load elimination: Consecutive loads from same location - Store-to-load forwarding: Replace load with stored value - Load-store redundancy: Remove store of just-loaded value Global memory optimizations: - Global dead store elimination - Global redundant load elimination

This implements mathematical identity patterns on register operands: - Self-subtraction: x - x → 0 - Self-XOR: x ^ x → 0 - Self-OR: x | x → x (identity) - Self-AND: x & x → x (identity) These patterns emerge after register allocation when different variables are assigned to the same register. SSA handles constant folding, peephole handles register-based patterns.

This implements power-of-2 strength reduction patterns: - Division by 2^n → right shift by n - Modulo by 2^n → bitwise AND with (2^n - 1) - Multiplication by 2^n → left shift by n This optimization is unique to peephole optimizer since SSA works on virtual registers before actual constants are loaded.

This implements self-comparison optimizations: - x != x → 0 (always false) - x == x → 1 (always true) - x < x → 0 (always false) - x > x → 0 (always false) - x <= x → 1 (always true) - x >= x → 1 (always true) These register-based patterns appear after register allocation when different variables are assigned to the same register. Complements SSA's SCCP constant comparison folding.

This implements bitwise identity and absorption patterns: - Double complement: ~(~x) → x - AND with all-ones: x & -1 → x - OR with zero: x | 0 → x - XOR with zero: x ^ 0 → x - AND with zero: x & 0 → 0 (absorption) - OR with all-ones: x | -1 → -1 (absorption) - Shift by zero: x << 0 → x, x >> 0 → x These patterns are not handled by SSA optimizer and provide significant optimization opportunities for bitwise operations.

This implements 3-instruction sequence optimizations: - Store-load-store elimination: removes unused intermediate loads - Consecutive stores: only last store to same location matters

Integrates all safe and working peephole optimizations: - Instruction fusion for eliminating redundant moves - Comparison optimization for self-comparisons - Strength reduction for power-of-2 operations - Algebraic simplification for register patterns - Bitwise operation optimizations - Redundant move elimination - Load/store pair elimination - Triple pattern optimization Removed eliminate_dead_instructions() and fold_constant_branches() as they were causing bootstrap failures due to linked list corruption.

fennecJ · 2025-08-30T13:30:39Z

src/peephole.c

+    }
+
+    /* Pattern 2: Redundant load immediately overwritten
+     * {load rd, offset; mov rd, rs} → {mov rd, rs}


Would the reverse pattern {mov rd, rs; load rd, offset} → {load rd, offset} also apply here? Perhaps we can add a FIXME in the comment for future work?

fennecJ · 2025-08-30T13:31:28Z

src/peephole.c

+    }
+
+    /* Pattern 3: Load constant immediately overwritten
+     * {li rd, imm; mov rd, rs} → {mov rd, rs}


Ditto, {mov rd, rs; li rd, imm} → {li rd, imm}

jserv requested review from ChAoSUnItY, DrXiao, fennecJ, nosba0957 and vacantron August 26, 2025 15:57

jserv added 8 commits August 28, 2025 19:08

Add triple pattern optimization

1b69892

This implements 3-instruction sequence optimizations: - Store-load-store elimination: removes unused intermediate loads - Consecutive stores: only last store to same location matters

jserv force-pushed the improve-peephole branch 3 times, most recently from c093ffc to 635a774 Compare August 28, 2025 11:30

jserv force-pushed the improve-peephole branch from 635a774 to 0f828a0 Compare August 28, 2025 15:49

fennecJ reviewed Aug 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement post-register-allocation optimization #267

Implement post-register-allocation optimization #267

jserv commented Aug 26, 2025 •

edited

Loading

Uh oh!

fennecJ Aug 30, 2025

Uh oh!

fennecJ Aug 30, 2025

Uh oh!

Uh oh!

Implement post-register-allocation optimization #267

Are you sure you want to change the base?

Implement post-register-allocation optimization #267

Conversation

jserv commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fennecJ Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

fennecJ Aug 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jserv commented Aug 26, 2025 •

edited

Loading