Skip to content

Conversation

dpaoliello
Copy link
Contributor

I was trying to understand the cases where emitSPUpdate may use a push to adjust the stack pointer and realized that the code was more complex than it needed to be.

  • Chunk is redundant as it is set to MaxSPChunk and never modified.
  • The only time we use the push optimization is if Offset is equal to SlotSize (as SlotSize is never as large as MaxSPChunk, so will never equal if std::min returned that value).
  • If we use the push optimization, then we've finished adjusting the stack and can return early instead of continuing the loop.

@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-backend-x86

Author: Daniel Paoliello (dpaoliello)

Changes

I was trying to understand the cases where emitSPUpdate may use a push to adjust the stack pointer and realized that the code was more complex than it needed to be.

  • Chunk is redundant as it is set to MaxSPChunk and never modified.
  • The only time we use the push optimization is if Offset is equal to SlotSize (as SlotSize is never as large as MaxSPChunk, so will never equal if std::min returned that value).
  • If we use the push optimization, then we've finished adjusting the stack and can return early instead of continuing the loop.

Full diff: https://github.com/llvm/llvm-project/pull/156948.diff

1 Files Affected:

  • (modified) llvm/lib/Target/X86/X86FrameLowering.cpp (+8-9)
diff --git a/llvm/lib/Target/X86/X86FrameLowering.cpp b/llvm/lib/Target/X86/X86FrameLowering.cpp
index cba7843d53e3f..39e0b31433393 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -53,6 +53,7 @@ X86FrameLowering::X86FrameLowering(const X86Subtarget &STI,
       STI(STI), TII(*STI.getInstrInfo()), TRI(STI.getRegisterInfo()) {
   // Cache a bunch of frame-related predicates for this subtarget.
   SlotSize = TRI->getSlotSize();
+  assert(SlotSize == 4 || SlotSize == 8);
   Is64Bit = STI.is64Bit();
   IsLP64 = STI.isTarget64BitLP64();
   // standard x86_64 uses 64-bit frame/stack pointers, x32 - 32-bit.
@@ -224,7 +225,7 @@ flagsNeedToBePreservedBeforeTheTerminators(const MachineBasicBlock &MBB) {
   return false;
 }
 
-constexpr int64_t MaxSPChunk = (1LL << 31) - 1;
+constexpr uint64_t MaxSPChunk = (1ULL << 31) - 1;
 
 /// emitSPUpdate - Emit a series of instructions to increment / decrement the
 /// stack pointer by a constant value.
@@ -245,8 +246,6 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
     return;
   }
 
-  uint64_t Chunk = MaxSPChunk;
-
   MachineFunction &MF = *MBB.getParent();
   const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
   const X86TargetLowering &TLI = *STI.getTargetLowering();
@@ -260,7 +259,7 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
     // loop, by inlineStackProbe().
     BuildMI(MBB, MBBI, DL, TII.get(X86::STACKALLOC_W_PROBING)).addImm(Offset);
     return;
-  } else if (Offset > Chunk) {
+  } else if (Offset > MaxSPChunk) {
     // Rather than emit a long series of instructions for large offsets,
     // load the offset into a register and do one sub/add
     unsigned Reg = 0;
@@ -284,7 +283,7 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
                              .addReg(Reg);
       MI->getOperand(3).setIsDead(); // The EFLAGS implicit def is dead.
       return;
-    } else if (Offset > 8 * Chunk) {
+    } else if (Offset > 8 * MaxSPChunk) {
       // If we would need more than 8 add or sub instructions (a >16GB stack
       // frame), it's worth spilling RAX to materialize this immediate.
       //   pushq %rax
@@ -322,8 +321,7 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
   }
 
   while (Offset) {
-    uint64_t ThisVal = std::min(Offset, Chunk);
-    if (ThisVal == SlotSize) {
+    if (Offset == SlotSize) {
       // Use push / pop for slot sized adjustments as a size optimization. We
       // need to find a dead register when using pop.
       unsigned Reg = isSub ? (unsigned)(Is64Bit ? X86::RAX : X86::EAX)
@@ -334,11 +332,12 @@ void X86FrameLowering::emitSPUpdate(MachineBasicBlock &MBB,
         BuildMI(MBB, MBBI, DL, TII.get(Opc))
             .addReg(Reg, getDefRegState(!isSub) | getUndefRegState(isSub))
             .setMIFlag(Flag);
-        Offset -= ThisVal;
-        continue;
+        return;
       }
     }
 
+    uint64_t ThisVal = std::min(Offset, MaxSPChunk);
+
     BuildStackAdjustment(MBB, MBBI, DL, isSub ? -ThisVal : ThisVal, InEpilogue)
         .setMIFlag(Flag);
 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants