Skip to content

[X86] Allow "sibcall" optimization on call sites marked musttail #150157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

brandtbucher
Copy link
Contributor

Fixes #147813.

Use the same code path to detect sibcall-eligibility for both musttail and normal tail calls.

@llvmbot
Copy link
Member

llvmbot commented Jul 23, 2025

@llvm/pr-subscribers-backend-x86

Author: Brandt Bucher (brandtbucher)

Changes

Fixes #147813.

Use the same code path to detect sibcall-eligibility for both musttail and normal tail calls.


Full diff: https://github.com/llvm/llvm-project/pull/150157.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLoweringCall.cpp (+9-4)
  • (added) llvm/test/CodeGen/X86/musttail-sibling.ll (+30)
diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
index b4639ac2577e8..9835586407042 100644
--- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
@@ -2095,7 +2095,12 @@ X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       isTailCall = false;
   }
 
-  if (isTailCall && !IsMustTail) {
+  if (IsMustTail && !isTailCall) {
+    report_fatal_error("failed to perform tail call elimination on a call site "
+                       "marked musttail");
+  }
+
+  if (isTailCall) {
     // Check if it's really possible to do a tail call.
     isTailCall = IsEligibleForTailCallOptimization(CLI, CCInfo, ArgLocs,
                                                    IsCalleePopSRet);
@@ -2109,9 +2114,9 @@ X86TargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
       ++NumTailCalls;
   }
 
-  if (IsMustTail && !isTailCall)
-    report_fatal_error("failed to perform tail call elimination on a call "
-                       "site marked musttail");
+  if (IsMustTail) {
+    isTailCall = true;
+  }
 
   assert(!(isVarArg && canGuaranteeTCO(CallConv)) &&
          "Var args not supported with calling convention fastcc, ghc or hipe");
diff --git a/llvm/test/CodeGen/X86/musttail-sibling.ll b/llvm/test/CodeGen/X86/musttail-sibling.ll
new file mode 100644
index 0000000000000..91a0ae059b01a
--- /dev/null
+++ b/llvm/test/CodeGen/X86/musttail-sibling.ll
@@ -0,0 +1,30 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64 < %s | FileCheck %s --check-prefix=X64
+; RUN: llc -mtriple=i686 < %s | FileCheck %s --check-prefix=X86
+
+declare void @callee1(ptr)
+define void @caller1(ptr %s) {
+; X64-LABEL: caller1:
+; X64:       # %bb.0:
+; X64-NEXT:    jmp callee1@PLT # TAILCALL
+;
+; X86-LABEL: caller1:
+; X86:       # %bb.0:
+; X86-NEXT:    jmp callee1@PLT # TAILCALL
+  musttail call void @callee1(ptr %s)
+  ret void
+}
+
+declare void @callee7(ptr)
+define void @caller7(ptr %r0, ptr %r1, ptr %r2, ptr %r3, ptr %r4, ptr %r5, ptr %s) {
+; X64-LABEL: caller7:
+; X64:       # %bb.0:
+; X64-NEXT:    jmp callee7@PLT # TAILCALL
+;
+; X86-LABEL: caller7:
+; X86:       # %bb.0:
+; X86-NEXT:    jmp callee7@PLT # TAILCALL
+  musttail call void @callee7(ptr %r0, ptr %r1, ptr %r2, ptr %r3, ptr %r4, ptr %r5, ptr %s)
+  ret void
+}
+

@rnk
Copy link
Collaborator

rnk commented Jul 30, 2025

I think this code to load and store all parameters in memory exists to handle cases where the parameters are changed or swapped, consider:
https://godbolt.org/z/fjc8xc1GP

The incoming parameters in memory and the outgoing arguments compete for the same slots in memory, and they have to be loaded and stored in some cases, and it's hard to figure out which ones require it and which do not.

We even have correctness bugs when musttail combines with byval: #46402 Triggering this case requires passing marginally complicated structs in memory (>=7 params for x86_64 or ia32).

@brandtbucher
Copy link
Contributor Author

@rnk, maybe I'm missing something, but this PR emits the same code as the example you linked. Would you like me to add a test for it?

define void @caller_flip7(ptr %r0, ptr %r1, ptr %r2, ptr %r3, ptr %r4, ptr %r5, ptr %x) {
; X64-LABEL: caller_flip7:
; X64:       # %bb.0:
; X64-NEXT:    movq {{[0-9]+}}(%rsp), %rax
; X64-NEXT:    movq %r9, {{[0-9]+}}(%rsp)
; X64-NEXT:    movq %rax, %r9
; X64-NEXT:    jmp callee7@PLT # TAILCALL
  musttail call void @callee7(ptr %r0, ptr %r1, ptr %r2, ptr %r3, ptr %r4, ptr %x, ptr %r5)
  ret void
}

@rnk
Copy link
Collaborator

rnk commented Aug 1, 2025

Would you like me to add a test for it?

Yep, if it already works, my concerns would be addressed.

@rnk rnk requested review from rnk and lhames August 1, 2025 21:37
@rnk
Copy link
Collaborator

rnk commented Aug 1, 2025

Actually, I recall that Swift uses musttail in a creative way, where it doesn't require prototype mismatch, but still guarantees the tail call. There are roughly three cases:

  • Register-only tail calls: Historically, these were sibcalls. These are the easy cases.
  • GuaranteedTCO, which allows prototype mismatch: In the general case, this requires allocating or deallocating parameter stack memory and potentially moving the return address around in memory (yuck). Swift does this sometimes for some coroutine functionality.
  • Musttail with matching prototypes: The original idea of musttail was that if we require matching prototypes, it should always be feasible to do the tail call, since the argument memory requirements are the same.

We have a confusing tangle of booleans in this code to try to differentiate between these cases, but I think that's the true purpose of the IsSibcall boolean that you're touching, and why we might need to think a bit more about swiftcc here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

musttail doesn't optimize sibling calls
3 participants