-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[AArch64][SME] Resume streaming-mode on entry to exception handlers #156638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds a new `TargetLowering` hook `lowerEHPadEntry()` that is called at the start of lowering EH pads in SelectionDAG. This allows the insertion of target-specific actions on entry to exception handlers. This is used on AArch64 to insert SME streaming-mode switches at landing pads. This is needed as exception handlers are always entered with PSTATE.SM off, and the function needs to resume the streaming mode of the function body.
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-aarch64 Author: Benjamin Maxwell (MacDue) ChangesThis patch adds a new This is used on AArch64 to insert SME streaming-mode switches at landing pads. This is needed as exception handlers are always entered with PSTATE.SM off, and the function needs to resume the streaming mode of the function body. Full diff: https://github.com/llvm/llvm-project/pull/156638.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index 438b6ff55c85f..385ee96bfe0b5 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -4686,6 +4686,13 @@ class LLVM_ABI TargetLowering : public TargetLoweringBase {
llvm_unreachable("Not Implemented");
}
+ /// Optional target hook at add target-specific actions when entering EH pad
+ /// blocks. The implementation should return the resulting token chain value.
+ virtual SDValue lowerEHPadEntry(SDValue Chain, const SDLoc &DL,
+ SelectionDAG &DAG) const {
+ return SDValue();
+ }
+
virtual void markLibCallAttributes(MachineFunction *MF, unsigned CC,
ArgListTy &Args) const {}
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index ece50ed95fc49..edcc390ba7c12 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1729,10 +1729,19 @@ void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
// Setup an EH landing-pad block.
FuncInfo->ExceptionPointerVirtReg = Register();
FuncInfo->ExceptionSelectorVirtReg = Register();
- if (LLVMBB->isEHPad())
+ if (LLVMBB->isEHPad()) {
if (!PrepareEHLandingPad())
continue;
+ if (!FastIS) {
+ if (SDValue NewRoot = TLI->lowerEHPadEntry(CurDAG->getRoot(),
+ SDB->getCurSDLoc(), *CurDAG);
+ NewRoot && NewRoot != CurDAG->getRoot()) {
+ CurDAG->setRoot(NewRoot);
+ }
+ }
+ }
+
// Before doing SelectionDAG ISel, see if FastISel has been requested.
if (FastIS) {
if (LLVMBB != &Fn.getEntryBlock())
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 23328ed57fb36..753d617f22f2a 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7908,6 +7908,39 @@ static bool isPassedInFPR(EVT VT) {
(VT.isFloatingPoint() && !VT.isScalableVector());
}
+SDValue AArch64TargetLowering::lowerEHPadEntry(SDValue Chain, SDLoc const &DL,
+ SelectionDAG &DAG) const {
+ assert(Chain.getOpcode() == ISD::EntryToken && "Unexpected Chain value");
+ SDValue Glue = Chain.getValue(1);
+
+ MachineFunction &MF = DAG.getMachineFunction();
+ SMEAttrs SMEFnAttrs = MF.getInfo<AArch64FunctionInfo>()->getSMEFnAttrs();
+
+ // Exception handlers are entered with:
+ // - PSTATE.SM is 0.
+ // - PSTATE.ZA is 0.
+ // - TPIDR2_EL0 is null.
+ // See:
+ // https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#exceptions
+ //
+ // Therefore, if the function that contains this exception handler is a
+ // streaming[-compatible] function, we must re-enable streaming mode.
+ //
+ // These mode changes are usually optimized away in catch blocks (as they)
+ // occur before the __cxa_begin_catch (which is a non-streaming function),
+ // but are necessary in some cases (such as for cleanups).
+
+ if (SMEFnAttrs.hasStreamingInterfaceOrBody())
+ return changeStreamingMode(DAG, DL, /*Enable=*/true, Chain,
+ /*Glue*/ Glue, AArch64SME::Always);
+
+ if (SMEFnAttrs.hasStreamingCompatibleInterface())
+ return changeStreamingMode(DAG, DL, /*Enable=*/true, Chain, Glue,
+ AArch64SME::IfCallerIsStreaming);
+
+ return Chain;
+}
+
SDValue AArch64TargetLowering::LowerFormalArguments(
SDValue Chain, CallingConv::ID CallConv, bool isVarArg,
const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 46738365080f9..77f2a98c8d53f 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -575,6 +575,9 @@ class AArch64TargetLowering : public TargetLowering {
bool shouldExpandBuildVectorWithShuffles(EVT, unsigned) const override;
+ SDValue lowerEHPadEntry(SDValue Chain, SDLoc const &DL,
+ SelectionDAG &DAG) const override;
+
SDValue LowerFormalArguments(SDValue Chain, CallingConv::ID CallConv,
bool isVarArg,
const SmallVectorImpl<ISD::InputArg> &Ins,
diff --git a/llvm/test/CodeGen/AArch64/sme-streaming-mode-landingpads.ll b/llvm/test/CodeGen/AArch64/sme-streaming-mode-landingpads.ll
new file mode 100644
index 0000000000000..b583479b21e4b
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sme-streaming-mode-landingpads.ll
@@ -0,0 +1,198 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=aarch64 -aarch64-streaming-hazard-size=0 -mattr=+sme,+sve -stop-before=finalize-isel -verify-machineinstrs < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+declare void @"StreamingCleanup::~StreamingCleanup"(ptr %this) nounwind "aarch64_pstate_sm_enabled"
+declare void @"StreamingCompatCleanup::~StreamingCompatCleanup"(ptr %this) nounwind "aarch64_pstate_sm_compatible"
+
+declare void @may_throw() "aarch64_pstate_sm_compatible"
+
+; This test models the kind of IR clang would emit for the following C++:
+;
+; struct StreamingCleanup {
+; ~StreamingCleanup() __arm_streaming
+; };
+;
+; void may_throw() __arm_streaming_compatible;
+;
+; void streaming_with_cleanup() __arm_streaming {
+; StreamingCleanup cleanup;
+; may_throw();
+; }
+;
+; This is a streaming function and all callees of this function are streaming[-compatible]
+; functions (including the StreamingCleanup destructor). This means call lowering will not
+; insert any streaming mode switches. However, if "may_throw" throws an exception, the
+; unwinder can re-enter this function (in %unwind_cleanup) to run the "StreamingCleanup"
+; destructor. The unwinder will always re-enter functions with streaming-mode disabled, so
+; we must ensure streaming-mode is enabled on entry to exception handlers.
+define void @streaming_with_cleanup() "aarch64_pstate_sm_enabled" personality ptr @__gxx_personality_v0 {
+ ; CHECK-LABEL: name: streaming_with_cleanup
+ ; CHECK: bb.0 (%ir-block.0):
+ ; CHECK-NEXT: successors: %bb.1(0x7ffff800), %bb.2(0x00000800)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: BL @may_throw, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: B %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1.normal_return:
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri]]
+ ; CHECK-NEXT: BL @"StreamingCleanup::~StreamingCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: RET_ReallyLR
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2.unwind_cleanup (landing-pad):
+ ; CHECK-NEXT: liveins: $x0, $x1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64all = COPY killed $x1
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64all = COPY killed $x0
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 1, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri1:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri1]]
+ ; CHECK-NEXT: BL @"StreamingCleanup::~StreamingCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit-def $sp, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: $x0 = COPY [[COPY1]]
+ ; CHECK-NEXT: BL @_Unwind_Resume, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 1, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ %cleanup = alloca i8, align 1
+ invoke void @may_throw()
+ to label %normal_return unwind label %unwind_cleanup
+
+normal_return:
+ call void @"StreamingCleanup::~StreamingCleanup"(ptr %cleanup)
+ ret void
+
+unwind_cleanup:
+ %eh_info = landingpad { ptr, i32 }
+ cleanup
+ call void @"StreamingCleanup::~StreamingCleanup"(ptr %cleanup)
+ resume { ptr, i32 } %eh_info
+}
+
+; This test is the same as "streaming_with_cleanup", but now the function and destructor
+; are streaming-compatible functions. In this case, when we enter the exception handler,
+; we must switch to streaming-mode "streaming_compatible_with_cleanup" was entered with
+; during normal execution (i.e., EntryPStateSM).
+define void @streaming_compatible_with_cleanup() "aarch64_pstate_sm_compatible" personality ptr @__gxx_personality_v0 {
+ ; CHECK-LABEL: name: streaming_compatible_with_cleanup
+ ; CHECK: bb.0 (%ir-block.0):
+ ; CHECK-NEXT: successors: %bb.1(0x7ffff800), %bb.2(0x00000800)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[EntryPStateSM:%[0-9]+]]:gpr64 = EntryPStateSM
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: BL @may_throw, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: B %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1.normal_return:
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri]]
+ ; CHECK-NEXT: BL @"StreamingCompatCleanup::~StreamingCompatCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: RET_ReallyLR
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2.unwind_cleanup (landing-pad):
+ ; CHECK-NEXT: liveins: $x0, $x1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64all = COPY killed $x1
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64all = COPY killed $x0
+ ; CHECK-NEXT: MSRpstatePseudo 1, 1, 1, [[EntryPStateSM]], csr_aarch64_smstartstop, implicit-def dead $vg, implicit $vg, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri1:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri1]]
+ ; CHECK-NEXT: BL @"StreamingCompatCleanup::~StreamingCompatCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatePseudo 1, 0, 1, [[EntryPStateSM]], csr_aarch64_smstartstop, implicit-def $vg, implicit $vg, implicit-def $sp, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: $x0 = COPY [[COPY1]]
+ ; CHECK-NEXT: BL @_Unwind_Resume, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp, implicit-def $vg
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatePseudo 1, 1, 1, [[EntryPStateSM]], csr_aarch64_smstartstop, implicit-def dead $vg, implicit $vg, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ %cleanup = alloca i8, align 1
+ invoke void @may_throw()
+ to label %normal_return unwind label %unwind_cleanup
+
+normal_return:
+ call void @"StreamingCompatCleanup::~StreamingCompatCleanup"(ptr %cleanup)
+ ret void
+
+unwind_cleanup:
+ %eh_info = landingpad { ptr, i32 }
+ cleanup
+ call void @"StreamingCompatCleanup::~StreamingCompatCleanup"(ptr %cleanup)
+ resume { ptr, i32 } %eh_info
+}
+
+; This is the same as "streaming_with_cleanup" but for a locally streaming function.
+; The lowering of "unwind_cleanup" is expected to match "streaming_with_cleanup".
+define void @locally_streaming_with_cleanup() "aarch64_pstate_sm_body" personality ptr @__gxx_personality_v0 {
+ ; CHECK-LABEL: name: locally_streaming_with_cleanup
+ ; CHECK: bb.0 (%ir-block.0):
+ ; CHECK-NEXT: successors: %bb.1(0x7ffff800), %bb.2(0x00000800)
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 1, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: BL @may_throw, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: B %bb.1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.1.normal_return:
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri]]
+ ; CHECK-NEXT: BL @"StreamingCleanup::~StreamingCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: RET_ReallyLR
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: bb.2.unwind_cleanup (landing-pad):
+ ; CHECK-NEXT: liveins: $x0, $x1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: EH_LABEL <mcsymbol >
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gpr64all = COPY killed $x1
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gpr64all = COPY killed $x0
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 1, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: [[ADDXri1:%[0-9]+]]:gpr64sp = ADDXri %stack.0.cleanup, 0, 0
+ ; CHECK-NEXT: $x0 = COPY [[ADDXri1]]
+ ; CHECK-NEXT: BL @"StreamingCleanup::~StreamingCleanup", csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 0, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit-def $sp, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ ; CHECK-NEXT: $x0 = COPY [[COPY1]]
+ ; CHECK-NEXT: BL @_Unwind_Resume, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $x0, implicit-def $sp
+ ; CHECK-NEXT: ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
+ ; CHECK-NEXT: MSRpstatesvcrImm1 1, 1, csr_aarch64_smstartstop, implicit-def dead $nzcv, implicit $vg, implicit-def $vg, implicit-def $fpmr
+ %cleanup = alloca i8, align 1
+ invoke void @may_throw()
+ to label %normal_return unwind label %unwind_cleanup
+
+normal_return:
+ call void @"StreamingCleanup::~StreamingCleanup"(ptr %cleanup)
+ ret void
+
+unwind_cleanup:
+ %eh_info = landingpad { ptr, i32 }
+ cleanup
+ call void @"StreamingCleanup::~StreamingCleanup"(ptr %cleanup)
+ resume { ptr, i32 } %eh_info
+}
+
+declare i32 @__gxx_personality_v0(...)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor nits, but otherwise LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global isel version?
There's no global ISEL support for SME |
This patch adds a new
TargetLowering
hooklowerEHPadEntry()
that is called at the start of lowering EH pads in SelectionDAG. This allows the insertion of target-specific actions on entry to exception handlers.This is used on AArch64 to insert SME streaming-mode switches at landing pads. This is needed as exception handlers are always entered with PSTATE.SM off, and the function needs to resume the streaming mode of the function body.