-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[AMDGPU] Set MaxAtomicSizeInBitsSupported. #75185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This will result in larger atomic operations getting expanded to `__atomic_*` libcalls via AtomicExpandPass, which matches what Clang already does in the frontend. While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them _except_ the atomic ones. Those are already be emitted by the Clang frontend, and by enabling them in the backend, allows the same behavior there.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-amdgpu Author: James Y Knight (jyknight) ChangesThis will result in larger atomic operations getting expanded to While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them except the atomic ones. Those are already be emitted by the Clang frontend, and enabling them in the backend allows the same behavior there. Full diff: https://github.com/llvm/llvm-project/pull/75185.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index fcbdf51b03c1f..78092675057df 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -506,9 +506,11 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::SELECT, MVT::v12f32, Promote);
AddPromotedToType(ISD::SELECT, MVT::v12f32, MVT::v12i32);
- // There are no libcalls of any kind.
- for (int I = 0; I < RTLIB::UNKNOWN_LIBCALL; ++I)
- setLibcallName(static_cast<RTLIB::Libcall>(I), nullptr);
+ // Disable most libcalls.
+ for (int I = 0; I < RTLIB::UNKNOWN_LIBCALL; ++I) {
+ if (I < RTLIB::ATOMIC_LOAD || I > RTLIB::ATOMIC_FETCH_NAND_16)
+ setLibcallName(static_cast<RTLIB::Libcall>(I), nullptr);
+ }
setSchedulingPreference(Sched::RegPressure);
setJumpIsExpensive(true);
@@ -556,6 +558,8 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
ISD::FSUB, ISD::FNEG,
ISD::FABS, ISD::AssertZext,
ISD::AssertSext, ISD::INTRINSIC_WO_CHAIN});
+
+ setMaxAtomicSizeInBitsSupported(64);
}
bool AMDGPUTargetLowering::mayIgnoreSignedZero(SDValue Op) const {
diff --git a/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll b/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll
new file mode 100644
index 0000000000000..f62a93f523365
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/atomic-oversize.ll
@@ -0,0 +1,10 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck %s
+
+define void @test(ptr %a) nounwind {
+; CHECK-LABEL: test:
+; CHECK: __atomic_load_16
+; CHECK: __atomic_store_16
+ %1 = load atomic i128, ptr %a seq_cst, align 16
+ store atomic i128 %1, ptr %a seq_cst, align 16
+ ret void
+}
|
…ot to expect to crash.
What's the point of doing this? There aren't any lib calls, and the lib call handling is unaware of the address space anyway. This is just changing a hard error into producing nonworking code |
The purpose is to match what's currently produced by Clang, with the eventual goal of deleting parts of the Clang atomic lowering, leaning on the backend instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet this breaks differently with 32-bit address space pointers
I don't know. I guess we can deal with that if it comes up in the future. |
This will result in larger atomic operations getting expanded to
__atomic_*
libcalls via AtomicExpandPass, which matches what Clang already does in the frontend.While AMDGPU currently disables the use of all libcalls, I've changed it to instead disable all of them except the atomic ones. Those are already be emitted by the Clang frontend, and enabling them in the backend allows the same behavior there.