-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[llvm-exegesis] Implement the loop repetition mode for AArch64 #154751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-tools-llvm-exegesis Author: Sjoerd Meijer (sjoerdmeijer) ChangesSubject says it all: implement the loop iterator decrement and jump function functions, and reserve X19 for the loop counter. Full diff: https://github.com/llvm/llvm-project/pull/154751.diff 2 Files Affected:
diff --git a/llvm/test/tools/llvm-exegesis/AArch64/loop-register.s b/llvm/test/tools/llvm-exegesis/AArch64/loop-register.s
new file mode 100644
index 0000000000000..2e67937ad0ef6
--- /dev/null
+++ b/llvm/test/tools/llvm-exegesis/AArch64/loop-register.s
@@ -0,0 +1,17 @@
+REQUIRES: aarch64-registered-target, asserts
+
+RUN: llvm-exegesis -mcpu=neoverse-v2 --use-dummy-perf-counters --mode=latency --debug-only=print-gen-assembly --opcode-name=ADDVv4i16v -repetition-mode=loop 2>&1 | FileCheck %s
+
+CHECK: 0: {{.*}} str x19, [sp, #-16]!
+CHECK-NEXT: 4: {{.*}} movi d[[REG:[0-9]+]], #0000000000000000
+CHECK-NEXT: 8: {{.*}} mov x19, #10000
+CHECK-NEXT: c: {{.*}} nop
+CHECK-NEXT: 10: {{.*}} nop
+CHECK-NEXT: 14: {{.*}} nop
+CHECK-NEXT: 18: {{.*}} nop
+CHECK-NEXT: 1c: {{.*}} nop
+CHECK-NEXT: 20: {{.*}} addv h[[REG]], v[[REG]].4h
+CHECK-NEXT: 24: {{.*}} subs x19, x19, #1
+CHECK-NEXT: 28: {{.*}} cbnz x19, #-8
+CHECK-NEXT: 2c: {{.*}} ldr x19, [sp], #16
+CHECK-NEXT: 30: {{.*}} ret
diff --git a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
index 3a0021e3c132d..d59dd1688dfa4 100644
--- a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
@@ -28,6 +28,8 @@
#define GET_AVAILABLE_OPCODE_CHECKER
#include "AArch64GenInstrInfo.inc"
+#include "llvm/CodeGen/MachineInstrBuilder.h"
+
namespace llvm {
namespace exegesis {
@@ -109,6 +111,10 @@ static MCInst loadFPImmediate(MCRegister Reg, unsigned RegBitWidth,
namespace {
+// Use X19 as the loop counter register since it's a callee-saved register
+// that's available for temporary use.
+constexpr const MCPhysReg kDefaultLoopCounterReg = AArch64::X19;
+
class ExegesisAArch64Target : public ExegesisTarget {
public:
ExegesisAArch64Target()
@@ -141,6 +147,31 @@ class ExegesisAArch64Target : public ExegesisTarget {
errs() << "setRegTo is not implemented, results will be unreliable\n";
return {};
}
+ MCRegister getDefaultLoopCounterRegister(const Triple &) const override {
+ return kDefaultLoopCounterReg;
+ }
+
+ void decrementLoopCounterAndJump(
+ MachineBasicBlock &MBB, MachineBasicBlock &TargetMBB,
+ const MCInstrInfo &MII, MCRegister LoopRegister) const override {
+ // subs LoopRegister, LoopRegister, #1
+ BuildMI(&MBB, DebugLoc(), MII.get(AArch64::SUBSXri))
+ .addDef(LoopRegister)
+ .addUse(LoopRegister)
+ .addImm(1) // Subtract 1
+ .addImm(0); // No shift amount
+ // cbnz LoopRegister, TargetMBB
+ BuildMI(&MBB, DebugLoc(), MII.get(AArch64::CBNZX))
+ .addUse(LoopRegister)
+ .addMBB(&TargetMBB);
+ }
+
+
+ // Registers that should not be selected for use in snippets.
+ const MCPhysReg UnavailableRegisters[1] = {kDefaultLoopCounterReg};
+ ArrayRef<MCPhysReg> getUnavailableRegisters() const override {
+ return UnavailableRegisters;
+ }
bool matchesArch(Triple::ArchType Arch) const override {
return Arch == Triple::aarch64 || Arch == Triple::aarch64_be;
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM module one nit and code formatting fixes.
@@ -28,6 +28,8 @@ | |||
#define GET_AVAILABLE_OPCODE_CHECKER | |||
#include "AArch64GenInstrInfo.inc" | |||
|
|||
#include "llvm/CodeGen/MachineInstrBuilder.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This can probably go with the group of includes at the top of the file?
Thanks for the review @boomanaiden154 ! I will fix that before merging this. One semi-related question to this: do you know if |
I haven't touched the code in a while, but I'm pretty sure I believe you're looking for the |
Oh wow, I have somehow missed that. Yes, that looks like it. Thanks for the info! |
.addImm(1) // Subtract 1 | ||
.addImm(0); // No shift amount | ||
// cbnz LoopRegister, TargetMBB | ||
BuildMI(&MBB, DebugLoc(), MII.get(AArch64::CBNZX)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be either SUB+CBNZ
or SUBS+Bcc
? I don't know if there is a lot of difference between the two if LoopRegister needs to get updated. (There might be a chance of fusing the SUBS+Bcc, I'm not sure when that does and doesn't happen).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than my comment about which inst it can use this LGTM.
85cfa6e
to
727c271
Compare
Subject says it all: implement the loop iterator decrement and jump function functions, and reserve X19 for the loop counter.
727c271
to
0143d1d
Compare
…4" (#155423) I see some build bot failures: - https://lab.llvm.org/buildbot/#/builders/76/builds/12434/ - https://lab.llvm.org/buildbot/#/builders/55/builds/16251/ Revert #154751 while I investigate this.
… for AArch64" (#155423) I see some build bot failures: - https://lab.llvm.org/buildbot/#/builders/76/builds/12434/ - https://lab.llvm.org/buildbot/#/builders/55/builds/16251/ Revert llvm/llvm-project#154751 while I investigate this.
Subject says it all: implement the loop iterator decrement and jump function functions, and reserve X19 for the loop counter.