-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[AMDGPU] High VGPR lowering on gfx1250 #156965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rampitec
merged 1 commit into
main
from
users/rampitec/09-04-_amdgpu_high_vgpr_lowering_on_gfx1250
Sep 4, 2025
Merged
[AMDGPU] High VGPR lowering on gfx1250 #156965
rampitec
merged 1 commit into
main
from
users/rampitec/09-04-_amdgpu_high_vgpr_lowering_on_gfx1250
Sep 4, 2025
+1,378
−4
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesPatch is 65.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156965.diff 12 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index ebe38de1636be..4ca1011ea1312 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -501,6 +501,9 @@ extern char &SIModeRegisterID;
void initializeAMDGPUInsertDelayAluLegacyPass(PassRegistry &);
extern char &AMDGPUInsertDelayAluID;
+void initializeAMDGPULowerVGPREncodingPass(PassRegistry &);
+extern char &AMDGPULowerVGPREncodingID;
+
void initializeSIInsertHardClausesLegacyPass(PassRegistry &);
extern char &SIInsertHardClausesID;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp b/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
new file mode 100644
index 0000000000000..ca06c316c2bfc
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
@@ -0,0 +1,354 @@
+//===- AMDGPULowerVGPREncoding.cpp - lower VGPRs above v255 ---------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// Lower VGPRs above first 256 on gfx1250.
+///
+/// The pass scans used VGPRs and inserts S_SET_VGPR_MSB instructions to switch
+/// VGPR addressing mode. The mode change is effective until the next change.
+/// This instruction provides high bits of a VGPR address for four of the
+/// operands: vdst, src0, src1, and src2, or other 4 operands depending on the
+/// instruction encoding. If bits are set they are added as MSB to the
+/// corresponding operand VGPR number.
+///
+/// There is no need to replace actual register operands because encoding of the
+/// high and low VGPRs is the same. I.e. v0 has the encoding 0x100, so does
+/// v256. v1 has the encoding 0x101 and v257 has the same encoding. So high
+/// VGPRs will survive until actual encoding and will result in a same actual
+/// bit encoding.
+///
+/// As a result the pass only inserts S_SET_VGPR_MSB to provide an actual offset
+/// to a VGPR address of the subseqent instructions. The InstPrinter will take
+/// care of the printing a low VGPR instead of a high one. In prinicple this
+/// shall be viable to print actual high VGPR numbers, but that would disagree
+/// with a disasm printing and create a situation where asm text is not
+/// deterministic.
+///
+/// This pass creates a convention where non-fall through basic blocks shall
+/// start with all 4 MSBs zero. Otherwise a disassembly would not be readable.
+/// An optimization here is possible but deemed not desirable because of the
+/// readbility concerns.
+///
+/// Consequentially the ABI is set to expect all 4 MSBs to be zero on entry.
+/// The pass must run very late in the pipeline to make sure no changes to VGPR
+/// operands will be made after it.
+//
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPU.h"
+#include "GCNSubtarget.h"
+#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "SIInstrInfo.h"
+#include "llvm/ADT/PackedVector.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "amdgpu-lower-vgpr-encoding"
+
+namespace {
+
+class AMDGPULowerVGPREncoding : public MachineFunctionPass {
+ static constexpr unsigned OpNum = 4;
+ static constexpr unsigned BitsPerField = 2;
+ static constexpr unsigned NumFields = 4;
+ static constexpr unsigned FieldMask = (1 << BitsPerField) - 1;
+ using ModeType = PackedVector<unsigned, BitsPerField,
+ std::bitset<BitsPerField * NumFields>>;
+
+ class ModeTy : public ModeType {
+ public:
+ // bitset constructor will set all bits to zero
+ ModeTy() : ModeType(0) {}
+
+ operator int64_t() const { return raw_bits().to_ulong(); }
+
+ static ModeTy fullMask() {
+ ModeTy M;
+ M.raw_bits().flip();
+ return M;
+ }
+ };
+
+public:
+ static char ID;
+
+ AMDGPULowerVGPREncoding() : MachineFunctionPass(ID) {}
+
+ void getAnalysisUsage(AnalysisUsage &AU) const override {
+ AU.setPreservesCFG();
+ MachineFunctionPass::getAnalysisUsage(AU);
+ }
+
+ bool runOnMachineFunction(MachineFunction &MF) override;
+
+private:
+ const SIInstrInfo *TII;
+ const SIRegisterInfo *TRI;
+
+ /// Most recent s_set_* instruction.
+ MachineInstr *MostRecentModeSet;
+
+ /// Whether the current mode is known.
+ bool CurrentModeKnown;
+
+ /// Current mode bits.
+ ModeTy CurrentMode;
+
+ /// Current mask of mode bits that instructions since MostRecentModeSet care
+ /// about.
+ ModeTy CurrentMask;
+
+ /// Number of current hard clause instructions.
+ unsigned ClauseLen;
+
+ /// Number of hard clause instructions remaining.
+ unsigned ClauseRemaining;
+
+ /// Clause group breaks.
+ unsigned ClauseBreaks;
+
+ /// Last hard clause instruction.
+ MachineInstr *Clause;
+
+ /// Insert mode change before \p I. \returns true if mode was changed.
+ bool setMode(ModeTy NewMode, ModeTy Mask, MachineInstr *I);
+
+ /// Reset mode to default.
+ void resetMode(MachineInstr *I) { setMode(ModeTy(), ModeTy::fullMask(), I); }
+
+ /// If \p MO references VGPRs, return the MSBs. Otherwise, return nullopt.
+ std::optional<unsigned> getMSBs(const MachineOperand &MO) const;
+
+ /// Handle single \p MI. \return true if changed.
+ bool runOnMachineInstr(MachineInstr &MI);
+
+ /// Compute the mode and mode mask for a single \p MI given \p Ops operands
+ /// bit mapping. Optionally takes second array \p Ops2 for VOPD.
+ /// If provided and an operand from \p Ops is not a VGPR, then \p Ops2
+ /// is checked.
+ void computeMode(ModeTy &NewMode, ModeTy &Mask, MachineInstr &MI,
+ const AMDGPU::OpName Ops[OpNum],
+ const AMDGPU::OpName *Ops2 = nullptr);
+
+ /// Check if an instruction \p I is within a clause and returns a suitable
+ /// iterator to insert mode change. It may also modify the S_CLAUSE
+ /// instruction to extend it or drop the clause if it cannot be adjusted.
+ MachineInstr *handleClause(MachineInstr *I);
+};
+
+bool AMDGPULowerVGPREncoding::setMode(ModeTy NewMode, ModeTy Mask,
+ MachineInstr *I) {
+ assert((NewMode.raw_bits() & ~Mask.raw_bits()).none());
+
+ if (CurrentModeKnown) {
+ auto Delta = NewMode.raw_bits() ^ CurrentMode.raw_bits();
+
+ if ((Delta & Mask.raw_bits()).none()) {
+ CurrentMask |= Mask;
+ return false;
+ }
+
+ if (MostRecentModeSet && (Delta & CurrentMask.raw_bits()).none()) {
+ CurrentMode |= NewMode;
+ CurrentMask |= Mask;
+
+ MostRecentModeSet->getOperand(0).setImm(CurrentMode);
+ return true;
+ }
+ }
+
+ I = handleClause(I);
+ MostRecentModeSet =
+ BuildMI(*I->getParent(), I, {}, TII->get(AMDGPU::S_SET_VGPR_MSB))
+ .addImm(NewMode);
+
+ CurrentMode = NewMode;
+ CurrentMask = Mask;
+ CurrentModeKnown = true;
+ return true;
+}
+
+std::optional<unsigned>
+AMDGPULowerVGPREncoding::getMSBs(const MachineOperand &MO) const {
+ if (!MO.isReg())
+ return std::nullopt;
+
+ MCRegister Reg = MO.getReg();
+ const TargetRegisterClass *RC = TRI->getPhysRegBaseClass(Reg);
+ if (!RC || !TRI->isVGPRClass(RC))
+ return std::nullopt;
+
+ unsigned Idx = TRI->getHWRegIndex(Reg);
+ return Idx >> 8;
+}
+
+void AMDGPULowerVGPREncoding::computeMode(ModeTy &NewMode, ModeTy &Mask,
+ MachineInstr &MI,
+ const AMDGPU::OpName Ops[OpNum],
+ const AMDGPU::OpName *Ops2) {
+ NewMode = {};
+ Mask = {};
+
+ for (unsigned I = 0; I < OpNum; ++I) {
+ MachineOperand *Op = TII->getNamedOperand(MI, Ops[I]);
+
+ std::optional<unsigned> MSBits;
+ if (Op)
+ MSBits = getMSBs(*Op);
+
+#if !defined(NDEBUG)
+ if (MSBits.has_value() && Ops2) {
+ auto Op2 = TII->getNamedOperand(MI, Ops2[I]);
+ if (Op2) {
+ std::optional<unsigned> MSBits2;
+ MSBits2 = getMSBs(*Op2);
+ if (MSBits2.has_value() && MSBits != MSBits2)
+ llvm_unreachable("Invalid VOPD pair was created");
+ }
+ }
+#endif
+
+ if (!MSBits.has_value() && Ops2) {
+ Op = TII->getNamedOperand(MI, Ops2[I]);
+ if (Op)
+ MSBits = getMSBs(*Op);
+ }
+
+ if (!MSBits.has_value())
+ continue;
+
+ // Skip tied uses of src2 of VOP2, these will be handled along with defs and
+ // only vdst bit affects these operands. We cannot skip tied uses of VOP3,
+ // these uses are real even if must match the vdst.
+ if (Ops[I] == AMDGPU::OpName::src2 && !Op->isDef() && Op->isTied() &&
+ (SIInstrInfo::isVOP2(MI) ||
+ (SIInstrInfo::isVOP3(MI) &&
+ TII->hasVALU32BitEncoding(MI.getOpcode()))))
+ continue;
+
+ NewMode[I] = MSBits.value();
+ Mask[I] = FieldMask;
+ }
+}
+
+bool AMDGPULowerVGPREncoding::runOnMachineInstr(MachineInstr &MI) {
+ auto Ops = AMDGPU::getVGPRLoweringOperandTables(MI.getDesc());
+ if (Ops.first) {
+ ModeTy NewMode, Mask;
+ computeMode(NewMode, Mask, MI, Ops.first, Ops.second);
+ return setMode(NewMode, Mask, &MI);
+ }
+ assert(!TII->hasVGPRUses(MI) || MI.isMetaInstruction() || MI.isPseudo());
+
+ return false;
+}
+
+MachineInstr *AMDGPULowerVGPREncoding::handleClause(MachineInstr *I) {
+ if (!ClauseRemaining)
+ return I;
+
+ // A clause cannot start with a special instruction, place it right before
+ // the clause.
+ if (ClauseRemaining == ClauseLen) {
+ I = Clause->getPrevNode();
+ assert(I->isBundle());
+ return I;
+ }
+
+ // If a clause defines breaks each group cannot start with a mode change.
+ // just drop the clause.
+ if (ClauseBreaks) {
+ Clause->eraseFromBundle();
+ ClauseRemaining = 0;
+ return I;
+ }
+
+ // Otherwise adjust a number of instructions in the clause if it fits.
+ // If it does not clause will just become shorter. Since the length
+ // recorded in the clause is one less, increment the length after the
+ // update. Note that SIMM16[5:0] must be 1-62, not 0 or 63.
+ if (ClauseLen < 63)
+ Clause->getOperand(0).setImm(ClauseLen | (ClauseBreaks << 8));
+
+ ++ClauseLen;
+
+ return I;
+}
+
+bool AMDGPULowerVGPREncoding::runOnMachineFunction(MachineFunction &MF) {
+ const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+ if (!ST.has1024AddressableVGPRs())
+ return false;
+
+ TII = ST.getInstrInfo();
+ TRI = ST.getRegisterInfo();
+
+ bool Changed = false;
+ ClauseLen = ClauseRemaining = 0;
+ CurrentMode.reset();
+ CurrentMask.reset();
+ CurrentModeKnown = true;
+ for (auto &MBB : MF) {
+ MostRecentModeSet = nullptr;
+
+ for (auto &MI : llvm::make_early_inc_range(MBB.instrs())) {
+ if (MI.isMetaInstruction())
+ continue;
+
+ if (MI.isTerminator() || MI.isCall()) {
+ if (MI.getOpcode() == AMDGPU::S_ENDPGM ||
+ MI.getOpcode() == AMDGPU::S_ENDPGM_SAVED) {
+ CurrentMode.reset();
+ CurrentModeKnown = true;
+ } else
+ resetMode(&MI);
+ continue;
+ }
+
+ if (MI.isInlineAsm()) {
+ if (TII->hasVGPRUses(MI))
+ resetMode(&MI);
+ continue;
+ }
+
+ if (MI.getOpcode() == AMDGPU::S_CLAUSE) {
+ assert(!ClauseRemaining && "Nested clauses are not supported");
+ ClauseLen = MI.getOperand(0).getImm();
+ ClauseBreaks = (ClauseLen >> 8) & 15;
+ ClauseLen = ClauseRemaining = (ClauseLen & 63) + 1;
+ Clause = &MI;
+ continue;
+ }
+
+ Changed |= runOnMachineInstr(MI);
+
+ if (ClauseRemaining)
+ --ClauseRemaining;
+ }
+
+ // If we're falling through to a block that has at least one other
+ // predecessor, we no longer know the mode.
+ MachineBasicBlock *Next = MBB.getNextNode();
+ if (Next && Next->pred_size() >= 2 &&
+ llvm::is_contained(Next->predecessors(), &MBB)) {
+ if (CurrentMode.raw_bits().any())
+ CurrentModeKnown = false;
+ }
+ }
+
+ return Changed;
+}
+
+} // namespace
+
+char AMDGPULowerVGPREncoding::ID = 0;
+
+char &llvm::AMDGPULowerVGPREncodingID = AMDGPULowerVGPREncoding::ID;
+
+INITIALIZE_PASS(AMDGPULowerVGPREncoding, DEBUG_TYPE,
+ "AMDGPU Lower VGPR Encoding", false, false)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index c84a0f6e31384..6acbf52b97de5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -373,6 +373,13 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr *MI) {
MF->getInfo<SIMachineFunctionInfo>(),
*OutStreamer);
+ if (isVerbose() && MI->getOpcode() == AMDGPU::S_SET_VGPR_MSB) {
+ unsigned V = MI->getOperand(0).getImm();
+ OutStreamer->AddComment(
+ " msbs: dst=" + Twine(V >> 6) + " src0=" + Twine(V & 3) +
+ " src1=" + Twine((V >> 2) & 3) + " src2=" + Twine((V >> 4) & 3));
+ }
+
MCInst TmpInst;
MCInstLowering.lower(MI, TmpInst);
EmitToStreamer(*OutStreamer, TmpInst);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4a2f0a13b1325..072becb9a2ad5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -584,6 +584,7 @@ extern "C" LLVM_ABI LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() {
initializeAMDGPURewriteUndefForPHILegacyPass(*PR);
initializeSIAnnotateControlFlowLegacyPass(*PR);
initializeAMDGPUInsertDelayAluLegacyPass(*PR);
+ initializeAMDGPULowerVGPREncodingPass(*PR);
initializeSIInsertHardClausesLegacyPass(*PR);
initializeSIInsertWaitcntsLegacyPass(*PR);
initializeSIModeRegisterLegacyPass(*PR);
@@ -1799,6 +1800,8 @@ void GCNPassConfig::addPreEmitPass() {
addPass(&AMDGPUWaitSGPRHazardsLegacyID);
+ addPass(&AMDGPULowerVGPREncodingID);
+
if (isPassEnabled(EnableInsertDelayAlu, CodeGenOptLevel::Less))
addPass(&AMDGPUInsertDelayAluID);
diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index a915c4076ca2a..aae56eef73edd 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -86,6 +86,7 @@ add_llvm_target(AMDGPUCodeGen
AMDGPUMCInstLower.cpp
AMDGPUMemoryUtils.cpp
AMDGPUIGroupLP.cpp
+ AMDGPULowerVGPREncoding.cpp
AMDGPUMCResourceInfo.cpp
AMDGPUMarkLastScratchLoad.cpp
AMDGPUMIRFormatter.cpp
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index ad122390e1f03..d1e8b7e4bad0d 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -324,6 +324,18 @@ void AMDGPUInstPrinter::printSymbolicFormat(const MCInst *MI,
}
}
+// \returns a low 256 vgpr representing a high vgpr \p Reg [v256..v1023] or
+// \p Reg itself otherwise.
+static MCPhysReg getRegForPrinting(MCPhysReg Reg, const MCRegisterInfo &MRI) {
+ unsigned Enc = MRI.getEncodingValue(Reg);
+ unsigned Idx = Enc & AMDGPU::HWEncoding::REG_IDX_MASK;
+ if (Idx < 0x100)
+ return Reg;
+
+ const MCRegisterClass *RC = getVGPRPhysRegClass(Reg, MRI);
+ return RC->getRegister(Idx % 0x100);
+}
+
void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, raw_ostream &O,
const MCRegisterInfo &MRI) {
#if !defined(NDEBUG)
@@ -337,7 +349,17 @@ void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, raw_ostream &O,
}
#endif
- O << getRegisterName(Reg);
+ unsigned PrintReg = getRegForPrinting(Reg, MRI);
+ O << getRegisterName(PrintReg);
+
+ if (PrintReg != Reg.id())
+ O << " /*" << getRegisterName(Reg) << "*/";
+}
+
+void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, unsigned Opc,
+ unsigned OpNo, raw_ostream &O,
+ const MCRegisterInfo &MRI) {
+ printRegOperand(Reg, O, MRI);
}
void AMDGPUInstPrinter::printVOPDst(const MCInst *MI, unsigned OpNo,
@@ -722,7 +744,7 @@ void AMDGPUInstPrinter::printRegularOperand(const MCInst *MI, unsigned OpNo,
const MCOperand &Op = MI->getOperand(OpNo);
if (Op.isReg()) {
- printRegOperand(Op.getReg(), O, MRI);
+ printRegOperand(Op.getReg(), MI->getOpcode(), OpNo, O, MRI);
// Check if operand register class contains register used.
// Intention: print disassembler message when invalid code is decoded,
@@ -1133,7 +1155,7 @@ void AMDGPUInstPrinter::printExpSrcN(const MCInst *MI, unsigned OpNo,
OpNo = OpNo - N + N / 2;
if (En & (1 << N))
- printRegOperand(MI->getOperand(OpNo).getReg(), O, MRI);
+ printRegOperand(MI->getOperand(OpNo).getReg(), Opc, OpNo, O, MRI);
else
O << "off";
}
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
index a92f99c3c0e4b..21cc2f229de91 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
@@ -35,6 +35,8 @@ class AMDGPUInstPrinter : public MCInstPrinter {
const MCSubtargetInfo &STI, raw_ostream &O) override;
static void printRegOperand(MCRegister Reg, raw_ostream &O,
const MCRegisterInfo &MRI);
+ void printRegOperand(MCRegister Reg, unsigned Opc, unsigned OpNo,
+ raw_ostream &O, const MCRegisterInfo &MRI);
private:
void printU16ImmOperand(const MCInst *MI, unsigned OpNo,
@@ -70,7 +72,7 @@ class AMDGPUInstPrinter : public MCInstPrinter {
void printSymbolicFormat(const MCInst *MI,
const MCSubtargetInfo &STI, raw_ostream &O);
- void printRegOperand(unsigned RegNo, raw_ostream &O);
+ void printRegOperand(MCRegister Reg, raw_ostream &O);
void printVOPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
raw_ostream &O);
void printVINTRPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index fe849cafb65d1..643c664e39f1e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -9270,6 +9270,9 @@ Register SIInstrInfo::findUsedSGPR(const MachineInstr &MI,
MachineOperand *SIInstrInfo::getNamedOperand(MachineInstr &MI,
AMDGPU::OpName OperandName) const {
+ if (OperandName == AMDGPU::OpName::NUM_OPERAND_NAMES)
+ return nullptr;
+
int Idx = AMDGPU::getNamedOperandIdx(MI.getOpcode(), OperandName);
if (Idx == -1)
return nullptr;
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index ff5cbd55484cf..6348d3607878e 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -3338,6 +3338,112 @@ const GcnBufferFormatInfo *getGcnBufferFormatInfo(uint8_t Format,
: getGfx9BufferFormatInfo(Format);
}
+const MCRegisterClass *getVGPRPhysRegClass(MCPhysReg Reg,
+ const MCRegisterInfo &MRI) {
+ const unsigned VGPRClasses[] = {
+ AMDGPU::VGPR_16RegClassID, AMDGPU::VGPR_32RegClassID,
+ AMDGPU::VReg_64RegClassID, AMDGPU::VReg_96RegClassID,
+ AMDGPU::VReg_128RegClassID, AMDGPU::VReg_160RegClassID,
+ AMDGPU::VReg_192RegClassID, AMDGPU::VReg_224RegClassID,
+ AMDGPU::VReg_256RegClassID, AMDGPU::VReg_288RegClassID,
+ AMDGPU::VReg_320RegClassID, AMDGPU::VReg_352RegClassID,
+ AMDGPU::VReg_384RegClassID, AMDGPU::VReg_512RegClassID,
+ AMDGPU::VReg_1024RegClassID};
+
+ for (unsigned RCID : VGPRClasses) {
+ const MCRegisterClass &RC = MRI.getRegClass(RCID);
+ if (RC.contains(Reg))
+ return &RC;
+ }
+
+ return nullptr;
+}
+
+unsigned getVGPREncodingMSBs(MCPhysReg Reg, const MCRegisterInfo &MRI) {
+ unsigned Enc = MRI.getEncodingValue(Reg);
+ unsigned Idx = Enc & AMDGPU::HWEncoding::REG_IDX_MASK;
+ return Idx >> 8;
+}
+
+MCPhysReg getVGPRWithMSBs(MCPhysReg Reg, unsigned MSBs,
+ const MCRegisterInfo &MRI) {
+ unsigned Enc = MRI.getEncodingValue(Reg);
+ unsigned Idx = Enc & AMDGPU::HWEn...
[truncated]
|
9d4c83f
to
12b93d6
Compare
shiltian
approved these changes
Sep 4, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.