Skip to content

Conversation

rampitec
Copy link
Collaborator

@rampitec rampitec commented Sep 4, 2025

No description provided.

Copy link
Collaborator Author

rampitec commented Sep 4, 2025

@rampitec rampitec marked this pull request as ready for review September 4, 2025 20:26
@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

Patch is 65.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156965.diff

12 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPU.h (+3)
  • (added) llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp (+354)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp (+7)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+3)
  • (modified) llvm/lib/Target/AMDGPU/CMakeLists.txt (+1)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+25-3)
  • (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h (+3-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+3)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp (+106)
  • (modified) llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h (+19)
  • (modified) llvm/test/CodeGen/AMDGPU/llc-pipeline.ll (+5)
  • (added) llvm/test/CodeGen/AMDGPU/vgpr-lowering-gfx1250.mir (+848)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index ebe38de1636be..4ca1011ea1312 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -501,6 +501,9 @@ extern char &SIModeRegisterID;
 void initializeAMDGPUInsertDelayAluLegacyPass(PassRegistry &);
 extern char &AMDGPUInsertDelayAluID;
 
+void initializeAMDGPULowerVGPREncodingPass(PassRegistry &);
+extern char &AMDGPULowerVGPREncodingID;
+
 void initializeSIInsertHardClausesLegacyPass(PassRegistry &);
 extern char &SIInsertHardClausesID;
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp b/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
new file mode 100644
index 0000000000000..ca06c316c2bfc
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/AMDGPULowerVGPREncoding.cpp
@@ -0,0 +1,354 @@
+//===- AMDGPULowerVGPREncoding.cpp - lower VGPRs above v255 ---------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+/// \file
+/// Lower VGPRs above first 256 on gfx1250.
+///
+/// The pass scans used VGPRs and inserts S_SET_VGPR_MSB instructions to switch
+/// VGPR addressing mode. The mode change is effective until the next change.
+/// This instruction provides high bits of a VGPR address for four of the
+/// operands: vdst, src0, src1, and src2, or other 4 operands depending on the
+/// instruction encoding. If bits are set they are added as MSB to the
+/// corresponding operand VGPR number.
+///
+/// There is no need to replace actual register operands because encoding of the
+/// high and low VGPRs is the same. I.e. v0 has the encoding 0x100, so does
+/// v256. v1 has the encoding 0x101 and v257 has the same encoding. So high
+/// VGPRs will survive until actual encoding and will result in a same actual
+/// bit encoding.
+///
+/// As a result the pass only inserts S_SET_VGPR_MSB to provide an actual offset
+/// to a VGPR address of the subseqent instructions. The InstPrinter will take
+/// care of the printing a low VGPR instead of a high one. In prinicple this
+/// shall be viable to print actual high VGPR numbers, but that would disagree
+/// with a disasm printing and create a situation where asm text is not
+/// deterministic.
+///
+/// This pass creates a convention where non-fall through basic blocks shall
+/// start with all 4 MSBs zero. Otherwise a disassembly would not be readable.
+/// An optimization here is possible but deemed not desirable because of the
+/// readbility concerns.
+///
+/// Consequentially the ABI is set to expect all 4 MSBs to be zero on entry.
+/// The pass must run very late in the pipeline to make sure no changes to VGPR
+/// operands will be made after it.
+//
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPU.h"
+#include "GCNSubtarget.h"
+#include "MCTargetDesc/AMDGPUMCTargetDesc.h"
+#include "SIInstrInfo.h"
+#include "llvm/ADT/PackedVector.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "amdgpu-lower-vgpr-encoding"
+
+namespace {
+
+class AMDGPULowerVGPREncoding : public MachineFunctionPass {
+  static constexpr unsigned OpNum = 4;
+  static constexpr unsigned BitsPerField = 2;
+  static constexpr unsigned NumFields = 4;
+  static constexpr unsigned FieldMask = (1 << BitsPerField) - 1;
+  using ModeType = PackedVector<unsigned, BitsPerField,
+                                std::bitset<BitsPerField * NumFields>>;
+
+  class ModeTy : public ModeType {
+  public:
+    // bitset constructor will set all bits to zero
+    ModeTy() : ModeType(0) {}
+
+    operator int64_t() const { return raw_bits().to_ulong(); }
+
+    static ModeTy fullMask() {
+      ModeTy M;
+      M.raw_bits().flip();
+      return M;
+    }
+  };
+
+public:
+  static char ID;
+
+  AMDGPULowerVGPREncoding() : MachineFunctionPass(ID) {}
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+    AU.setPreservesCFG();
+    MachineFunctionPass::getAnalysisUsage(AU);
+  }
+
+  bool runOnMachineFunction(MachineFunction &MF) override;
+
+private:
+  const SIInstrInfo *TII;
+  const SIRegisterInfo *TRI;
+
+  /// Most recent s_set_* instruction.
+  MachineInstr *MostRecentModeSet;
+
+  /// Whether the current mode is known.
+  bool CurrentModeKnown;
+
+  /// Current mode bits.
+  ModeTy CurrentMode;
+
+  /// Current mask of mode bits that instructions since MostRecentModeSet care
+  /// about.
+  ModeTy CurrentMask;
+
+  /// Number of current hard clause instructions.
+  unsigned ClauseLen;
+
+  /// Number of hard clause instructions remaining.
+  unsigned ClauseRemaining;
+
+  /// Clause group breaks.
+  unsigned ClauseBreaks;
+
+  /// Last hard clause instruction.
+  MachineInstr *Clause;
+
+  /// Insert mode change before \p I. \returns true if mode was changed.
+  bool setMode(ModeTy NewMode, ModeTy Mask, MachineInstr *I);
+
+  /// Reset mode to default.
+  void resetMode(MachineInstr *I) { setMode(ModeTy(), ModeTy::fullMask(), I); }
+
+  /// If \p MO references VGPRs, return the MSBs. Otherwise, return nullopt.
+  std::optional<unsigned> getMSBs(const MachineOperand &MO) const;
+
+  /// Handle single \p MI. \return true if changed.
+  bool runOnMachineInstr(MachineInstr &MI);
+
+  /// Compute the mode and mode mask for a single \p MI given \p Ops operands
+  /// bit mapping. Optionally takes second array \p Ops2 for VOPD.
+  /// If provided and an operand from \p Ops is not a VGPR, then \p Ops2
+  /// is checked.
+  void computeMode(ModeTy &NewMode, ModeTy &Mask, MachineInstr &MI,
+                   const AMDGPU::OpName Ops[OpNum],
+                   const AMDGPU::OpName *Ops2 = nullptr);
+
+  /// Check if an instruction \p I is within a clause and returns a suitable
+  /// iterator to insert mode change. It may also modify the S_CLAUSE
+  /// instruction to extend it or drop the clause if it cannot be adjusted.
+  MachineInstr *handleClause(MachineInstr *I);
+};
+
+bool AMDGPULowerVGPREncoding::setMode(ModeTy NewMode, ModeTy Mask,
+                                      MachineInstr *I) {
+  assert((NewMode.raw_bits() & ~Mask.raw_bits()).none());
+
+  if (CurrentModeKnown) {
+    auto Delta = NewMode.raw_bits() ^ CurrentMode.raw_bits();
+
+    if ((Delta & Mask.raw_bits()).none()) {
+      CurrentMask |= Mask;
+      return false;
+    }
+
+    if (MostRecentModeSet && (Delta & CurrentMask.raw_bits()).none()) {
+      CurrentMode |= NewMode;
+      CurrentMask |= Mask;
+
+      MostRecentModeSet->getOperand(0).setImm(CurrentMode);
+      return true;
+    }
+  }
+
+  I = handleClause(I);
+  MostRecentModeSet =
+      BuildMI(*I->getParent(), I, {}, TII->get(AMDGPU::S_SET_VGPR_MSB))
+          .addImm(NewMode);
+
+  CurrentMode = NewMode;
+  CurrentMask = Mask;
+  CurrentModeKnown = true;
+  return true;
+}
+
+std::optional<unsigned>
+AMDGPULowerVGPREncoding::getMSBs(const MachineOperand &MO) const {
+  if (!MO.isReg())
+    return std::nullopt;
+
+  MCRegister Reg = MO.getReg();
+  const TargetRegisterClass *RC = TRI->getPhysRegBaseClass(Reg);
+  if (!RC || !TRI->isVGPRClass(RC))
+    return std::nullopt;
+
+  unsigned Idx = TRI->getHWRegIndex(Reg);
+  return Idx >> 8;
+}
+
+void AMDGPULowerVGPREncoding::computeMode(ModeTy &NewMode, ModeTy &Mask,
+                                          MachineInstr &MI,
+                                          const AMDGPU::OpName Ops[OpNum],
+                                          const AMDGPU::OpName *Ops2) {
+  NewMode = {};
+  Mask = {};
+
+  for (unsigned I = 0; I < OpNum; ++I) {
+    MachineOperand *Op = TII->getNamedOperand(MI, Ops[I]);
+
+    std::optional<unsigned> MSBits;
+    if (Op)
+      MSBits = getMSBs(*Op);
+
+#if !defined(NDEBUG)
+    if (MSBits.has_value() && Ops2) {
+      auto Op2 = TII->getNamedOperand(MI, Ops2[I]);
+      if (Op2) {
+        std::optional<unsigned> MSBits2;
+        MSBits2 = getMSBs(*Op2);
+        if (MSBits2.has_value() && MSBits != MSBits2)
+          llvm_unreachable("Invalid VOPD pair was created");
+      }
+    }
+#endif
+
+    if (!MSBits.has_value() && Ops2) {
+      Op = TII->getNamedOperand(MI, Ops2[I]);
+      if (Op)
+        MSBits = getMSBs(*Op);
+    }
+
+    if (!MSBits.has_value())
+      continue;
+
+    // Skip tied uses of src2 of VOP2, these will be handled along with defs and
+    // only vdst bit affects these operands. We cannot skip tied uses of VOP3,
+    // these uses are real even if must match the vdst.
+    if (Ops[I] == AMDGPU::OpName::src2 && !Op->isDef() && Op->isTied() &&
+        (SIInstrInfo::isVOP2(MI) ||
+         (SIInstrInfo::isVOP3(MI) &&
+          TII->hasVALU32BitEncoding(MI.getOpcode()))))
+      continue;
+
+    NewMode[I] = MSBits.value();
+    Mask[I] = FieldMask;
+  }
+}
+
+bool AMDGPULowerVGPREncoding::runOnMachineInstr(MachineInstr &MI) {
+  auto Ops = AMDGPU::getVGPRLoweringOperandTables(MI.getDesc());
+  if (Ops.first) {
+    ModeTy NewMode, Mask;
+    computeMode(NewMode, Mask, MI, Ops.first, Ops.second);
+    return setMode(NewMode, Mask, &MI);
+  }
+  assert(!TII->hasVGPRUses(MI) || MI.isMetaInstruction() || MI.isPseudo());
+
+  return false;
+}
+
+MachineInstr *AMDGPULowerVGPREncoding::handleClause(MachineInstr *I) {
+  if (!ClauseRemaining)
+    return I;
+
+  // A clause cannot start with a special instruction, place it right before
+  // the clause.
+  if (ClauseRemaining == ClauseLen) {
+    I = Clause->getPrevNode();
+    assert(I->isBundle());
+    return I;
+  }
+
+  // If a clause defines breaks each group cannot start with a mode change.
+  // just drop the clause.
+  if (ClauseBreaks) {
+    Clause->eraseFromBundle();
+    ClauseRemaining = 0;
+    return I;
+  }
+
+  // Otherwise adjust a number of instructions in the clause if it fits.
+  // If it does not clause will just become shorter. Since the length
+  // recorded in the clause is one less, increment the length after the
+  // update. Note that SIMM16[5:0] must be 1-62, not 0 or 63.
+  if (ClauseLen < 63)
+    Clause->getOperand(0).setImm(ClauseLen | (ClauseBreaks << 8));
+
+  ++ClauseLen;
+
+  return I;
+}
+
+bool AMDGPULowerVGPREncoding::runOnMachineFunction(MachineFunction &MF) {
+  const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
+  if (!ST.has1024AddressableVGPRs())
+    return false;
+
+  TII = ST.getInstrInfo();
+  TRI = ST.getRegisterInfo();
+
+  bool Changed = false;
+  ClauseLen = ClauseRemaining = 0;
+  CurrentMode.reset();
+  CurrentMask.reset();
+  CurrentModeKnown = true;
+  for (auto &MBB : MF) {
+    MostRecentModeSet = nullptr;
+
+    for (auto &MI : llvm::make_early_inc_range(MBB.instrs())) {
+      if (MI.isMetaInstruction())
+        continue;
+
+      if (MI.isTerminator() || MI.isCall()) {
+        if (MI.getOpcode() == AMDGPU::S_ENDPGM ||
+            MI.getOpcode() == AMDGPU::S_ENDPGM_SAVED) {
+          CurrentMode.reset();
+          CurrentModeKnown = true;
+        } else
+          resetMode(&MI);
+        continue;
+      }
+
+      if (MI.isInlineAsm()) {
+        if (TII->hasVGPRUses(MI))
+          resetMode(&MI);
+        continue;
+      }
+
+      if (MI.getOpcode() == AMDGPU::S_CLAUSE) {
+        assert(!ClauseRemaining && "Nested clauses are not supported");
+        ClauseLen = MI.getOperand(0).getImm();
+        ClauseBreaks = (ClauseLen >> 8) & 15;
+        ClauseLen = ClauseRemaining = (ClauseLen & 63) + 1;
+        Clause = &MI;
+        continue;
+      }
+
+      Changed |= runOnMachineInstr(MI);
+
+      if (ClauseRemaining)
+        --ClauseRemaining;
+    }
+
+    // If we're falling through to a block that has at least one other
+    // predecessor, we no longer know the mode.
+    MachineBasicBlock *Next = MBB.getNextNode();
+    if (Next && Next->pred_size() >= 2 &&
+        llvm::is_contained(Next->predecessors(), &MBB)) {
+      if (CurrentMode.raw_bits().any())
+        CurrentModeKnown = false;
+    }
+  }
+
+  return Changed;
+}
+
+} // namespace
+
+char AMDGPULowerVGPREncoding::ID = 0;
+
+char &llvm::AMDGPULowerVGPREncodingID = AMDGPULowerVGPREncoding::ID;
+
+INITIALIZE_PASS(AMDGPULowerVGPREncoding, DEBUG_TYPE,
+                "AMDGPU Lower VGPR Encoding", false, false)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
index c84a0f6e31384..6acbf52b97de5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
@@ -373,6 +373,13 @@ void AMDGPUAsmPrinter::emitInstruction(const MachineInstr *MI) {
                              MF->getInfo<SIMachineFunctionInfo>(),
                              *OutStreamer);
 
+    if (isVerbose() && MI->getOpcode() == AMDGPU::S_SET_VGPR_MSB) {
+      unsigned V = MI->getOperand(0).getImm();
+      OutStreamer->AddComment(
+          " msbs: dst=" + Twine(V >> 6) + " src0=" + Twine(V & 3) +
+          " src1=" + Twine((V >> 2) & 3) + " src2=" + Twine((V >> 4) & 3));
+    }
+
     MCInst TmpInst;
     MCInstLowering.lower(MI, TmpInst);
     EmitToStreamer(*OutStreamer, TmpInst);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 4a2f0a13b1325..072becb9a2ad5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -584,6 +584,7 @@ extern "C" LLVM_ABI LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() {
   initializeAMDGPURewriteUndefForPHILegacyPass(*PR);
   initializeSIAnnotateControlFlowLegacyPass(*PR);
   initializeAMDGPUInsertDelayAluLegacyPass(*PR);
+  initializeAMDGPULowerVGPREncodingPass(*PR);
   initializeSIInsertHardClausesLegacyPass(*PR);
   initializeSIInsertWaitcntsLegacyPass(*PR);
   initializeSIModeRegisterLegacyPass(*PR);
@@ -1799,6 +1800,8 @@ void GCNPassConfig::addPreEmitPass() {
 
   addPass(&AMDGPUWaitSGPRHazardsLegacyID);
 
+  addPass(&AMDGPULowerVGPREncodingID);
+
   if (isPassEnabled(EnableInsertDelayAlu, CodeGenOptLevel::Less))
     addPass(&AMDGPUInsertDelayAluID);
 
diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index a915c4076ca2a..aae56eef73edd 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -86,6 +86,7 @@ add_llvm_target(AMDGPUCodeGen
   AMDGPUMCInstLower.cpp
   AMDGPUMemoryUtils.cpp
   AMDGPUIGroupLP.cpp
+  AMDGPULowerVGPREncoding.cpp
   AMDGPUMCResourceInfo.cpp
   AMDGPUMarkLastScratchLoad.cpp
   AMDGPUMIRFormatter.cpp
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index ad122390e1f03..d1e8b7e4bad0d 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -324,6 +324,18 @@ void AMDGPUInstPrinter::printSymbolicFormat(const MCInst *MI,
   }
 }
 
+// \returns a low 256 vgpr representing a high vgpr \p Reg [v256..v1023] or
+// \p Reg itself otherwise.
+static MCPhysReg getRegForPrinting(MCPhysReg Reg, const MCRegisterInfo &MRI) {
+  unsigned Enc = MRI.getEncodingValue(Reg);
+  unsigned Idx = Enc & AMDGPU::HWEncoding::REG_IDX_MASK;
+  if (Idx < 0x100)
+    return Reg;
+
+  const MCRegisterClass *RC = getVGPRPhysRegClass(Reg, MRI);
+  return RC->getRegister(Idx % 0x100);
+}
+
 void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, raw_ostream &O,
                                         const MCRegisterInfo &MRI) {
 #if !defined(NDEBUG)
@@ -337,7 +349,17 @@ void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, raw_ostream &O,
   }
 #endif
 
-  O << getRegisterName(Reg);
+  unsigned PrintReg = getRegForPrinting(Reg, MRI);
+  O << getRegisterName(PrintReg);
+
+  if (PrintReg != Reg.id())
+    O << " /*" << getRegisterName(Reg) << "*/";
+}
+
+void AMDGPUInstPrinter::printRegOperand(MCRegister Reg, unsigned Opc,
+                                        unsigned OpNo, raw_ostream &O,
+                                        const MCRegisterInfo &MRI) {
+  printRegOperand(Reg, O, MRI);
 }
 
 void AMDGPUInstPrinter::printVOPDst(const MCInst *MI, unsigned OpNo,
@@ -722,7 +744,7 @@ void AMDGPUInstPrinter::printRegularOperand(const MCInst *MI, unsigned OpNo,
 
   const MCOperand &Op = MI->getOperand(OpNo);
   if (Op.isReg()) {
-    printRegOperand(Op.getReg(), O, MRI);
+    printRegOperand(Op.getReg(), MI->getOpcode(), OpNo, O, MRI);
 
     // Check if operand register class contains register used.
     // Intention: print disassembler message when invalid code is decoded,
@@ -1133,7 +1155,7 @@ void AMDGPUInstPrinter::printExpSrcN(const MCInst *MI, unsigned OpNo,
     OpNo = OpNo - N + N / 2;
 
   if (En & (1 << N))
-    printRegOperand(MI->getOperand(OpNo).getReg(), O, MRI);
+    printRegOperand(MI->getOperand(OpNo).getReg(), Opc, OpNo, O, MRI);
   else
     O << "off";
 }
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
index a92f99c3c0e4b..21cc2f229de91 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.h
@@ -35,6 +35,8 @@ class AMDGPUInstPrinter : public MCInstPrinter {
                  const MCSubtargetInfo &STI, raw_ostream &O) override;
   static void printRegOperand(MCRegister Reg, raw_ostream &O,
                               const MCRegisterInfo &MRI);
+  void printRegOperand(MCRegister Reg, unsigned Opc, unsigned OpNo,
+                       raw_ostream &O, const MCRegisterInfo &MRI);
 
 private:
   void printU16ImmOperand(const MCInst *MI, unsigned OpNo,
@@ -70,7 +72,7 @@ class AMDGPUInstPrinter : public MCInstPrinter {
   void printSymbolicFormat(const MCInst *MI,
                            const MCSubtargetInfo &STI, raw_ostream &O);
 
-  void printRegOperand(unsigned RegNo, raw_ostream &O);
+  void printRegOperand(MCRegister Reg, raw_ostream &O);
   void printVOPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
                    raw_ostream &O);
   void printVINTRPDst(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index fe849cafb65d1..643c664e39f1e 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -9270,6 +9270,9 @@ Register SIInstrInfo::findUsedSGPR(const MachineInstr &MI,
 
 MachineOperand *SIInstrInfo::getNamedOperand(MachineInstr &MI,
                                              AMDGPU::OpName OperandName) const {
+  if (OperandName == AMDGPU::OpName::NUM_OPERAND_NAMES)
+    return nullptr;
+
   int Idx = AMDGPU::getNamedOperandIdx(MI.getOpcode(), OperandName);
   if (Idx == -1)
     return nullptr;
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index ff5cbd55484cf..6348d3607878e 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -3338,6 +3338,112 @@ const GcnBufferFormatInfo *getGcnBufferFormatInfo(uint8_t Format,
                           : getGfx9BufferFormatInfo(Format);
 }
 
+const MCRegisterClass *getVGPRPhysRegClass(MCPhysReg Reg,
+                                           const MCRegisterInfo &MRI) {
+  const unsigned VGPRClasses[] = {
+      AMDGPU::VGPR_16RegClassID,  AMDGPU::VGPR_32RegClassID,
+      AMDGPU::VReg_64RegClassID,  AMDGPU::VReg_96RegClassID,
+      AMDGPU::VReg_128RegClassID, AMDGPU::VReg_160RegClassID,
+      AMDGPU::VReg_192RegClassID, AMDGPU::VReg_224RegClassID,
+      AMDGPU::VReg_256RegClassID, AMDGPU::VReg_288RegClassID,
+      AMDGPU::VReg_320RegClassID, AMDGPU::VReg_352RegClassID,
+      AMDGPU::VReg_384RegClassID, AMDGPU::VReg_512RegClassID,
+      AMDGPU::VReg_1024RegClassID};
+
+  for (unsigned RCID : VGPRClasses) {
+    const MCRegisterClass &RC = MRI.getRegClass(RCID);
+    if (RC.contains(Reg))
+      return &RC;
+  }
+
+  return nullptr;
+}
+
+unsigned getVGPREncodingMSBs(MCPhysReg Reg, const MCRegisterInfo &MRI) {
+  unsigned Enc = MRI.getEncodingValue(Reg);
+  unsigned Idx = Enc & AMDGPU::HWEncoding::REG_IDX_MASK;
+  return Idx >> 8;
+}
+
+MCPhysReg getVGPRWithMSBs(MCPhysReg Reg, unsigned MSBs,
+                          const MCRegisterInfo &MRI) {
+  unsigned Enc = MRI.getEncodingValue(Reg);
+  unsigned Idx = Enc & AMDGPU::HWEn...
[truncated]

@rampitec rampitec force-pushed the users/rampitec/09-04-_amdgpu_high_vgpr_lowering_on_gfx1250 branch from 9d4c83f to 12b93d6 Compare September 4, 2025 21:48
@rampitec rampitec merged commit 1f0f347 into main Sep 4, 2025
9 checks passed
@rampitec rampitec deleted the users/rampitec/09-04-_amdgpu_high_vgpr_lowering_on_gfx1250 branch September 4, 2025 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants