Skip to content

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Sep 3, 2025

If a VL operand is > 31 then it will be materialized into an ADDI $x0, imm. We can reason about it by peeking at the virtual register definition which allows RISCVVectorPeephole and RISCVVLOptimizer to catch more cases.

There's a separate issue with RISCVVLOptimizer where the materialized immediate may not always dominate the instruction we want to reduce the VL of, but this is left to another patch.

If a VL operand is > 31 then it will be materialized into an ADDI $x0, imm. We can reason about it by peeking at the virtual register definition which allows RISCVVectorPeephole and RISCVVLOptimizer to catch more cases.

There's a separate issue with RISCVVLOptimizer where the materialized immediate may not always dominate the instruction we want to reduce the VL of, but this is left to another patch.
Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

if (MO.isImm())
return MO.getImm();
MachineInstr *Def = MRI->getVRegDef(MO.getReg());
if (Def->getOpcode() == RISCV::ADDI &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this a predicate. We have a lot of code doing the same judgement, like isLoadImm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just reused isLoadImm instead in a6c215c

return MO.getImm();
MachineInstr *Def = MRI->getVRegDef(MO.getReg());
if (Def->getOpcode() == RISCV::ADDI &&
Def->getOperand(1).getReg() == RISCV::X0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you need to check operand 1 is a register; I think it can also be a FrameIndex here. isLoadImm is nearby and gets this correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to just reuse isLoadImm in a6c215c

@@ -4807,9 +4820,11 @@ bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS) {
return true;
if (LHS.isImm() && LHS.getImm() == RISCV::VLMaxSentinel)
return false;
if (!LHS.isImm() || !RHS.isImm())
std::optional<int64_t> LHSImm = getEffectiveImm(LHS, MRI),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the new logic works properly with VLMaxSentinel? In particular, what if you get an ADDI which happens to encode -1?

I think this is correct, just flagging it for extra consideration.

Copy link
Contributor Author

@lukel97 lukel97 Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally checked for VLMaxSentinel in ADDI, but undid this in 8072205. Since in the places where we check for VLMaxSentinel we don't seem to check for ADDIs, we only consider immediate operands. E.g. in RISCVInsertVSETVLI

    if (VLOp.isImm()) {
      int64_t Imm = VLOp.getImm();
      // Convert the VLMax sentintel to X0 register.
      if (Imm == RISCV::VLMaxSentinel) { ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, what if you get an ADDI which happens to encode -1?

I guess the question here is whether we ever generate ADDI with -1. Because if we do, then I think even for places that do not check for VLMaxSentinel might behave incorrectly. For instance, in VLOpt it will aggregate user VLs and pick the largest one. Without checking if the effective imm is VLMaxSentinel, ADDI with -1 will never be picked when it should be.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, what if you get an ADDI which happens to encode -1?

I think we're okay here, and it would in fact be wrong to special case ADDI xN, x0, -1. The current immediate field is essentially a union of two states - actual immediate which are in the range of 0-31 and the VLMaxSentinel which represents the symbolic VLMAX value. We just happen to use the value -1 (which doesn't correspond to a valid immediate) for this purpose.

Now, if we did special case ADDI -1 as VLMaxSentinel, we'd probably never notice. -1 interpreted an unsigned XLEN is going to be way larger than any possible VLMAX, and given the rules of vsetvli would result in VL=VLMAX anyways. So the difference is probably not visible.

(In case it's not clear, I'm agreeing with Luke, just explaining my reasoning on how I got there.)

@@ -365,7 +365,8 @@ unsigned getDestLog2EEW(const MCInstrDesc &Desc, unsigned Log2SEW);
static constexpr int64_t VLMaxSentinel = -1LL;

/// Given two VL operands, do we know that LHS <= RHS?
bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS);
bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change adds the assumption that this routine is only called in SSA form. Is that true? If so, update the comment.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

If a VL operand is > 31 then it will be materialized into an ADDI $x0, imm. We can reason about it by peeking at the virtual register definition which allows RISCVVectorPeephole and RISCVVLOptimizer to catch more cases.

There's a separate issue with RISCVVLOptimizer where the materialized immediate may not always dominate the instruction we want to reduce the VL of, but this is left to another patch.


Patch is 20.00 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156639.diff

15 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+20-4)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+2-1)
  • (modified) llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp (+5-5)
  • (modified) llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp (+8-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll (+3-2)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+7-24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll (+6-5)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll (+6-5)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll (+12)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7b4a1de167695..872f2cff67e58 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -4796,8 +4796,22 @@ unsigned RISCV::getDestLog2EEW(const MCInstrDesc &Desc, unsigned Log2SEW) {
   return Scaled;
 }
 
-/// Given two VL operands, do we know that LHS <= RHS?
-bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS) {
+static std::optional<int64_t> getEffectiveImm(const MachineOperand &MO,
+                                              const MachineRegisterInfo *MRI) {
+  assert(MO.isImm() || MO.getReg().isVirtual());
+  if (MO.isImm())
+    return MO.getImm();
+  const MachineInstr *Def = MRI->getVRegDef(MO.getReg());
+  int64_t Imm;
+  if (isLoadImm(Def, Imm))
+    return Imm;
+  return std::nullopt;
+}
+
+/// Given two VL operands, do we know that LHS <= RHS? Must be used in SSA form.
+bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS,
+                        const MachineRegisterInfo *MRI) {
+  assert(MRI->isSSA());
   if (LHS.isReg() && RHS.isReg() && LHS.getReg().isVirtual() &&
       LHS.getReg() == RHS.getReg())
     return true;
@@ -4807,9 +4821,11 @@ bool RISCV::isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS) {
     return true;
   if (LHS.isImm() && LHS.getImm() == RISCV::VLMaxSentinel)
     return false;
-  if (!LHS.isImm() || !RHS.isImm())
+  std::optional<int64_t> LHSImm = getEffectiveImm(LHS, MRI),
+                         RHSImm = getEffectiveImm(RHS, MRI);
+  if (!LHSImm || !RHSImm)
     return false;
-  return LHS.getImm() <= RHS.getImm();
+  return LHSImm <= RHSImm;
 }
 
 namespace {
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.h b/llvm/lib/Target/RISCV/RISCVInstrInfo.h
index 785c8352d4a5e..0defb184fba6d 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.h
@@ -365,7 +365,8 @@ unsigned getDestLog2EEW(const MCInstrDesc &Desc, unsigned Log2SEW);
 static constexpr int64_t VLMaxSentinel = -1LL;
 
 /// Given two VL operands, do we know that LHS <= RHS?
-bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS);
+bool isVLKnownLE(const MachineOperand &LHS, const MachineOperand &RHS,
+                 const MachineRegisterInfo *MRI);
 
 // Mask assignments for floating-point
 static constexpr unsigned FPMASK_Negative_Infinity = 0x001;
diff --git a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
index 9b70eb6c25b12..9b4592011eb4c 100644
--- a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
@@ -1347,7 +1347,7 @@ RISCVVLOptimizer::getMinimumVLForUser(const MachineOperand &UserOp) const {
     assert(UserOp.getOperandNo() == UserMI.getNumExplicitDefs() &&
            RISCVII::isFirstDefTiedToFirstUse(UserMI.getDesc()));
     auto DemandedVL = DemandedVLs.lookup(&UserMI);
-    if (!DemandedVL || !RISCV::isVLKnownLE(*DemandedVL, VLOp)) {
+    if (!DemandedVL || !RISCV::isVLKnownLE(*DemandedVL, VLOp, MRI)) {
       LLVM_DEBUG(dbgs() << "  Abort because user is passthru in "
                            "instruction with demanded tail\n");
       return std::nullopt;
@@ -1365,7 +1365,7 @@ RISCVVLOptimizer::getMinimumVLForUser(const MachineOperand &UserOp) const {
   // requires.
   if (auto DemandedVL = DemandedVLs.lookup(&UserMI)) {
     assert(isCandidate(UserMI));
-    if (RISCV::isVLKnownLE(*DemandedVL, VLOp))
+    if (RISCV::isVLKnownLE(*DemandedVL, VLOp, MRI))
       return DemandedVL;
   }
 
@@ -1408,10 +1408,10 @@ RISCVVLOptimizer::checkUsers(const MachineInstr &MI) const {
 
     // Use the largest VL among all the users. If we cannot determine this
     // statically, then we cannot optimize the VL.
-    if (!CommonVL || RISCV::isVLKnownLE(*CommonVL, *VLOp)) {
+    if (!CommonVL || RISCV::isVLKnownLE(*CommonVL, *VLOp, MRI)) {
       CommonVL = *VLOp;
       LLVM_DEBUG(dbgs() << "    User VL is: " << VLOp << "\n");
-    } else if (!RISCV::isVLKnownLE(*VLOp, *CommonVL)) {
+    } else if (!RISCV::isVLKnownLE(*VLOp, *CommonVL, MRI)) {
       LLVM_DEBUG(dbgs() << "    Abort because cannot determine a common VL\n");
       return std::nullopt;
     }
@@ -1464,7 +1464,7 @@ bool RISCVVLOptimizer::tryReduceVL(MachineInstr &MI) const {
   assert((CommonVL->isImm() || CommonVL->getReg().isVirtual()) &&
          "Expected VL to be an Imm or virtual Reg");
 
-  if (!RISCV::isVLKnownLE(*CommonVL, VLOp)) {
+  if (!RISCV::isVLKnownLE(*CommonVL, VLOp, MRI)) {
     LLVM_DEBUG(dbgs() << "  Abort due to CommonVL not <= VLOp.\n");
     return false;
   }
diff --git a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
index 62651185137c9..6ea010ebb73d7 100644
--- a/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVectorPeephole.cpp
@@ -177,7 +177,7 @@ bool RISCVVectorPeephole::tryToReduceVL(MachineInstr &MI) const {
 
     MachineOperand &SrcVL =
         Src->getOperand(RISCVII::getVLOpNum(Src->getDesc()));
-    if (VL.isIdenticalTo(SrcVL) || !RISCV::isVLKnownLE(VL, SrcVL))
+    if (VL.isIdenticalTo(SrcVL) || !RISCV::isVLKnownLE(VL, SrcVL, MRI))
       continue;
 
     if (!ensureDominates(VL, *Src))
@@ -440,7 +440,7 @@ bool RISCVVectorPeephole::convertSameMaskVMergeToVMv(MachineInstr &MI) {
   const MachineOperand &MIVL = MI.getOperand(RISCVII::getVLOpNum(MI.getDesc()));
   const MachineOperand &TrueVL =
       True->getOperand(RISCVII::getVLOpNum(True->getDesc()));
-  if (!RISCV::isVLKnownLE(MIVL, TrueVL))
+  if (!RISCV::isVLKnownLE(MIVL, TrueVL, MRI))
     return false;
 
   // True's passthru needs to be equivalent to False
@@ -611,7 +611,7 @@ bool RISCVVectorPeephole::foldUndefPassthruVMV_V_V(MachineInstr &MI) {
     MachineOperand &SrcPolicy =
         Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc()));
 
-    if (RISCV::isVLKnownLE(MIVL, SrcVL))
+    if (RISCV::isVLKnownLE(MIVL, SrcVL, MRI))
       SrcPolicy.setImm(SrcPolicy.getImm() | RISCVVType::TAIL_AGNOSTIC);
   }
 
@@ -663,7 +663,7 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
   // so we don't need to handle a smaller source VL here.  However, the
   // user's VL may be larger
   MachineOperand &SrcVL = Src->getOperand(RISCVII::getVLOpNum(Src->getDesc()));
-  if (!RISCV::isVLKnownLE(SrcVL, MI.getOperand(3)))
+  if (!RISCV::isVLKnownLE(SrcVL, MI.getOperand(3), MRI))
     return false;
 
   // If the new passthru doesn't dominate Src, try to move Src so it does.
@@ -684,7 +684,7 @@ bool RISCVVectorPeephole::foldVMV_V_V(MachineInstr &MI) {
     // If MI was tail agnostic and the VL didn't increase, preserve it.
     int64_t Policy = RISCVVType::TAIL_UNDISTURBED_MASK_UNDISTURBED;
     if ((MI.getOperand(5).getImm() & RISCVVType::TAIL_AGNOSTIC) &&
-        RISCV::isVLKnownLE(MI.getOperand(3), SrcVL))
+        RISCV::isVLKnownLE(MI.getOperand(3), SrcVL, MRI))
       Policy |= RISCVVType::TAIL_AGNOSTIC;
     Src->getOperand(RISCVII::getVecPolicyOpNum(Src->getDesc())).setImm(Policy);
   }
@@ -775,9 +775,9 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
       True.getOperand(RISCVII::getVLOpNum(True.getDesc()));
 
   MachineOperand MinVL = MachineOperand::CreateImm(0);
-  if (RISCV::isVLKnownLE(TrueVL, VMergeVL))
+  if (RISCV::isVLKnownLE(TrueVL, VMergeVL, MRI))
     MinVL = TrueVL;
-  else if (RISCV::isVLKnownLE(VMergeVL, TrueVL))
+  else if (RISCV::isVLKnownLE(VMergeVL, TrueVL, MRI))
     MinVL = VMergeVL;
   else
     return false;
@@ -797,7 +797,7 @@ bool RISCVVectorPeephole::foldVMergeToMask(MachineInstr &MI) const {
   // to the tail. In that case we always need to use tail undisturbed to
   // preserve them.
   uint64_t Policy = RISCVVType::TAIL_UNDISTURBED_MASK_UNDISTURBED;
-  if (!PassthruReg && RISCV::isVLKnownLE(VMergeVL, MinVL))
+  if (!PassthruReg && RISCV::isVLKnownLE(VMergeVL, MinVL, MRI))
     Policy |= RISCVVType::TAIL_AGNOSTIC;
 
   assert(RISCVII::hasVecPolicyOp(True.getDesc().TSFlags) &&
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll
index a66370f5ccc0a..22c629088bacd 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll
@@ -413,9 +413,10 @@ define <256 x i8> @vadd_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
 define <256 x i8> @vadd_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
 ; CHECK-LABEL: vadd_vi_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 128
-; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a0)
+; CHECK-NEXT:    li a0, 128
+; CHECK-NEXT:    vsetvli zero, a0, e8, m8, ta, ma
 ; CHECK-NEXT:    vadd.vi v8, v8, -1, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll
index fec54b36042fa..ec5845752c29c 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmax-vp.ll
@@ -321,9 +321,10 @@ define <256 x i8> @vmax_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %e
 define <256 x i8> @vmax_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
 ; CHECK-LABEL: vmax_vx_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 128
-; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a1)
+; CHECK-NEXT:    li a1, 128
+; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
 ; CHECK-NEXT:    vmax.vx v8, v8, a0, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll
index 7ca0dbd9adffc..2ffd3318d8759 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmaxu-vp.ll
@@ -320,9 +320,10 @@ define <256 x i8> @vmaxu_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %
 define <256 x i8> @vmaxu_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
 ; CHECK-LABEL: vmaxu_vx_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 128
-; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a1)
+; CHECK-NEXT:    li a1, 128
+; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
 ; CHECK-NEXT:    vmaxu.vx v8, v8, a0, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll
index ea75742ca6e43..53649c77098f2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vmin-vp.ll
@@ -321,9 +321,10 @@ define <256 x i8> @vmin_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %e
 define <256 x i8> @vmin_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
 ; CHECK-LABEL: vmin_vx_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 128
-; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a1)
+; CHECK-NEXT:    li a1, 128
+; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
 ; CHECK-NEXT:    vmin.vx v8, v8, a0, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll
index f4f54db64018d..76b5be39f2d93 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vminu-vp.ll
@@ -320,9 +320,10 @@ define <256 x i8> @vminu_vx_v258i8_unmasked(<256 x i8> %va, i8 %b, i32 zeroext %
 define <256 x i8> @vminu_vx_v258i8_evl129(<256 x i8> %va, i8 %b, <256 x i1> %m) {
 ; CHECK-LABEL: vminu_vx_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a2, 128
-; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a1)
+; CHECK-NEXT:    li a1, 128
+; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
 ; CHECK-NEXT:    vminu.vx v8, v8, a0, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll
index 902001a376d6a..6ceb03c765fd7 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsadd-vp.ll
@@ -422,9 +422,10 @@ define <256 x i8> @vsadd_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
 define <256 x i8> @vsadd_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
 ; CHECK-LABEL: vsadd_vi_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 128
-; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a0)
+; CHECK-NEXT:    li a0, 128
+; CHECK-NEXT:    vsetvli zero, a0, e8, m8, ta, ma
 ; CHECK-NEXT:    vsadd.vi v8, v8, -1, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll
index 57292147a0140..2839efd40305b 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vsaddu-vp.ll
@@ -418,9 +418,10 @@ define <256 x i8> @vsaddu_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
 define <256 x i8> @vsaddu_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
 ; CHECK-LABEL: vsaddu_vi_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 128
-; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a0)
+; CHECK-NEXT:    li a0, 128
+; CHECK-NEXT:    vsetvli zero, a0, e8, m8, ta, ma
 ; CHECK-NEXT:    vsaddu.vi v8, v8, -1, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll
index 7b2dcbb025f8f..93f024c2b77a5 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll
@@ -203,38 +203,21 @@ define <256 x i8> @select_v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c, i3
 define <256 x i8> @select_evl_v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c) {
 ; CHECK-LABEL: select_evl_v256i8:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    addi sp, sp, -16
-; CHECK-NEXT:    .cfi_def_cfa_offset 16
-; CHECK-NEXT:    csrr a2, vlenb
-; CHECK-NEXT:    slli a2, a2, 3
-; CHECK-NEXT:    sub sp, sp, a2
-; CHECK-NEXT:    .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x08, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 16 + 8 * vlenb
-; CHECK-NEXT:    addi a2, sp, 16
-; CHECK-NEXT:    vs8r.v v16, (a2) # vscale x 64-byte Folded Spill
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m1, ta, ma
 ; CHECK-NEXT:    vmv1r.v v7, v8
+; CHECK-NEXT:    vmv1r.v v6, v0
 ; CHECK-NEXT:    li a2, 128
+; CHECK-NEXT:    addi a3, a1, 128
 ; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
-; CHECK-NEXT:    vle8.v v24, (a0)
-; CHECK-NEXT:    addi a0, a1, 128
-; CHECK-NEXT:    vle8.v v8, (a0)
-; CHECK-NEXT:    vle8.v v16, (a1)
-; CHECK-NEXT:    vmv1r.v v6, v0
+; CHECK-NEXT:    vle8.v v24, (a3)
+; CHECK-NEXT:    vle8.v v8, (a1)
 ; CHECK-NEXT:    vmv1r.v v0, v7
-; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
-; CHECK-NEXT:    vmerge.vvm v24, v8, v24, v0
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, mu
+; CHECK-NEXT:    vle8.v v24, (a0), v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v6
-; CHECK-NEXT:    addi a0, sp, 16
-; CHECK-NEXT:    vl8r.v v8, (a0) # vscale x 64-byte Folded Reload
 ; CHECK-NEXT:    vsetvli zero, a2, e8, m8, ta, ma
-; CHECK-NEXT:    vmerge.vvm v8, v16, v8, v0
+; CHECK-NEXT:    vmerge.vvm v8, v8, v16, v0
 ; CHECK-NEXT:    vmv8r.v v16, v24
-; CHECK-NEXT:    csrr a0, vlenb
-; CHECK-NEXT:    slli a0, a0, 3
-; CHECK-NEXT:    add sp, sp, a0
-; CHECK-NEXT:    .cfi_def_cfa sp, 16
-; CHECK-NEXT:    addi sp, sp, 16
-; CHECK-NEXT:    .cfi_def_cfa_offset 0
 ; CHECK-NEXT:    ret
   %v = call <256 x i8> @llvm.vp.select.v256i8(<256 x i1> %a, <256 x i8> %b, <256 x i8> %c, i32 129)
   ret <256 x i8> %v
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll
index 353042fc889e5..79856de033060 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssub-vp.ll
@@ -436,14 +436,15 @@ define <256 x i8> @vssub_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
 define <256 x i8> @vssub_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
 ; CHECK-LABEL: vssub_vi_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 128
-; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a0)
-; CHECK-NEXT:    li a0, -1
-; CHECK-NEXT:    vssub.vx v8, v8, a0, v0.t
+; CHECK-NEXT:    li a0, 128
+; CHECK-NEXT:    li a1, -1
+; CHECK-NEXT:    vsetvli zero, a0, e8, m8, ta, ma
+; CHECK-NEXT:    vssub.vx v8, v8, a1, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
-; CHECK-NEXT:    vssub.vx v16, v16, a0, v0.t
+; CHECK-NEXT:    vssub.vx v16, v16, a1, v0.t
 ; CHECK-NEXT:    ret
   %v = call <256 x i8> @llvm.vp.ssub.sat.v258i8(<256 x i8> %va, <256 x i8> splat (i8 -1), <256 x i1> %m, i32 129)
   ret <256 x i8> %v
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll
index c00fb329b2f0c..7a9bef49c994d 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vssubu-vp.ll
@@ -431,14 +431,15 @@ define <256 x i8> @vssubu_vi_v258i8_unmasked(<256 x i8> %va, i32 zeroext %evl) {
 define <256 x i8> @vssubu_vi_v258i8_evl129(<256 x i8> %va, <256 x i1> %m) {
 ; CHECK-LABEL: vssubu_vi_v258i8_evl129:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    li a1, 128
-; CHECK-NEXT:    vsetvli zero, a1, e8, m8, ta, ma
+; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
 ; CHECK-NEXT:    vlm.v v24, (a0)
-; CHECK-NEXT:    li a0, -1
-; CHECK-NEXT:    vssubu.vx v8, v8, a0, v0.t
+; CHECK-NEXT:    li a0, 128
+; CHECK-NEXT:    li a1, -1
+; CHECK-NEXT:    vsetvli zero, a0, e8, m8, ta, ma
+; CHECK-NEXT:    vssubu.vx v8, v8, a1, v0.t
 ; CHECK-NEXT:    vmv1r.v v0, v24
 ; CHECK-NEXT:    vsetivli zero, 1, e8, m8, ta, ma
-; CHECK-NEXT:    vssubu.vx v16, v16, a0, v0.t
+; CHECK-NEXT:    vssubu.vx v16, v16, a1, v0.t
 ; CHECK-NEXT:    ret
   %v = call <256 x i8> @llvm.vp.usub.sat.v258i8(<256 x i8> %va, <256 x i8> splat (i8 -1), <256 x i1> %m, i32 129)
   ret <256 x i8> %v
diff --git a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
index 2f2035b02b552..89cd53977312c 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
@@ -82,6 +82,18 @@ define <vscale x 4 x i32> @diff_avl_vlmax(<vscale x 4 x i32> %passthru, <vscale
   ret <vscale x 4 x i32> %w
 }
 
+define <vscale x 4 x i32> @diff_avl_non_uimm5(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
+; CHECK-LABEL: diff_avl_non_uimm5:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 42
+; CHECK-NEXT:    vsetvli zero, a0, e32, m2, tu, ma
+; CHECK-NEXT:    vadd.vv v8, v10, v12
+; CHECK-NEXT:    ret
+  %v = call <vscale x 4 x i32> @llvm.riscv.vadd.nxv4i32.nxv4i32(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b, iXLen 42)
+  %w = call <vscale x 4 x i32> @llvm.riscv.vmv.v.v.nxv4i32(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %v, iXLen 123)
+  ret <vscale x 4 x i32> %w
+}
+
 define <vscale x 4 x i32> @vadd_mask_ma(<vscale x 4 x i32> %passthru, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i1> %mask, iXLen %vl) {
 ; CHECK-LABEL: vadd_mask_ma:
 ; CHECK:       # %bb....
[truncated]

Copy link
Member

@mshockwave mshockwave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 enabled auto-merge (squash) September 4, 2025 23:53
@lukel97 lukel97 merged commit dc2ed00 into llvm:main Sep 5, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 5, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/21512

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: 1200 seconds without output running [b'ninja', b'check-all'], attempting to kill
...
4.885 [27/18/15] Linking CXX executable unittests/CodeGen/GlobalISel/GlobalISelTests
4.922 [26/18/16] Linking CXX executable unittests/Debuginfod/DebuginfodTests
5.608 [25/18/17] Linking CXX executable unittests/ExecutionEngine/ExecutionEngineTests
5.661 [24/18/18] Linking CXX executable unittests/CodeGen/CGPluginTest/CGPluginTest
5.759 [23/18/19] Linking CXX executable unittests/ExecutionEngine/JITLink/JITLinkTests
6.015 [22/18/20] Linking CXX executable unittests/InterfaceStub/InterfaceStubTests
8.698 [21/18/21] Linking CXX executable unittests/FuzzMutate/FuzzMutateTests
8.781 [20/18/22] Linking CXX executable unittests/Linker/LinkerTests
16.230 [19/18/23] Linking CXX executable unittests/MC/MCTests
17.166 [18/18/24] Linking CXX executable unittests/Passes/Plugins/PluginsTests
command timed out: 1200 seconds without output running [b'ninja', b'check-all'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1217.977089

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants