[LV] Add initial legality checks for loops with unbound loads. #152422

arcbbb · 2025-08-07T01:13:36Z

This patch splits out the legality checks from PR #151300, following the landing of PR #128593.

It is a step toward supporting vectorization of early-exit loops that contain potentially faulting loads.
In this commit, a loop is considered legal for vectorization if it satisfies the following criteria:

The target supports first-faulting load intrinsics (e.g., vp.load.ff).
Unbounded loads are unit-stride, which is the only type currently supported by vp.load.ff.

llvmbot · 2025-08-07T01:14:07Z

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-vectorizers
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-risc-v

Author: Shih-Po Hung (arcbbb)

Changes

This patch splits out the legality checks from PR #151300, following the landing of PR #128593.

It is a step toward supporting vectorization of early-exit loops that contain potentially faulting loads.
In this commit, a loop is considered legal for vectorization if it satisfies the following criteria:

The target supports first-faulting load intrinsics (e.g., vp.load.ff).
All unbounded loads are unit-stride, which is the only type currently supported by vp.load.ff.
All unbounded loads are located in the loop header, ensuring that the header mask change dominates the loop.

Full diff: https://github.com/llvm/llvm-project/pull/152422.diff

9 Files Affected:

(modified) llvm/include/llvm/Analysis/Loads.h (+8)
(modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+3)
(modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+2)
(modified) llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h (+8)
(modified) llvm/lib/Analysis/Loads.cpp (+16)
(modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+4)
(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h (+3)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp (+28-3)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+7)

diff --git a/llvm/include/llvm/Analysis/Loads.h b/llvm/include/llvm/Analysis/Loads.h
index 84564563de8e3..080757b6d1fe0 100644
--- a/llvm/include/llvm/Analysis/Loads.h
+++ b/llvm/include/llvm/Analysis/Loads.h
@@ -91,6 +91,14 @@ LLVM_ABI bool isDereferenceableReadOnlyLoop(
     Loop *L, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
     SmallVectorImpl<const SCEVPredicate *> *Predicates = nullptr);
 
+/// Return true if the loop \p L cannot fault on any iteration and only
+/// contains read-only memory accesses. Also collect loads that are not
+/// guaranteed to be dereferenceable.
+LLVM_ABI bool isReadOnlyLoopWithSafeOrSpeculativeLoads(
+    Loop *L, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
+    SmallVectorImpl<LoadInst *> *SpeculativeLoads,
+    SmallVectorImpl<const SCEVPredicate *> *Predicates = nullptr);
+
 /// Return true if we know that executing a load from this value cannot trap.
 ///
 /// If DT and ScanFrom are specified this method performs context-sensitive
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index aa4550de455e0..0671ec4f4db01 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -1857,6 +1857,9 @@ class TargetTransformInfo {
   /// \returns True if the target supports scalable vectors.
   LLVM_ABI bool supportsScalableVectors() const;
 
+  /// \returns True if the target supports speculative load intrinsics (e.g., vp.load.ff).
+  LLVM_ABI bool supportsSpeculativeLoads() const;
+
   /// \return true when scalable vectorization is preferred.
   LLVM_ABI bool enableScalableVectorization() const;
 
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index abdbca04488db..1df93ecc7ec16 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -1106,6 +1106,8 @@ class TargetTransformInfoImplBase {
 
   virtual bool supportsScalableVectors() const { return false; }
 
+  virtual bool supportsSpeculativeLoads() const { return false; }
+
   virtual bool enableScalableVectorization() const { return false; }
 
   virtual bool hasActiveVectorLength() const { return false; }
diff --git a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
index 43ff084816d18..3b5638f3f570a 100644
--- a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
+++ b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
@@ -445,6 +445,11 @@ class LoopVectorizationLegality {
   /// Returns a list of all known histogram operations in the loop.
   bool hasHistograms() const { return !Histograms.empty(); }
 
+  /// Returns the loads that may fault and need to be speculative.
+  const SmallPtrSetImpl<const Instruction *> &getSpeculativeLoads() const {
+    return SpeculativeLoads;
+  }
+
   PredicatedScalarEvolution *getPredicatedScalarEvolution() const {
     return &PSE;
   }
@@ -630,6 +635,9 @@ class LoopVectorizationLegality {
   /// may work on the same memory location.
   SmallVector<HistogramInfo, 1> Histograms;
 
+  /// Hold all loads that need to be speculative.
+  SmallPtrSet<const Instruction *, 4> SpeculativeLoads;
+
   /// BFI and PSI are used to check for profile guided size optimizations.
   BlockFrequencyInfo *BFI;
   ProfileSummaryInfo *PSI;
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 78d0887d5d87e..c5a55e9903d41 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -870,3 +870,19 @@ bool llvm::isDereferenceableReadOnlyLoop(
   }
   return true;
 }
+
+bool llvm::isReadOnlyLoopWithSafeOrSpeculativeLoads(
+    Loop *L, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
+    SmallVectorImpl<LoadInst *> *SpeculativeLoads,
+    SmallVectorImpl<const SCEVPredicate *> *Predicates) {
+  for (BasicBlock *BB : L->blocks()) {
+    for (Instruction &I : *BB) {
+      if (auto *LI = dyn_cast<LoadInst>(&I)) {
+        if (!isDereferenceableAndAlignedInLoop(LI, L, *SE, *DT, AC, Predicates))
+          SpeculativeLoads->push_back(LI);
+      } else if (I.mayReadFromMemory() || I.mayWriteToMemory() || I.mayThrow())
+        return false;
+    }
+  }
+  return true;
+}
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index c7eb2ec18c679..9f05e01d34781 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -1457,6 +1457,10 @@ bool TargetTransformInfo::supportsScalableVectors() const {
   return TTIImpl->supportsScalableVectors();
 }
 
+bool TargetTransformInfo::supportsSpeculativeLoads() const {
+  return TTIImpl->supportsSpeculativeLoads();
+}
+
 bool TargetTransformInfo::enableScalableVectorization() const {
   return TTIImpl->enableScalableVectorization();
 }
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 05d504cbcb6bb..54e9c8346b6e2 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -110,6 +110,9 @@ class RISCVTTIImpl final : public BasicTTIImplBase<RISCVTTIImpl> {
   bool supportsScalableVectors() const override {
     return ST->hasVInstructions();
   }
+  bool supportsSpeculativeLoads() const override {
+    return ST->hasVInstructions();
+  }
   bool enableOrderedReductions() const override { return true; }
   bool enableScalableVectorization() const override {
     return ST->hasVInstructions();
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index c47fd9421fddd..46660866741ea 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1760,16 +1760,41 @@ bool LoopVectorizationLegality::isVectorizableEarlyExitLoop() {
   assert(LatchBB->getUniquePredecessor() == SingleUncountableExitingBlock &&
          "Expected latch predecessor to be the early exiting block");
 
-  // TODO: Handle loops that may fault.
   Predicates.clear();
-  if (!isDereferenceableReadOnlyLoop(TheLoop, PSE.getSE(), DT, AC,
-                                     &Predicates)) {
+  SmallVector<LoadInst *, 4> NonDerefLoads;
+  bool HasSafeAccess =
+      TTI->supportsSpeculativeLoads()
+          ? isReadOnlyLoopWithSafeOrSpeculativeLoads(
+                TheLoop, PSE.getSE(), DT, AC, &NonDerefLoads, &Predicates)
+          : isDereferenceableReadOnlyLoop(TheLoop, PSE.getSE(), DT, AC,
+                                          &Predicates);
+  if (!HasSafeAccess) {
     reportVectorizationFailure(
         "Loop may fault",
         "Cannot vectorize potentially faulting early exit loop",
         "PotentiallyFaultingEarlyExitLoop", ORE, TheLoop);
     return false;
   }
+  // Speculative loads need to be unit-stride.
+  for (LoadInst *LI : NonDerefLoads) {
+    if (LI->getParent() != TheLoop->getHeader()) {
+      reportVectorizationFailure("Cannot vectorize predicated speculative load",
+                                 "SpeculativeLoadNeedsPredication", ORE,
+                                 TheLoop);
+      return false;
+    }
+    int Stride = isConsecutivePtr(LI->getType(), LI->getPointerOperand());
+    if (Stride != 1) {
+      reportVectorizationFailure("Loop contains non-unit-stride load",
+                                 "Cannot vectorize early exit loop with "
+                                 "speculative non-unit-stride load",
+                                 "SpeculativeNonUnitStrideLoadEarlyExitLoop",
+                                 ORE, TheLoop);
+      return false;
+    }
+    SpeculativeLoads.insert(LI);
+    LLVM_DEBUG(dbgs() << "LV: Found speculative load: " << *LI << "\n");
+  }
 
   [[maybe_unused]] const SCEV *SymbolicMaxBTC =
       PSE.getSymbolicMaxBackedgeTakenCount();
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 9667b506e594f..790a5236d4f04 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -10041,6 +10041,13 @@ bool LoopVectorizePass::processLoop(Loop *L) {
     return false;
   }
 
+  if (!LVL.getSpeculativeLoads().empty()) {
+    reportVectorizationFailure("Auto-vectorization of loops with speculative "
+                               "load is not supported",
+                               "SpeculativeLoadsNotSupported", ORE, L);
+    return false;
+  }
+
   // Entrance to the VPlan-native vectorization path. Outer loops are processed
   // here. They may require CFG and instruction level transformations before
   // even evaluating whether vectorization is profitable. Since we cannot modify

…oad.

alexey-bataev

Tests?

alexey-bataev · 2025-08-07T11:56:36Z

llvm/lib/Analysis/Loads.cpp

+      } else if (I.mayReadFromMemory() || I.mayWriteToMemory() || I.mayThrow())
+        return false;


Fixed. Thanks!

arcbbb · 2025-08-09T22:50:51Z

Tests?

Added, and updated the criteria to limit the number of unbound accesses to one.

llvm/lib/Analysis/Loads.cpp

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/test/Transforms/LoopVectorize/RISCV/unbound-access-legality.ll

david-arm · 2025-08-11T13:37:24Z

If you're interested I tried to solve this in a different way (#120603) using loop versioning based on alignment of the loads in the loop. However, this was rejected due to an undocumented feature of normal IR loads. Using different intrinsics with their own semantics certainly solves this problem!

arcbbb · 2025-08-19T07:07:09Z

ping

llvm/lib/Analysis/Loads.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/unittests/Analysis/LoadsTest.cpp

llvm/lib/Analysis/Loads.cpp

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

isLoopSafeWithLoadOnlyFaults

llvm/include/llvm/Analysis/TargetTransformInfo.h

arcbbb · 2025-09-02T01:16:43Z

gentle ping

llvm/unittests/Analysis/LoadsTest.cpp

llvm/include/llvm/Analysis/Loads.h

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

lukel97

LGTM

llvm/unittests/Analysis/LoadsTest.cpp

lukel97 · 2025-09-02T08:42:03Z

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

    return false;
  }
+  // Check non-dereferenceable loads if any.
+  for (LoadInst *LI : NonDerefLoads) {
+    // Only support unit-stride access for now.


Just a question about potential AArch64 support, SVE has first-fault gathers IIUC which could be used for strided accesses. I presume we don't need to check for alignment there? From quickly scanning the docs it looks like an unaligned access non-first-fault will be handled the same as any other non-first-fault.

Yes, SVE's first faulting gathers have the same alignment rules as normal gathers, so no extra checks required.

llvm/include/llvm/Analysis/Loads.h

fhahn · 2025-09-02T08:57:12Z

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

+  /// Returns the loads that need to be fault-only-first.
+  const SmallPtrSetImpl<const Instruction *> &getFaultOnlyFirstLoads() const {
+    return FaultOnlyFirstLoads;
+  }


It's not clear to me what fault-only-first means here. Those are just unit strided loads that may not be dereferenceable and may not be aligned?

Would getFaultingLoads be an accurate description? Loads that may fault due to not being deferenceable or alignment and so need to be handled via fault-only-first instructions. They may be non-unit strided in future.

I've renamed it to PotentiallyFaultingLoads. Hope this makes it clearer.

fhahn · 2025-09-02T08:58:18Z

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

+  /// Hold all loads that need to be fault-only-first.
+  SmallPtrSet<const Instruction *, 4> FaultOnlyFirstLoads;


See above regarding naming

llvm/unittests/Analysis/LoadsTest.cpp

arcbbb requested review from fhahn, huntergr-arm and david-arm August 7, 2025 01:13

llvmbot added backend:RISC-V vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Aug 7, 2025

[LV] Add initial legality checks for loops with non-dereferenceable l…

dc5b7c1

…oad.

arcbbb force-pushed the legality-checks branch from ac2dab8 to dc5b7c1 Compare August 7, 2025 01:14

alexey-bataev reviewed Aug 7, 2025

View reviewed changes

arcbbb added 5 commits August 8, 2025 00:28

Fix braces

78f05df

Limit to single unbound access. Add tests

f5315f8

clang-formatted

d3246bd

trivial fix: use lower case

eb316ee

trivial fix

faa6112

david-arm reviewed Aug 11, 2025

View reviewed changes

Address comment and refine TTI to check data type

b206c9f

lukel97 reviewed Aug 20, 2025

View reviewed changes

llvm/lib/Analysis/Loads.cpp Outdated Show resolved Hide resolved

Query subtarget feature for misaligned access support

72383dd

lukel97 reviewed Aug 21, 2025

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved

Add align 1 case

0c1d007

lukel97 reviewed Aug 26, 2025

View reviewed changes

arcbbb added 4 commits August 26, 2025 20:09

Rename isLegalSpeculativeLoad to isLegalFaultOnlyFirstLoad

e22c486

Rename isReadOnlyLoopWithSafeOrSpeculativeLoads to

2a37216

isLoopSafeWithLoadOnlyFaults

Rename SpeculativeLoads to FaultOnlyFirstLoads

abb0120

Refine comments with ff loads

6b920f4

arcbbb added 2 commits August 27, 2025 00:11

Update unittest to check the returned instructions

22dee5b

Refine report messages

e88d5d4

fhahn reviewed Aug 27, 2025

View reviewed changes

llvm/include/llvm/Analysis/TargetTransformInfo.h Outdated Show resolved Hide resolved

arcbbb added 2 commits August 27, 2025 20:19

Remove TTI isLegalFaultOnlyFirstLoad

12e6dc9

Update tests

3284452

lukel97 reviewed Sep 2, 2025

View reviewed changes

arcbbb added 6 commits September 2, 2025 00:38

Refine description

10156ca

Rename isLoopSafeWithLoadOnlyFaults to isReadOnlyLoop

16b5376

Refine error message

357ff34

Refine LoadTest

ead7a78

clang-format

9e98d12

Update tests

f18dae7

lukel97 approved these changes Sep 2, 2025

View reviewed changes

Rename IsLoadOnlyFaultingLoop to IsReadOnlyLoop

9dc5a7a

fhahn reviewed Sep 3, 2025

View reviewed changes

arcbbb added 5 commits September 2, 2025 23:41

Assert the size of NonDerefLoads

380e064

pass NonDereferenceableAndAlignedLoads as reference

93db706

Rename FaultOnlyFirstLoads to PotentiallyFaultingLoads

43976f7

Update report message in LoopVectorize

51a8329

Update descriptions

7a99472

		} else if (I.mayReadFromMemory() \|\| I.mayWriteToMemory() \|\| I.mayThrow())
		return false;

		/// Hold all loads that need to be fault-only-first.
		SmallPtrSet<const Instruction *, 4> FaultOnlyFirstLoads;

[LV] Add initial legality checks for loops with unbound loads. #152422

Are you sure you want to change the base?

[LV] Add initial legality checks for loops with unbound loads. #152422

Uh oh!

Conversation

arcbbb commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexey-bataev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arcbbb commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

david-arm commented Aug 11, 2025

Uh oh!

arcbbb commented Aug 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arcbbb commented Sep 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

arcbbb commented Aug 7, 2025 •

edited

Loading

llvmbot commented Aug 7, 2025 •

edited

Loading