Skip to content

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Jul 9, 2025

Require that all Instructions in the Loop are visited by ValueEvolution, as any stray instructions would complicate life for the optimization.

@artagnon artagnon requested review from nikic and pfusik July 9, 2025 19:45
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Ramkumar Ramachandra (artagnon)

Changes

Require that all Instructions in the Loop are visited by ValueEvolution, as any stray instructions would complicate life for the optimization.


Full diff: https://github.com/llvm/llvm-project/pull/147812.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/HashRecognize.cpp (+43-9)
  • (modified) llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll (+29-3)
diff --git a/llvm/lib/Analysis/HashRecognize.cpp b/llvm/lib/Analysis/HashRecognize.cpp
index 2cc3ad5f18482..f032593492287 100644
--- a/llvm/lib/Analysis/HashRecognize.cpp
+++ b/llvm/lib/Analysis/HashRecognize.cpp
@@ -91,6 +91,10 @@ class ValueEvolution {
   APInt GenPoly;
   StringRef ErrStr;
 
+  // A set of instructions visited by ValueEvolution. Anything that's not in the
+  // use-def chain of the PHIs' evolution will be reported as unvisited.
+  SmallPtrSet<const Instruction *, 16> Visited;
+
   // Compute the KnownBits of a BinaryOperator.
   KnownBits computeBinOp(const BinaryOperator *I);
 
@@ -102,15 +106,19 @@ class ValueEvolution {
 
 public:
   // ValueEvolution is meant to be constructed with the TripCount of the loop,
-  // and whether the polynomial algorithm is big-endian, for the significant-bit
-  // check.
-  ValueEvolution(unsigned TripCount, bool ByteOrderSwapped);
+  // whether the polynomial algorithm is big-endian for the significant-bit
+  // check, and an initial value for the Visited set.
+  ValueEvolution(unsigned TripCount, bool ByteOrderSwapped,
+                 ArrayRef<const Instruction *> InitVisited);
 
   // Given a list of PHI nodes along with their incoming value from within the
   // loop, computeEvolutions computes the KnownBits of each of the PHI nodes on
   // the final iteration. Returns true on success and false on error.
   bool computeEvolutions(ArrayRef<PhiStepPair> PhiEvolutions);
 
+  // Query the Visited set.
+  bool isVisited(const Instruction *I) const { return Visited.contains(I); }
+
   // In case ValueEvolution encounters an error, this is meant to be used for a
   // precise error message.
   StringRef getError() const { return ErrStr; }
@@ -120,8 +128,11 @@ class ValueEvolution {
   KnownPhiMap KnownPhis;
 };
 
-ValueEvolution::ValueEvolution(unsigned TripCount, bool ByteOrderSwapped)
-    : TripCount(TripCount), ByteOrderSwapped(ByteOrderSwapped) {}
+ValueEvolution::ValueEvolution(unsigned TripCount, bool ByteOrderSwapped,
+                               ArrayRef<const Instruction *> InitVisited)
+    : TripCount(TripCount), ByteOrderSwapped(ByteOrderSwapped) {
+  Visited.insert_range(InitVisited);
+}
 
 KnownBits ValueEvolution::computeBinOp(const BinaryOperator *I) {
   KnownBits KnownL(compute(I->getOperand(0)));
@@ -177,6 +188,9 @@ KnownBits ValueEvolution::computeBinOp(const BinaryOperator *I) {
 KnownBits ValueEvolution::computeInstr(const Instruction *I) {
   unsigned BitWidth = I->getType()->getScalarSizeInBits();
 
+  // computeInstr is the only entry-point that needs to update the Visited set.
+  Visited.insert(I);
+
   // We look up in the map that contains the KnownBits of the PHI from the
   // previous iteration.
   if (const PHINode *P = dyn_cast<PHINode>(I))
@@ -185,9 +199,14 @@ KnownBits ValueEvolution::computeInstr(const Instruction *I) {
   // Compute the KnownBits for a Select(Cmp()), forcing it to take the branch
   // that is predicated on the (least|most)-significant-bit check.
   CmpPredicate Pred;
-  Value *L, *R, *TV, *FV;
-  if (match(I, m_Select(m_ICmp(Pred, m_Value(L), m_Value(R)), m_Value(TV),
-                        m_Value(FV)))) {
+  Value *L, *R;
+  Instruction *TV, *FV;
+  if (match(I, m_Select(m_ICmp(Pred, m_Value(L), m_Value(R)), m_Instruction(TV),
+                        m_Instruction(FV)))) {
+    Visited.insert(cast<Instruction>(I->getOperand(0)));
+    Visited.insert(TV);
+    Visited.insert(FV);
+
     // We need to check LCR against [0, 2) in the little-endian case, because
     // the RCR check is insufficient: it is simply [0, 1).
     if (!ByteOrderSwapped) {
@@ -209,6 +228,9 @@ KnownBits ValueEvolution::computeInstr(const Instruction *I) {
     ConstantRange CheckRCR(APInt::getZero(ICmpBW),
                            ByteOrderSwapped ? APInt::getSignedMinValue(ICmpBW)
                                             : APInt(ICmpBW, 1));
+
+    // We only compute KnownBits of either TV or FV, as the other value would
+    // just be a bit-shift as checked by isBigEndianBitShift.
     if (AllowedR == CheckRCR)
       return compute(TV);
     if (AllowedR.inverse() == CheckRCR)
@@ -629,11 +651,23 @@ HashRecognize::recognizeCRC() const {
   if (SimpleRecurrence)
     PhiEvolutions.emplace_back(SimpleRecurrence.Phi, SimpleRecurrence.BO);
 
-  ValueEvolution VE(TC, *ByteOrderSwapped);
+  // Initialize the Visited set in ValueEvolution with the IndVar-related
+  // instructions.
+  std::initializer_list<const Instruction *> InitVisited = {
+      IndVar, Latch->getTerminator(), L.getLatchCmpInst(),
+      cast<Instruction>(IndVar->getIncomingValueForBlock(Latch))};
+
+  ValueEvolution VE(TC, *ByteOrderSwapped, InitVisited);
   if (!VE.computeEvolutions(PhiEvolutions))
     return VE.getError();
   KnownBits ResultBits = VE.KnownPhis.at(ConditionalRecurrence.Phi);
 
+  // Any unvisited instructions from the KnownBits propagation can complicate
+  // the optimization, which would just replace the entire loop with the
+  // table-lookup version of the hash algorithm.
+  if (any_of(*Latch, [VE](const Instruction &I) { return !VE.isVisited(&I); }))
+    return "Found stray unvisited instructions";
+
   unsigned N = std::min(TC, ResultBits.getBitWidth());
   auto IsZero = [](const KnownBits &K) { return K.isZero(); };
   if (!checkExtractBits(ResultBits, N, IsZero, *ByteOrderSwapped))
diff --git a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
index 247a105940e6e..3926c467375ed 100644
--- a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
+++ b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
@@ -909,10 +909,10 @@ exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
 
-define i16 @not.crc.bad.cast(i8 %msg, i16 %checksum) {
-; CHECK-LABEL: 'not.crc.bad.cast'
+define i16 @not.crc.bad.endian.swapped.sb.check(i8 %msg, i16 %checksum) {
+; CHECK-LABEL: 'not.crc.bad.endian.swapped.sb.check'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Expected bottom 8 bits zero (????????00001011)
+; CHECK-NEXT:  Reason: Found stray unvisited instructions
 ;
 entry:
   br label %loop
@@ -1189,3 +1189,29 @@ loop:                                              ; preds = %loop, %entry
 exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
+
+define i16 @not.crc.stray.unvisited.call(i16 %crc.init) {
+; CHECK-LABEL: 'not.crc.stray.unvisited.call'
+; CHECK-NEXT:  Did not find a hash algorithm
+; CHECK-NEXT:  Reason: Found stray unvisited instructions
+;
+entry:
+  br label %loop
+
+loop:                                              ; preds = %loop, %entry
+  %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
+  %crc = phi i16 [ %crc.init, %entry ], [ %crc.next, %loop ]
+  %crc.shl = shl i16 %crc, 1
+  %crc.xor = xor i16 %crc.shl, 4129
+  %check.sb = icmp slt i16 %crc, 0
+  %crc.next = select i1 %check.sb, i16 %crc.xor, i16 %crc.shl
+  call void @print(i16 %crc.next)
+  %iv.next = add nuw nsw i32 %iv, 1
+  %exit.cond = icmp samesign ult i32 %iv, 7
+  br i1 %exit.cond, label %loop, label %exit
+
+exit:                                              ; preds = %loop
+  ret i16 %crc.next
+}
+
+declare void @print(i16)

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about simpler approaches:

  1. Instead of the Visited set, have a ToVisit set initialized with all the loop instructions, erase while visiting and at the end just check if the set is empty.
  2. Instead of a set, only track the number of instructions. This would work if the instructions are visited once - is that the case?

Copy link
Contributor Author

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the Visited set, have a ToVisit set initialized with all the loop instructions, erase while visiting and at the end just check if the set is empty.

I tried it, and it's exactly equivalent, with minor regression to the elegance of the code (due to missing erase_range, SmallPtrSetImpl stuff), and same computational complexity. I've simplified the check at the end.

Instead of a set, only track the number of instructions. This would work if the instructions are visited once - is that the case?

No, the instructions are visited trip-count times.

artagnon added 4 commits July 10, 2025 23:50
Require that all Instructions in the Loop are visited by ValueEvolution,
as any stray instructions would complicate life for the optimization.
@pfusik
Copy link
Contributor

pfusik commented Jul 14, 2025

Instead of the Visited set, have a ToVisit set initialized with all the loop instructions, erase while visiting and at the end just check if the set is empty.

I tried it, and it's exactly equivalent, with minor regression to the elegance of the code (due to missing erase_range, SmallPtrSetImpl stuff), and same computational complexity. I've simplified the check at the end.

Okay. I wasn't sure what's simpler.

Instead of a set, only track the number of instructions. This would work if the instructions are visited once - is that the case?

No, the instructions are visited trip-count times.

If that's exactly trip-count times per instructions, we could compare against the number multiplied by TC.

@artagnon
Copy link
Contributor Author

Instead of a set, only track the number of instructions. This would work if the instructions are visited once - is that the case?

No, the instructions are visited trip-count times.

If that's exactly trip-count times per instructions, we could compare against the number multiplied by TC.

True, but the code would become a lot more cryptic, with various ++NumVisited statements.

@pfusik
Copy link
Contributor

pfusik commented Jul 14, 2025

If that's exactly trip-count times per instructions, we could compare against the number multiplied by TC.

True, but the code would become a lot more cryptic, with various ++NumVisited statements.

I agree. Let's keep it as-is.

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a step in a good direction to make sure the loop calculates a CRC and is free from side effects.

@artagnon artagnon merged commit 8ef1a0e into llvm:main Jul 16, 2025
9 checks passed
@artagnon artagnon deleted the hr-visitedset branch July 16, 2025 14:28
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Jul 16, 2025
Require that all Instructions in the Loop are visited by ValueEvolution,
as any stray instructions would complicate life for the optimization.
bonsthie pushed a commit to bonsthie/H2BLB-custom-backend that referenced this pull request Jul 22, 2025
Require that all Instructions in the Loop are visited by ValueEvolution,
as any stray instructions would complicate life for the optimization.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants