-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[HashRecognize] Strip ValueEvolution #148620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-analysis Author: Ramkumar Ramachandra (artagnon) ChangesThe ConstantRange of the LHS of the ICmp in the Select(ICmp()) significant-bit check is not checked in the big-endian case, leading to a potential miscompile. Fix this. -- 8< -- Full diff: https://github.com/llvm/llvm-project/pull/148620.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/HashRecognize.cpp b/llvm/lib/Analysis/HashRecognize.cpp
index 2cc3ad5f18482..ae65d5c3e5188 100644
--- a/llvm/lib/Analysis/HashRecognize.cpp
+++ b/llvm/lib/Analysis/HashRecognize.cpp
@@ -90,6 +90,7 @@ class ValueEvolution {
const bool ByteOrderSwapped;
APInt GenPoly;
StringRef ErrStr;
+ unsigned AtIter;
// Compute the KnownBits of a BinaryOperator.
KnownBits computeBinOp(const BinaryOperator *I);
@@ -102,8 +103,8 @@ class ValueEvolution {
public:
// ValueEvolution is meant to be constructed with the TripCount of the loop,
- // and whether the polynomial algorithm is big-endian, for the significant-bit
- // check.
+ // and a boolean indicating whether the polynomial algorithm is big-endian
+ // (for the significant-bit check).
ValueEvolution(unsigned TripCount, bool ByteOrderSwapped);
// Given a list of PHI nodes along with their incoming value from within the
@@ -115,6 +116,10 @@ class ValueEvolution {
// precise error message.
StringRef getError() const { return ErrStr; }
+ // A set of Instructions visited by ValueEvolution. The only unvisited
+ // instructions will be ones not on the use-def chain of the PHIs' evolutions.
+ SmallPtrSet<const Instruction *, 16> Visited;
+
// The computed KnownBits for each PHI node, which is populated after
// computeEvolutions is called.
KnownPhiMap KnownPhis;
@@ -177,6 +182,9 @@ KnownBits ValueEvolution::computeBinOp(const BinaryOperator *I) {
KnownBits ValueEvolution::computeInstr(const Instruction *I) {
unsigned BitWidth = I->getType()->getScalarSizeInBits();
+ // computeInstr is the only entry-point that needs to update the Visited set.
+ Visited.insert(I);
+
// We look up in the map that contains the KnownBits of the PHI from the
// previous iteration.
if (const PHINode *P = dyn_cast<PHINode>(I))
@@ -185,34 +193,47 @@ KnownBits ValueEvolution::computeInstr(const Instruction *I) {
// Compute the KnownBits for a Select(Cmp()), forcing it to take the branch
// that is predicated on the (least|most)-significant-bit check.
CmpPredicate Pred;
- Value *L, *R, *TV, *FV;
- if (match(I, m_Select(m_ICmp(Pred, m_Value(L), m_Value(R)), m_Value(TV),
- m_Value(FV)))) {
- // We need to check LCR against [0, 2) in the little-endian case, because
- // the RCR check is insufficient: it is simply [0, 1).
- if (!ByteOrderSwapped) {
- KnownBits KnownL = compute(L);
- unsigned ICmpBW = KnownL.getBitWidth();
- auto LCR = ConstantRange::fromKnownBits(KnownL, false);
- auto CheckLCR = ConstantRange(APInt::getZero(ICmpBW), APInt(ICmpBW, 2));
- if (LCR != CheckLCR) {
- ErrStr = "Bad LHS of significant-bit-check";
- return {BitWidth};
- }
- }
+ Value *L, *R;
+ Instruction *TV, *FV;
+ if (match(I, m_Select(m_ICmp(Pred, m_Value(L), m_Value(R)), m_Instruction(TV),
+ m_Instruction(FV)))) {
+ Visited.insert(cast<Instruction>(I->getOperand(0)));
// Check that the predication is on (most|least) significant bit.
+ KnownBits KnownL = compute(L);
+ unsigned ICmpBW = KnownL.getBitWidth();
+ auto LCR = ConstantRange::fromKnownBits(KnownL, false);
+ // Check LCR against full-set, [0, -1), [0, -3), [0, -7), etc. depending on
+ // AtIter in the big-endian case, and against [0, 2) in the little-endian
+ // case.
+ auto CheckLCR = ConstantRange::getNonEmpty(
+ APInt::getZero(ICmpBW), ByteOrderSwapped
+ ? -APInt::getLowBitsSet(ICmpBW, AtIter)
+ : APInt(ICmpBW, 2));
+ if (LCR != CheckLCR) {
+ ErrStr = "Bad LHS of significant-bit-check";
+ return {BitWidth};
+ }
+
KnownBits KnownR = compute(R);
- unsigned ICmpBW = KnownR.getBitWidth();
auto RCR = ConstantRange::fromKnownBits(KnownR, false);
auto AllowedR = ConstantRange::makeAllowedICmpRegion(Pred, RCR);
- ConstantRange CheckRCR(APInt::getZero(ICmpBW),
- ByteOrderSwapped ? APInt::getSignedMinValue(ICmpBW)
- : APInt(ICmpBW, 1));
- if (AllowedR == CheckRCR)
+ // Check AllowedR against [0, smin) in the big-endian case, and against
+ // [0, 1) in the little-endian case.
+ ConstantRange CheckAllowedR(
+ APInt::getZero(ICmpBW),
+ ByteOrderSwapped ? APInt::getSignedMinValue(ICmpBW) : APInt(ICmpBW, 1));
+
+ // We only compute KnownBits of either TV or FV, as the other value would
+ // just be a bit-shift as checked by isBigEndianBitShift.
+ if (AllowedR == CheckAllowedR) {
+ Visited.insert(FV);
return compute(TV);
- if (AllowedR.inverse() == CheckRCR)
+ }
+ if (AllowedR.inverse() == CheckAllowedR) {
+ Visited.insert(TV);
return compute(FV);
+ }
ErrStr = "Bad RHS of significant-bit-check";
return {BitWidth};
@@ -247,9 +268,11 @@ KnownBits ValueEvolution::compute(const Value *V) {
}
bool ValueEvolution::computeEvolutions(ArrayRef<PhiStepPair> PhiEvolutions) {
- for (unsigned I = 0; I < TripCount; ++I)
+ for (unsigned I = 0; I < TripCount; ++I) {
+ AtIter = I;
for (auto [Phi, Step] : PhiEvolutions)
KnownPhis.emplace_or_assign(Phi, computeInstr(Step));
+ }
return ErrStr.empty();
}
@@ -634,6 +657,17 @@ HashRecognize::recognizeCRC() const {
return VE.getError();
KnownBits ResultBits = VE.KnownPhis.at(ConditionalRecurrence.Phi);
+ // There must be exactly four unvisited instructions, corresponding to the
+ // IndVar PHI. Any other unvisited instructions from the KnownBits propagation
+ // can complicate the optimization, which replaces the entire loop with the
+ // table-lookup version of the hash algorithm.
+ std::initializer_list<const Instruction *> AugmentVisited = {
+ IndVar, Latch->getTerminator(), L.getLatchCmpInst(),
+ cast<Instruction>(IndVar->getIncomingValueForBlock(Latch))};
+ VE.Visited.insert_range(AugmentVisited);
+ if (std::distance(Latch->begin(), Latch->end()) != VE.Visited.size())
+ return "Found stray unvisited instructions";
+
unsigned N = std::min(TC, ResultBits.getBitWidth());
auto IsZero = [](const KnownBits &K) { return K.isZero(); };
if (!checkExtractBits(ResultBits, N, IsZero, *ByteOrderSwapped))
diff --git a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
index 247a105940e6e..8ebe6eaaaf0be 100644
--- a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
+++ b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
@@ -649,7 +649,7 @@ exit: ; preds = %loop
define i16 @not.crc.wrong.sb.check.const(i8 %msg, i16 %checksum) {
; CHECK-LABEL: 'not.crc.wrong.sb.check.const'
; CHECK-NEXT: Did not find a hash algorithm
-; CHECK-NEXT: Reason: Bad RHS of significant-bit-check
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
;
entry:
br label %loop
@@ -676,7 +676,7 @@ exit: ; preds = %loop
define i16 @not.crc.wrong.sb.check.pred(i16 %crc.init) {
; CHECK-LABEL: 'not.crc.wrong.sb.check.pred'
; CHECK-NEXT: Did not find a hash algorithm
-; CHECK-NEXT: Reason: Bad RHS of significant-bit-check
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
;
entry:
br label %loop
@@ -857,10 +857,10 @@ exit: ; preds = %loop
ret i16 %crc.next
}
-define i16 @crc1.tc8.sb.check.endian.mismatch(i8 %msg, i16 %checksum) {
-; CHECK-LABEL: 'crc1.tc8.sb.check.endian.mismatch'
+define i16 @not.crc.sb.check.endian.mismatch(i8 %msg, i16 %checksum) {
+; CHECK-LABEL: 'not.crc.sb.check.endian.mismatch'
; CHECK-NEXT: Did not find a hash algorithm
-; CHECK-NEXT: Reason: Bad RHS of significant-bit-check
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
;
entry:
br label %loop
@@ -909,10 +909,10 @@ exit: ; preds = %loop
ret i16 %crc.next
}
-define i16 @not.crc.bad.cast(i8 %msg, i16 %checksum) {
-; CHECK-LABEL: 'not.crc.bad.cast'
+define i16 @not.crc.bad.endian.swapped.sb.check(i8 %msg, i16 %checksum) {
+; CHECK-LABEL: 'not.crc.bad.endian.swapped.sb.check'
; CHECK-NEXT: Did not find a hash algorithm
-; CHECK-NEXT: Reason: Expected bottom 8 bits zero (????????00001011)
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
;
entry:
br label %loop
@@ -1189,3 +1189,78 @@ loop: ; preds = %loop, %entry
exit: ; preds = %loop
ret i16 %crc.next
}
+
+define i16 @not.crc.stray.unvisited.call(i16 %crc.init) {
+; CHECK-LABEL: 'not.crc.stray.unvisited.call'
+; CHECK-NEXT: Did not find a hash algorithm
+; CHECK-NEXT: Reason: Found stray unvisited instructions
+;
+entry:
+ br label %loop
+
+loop: ; preds = %loop, %entry
+ %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
+ %crc = phi i16 [ %crc.init, %entry ], [ %crc.next, %loop ]
+ %crc.shl = shl i16 %crc, 1
+ %crc.xor = xor i16 %crc.shl, 4129
+ %check.sb = icmp slt i16 %crc, 0
+ %crc.next = select i1 %check.sb, i16 %crc.xor, i16 %crc.shl
+ call void @print(i16 %crc.next)
+ %iv.next = add nuw nsw i32 %iv, 1
+ %exit.cond = icmp samesign ult i32 %iv, 7
+ br i1 %exit.cond, label %loop, label %exit
+
+exit: ; preds = %loop
+ ret i16 %crc.next
+}
+
+declare void @print(i16)
+
+define i16 @not.crc.call.sb.check(i16 %crc.init) {
+; CHECK-LABEL: 'not.crc.call.sb.check'
+; CHECK-NEXT: Did not find a hash algorithm
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
+;
+entry:
+ br label %loop
+
+loop: ; preds = %loop, %entry
+ %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
+ %crc = phi i16 [ %crc.init, %entry ], [ %crc.next, %loop ]
+ %crc.shl = shl i16 %crc, 1
+ %crc.xor = xor i16 %crc.shl, 4129
+ %call = call i16 @side.effect()
+ %check.sb = icmp slt i16 %call, 0
+ %crc.next = select i1 %check.sb, i16 %crc.xor, i16 %crc.shl
+ %iv.next = add nuw nsw i32 %iv, 1
+ %exit.cond = icmp samesign ult i32 %iv, 7
+ br i1 %exit.cond, label %loop, label %exit
+
+exit: ; preds = %loop
+ ret i16 %crc.next
+}
+
+declare i16 @side.effect()
+
+define i16 @not.crc.bad.lhs.sb.check.be(i16 %crc.init) {
+; CHECK-LABEL: 'not.crc.bad.lhs.sb.check.be'
+; CHECK-NEXT: Did not find a hash algorithm
+; CHECK-NEXT: Reason: Bad LHS of significant-bit-check
+;
+entry:
+ br label %loop
+
+loop: ; preds = %loop, %entry
+ %iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
+ %crc = phi i16 [ %crc.init, %entry ], [ %crc.next, %loop ]
+ %crc.shl = shl i16 %crc, 1
+ %crc.xor = xor i16 %crc.shl, 4129
+ %check.sb = icmp slt i16 %crc.shl, 0
+ %crc.next = select i1 %check.sb, i16 %crc.xor, i16 %crc.shl
+ %iv.next = add nuw nsw i32 %iv, 1
+ %exit.cond = icmp samesign ult i32 %iv, 7
+ br i1 %exit.cond, label %loop, label %exit
+
+exit: ; preds = %loop
+ ret i16 %crc.next
+}
|
60e28b9
to
894c8e4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain the math behind this. :)
Sorry for the delay, but the logic behind this patch is easy to see with examples. Let us look at a big-endian example first: define i16 @crc16.be(i16 %crc.init) {
entry:
br label %loop
loop: ; preds = %loop, %entry
%iv = phi i32 [ 0, %entry ], [ %iv.next, %loop ]
%crc = phi i16 [ %crc.init, %entry ], [ %crc.next, %loop ]
%crc.shl = shl i16 %crc, 1
%crc.xor = xor i16 %crc.shl, 4129
%check.sb = icmp sge i16 %crc, 0
%crc.next = select i1 %check.sb, i16 %crc.shl, i16 %crc.xor
%iv.next = add nuw nsw i32 %iv, 1
%exit.cond = icmp samesign ult i32 %iv, 7
br i1 %exit.cond, label %loop, label %exit
exit: ; preds = %loop
ret i16 %crc.next
} Notice See the little-endian case: define i32 @crc32.le(i32 %checksum, i32 %msg) {
entry:
br label %loop
loop: ; preds = %loop, %entry
%crc = phi i32 [ %checksum, %entry ], [ %crc.next, %loop ]
%data = phi i32 [ %msg, %entry ], [ %data.next, %loop ]
%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
%xor.crc.data = xor i32 %crc, %data
%sb.crc.data = and i32 %xor.crc.data, 1
%check.sb = icmp eq i32 %sb.crc.data, 0
%crc.lshr = lshr i32 %crc, 1
%crc.xor = xor i32 %crc.lshr, 33800
%crc.next = select i1 %check.sb, i32 %crc.lshr, i32 %crc.xor
%iv.next = add nuw nsw i8 %iv, 1
%data.next = lshr i32 %data, 1
%exit.cond = icmp samesign ult i8 %iv, 7
br i1 %exit.cond, label %loop, label %exit
exit: ; preds = %loop
ret i32 %crc.next
} Notice |
What do you mean by LHS/RHS here? If the compare's operands, in |
Sorry, I meant AllowedR: it is the allowed icmp region. when you say something sge 0, isn't the allowed icmp region [0, smax)? |
And what's "LHS" ? |
LHS is simply KnownBits KnownL = compute(L);
auto LCR = ConstantRange::fromKnownBits(KnownL, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay - I was on vacation. :)
I still don't understand why go TripCount
times through ValueEvolution
if we seem to be matching a specific compare and a bit shift. Can't we match it once?
So the KnownBits will evolve through the different iterations, and the ConstantRange computation depends on the KnownBits at each iteration: if iteration 0 didn't surface a corruption, but a later iteration surfaces it, would we not want to check that? Note that we check both, the final KnownBits as well as the sub-KnownBits at the compare, and this is needed for correctness. |
894c8e4
to
8f00ddd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the whole ValueEvolution
class is flawed if it only checks if we are shifting in zero bits.
For example, this get matched:
for.body: ; preds = %entry, %for.body
%i.011 = phi i16 [ 0, %entry ], [ %inc, %for.body ]
%crc.addr.010 = phi i16 [ %crc, %entry ], [ %cond, %for.body ]
%shl = shl i16 %crc.addr.010, 1
%evil1 = and i16 %i.011, 2
%evil2 = add i16 %evil1, 1
%evil = mul i16 %shl, %evil2
%xor = xor i16 %evil, 4129
%tobool.not9 = icmp slt i16 %crc.addr.010, 0
%cond = select i1 %tobool.not9, i16 %xor, i16 %evil
%inc = add nuw nsw i16 %i.011, 1
%exitcond.not = icmp eq i16 %inc, 8
br i1 %exitcond.not, label %for.cond.cleanup, label %for.body, !llvm.loop !9
Consider removing ValueEvolution
and instead pattern-match the CRC calculation instructions. The whole analysis would be much simpler by matching the entire loop instead of looking for specific patterns here and there.
Hm, I agree that it is flawed if it is just checking the shifting in of zero bits, but I think the problem is more subtle: the branches-taken should be swapped, and instead of comparing the result against zero, we need to compare the result against the interleaved generating-polynomial (I thought I was doing this, but I think I got muddled up in the final stages of the development). Pattern-matching a CRC is going to be inherently more fragile. |
What exactly do you mean by "fragile" ? There's already pattern-matching for the shift, the XOR, the select and the compare. |
Wouldn't be robust to changes in InstCombine, I mean. No, of course there shouldn't be miscompiles: let me think a bit about whether the ValueEvolution approach could work in theory, and implement pattern-matching otherwise. |
The instruction sequence for CRC seems simple enough that I wouldn't expect many variations. Our tests should catch the changes. Perhaps testing against C sources or on an IR right after parsing might help? |
8f00ddd
to
7c78576
Compare
7c78576
to
d971aba
Compare
Now this is a major simplification of the analysis. Thank you. I'm still not happy with |
I'm not happy with it either: at the moment, I just tried exacting the logic from ValueEvolution, and getting rid of it. There's an additional problem with computeKnownBits: it is limited by depth, and cannot be used for a correctness-check.
Still thinking about what the best way to do this would be. I'd be more inclined to pattern-match piece-wise: so, keeping (Simple|Conditional)Recurrence, and matching more and more of the loop in steps; I think this approach would be least fragile. |
d971aba
to
17e5efb
Compare
Sorry about the enormous delay: some VPlan work took priority. All the issues are fixed now, and there is no longer a computeKnownBits call: kindly let me know if you find holes. |
ef82b49
to
047330f
Compare
The ConstantRange of the LHS of the ICmp in the Select(ICmp()) significant-bit check is not checked in the big-endian case, leading to a potential miscompile. Fix this. Co-authored-by: Piotr Fusik <p.fusik@samsung.com>
Co-authored-by: Piotr Fusik <p.fusik@samsung.com>
047330f
to
a695878
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. This PR came a long way as I wanted to avoid misdetection of some other code as CRC. Thanks for your patience addressing all my comments.
The ValueEvolution logic is deeply flawed, and checking that zero-bits are shifted can be exploited for miscompiles. In an effort to redo HashRecognize with a pattern-matching based approach, extract and fix the core logic of ValueEvolution, and strip it completely, showing that none of the tests rely on the KnownBits computation of ValueEvolution.