-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[SCEV] Fix NSW flag propagation in getGEPExpr #155269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
SCEV was losing NSW flags during AddRec operations, causing Dependence Analysis to add unnecessary runtime assumptions for inbounds GEPs. This patch fixes getGEPExpr: inherit flags from index expressions when GEP has no explicit flags, allowing NSW flags from AddRec indices to propagate to the final GEP result. This eliminates spurious runtime assumptions in DA for expressions like {0,+,(4 * %N * %M)} derived from inbounds GEPs, allowing proper dependence analysis without conservative runtime checks.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-analysis Author: Sebastian Pop (sebpop) ChangesSCEV was losing NSW flags during AddRec operations, causing Dependence Analysis to add unnecessary runtime assumptions for inbounds GEPs. This patch fixes getGEPExpr: inherit flags from index expressions when GEP has no explicit flags, allowing NSW flags from AddRec indices to propagate to the final GEP result. This eliminates spurious runtime assumptions in DA for expressions like {0,+,(4 * %N * %M)} derived from inbounds GEPs, allowing proper dependence analysis without conservative runtime checks. Patch is 22.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/155269.diff 9 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index d2c445f1ffaa0..d285afb308ea5 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -3760,6 +3760,31 @@ ScalarEvolution::getGEPExpr(GEPOperator *GEP,
if (NW.hasNoUnsignedWrap())
OffsetWrap = setFlags(OffsetWrap, SCEV::FlagNUW);
+ // Inherit flags from index expressions when GEP has no explicit flags.
+ if (OffsetWrap == SCEV::FlagAnyWrap) {
+ // Check if all index expressions have compatible no-wrap flags
+ bool AllHaveNSW = true, AllHaveNUW = true;
+ for (const SCEV *IndexExpr : IndexExprs) {
+ if (auto *AR = dyn_cast<SCEVAddRecExpr>(IndexExpr)) {
+ if (!AR->hasNoSignedWrap())
+ AllHaveNSW = false;
+ if (!AR->hasNoUnsignedWrap())
+ AllHaveNUW = false;
+ } else {
+ // Be conservative for non-AddRec expressions.
+ AllHaveNSW = false;
+ AllHaveNUW = false;
+ break;
+ }
+ }
+ // Inherit NSW if all have NSW.
+ if (AllHaveNSW)
+ OffsetWrap = setFlags(OffsetWrap, SCEV::FlagNSW);
+ // Inherit NUW if all have NUW.
+ if (AllHaveNUW)
+ OffsetWrap = setFlags(OffsetWrap, SCEV::FlagNUW);
+ }
+
Type *CurTy = GEP->getType();
bool FirstIter = true;
SmallVector<const SCEV *, 4> Offsets;
diff --git a/llvm/test/Analysis/Delinearization/fixed_size_array.ll b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
index 634850bb4a5a2..b85b04e813d77 100644
--- a/llvm/test/Analysis/Delinearization/fixed_size_array.ll
+++ b/llvm/test/Analysis/Delinearization/fixed_size_array.ll
@@ -12,7 +12,7 @@ define void @a_i_j_k(ptr %a) {
; CHECK-LABEL: 'a_i_j_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,128}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,128}<nuw><nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][8][32] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
@@ -61,7 +61,7 @@ define void @a_i_nj_k(ptr %a) {
; CHECK-LABEL: 'a_i_nj_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}896,+,1024}<nuw><nsw><%for.i.header>,+,-128}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}896,+,1024}<nuw><nsw><%for.i.header>,+,-128}<nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][8][32] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{7,+,-1}<nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
@@ -117,14 +117,14 @@ define void @a_ijk_b_i2jk(ptr %a, ptr %b) {
; CHECK-LABEL: 'a_ijk_b_i2jk'
; CHECK-NEXT: Inst: store i32 1, ptr %a.idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nuw><nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
; CHECK-EMPTY:
; CHECK-NEXT: Inst: store i32 1, ptr %b.idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,256}<nuw><nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %b
; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,1}<nuw><nsw><%for.k>]
@@ -181,10 +181,10 @@ define void @a_i_2j1_k(ptr %a) {
; CHECK-LABEL: 'a_i_2j1_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}128,+,1024}<nuw><nsw><%for.i.header>,+,256}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}128,+,1024}<nuw><nsw><%for.i.header>,+,256}<nuw><nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][4][64] with elements of 4 bytes.
-; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><%for.j.header>][{32,+,1}<nw><%for.k>]
+; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{32,+,1}<nuw><nsw><%for.k>]
;
entry:
br label %for.i.header
@@ -235,7 +235,7 @@ define void @a_i_3j_k(ptr %a) {
; CHECK-LABEL: 'a_i_3j_k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,384}<nw><%for.j.header>,+,4}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,384}<nuw><nsw><%for.j.header>,+,4}<nuw><nsw><%for.k>
; CHECK-NEXT: failed to delinearize
;
entry:
@@ -287,7 +287,7 @@ define void @a_i_j_3k(ptr %a) {
; CHECK-LABEL: 'a_i_j_3k'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,128}<nw><%for.j.header>,+,12}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,1024}<nuw><nsw><%for.i.header>,+,128}<nuw><nsw><%for.j.header>,+,12}<nuw><nsw><%for.k>
; CHECK-NEXT: Base offset: %a
; CHECK-NEXT: ArrayDecl[UnknownSize][8][32] with elements of 4 bytes.
; CHECK-NEXT: ArrayRef[{0,+,1}<nuw><nsw><%for.i.header>][{0,+,1}<nuw><nsw><%for.j.header>][{0,+,3}<nuw><nsw><%for.k>]
@@ -503,7 +503,7 @@ define void @non_divisible_by_element_size(ptr %a) {
; CHECK-LABEL: 'non_divisible_by_element_size'
; CHECK-NEXT: Inst: store i32 1, ptr %idx, align 4
; CHECK-NEXT: In Loop with Header: for.k
-; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,256}<nuw><nsw><%for.i.header>,+,32}<nw><%for.j.header>,+,1}<nw><%for.k>
+; CHECK-NEXT: AccessFunction: {{\{\{\{}}0,+,256}<nuw><nsw><%for.i.header>,+,32}<nuw><nsw><%for.j.header>,+,1}<nuw><nsw><%for.k>
; CHECK-NEXT: failed to delinearize
;
entry:
diff --git a/llvm/test/Analysis/DependenceAnalysis/scev-nsw-flags-enable-analysis.ll b/llvm/test/Analysis/DependenceAnalysis/scev-nsw-flags-enable-analysis.ll
new file mode 100644
index 0000000000000..89b837e51fa4f
--- /dev/null
+++ b/llvm/test/Analysis/DependenceAnalysis/scev-nsw-flags-enable-analysis.ll
@@ -0,0 +1,44 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa 2>&1 | FileCheck %s
+
+; Test that SCEV NSW flag preservation enables dependence analysis to work
+; correctly. Previously, SCEV would lose NSW flags when combining AddRec
+; expressions from GEP operations, causing dependence analysis to incorrectly
+; classify expressions as "wrapping" and fail analysis.
+
+; Test showing different GEPs with same pattern work correctly
+define void @test_da_different_geps(ptr %A) {
+; CHECK-LABEL: 'test_da_different_geps'
+; CHECK-NEXT: Src: store i32 %conv, ptr %gep1, align 4 --> Dst: store i32 %conv, ptr %gep1, align 4
+; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: Src: store i32 %conv, ptr %gep1, align 4 --> Dst: %val = load i32, ptr %gep2, align 4
+; CHECK-NEXT: da analyze - flow [*|<]!
+; CHECK-NEXT: Src: %val = load i32, ptr %gep2, align 4 --> Dst: %val = load i32, ptr %gep2, align 4
+; CHECK-NEXT: da analyze - none!
+;
+
+entry:
+ br label %loop
+
+loop:
+ %i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
+
+ ; NSW-flagged arithmetic
+ %mul = mul nsw i64 %i, 3
+ %sub = add nsw i64 %mul, -6
+
+ ; Two different access patterns that DA can now analyze correctly
+ %gep1 = getelementptr inbounds [100 x i32], ptr %A, i64 %sub, i64 %sub
+ %gep2 = getelementptr inbounds [100 x i32], ptr %A, i64 %i, i64 %i
+
+ %conv = trunc i64 %i to i32
+ store i32 %conv, ptr %gep1
+ %val = load i32, ptr %gep2
+
+ %i.next = add nsw i64 %i, 1
+ %cond = icmp ult i64 %i.next, 50
+ br i1 %cond, label %loop, label %exit
+
+exit:
+ ret void
+}
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
index d1d1ecb2af888..ef5af4881b992 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
@@ -122,10 +122,10 @@ define void @dependency_check_and_runtime_checks_needed_select_of_ptr_add_recs(p
; CHECK-NEXT: Member: {%a,+,4}<nuw><%loop>
; CHECK-NEXT: Group GRP1:
; CHECK-NEXT: (Low: %b High: ((4 * %n) + %b))
-; CHECK-NEXT: Member: {%b,+,4}<%loop>
+; CHECK-NEXT: Member: {%b,+,4}<nw><%loop>
; CHECK-NEXT: Group GRP2:
; CHECK-NEXT: (Low: %c High: ((4 * %n) + %c))
-; CHECK-NEXT: Member: {%c,+,4}<%loop>
+; CHECK-NEXT: Member: {%c,+,4}<nw><%loop>
; CHECK-NEXT: Group GRP3:
; CHECK-NEXT: (Low: ((4 * %offset) + %a) High: ((4 * %offset) + (4 * %n) + %a))
; CHECK-NEXT: Member: {((4 * %offset) + %a),+,4}<%loop>
diff --git a/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll b/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
index 593888f5f7bd5..1ccd2613eaeac 100644
--- a/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
+++ b/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
@@ -102,7 +102,7 @@ define void @test-add-scope-invariant(ptr %input, i32 %needle) {
; CHECK-NEXT: %of_interest = add nuw nsw i32 %i.next, %offset
; CHECK-NEXT: --> {(1 + %offset)<nuw><nsw>,+,1}<nuw><%loop> U: [1,0) S: [1,0) Exits: %needle LoopDispositions: { %loop: Computable }
; CHECK-NEXT: %gep2 = getelementptr i32, ptr %input, i32 %of_interest
-; CHECK-NEXT: --> ((4 * (sext i32 {(1 + %offset)<nuw><nsw>,+,1}<nuw><%loop> to i64))<nsw> + %input) U: full-set S: full-set Exits: ((4 * (sext i32 %needle to i64))<nsw> + %input) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT: --> ((4 * (sext i32 {(1 + %offset)<nuw><nsw>,+,1}<nuw><%loop> to i64))<nuw><nsw> + %input) U: full-set S: full-set Exits: ((4 * (sext i32 %needle to i64))<nuw><nsw> + %input) LoopDispositions: { %loop: Computable }
; CHECK-NEXT: Determining loop execution counts for: @test-add-scope-invariant
; CHECK-NEXT: Loop %loop: backedge-taken count is (-1 + (-1 * %offset) + %needle)
; CHECK-NEXT: Loop %loop: constant max backedge-taken count is i32 -1
@@ -133,7 +133,7 @@ define void @test-add-scope-bound(ptr %input, i32 %needle) {
; CHECK-NEXT: %i = phi i32 [ %i.next, %loop ], [ 0, %entry ]
; CHECK-NEXT: --> {0,+,1}<nuw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: %gep = getelementptr i32, ptr %input, i32 %i
-; CHECK-NEXT: --> ((4 * (sext i32 {0,+,1}<nuw><%loop> to i64))<nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT: --> ((4 * (sext i32 {0,+,1}<nuw><%loop> to i64))<nuw><nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: %offset = load i32, ptr %gep, align 4
; CHECK-NEXT: --> %offset U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Variant }
; CHECK-NEXT: %i.next = add nuw i32 %i, 1
@@ -174,7 +174,7 @@ define void @test-add-scope-bound-unkn-preheader(ptr %input, i32 %needle) {
; CHECK-NEXT: %i.next = add nuw i32 %i, %offset
; CHECK-NEXT: --> {%offset,+,%offset}<nuw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: %gep2 = getelementptr i32, ptr %input, i32 %i.next
-; CHECK-NEXT: --> ((4 * (sext i32 {%offset,+,%offset}<nuw><%loop> to i64))<nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT: --> ((4 * (sext i32 {%offset,+,%offset}<nuw><%loop> to i64))<nuw><nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: Determining loop execution counts for: @test-add-scope-bound-unkn-preheader
; CHECK-NEXT: Loop %loop: Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %loop: Unpredictable constant max backedge-taken count.
@@ -205,7 +205,7 @@ define void @test-add-scope-bound-unkn-preheader-neg1(ptr %input, i32 %needle) {
; CHECK-NEXT: %i.next = add nuw i32 %i, %offset
; CHECK-NEXT: --> {%offset,+,%offset}<nuw><%loop> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: %gep2 = getelementptr i32, ptr %input, i32 %i.next
-; CHECK-NEXT: --> ((4 * (sext i32 {%offset,+,%offset}<nuw><%loop> to i64))<nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
+; CHECK-NEXT: --> ((4 * (sext i32 {%offset,+,%offset}<nuw><%loop> to i64))<nuw><nsw> + %input) U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %loop: Computable }
; CHECK-NEXT: Determining loop execution counts for: @test-add-scope-bound-unkn-preheader-neg1
; CHECK-NEXT: Loop %loop: Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %loop: Unpredictable constant max backedge-taken count.
diff --git a/llvm/test/Analysis/ScalarEvolution/nsw.ll b/llvm/test/Analysis/ScalarEvolution/nsw.ll
index 4d668d1ffef11..1480ea223e34e 100644
--- a/llvm/test/Analysis/ScalarEvolution/nsw.ll
+++ b/llvm/test/Analysis/ScalarEvolution/nsw.ll
@@ -13,19 +13,19 @@ define void @test1(ptr %p) nounwind {
; CHECK-NEXT: %tmp2 = sext i32 %i.01 to i64
; CHECK-NEXT: --> {0,+,1}<nuw><nsw><%bb> U: [0,-9223372036854775808) S: [0,-9223372036854775808) Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %tmp3 = getelementptr double, ptr %p, i64 %tmp2
-; CHECK-NEXT: --> {%p,+,8}<%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
+; CHECK-NEXT: --> {%p,+,8}<nw><%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %tmp6 = sext i32 %i.01 to i64
; CHECK-NEXT: --> {0,+,1}<nuw><nsw><%bb> U: [0,-9223372036854775808) S: [0,-9223372036854775808) Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %tmp7 = getelementptr double, ptr %p, i64 %tmp6
-; CHECK-NEXT: --> {%p,+,8}<%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
+; CHECK-NEXT: --> {%p,+,8}<nw><%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %tmp8 = add nsw i32 %i.01, 1
; CHECK-NEXT: --> {1,+,1}<nuw><nsw><%bb> U: [1,-2147483648) S: [1,-2147483648) Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %p.gep = getelementptr double, ptr %p, i32 %tmp8
-; CHECK-NEXT: --> {(8 + %p),+,8}<%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
+; CHECK-NEXT: --> {(8 + %p),+,8}<nw><%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %phitmp = sext i32 %tmp8 to i64
; CHECK-NEXT: --> {1,+,1}<nuw><nsw><%bb> U: [1,-9223372036854775808) S: [1,-9223372036854775808) Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: %tmp9 = getelementptr inbounds double, ptr %p, i64 %phitmp
-; CHECK-NEXT: --> {(8 + %p),+,8}<%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
+; CHECK-NEXT: --> {(8 + %p),+,8}<nw><%bb> U: full-set S: full-set Exits: <<Unknown>> LoopDispositions: { %bb: Computable }
; CHECK-NEXT: Determining loop execution counts for: @test1
; CHECK-NEXT: Loop %bb: Unpredictable backedge-taken count.
; CHECK-NEXT: Loop %bb: Unpredictable constant max backedge-taken count.
diff --git a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll
index e784d25385980..d638b6f890923 100644
--- a/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll
+++ b/llvm/test/Analysis/ScalarEvolution/ptrtoint.ll
@@ -222,9 +222,9 @@ define void @ptrtoint_of_addrec(ptr %in, i32 %count) {
; X64-NEXT: %i6 = phi i64 [ 0, %entry ], [ %i9, %loop ]
; X64-NEXT: --> {0,+,1}<nuw><nsw><%loop> U: [0,-9223372036854775808) S: [0,-9223372036854775808) Exits: (-1 + (zext i32 %count to i64))<nsw> LoopDispositions: { %loop: Computable }
; X64-NEXT: %i7 = getelementptr inbounds i32, ptr %in, i64 %i6
-; X64-NEXT: --> {%in,+,4}<%loop> U: full-set S: full-set Exits: (-4 + (4 * (zext i32 %count to i64))<nuw><nsw> + %in) LoopDispositions: { %loop: Computable }
+; X64-NEXT: --> {%in,+,4}<nw><%loop> U: full-set S: full-set Exits: (-4 + (4 * (zext i32 %count to i64))<nuw><nsw> + %in) LoopDispositions: { %loop: Computable }
; X64-NEXT: %i8 = ptrtoint ptr %i7 to i64
-; X64-NEXT: --> {(ptrtoint ptr %in to i64),+,4}<%loop> U: full-set S: full-set Exits: (-4 + (4 * (zext i32 %count to i64))<nuw><nsw> + (ptrtoint ptr %in to i64)) LoopDispositions: { %loop: Computable }
+; X64-NEXT: --> {(ptrtoint ptr %in to i64),+,4}<nw><%loop> U: full-set S: full-set Exits: (-4 + (4 * (zext i32 %count to i64))<nuw><nsw> + (ptrtoint ptr %in to i64)) LoopDispositions: { %loop: Computable }
; X64-NEXT: %i9 = add nuw nsw i64 %i6, 1
; X64-NEXT: --> {1,+,1}<nuw><%loop> U: [1,0) S: [1,0) Exits: (zext i32 %count to i64) LoopDispositions: { %loop: Computable }
; X64-NEXT: Determining loop execution counts for: @ptrtoint_of_addrec
diff --git a/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll b/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
index 30a095fd144fa..6d4d6b1293d15 100644
--- a/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
+++ b/llvm/test/Analysis/ScalarEvolution/trip-count-scalable-stride.ll
@@ -237,7 +237,7 @@ define void @vscale_slt_with_vp_plain(ptr nocapture %A, i32 %n) mustprogress vsc
; CHECK-NEXT: %i.05 = phi i32 [ %add, %for.body ], [ 0, %entry ]
; CHECK-NEXT: --> {0,+,(4 * vscale)<nuw><nsw>}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,2147483645) Exits: (4 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) LoopDispositions: { %for.body: Computable }
; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.05
-; CHECK-NEXT: --> {%A,+,(16 * vscale)<nuw><nsw>}<%for.body> U: full-set S: full-set Exits: ((16 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) + %A) LoopDispositions: { %for.body: Computable }
+; CHECK-NEXT: --> {%A,+,(16 * vscale)<nuw><nsw>}<nw><%for.body> U: full-set S: full-set Exits: ((16 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) + %A) LoopDispositions: { %for.body: Computable }
; CHECK-NEXT: %add = add nsw i32 %i.05, %VF
; CHECK-NEXT: --> {(4 * vscale)<nuw><nsw>,+,(4 * vscale)<nuw><nsw>}<nuw><nsw><%for.body> U: [8,-2147483648) S: [8,2147483645) Exits: (vscale * (4 + (4 * ((-1 + %n) /u (4 * vscale)<nuw><nsw>))<nuw><nsw>)<nuw>) LoopDispositions: { %for.body: Computable }
; CHECK-NEXT: Determining loop execution counts for: @vscale_slt_with_vp_plain
@@ -278,7 +278,7 @@ define void @vscale_slt_with_vp_umin(ptr nocapture %A, i32 %n) mustprogress vsca
; CHECK-NEXT: %i.05 = phi i32 [ %add, %for.body ], [ 0, %entry ]
; CHECK-NEXT: --> {0,+,(4 * vscale)<nuw><nsw>}<nuw><nsw><%for.body> U: [0,-2147483648) S: [0,2147483645) Exits: (4 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) LoopDispositions: { %for.body: Computable }
; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, ptr %A, i32 %i.05
-; CHECK-NEXT: --> {%A,+,(16 * vscale)<nuw><nsw>}<%for.body> U: full-set S: full-set Exits: ((16 * vscale * ((-1 + %n) /u (4 * vscale)<nuw><nsw>)) + %A) LoopDispositions: { %for.body: Comput...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross-posting previous review: #155145 (comment)
SCEV was losing NSW flags during AddRec operations, causing Dependence Analysis to add unnecessary runtime assumptions for inbounds GEPs.
This patch fixes getGEPExpr: inherit flags from index expressions when GEP has no explicit flags, allowing NSW flags from AddRec indices to propagate to the final GEP result.
This eliminates spurious runtime assumptions in DA for expressions like {0,+,(4 * %N * %M)} derived from inbounds GEPs, allowing proper dependence analysis without conservative runtime checks.