Skip to content

Conversation

buggfg
Copy link

@buggfg buggfg commented Jul 23, 2025

As I mentioned in the discussion, this patch specifically addresses the masking issue found in common loop structures.

This patch resolves the masking issue by adding the nsw/nuw flags to the trunc instruction, allowing the InstCombinePass to subsequently remove that Trunc instruction.

With this patch, the following common loop successfully undergoes 8 iterations of loop unrolling, resulting in a remarkable 2x performance improvement (without vectorization):

void func(int result[], int start) {
  for (int i = start; i < 100; i++)
      result[i] += 1;
}

Additionally, we have validated the functional correctness and effectiveness of this patch through testing on the SPEC CPU2006 and SPEC CPU2017 benchmarks.

Thank you for considering this change!

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

Comment on lines 1065 to 1066
TruncInst->setHasNoSignedWrap();
TruncInst->setHasNoUnsignedWrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think NUW and NSW imply NW, not the other way around, so I don't think we can apply NUW/NSW on trunc here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I understand your point. However, TruncIV only has the nw flag, and TruncAR->hasNoUnsignedWrap() = TruncAR->hasNoSignedWrap() = false. so the masking issue remains unresolved.

CmpIndVar = Builder.CreateTrunc(
    CmpIndVar, ExitCnt->getType(), "lftr.wideiv",
    TruncAR->hasNoUnsignedWrap(), TruncAR->hasNoSignedWrap());

Given that TruncIV is marked as nw and truncation does not change the sign, can we assume that the nw flag of TruncIV retains the sign of the original IV? Specifically, if the original IV is signed, does TruncIV's nw imply that it behaves like nsw?

Copy link
Contributor

@lukel97 lukel97 Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that TruncIV is marked as nw and truncation does not change the sign, can we assume that the nw flag of TruncIV retains the sign of the original IV? Specifically, if the original IV is signed, does TruncIV's nw imply that it behaves like nsw?

I think in general truncation can change the sign, e.g. i16 0x8000 truncated to i8 will be 0x00, so the sign can change.

I tried out just adding the nuw/nsw wrap flags if the condition predicate is unsigned or signed, but it looks like that's unsound:

--- a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+++ b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
@@ -1050,8 +1050,10 @@ linearFunctionTestReplace(Loop *L, BasicBlock *ExitingBB,
       bool Discard;
       L->makeLoopInvariant(ExitCnt, Discard);
     } else
-      CmpIndVar = Builder.CreateTrunc(CmpIndVar, ExitCnt->getType(),
-                                      "lftr.wideiv");
+      CmpIndVar =
+          Builder.CreateTrunc(CmpIndVar, ExitCnt->getType(), "lftr.wideiv",
+                              cast<ICmpInst>(BI->getCondition())->isUnsigned(),
+                              cast<ICmpInst>(BI->getCondition())->isSigned());
   }
diff --git a/llvm/test/Transforms/IndVarSimplify/lftr.ll b/llvm/test/Transforms/IndVarSimplify/lftr.ll
index 5ee62ba357ab..3825a49563a6 100644
--- a/llvm/test/Transforms/IndVarSimplify/lftr.ll
+++ b/llvm/test/Transforms/IndVarSimplify/lftr.ll
@@ -415,7 +415,7 @@ define void @wide_trip_count_test1(ptr %autoc,
 ; CHECK-NEXT:    [[ADD3:%.*]] = fadd float [[TEMP2]], [[MUL]]
 ; CHECK-NEXT:    store float [[ADD3]], ptr [[ARRAYIDX2]], align 4
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw i64 [[INDVARS_IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[SUB]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]]
 ; CHECK:       for.end.loopexit:

Passing the change to wide_trip_count_test1 through alive2 shows that this eventually triggers poison where it didn't before.

I wonder if it's possible to infer the NUW/NSW flags on TruncatedIV where possible? I think part of the NUW/NSW information is being lost when the checks are pulled to outside the loop, e.g. in your C example from the original issue:

; Function Attrs: nounwind vscale_range(2,1024)
define dso_local void @func(ptr noundef captures(none) %result, i32 noundef signext %start) local_unnamed_addr #0 {
entry:
  %cmp3 = icmp slt i32 %start, 100
  br i1 %cmp3, label %for.body.preheader, label %for.cond.cleanup

for.body.preheader:                               ; preds = %entry
  %0 = sext i32 %start to i64
  br label %for.body

for.cond.cleanup.loopexit:                        ; preds = %for.body
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond.cleanup.loopexit, %entry
  ret void

for.body:                                         ; preds = %for.body.preheader, %for.body
  %indvars.iv = phi i64 [ %0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
  %i.04 = phi i32 [ %inc, %for.body ], [ %start, %for.body.preheader ]
  %idxprom = sext i32 %i.04 to i64
  %arrayidx = getelementptr inbounds i32, ptr %result, i64 %indvars.iv
  %1 = load i32, ptr %arrayidx, align 4, !tbaa !6
  %add = add nsw i32 %1, 1
  store i32 %add, ptr %arrayidx, align 4, !tbaa !6
  %indvars.iv.next = add nsw i64 %indvars.iv, 1
  %inc = add nsw i32 %i.04, 1
  %cmp = icmp slt i64 %indvars.iv, 99
  br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !10
}

I don't think SCEV sees the %cmp3 = icmp slt i32 %start, 100 condition and so it doesn't realise that %indvars.iv.next can't signed wrap in i32?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that TruncIV is marked as nw and truncation does not change the sign, can we assume that the nw flag of TruncIV retains the sign of the original IV? Specifically, if the original IV is signed, does TruncIV's nw imply that it behaves like nsw?

I think in general truncation can change the sign, e.g. i16 0x8000 truncated to i8 will be 0x00, so the sign can change.

I tried out just adding the nuw/nsw wrap flags if the condition predicate is unsigned or signed, but it looks like that's unsound:

--- a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+++ b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
@@ -1050,8 +1050,10 @@ linearFunctionTestReplace(Loop *L, BasicBlock *ExitingBB,
       bool Discard;
       L->makeLoopInvariant(ExitCnt, Discard);
     } else
-      CmpIndVar = Builder.CreateTrunc(CmpIndVar, ExitCnt->getType(),
-                                      "lftr.wideiv");
+      CmpIndVar =
+          Builder.CreateTrunc(CmpIndVar, ExitCnt->getType(), "lftr.wideiv",
+                              cast<ICmpInst>(BI->getCondition())->isUnsigned(),
+                              cast<ICmpInst>(BI->getCondition())->isSigned());
   }
diff --git a/llvm/test/Transforms/IndVarSimplify/lftr.ll b/llvm/test/Transforms/IndVarSimplify/lftr.ll
index 5ee62ba357ab..3825a49563a6 100644
--- a/llvm/test/Transforms/IndVarSimplify/lftr.ll
+++ b/llvm/test/Transforms/IndVarSimplify/lftr.ll
@@ -415,7 +415,7 @@ define void @wide_trip_count_test1(ptr %autoc,
 ; CHECK-NEXT:    [[ADD3:%.*]] = fadd float [[TEMP2]], [[MUL]]
 ; CHECK-NEXT:    store float [[ADD3]], ptr [[ARRAYIDX2]], align 4
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw i64 [[INDVARS_IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[SUB]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]]
 ; CHECK:       for.end.loopexit:

Passing the change to wide_trip_count_test1 through alive2 shows that this eventually triggers poison where it didn't before.

I wonder if it's possible to infer the NUW/NSW flags on TruncatedIV where possible? I think part of the NUW/NSW information is being lost when the checks are pulled to outside the loop, e.g. in your C example from the original issue:

; Function Attrs: nounwind vscale_range(2,1024)
define dso_local void @func(ptr noundef captures(none) %result, i32 noundef signext %start) local_unnamed_addr #0 {
entry:
  %cmp3 = icmp slt i32 %start, 100
  br i1 %cmp3, label %for.body.preheader, label %for.cond.cleanup

for.body.preheader:                               ; preds = %entry
  %0 = sext i32 %start to i64
  br label %for.body

for.cond.cleanup.loopexit:                        ; preds = %for.body
  br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds = %for.cond.cleanup.loopexit, %entry
  ret void

for.body:                                         ; preds = %for.body.preheader, %for.body
  %indvars.iv = phi i64 [ %0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
  %i.04 = phi i32 [ %inc, %for.body ], [ %start, %for.body.preheader ]
  %idxprom = sext i32 %i.04 to i64
  %arrayidx = getelementptr inbounds i32, ptr %result, i64 %indvars.iv
  %1 = load i32, ptr %arrayidx, align 4, !tbaa !6
  %add = add nsw i32 %1, 1
  store i32 %add, ptr %arrayidx, align 4, !tbaa !6
  %indvars.iv.next = add nsw i64 %indvars.iv, 1
  %inc = add nsw i32 %i.04, 1
  %cmp = icmp slt i64 %indvars.iv, 99
  br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit, !llvm.loop !10
}

I don't think SCEV sees the %cmp3 = icmp slt i32 %start, 100 condition and so it doesn't realise that %indvars.iv.next can't signed wrap in i32?

We found the linearFunctionTestReplace()'s design can help determine whether truncation will change the sign:

  1. The induction variable (IV) must be a LoopCounter, so its step is guaranteed to be 1.
  2. The ICmpInst::Predicate can only be eq or ne, which means that ExitCnt must be the final value of IV.

Therefore, when the initial value Start of IV does not exceed ExitCntSize, the range [start, end) of IV will not cause signed or unsigned wrap.

For instance, in llvm/test/Transforms/IndVarSimplify/lftr.ll, the initial value start = 68719476736 = 2^9 is within the range of i32, the step is 1, and the final value %sub is also i32. Thus, the IV remains within the range of i32. temp3 = trunc nuw i64 %indvars.iv.next to i32 is valid.

So I propose to add a check for the type width of the initial IV value. If it matches the target type for truncation, then annotate the Trunc Instruction with nsw or nuw flag. Looking forward to your reply :)

Comment on lines 1060 to 1061
if (const SCEVAddRecExpr *TruncAR =
dyn_cast<SCEVAddRecExpr>(TruncatedIV)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's an induction variable, is TruncatedIV always guaranteed to be a SCEVAddRecExpr? Does it work if we do a cast instead?

Suggested change
if (const SCEVAddRecExpr *TruncAR =
dyn_cast<SCEVAddRecExpr>(TruncatedIV)) {
auto *TruncAR = cast<SCEVAddRecExpr>(TruncatedIV);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it's an induction variable, is TruncatedIV always guaranteed to be a SCEVAddRecExpr? Does it work if we do a cast instead?

For safety, I prefer using dyn_cast (even though in the genLoopLimit function of this file, it directly uses cast<SCEVAddRecExpr>(SE->getSCEV(IndVar)))

@buggfg buggfg marked this pull request as ready for review July 24, 2025 09:25
@buggfg buggfg changed the title [WIP][IndVarSimplify] Fix Masking Issue by Adding nsw/nuw Flags to Trunc Instruction [IndVarSimplify] Fix Masking Issue by Adding nsw/nuw Flags to Trunc Instruction Jul 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 24, 2025

@llvm/pr-subscribers-llvm-transforms

Author: bernadate (buggfg)

Changes

As I mentioned in the discussion, this patch specifically addresses the masking issue found in common loop structures.

This patch resolves the masking issue by adding the nsw/nuw flags to the trunc instruction, allowing the InstCombinePass to subsequently remove that Trunc instruction.

With this patch, the following common loop successfully undergoes 8 iterations of loop unrolling, resulting in a remarkable 2x performance improvement (without vectorization):

void func(int result[], int start) {
  for (int i = start; i &lt; 100; i++)
      result[i] += 1;
}

Additionally, we have validated the functional correctness and effectiveness of this patch through testing on the SPEC CPU2006 and SPEC CPU2017 benchmarks.

Thank you for considering this change!


Full diff: https://github.com/llvm/llvm-project/pull/150179.diff

5 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/IndVarSimplify.cpp (+20-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/X86/eliminate-trunc.ll (+4-4)
  • (modified) llvm/test/Transforms/IndVarSimplify/lftr-pr41998.ll (+1-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/lftr.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll (+1-2)
diff --git a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
index 334c911191cb8..04a1f4831b8d8 100644
--- a/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+++ b/llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
@@ -1049,9 +1049,28 @@ linearFunctionTestReplace(Loop *L, BasicBlock *ExitingBB,
     if (Extended) {
       bool Discard;
       L->makeLoopInvariant(ExitCnt, Discard);
-    } else
+    } else{
       CmpIndVar = Builder.CreateTrunc(CmpIndVar, ExitCnt->getType(),
                                       "lftr.wideiv");
+
+      // Set the correct wrap flag to avoid the masking issue.
+      Instruction *TruncInst = dyn_cast<Instruction>(CmpIndVar);
+      
+      // The TruncatedIV is incrementing.
+      if (const SCEVAddRecExpr *TruncAR =
+              dyn_cast<SCEVAddRecExpr>(TruncatedIV)) {
+        // If TruncIV does not cause self-wrap, explicitly add the nsw and nuw
+        // flags to TruncInst.
+        if (TruncAR->hasNoSelfWrap()) {
+          TruncInst->setHasNoSignedWrap();
+          TruncInst->setHasNoUnsignedWrap();
+        } else if (TruncAR->hasNoSignedWrap()) {
+          TruncInst->setHasNoSignedWrap();
+        } else if (TruncAR->hasNoUnsignedWrap()) {
+          TruncInst->setHasNoUnsignedWrap();
+        }
+      }
+    }  
   }
   LLVM_DEBUG(dbgs() << "INDVARS: Rewriting loop exit condition to:\n"
                     << "      LHS:" << *CmpIndVar << '\n'
diff --git a/llvm/test/Transforms/IndVarSimplify/X86/eliminate-trunc.ll b/llvm/test/Transforms/IndVarSimplify/X86/eliminate-trunc.ll
index 565ac5c8743d4..7e7a3f192f998 100644
--- a/llvm/test/Transforms/IndVarSimplify/X86/eliminate-trunc.ll
+++ b/llvm/test/Transforms/IndVarSimplify/X86/eliminate-trunc.ll
@@ -227,7 +227,7 @@ define void @test_01_unsigned(i32 %n) {
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i64 [[IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[TMP0]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
@@ -255,7 +255,7 @@ define void @test_02_unsigned(i32 %n) {
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ 4294967294, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i64 [[IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[TMP0]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
@@ -304,7 +304,7 @@ define void @test_04_unsigned(i32 %n) {
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i64 [[IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[TMP0]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
@@ -332,7 +332,7 @@ define void @test_05_unsigned(i32 %n) {
 ; CHECK:       loop:
 ; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ 1, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i64 [[IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[TMP0]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
diff --git a/llvm/test/Transforms/IndVarSimplify/lftr-pr41998.ll b/llvm/test/Transforms/IndVarSimplify/lftr-pr41998.ll
index b7f4756b2757f..376ef1ac5ffac 100644
--- a/llvm/test/Transforms/IndVarSimplify/lftr-pr41998.ll
+++ b/llvm/test/Transforms/IndVarSimplify/lftr-pr41998.ll
@@ -13,7 +13,7 @@ define void @test_int(i32 %start, ptr %p) {
 ; CHECK-NEXT:    [[I2:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[I2_INC:%.*]], [[LOOP]] ]
 ; CHECK-NEXT:    [[I2_INC]] = add nuw nsw i32 [[I2]], 1
 ; CHECK-NEXT:    store volatile i32 [[I2_INC]], ptr [[P:%.*]], align 4
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i32 [[I2_INC]] to i3
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i32 [[I2_INC]] to i3
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i3 [[LFTR_WIDEIV]], [[TMP1]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[END:%.*]], label [[LOOP]]
 ; CHECK:       end:
diff --git a/llvm/test/Transforms/IndVarSimplify/lftr.ll b/llvm/test/Transforms/IndVarSimplify/lftr.ll
index 5ee62ba357ab6..cfa4baa2d3b11 100644
--- a/llvm/test/Transforms/IndVarSimplify/lftr.ll
+++ b/llvm/test/Transforms/IndVarSimplify/lftr.ll
@@ -415,7 +415,7 @@ define void @wide_trip_count_test1(ptr %autoc,
 ; CHECK-NEXT:    [[ADD3:%.*]] = fadd float [[TEMP2]], [[MUL]]
 ; CHECK-NEXT:    store float [[ADD3]], ptr [[ARRAYIDX2]], align 4
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc i64 [[INDVARS_IV_NEXT]] to i32
+; CHECK-NEXT:    [[LFTR_WIDEIV:%.*]] = trunc nuw nsw i64 [[INDVARS_IV_NEXT]] to i32
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i32 [[LFTR_WIDEIV]], [[SUB]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]]
 ; CHECK:       for.end.loopexit:
diff --git a/llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll b/llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll
index bbdbd95c6017a..ddd98f21c36a4 100644
--- a/llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll
+++ b/llvm/test/Transforms/PhaseOrdering/AArch64/constraint-elimination-placement.ll
@@ -33,8 +33,7 @@ define i1 @test_order_1(ptr %this, ptr noalias %other, i1 %tobool9.not, i32 %cal
 ; CHECK-NEXT:    br i1 [[CMP44]], label [[FOR_BODY45]], label [[FOR_COND]]
 ; CHECK:       for.inc57:
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], 1
-; CHECK-NEXT:    [[TMP1:%.*]] = and i64 [[INDVARS_IV_NEXT]], 4294967295
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i64 [[TMP1]], 1
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp eq i64 [[INDVARS_IV]], 0
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[FOR_COND41_PREHEADER_PREHEADER]], label [[FOR_COND41_PREHEADER]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret i1 false

buggfg and others added 4 commits July 24, 2025 17:28
@buggfg
Copy link
Author

buggfg commented Aug 12, 2025

Hi @lukel97, I've updated the code. We found that if the initial value of theIV is within the range of the target type for the Trunc instruction, we can reasonably add the nsw/nuw flag to it. Could you take a look whenever you have a chance?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants