Skip to content

Conversation

AZero13
Copy link
Contributor

@AZero13 AZero13 commented Aug 21, 2025

This is so that we don't expand to include unneeded 0 checks.

Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1 in Fast-ISEL.

Finally, Remove the README entry regarding this issue.

@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2025

@llvm/pr-subscribers-backend-arm

Author: AZero13 (AZero13)

Changes

This is so that we don't expand to include unneeded 0 checks.

Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1 in Fast-ISEL.

Finally, Remove the README entry regarding this issue.


Full diff: https://github.com/llvm/llvm-project/pull/154848.diff

4 Files Affected:

  • (modified) llvm/lib/Target/ARM/ARMISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/ARM/ARMLegalizerInfo.cpp (+1-1)
  • (modified) llvm/lib/Target/ARM/README.txt (-16)
  • (modified) llvm/test/CodeGen/ARM/clz.ll (+33-4)
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.cpp b/llvm/lib/Target/ARM/ARMISelLowering.cpp
index 830156359e9e8..c6cf6f7d8978a 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMISelLowering.cpp
@@ -21398,7 +21398,7 @@ bool ARMTargetLowering::isCheapToSpeculateCttz(Type *Ty) const {
 }
 
 bool ARMTargetLowering::isCheapToSpeculateCtlz(Type *Ty) const {
-  return Subtarget->hasV6T2Ops();
+  return Subtarget->hasV5TOps() && !Subtarget->isThumb1Only();
 }
 
 bool ARMTargetLowering::isMaskAndCmp0FoldingBeneficial(
diff --git a/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp b/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
index fc12f050fa5a5..cdff649ecfa57 100644
--- a/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMLegalizerInfo.cpp
@@ -206,7 +206,7 @@ ARMLegalizerInfo::ARMLegalizerInfo(const ARMSubtarget &ST) : ST(ST) {
 
   getActionDefinitionsBuilder({G_FREM, G_FPOW}).libcallFor({s32, s64});
 
-  if (ST.hasV5TOps()) {
+  if (ST.hasV5TOps() && !ST.isThumb1Only()) {
     getActionDefinitionsBuilder(G_CTLZ)
         .legalFor({s32, s32})
         .clampScalar(1, s32, s32)
diff --git a/llvm/lib/Target/ARM/README.txt b/llvm/lib/Target/ARM/README.txt
index def67cfae7277..ff84e07fa084a 100644
--- a/llvm/lib/Target/ARM/README.txt
+++ b/llvm/lib/Target/ARM/README.txt
@@ -697,22 +697,6 @@ target-neutral one.
 
 //===---------------------------------------------------------------------===//
 
-Optimize unnecessary checks for zero with __builtin_clz/ctz.  Those builtins
-are specified to be undefined at zero, so portable code must check for zero
-and handle it as a special case.  That is unnecessary on ARM where those
-operations are implemented in a way that is well-defined for zero.  For
-example:
-
-int f(int x) { return x ? __builtin_clz(x) : sizeof(int)*8; }
-
-should just be implemented with a CLZ instruction.  Since there are other
-targets, e.g., PPC, that share this behavior, it would be best to implement
-this in a target-independent way: we should probably fold that (when using
-"undefined at zero" semantics) to set the "defined at zero" bit and have
-the code generator expand out the right code.
-
-//===---------------------------------------------------------------------===//
-
 Clean up the test/MC/ARM files to have more robust register choices.
 
 R0 should not be used as a register operand in the assembler tests as it's then
diff --git a/llvm/test/CodeGen/ARM/clz.ll b/llvm/test/CodeGen/ARM/clz.ll
index 0f49fbba11845..9e1e9f6ce6daa 100644
--- a/llvm/test/CodeGen/ARM/clz.ll
+++ b/llvm/test/CodeGen/ARM/clz.ll
@@ -1,12 +1,41 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
 ; RUN: llc -mtriple=arm-eabi -mattr=+v5t %s -o - | FileCheck %s -check-prefixes=CHECK,INLINE
 ; RUN: llc -mtriple=arm-eabi %s -o - | FileCheck %s -check-prefixes=CHECK,LIBCALL
 
 declare i32 @llvm.ctlz.i32(i32, i1)
 
-define i32 @test(i32 %x) {
-; CHECK-LABEL: test
-; INLINE: clz r0, r0
-; LIBCALL: b __clzsi2
+define i32 @undef_zero(i32 %x) {
+; INLINE-LABEL: undef_zero:
+; INLINE:       @ %bb.0:
+; INLINE-NEXT:    clz r0, r0
+; INLINE-NEXT:    bx lr
+;
+; LIBCALL-LABEL: undef_zero:
+; LIBCALL:       @ %bb.0:
+; LIBCALL-NEXT:    b __clzsi2
         %tmp.1 = call i32 @llvm.ctlz.i32( i32 %x, i1 true )
         ret i32 %tmp.1
 }
+
+define i32 @no_undef_zero(i32 %x) {
+; INLINE-LABEL: no_undef_zero:
+; INLINE:       @ %bb.0:
+; INLINE-NEXT:    clz r0, r0
+; INLINE-NEXT:    bx lr
+;
+; LIBCALL-LABEL: no_undef_zero:
+; LIBCALL:       @ %bb.0:
+; LIBCALL-NEXT:    cmp r0, #0
+; LIBCALL-NEXT:    moveq r0, #32
+; LIBCALL-NEXT:    moveq pc, lr
+; LIBCALL-NEXT:  .LBB1_1: @ %cond.false
+; LIBCALL-NEXT:    .save {r11, lr}
+; LIBCALL-NEXT:    push {r11, lr}
+; LIBCALL-NEXT:    bl __clzsi2
+; LIBCALL-NEXT:    pop {r11, lr}
+; LIBCALL-NEXT:    mov pc, lr
+        %tmp.1 = call i32 @llvm.ctlz.i32( i32 %x, i1 false )
+        ret i32 %tmp.1
+}
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; CHECK: {{.*}}

@AZero13
Copy link
Contributor Author

AZero13 commented Aug 21, 2025

@davemgreen

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For reference, see af1b48b .)

Please also fix isCheapToSpeculateCttz.

@AZero13
Copy link
Contributor Author

AZero13 commented Aug 21, 2025

(For reference, see af1b48b .)

Please also fix isCheapToSpeculateCttz.

Done!

@efriedma-quic
Copy link
Collaborator

Maybe also add a RUN line to llvm/test/CodeGen/ARM/cttz.ll?

…or hasV5TOps and no Thumb 1

This is so that we don't expand to include unneeded 0 checks.

Also fix the logic error in LegalizerInfo so it is NOT legal on Thumb1 in Fast-ISEL.

Finally, Remove the README entry regarding this issue.
@AZero13
Copy link
Contributor Author

AZero13 commented Aug 22, 2025

@efriedma-quic Done!

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

(You could maybe make an argument that we shouldn't speculate ctz without rbit... but it close enough that it probably doesn't matter much either way.)

@AZero13
Copy link
Contributor Author

AZero13 commented Aug 25, 2025

Thank you. @efriedma-quic i don't have merge permissions. Can you please do it?

@efriedma-quic efriedma-quic merged commit 79dfe48 into llvm:main Aug 25, 2025
9 checks passed
@AZero13 AZero13 deleted the moveq branch August 25, 2025 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants