Skip to content

Conversation

woruyu
Copy link
Member

@woruyu woruyu commented Aug 15, 2025

Summary
This PR resolves #153543

@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Aug 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 15, 2025

@llvm/pr-subscribers-llvm-selectiondag

Author: woruyu (woruyu)

Changes

Summary
This PR resolves #153543


Full diff: https://github.com/llvm/llvm-project/pull/153762.diff

1 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+2-2)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index d343b644e41cb..245070b8b30be 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -10872,8 +10872,8 @@ SDValue DAGCombiner::visitSRA(SDNode *N) {
       // on that type, and the truncate to that type is both legal and free,
       // perform the transform.
       if ((ShiftAmt > 0) &&
-          TLI.isOperationLegalOrCustom(ISD::SIGN_EXTEND, TruncVT) &&
-          TLI.isOperationLegalOrCustom(ISD::TRUNCATE, VT) &&
+          TLI.isOperationLegalOrCustom(ISD::SIGN_EXTEND, VT) &&
+          TLI.isOperationLegalOrCustom(ISD::TRUNCATE, TruncVT) &&
           TLI.isTruncateFree(VT, TruncVT)) {
         SDValue Amt = DAG.getShiftAmountConstant(ShiftAmt, VT, DL);
         SDValue Shift = DAG.getNode(ISD::SRL, DL, VT,

@woruyu woruyu requested a review from RKSimon August 15, 2025 08:08
@woruyu
Copy link
Member Author

woruyu commented Aug 15, 2025

Just check type exchanged! Is it necessary to add a testcase?

@RKSimon
Copy link
Collaborator

RKSimon commented Aug 15, 2025

Just check type exchanged! Is it necessary to add a testcase?

If at all possible - yes

@RKSimon RKSimon requested a review from topperc August 15, 2025 08:41
@woruyu
Copy link
Member Author

woruyu commented Aug 15, 2025

I tried, but finding a case where SIGN_EXTEND to TruncVT is LegalOrCustom and TRUNCATE from VT is also LegalOrCustom (and still reaches instruction selection to trigger the error) feels like searching for a needle in a haystack. For a lit test, do you have any suggestions, @akiva-pripas, or would it make more sense to add a unit test instead, and if so, in which file?

@akiva-pripas
Copy link

I tried, but finding a case where SIGN_EXTEND to TruncVT is LegalOrCustom and TRUNCATE from VT is also LegalOrCustom (and still reaches instruction selection to trigger the error) feels like searching for a needle in a haystack. For a lit test, do you have any suggestions, @akiva-pripas, or would it make more sense to add a unit test instead, and if so, in which file?

The crash came from an out of tree target that I can't share, so I'm afraid I can't supply a test for it.

@akiva-pripas
Copy link

akiva-pripas commented Aug 17, 2025

I can provide the selectionDAG before and after the optimization takes place when it shouldn't:

Vector/type-legalized selection DAG: %bb.11 'vector_aint_shift_aint:vector.body27'
SelectionDAG has 44 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %17
  t13: i32,ch = CopyFromReg t0, Register:i32 %16
  t57: v8i64 = BUILD_VECTOR Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>
  t43: v8i64,ch = load<(load (s512) from %ir.lsr.iv45, align 4, !tbaa !33)> t0, t2, undef:i32
    t23: i32,ch = CopyFromReg t0, Register:i32 %15
  t25: i32 = add nsw t23, Constant:i32<-16>
    t45: i32 = add nuw t2, Constant:i32<64>
  t46: v8i64,ch = load<(load (s512) from %ir.lsr.iv45 + 64, align 4, !tbaa !33)> t0, t45, undef:i32
  t47: ch = TokenFactor t43:1, t46:1
  t69: v8i64 = BUILD_VECTOR Constant:i64<63>, Constant:i64<63>, Constant:i64<63>, Constant:i64<63>, Constant:i64<63>, Constant:i64<63>, Constant:i64<63>, Constant:i64<63>
          t16: i32 = add t2, Constant:i32<128>
        t18: ch = CopyToReg t0, Register:i32 %18, t16
          t19: i32 = add t13, Constant:i32<128>
        t21: ch = CopyToReg t0, Register:i32 %19, t19
        t27: ch = CopyToReg t0, Register:i32 %20, t25
              t63: v8i64 = srl t43, t57
            t70: v8i64 = shl t63, t69
          t71: v8i64 = sra t70, t69
        t52: ch = store<(store (s512) into %ir.lsr.iv47, align 4, !tbaa !33)> t47, t71, t13, undef:i32
              t58: v8i64 = srl t46, t57
            t72: v8i64 = shl t58, t69
          t73: v8i64 = sra t72, t69
          t53: i32 = add nuw t13, Constant:i32<64>
        t54: ch = store<(store (s512) into %ir.lsr.iv47 + 64, align 4, !tbaa !33)> t47, t73, t53, undef:i32
      t67: ch = TokenFactor t18, t21, t27, t52, t54
    t40: ch = br_cc t67, setne:ch, t25, Constant:i32<0>, BasicBlock:ch<vector.body27 0x2a12f621658>
  t36: ch = br t40, BasicBlock:ch<middle.block21 0x2a12f621910>
Optimized vector-legalized selection DAG: %bb.11 'vector_aint_shift_aint:vector.body27'
SelectionDAG has 42 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %17
  t13: i32,ch = CopyFromReg t0, Register:i32 %16
  t57: v8i64 = BUILD_VECTOR Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>, Constant:i64<31>
  t43: v8i64,ch = load<(load (s512) from %ir.lsr.iv45, align 4, !tbaa !33)> t0, t2, undef:i32
    t23: i32,ch = CopyFromReg t0, Register:i32 %15
  t25: i32 = add nsw t23, Constant:i32<-16>
    t45: i32 = add nuw t2, Constant:i32<64>
  t46: v8i64,ch = load<(load (s512) from %ir.lsr.iv45 + 64, align 4, !tbaa !33)> t0, t45, undef:i32
  t47: ch = TokenFactor t43:1, t46:1
          t16: i32 = add t2, Constant:i32<128>
        t18: ch = CopyToReg t0, Register:i32 %18, t16
          t19: i32 = add t13, Constant:i32<128>
        t21: ch = CopyToReg t0, Register:i32 %19, t19
        t27: ch = CopyToReg t0, Register:i32 %20, t25
              t83: v8i64 = srl t43, t57
            t84: v8i1 = truncate t83
          t85: v8i64 = sign_extend t84
        t52: ch = store<(store (s512) into %ir.lsr.iv47, align 4, !tbaa !33)> t47, t85, t13, undef:i32
              t77: v8i64 = srl t46, t57
            t78: v8i1 = truncate t77
          t79: v8i64 = sign_extend t78
          t53: i32 = add nuw t13, Constant:i32<64>
        t54: ch = store<(store (s512) into %ir.lsr.iv47 + 64, align 4, !tbaa !33)> t47, t79, t53, undef:i32
      t67: ch = TokenFactor t18, t21, t27, t52, t54
    t40: ch = br_cc t67, setne:ch, t25, Constant:i32<0>, BasicBlock:ch<vector.body27 0x2a12f621658>
  t36: ch = br t40, BasicBlock:ch<middle.block21 0x2a12f621910>

@woruyu
Copy link
Member Author

woruyu commented Aug 22, 2025

Thank you for the DAG! @akiva-pripas, however, I tried to create a lit test that hits this change, but it’s very hard to reproduce reliably. Would it be acceptable to land the fix without a test?(just for trade-off).
I do have a small test that reaches the new legality gate in DAGCombine and exits early, But most backends return isTruncateFree = false for vector types; AMDGPU can return true for some sizes, but I still couldn’t find a stable (target, MVT) pair where both TRUNCATE(VT→TruncVT) and SIGN_EXTEND(TruncVT→VT) are LegalOrCustom and the transform survives to isel and then create a error.

; RUN: llc -mtriple=x86_64-pc-linux-gnu -mattr=+avx512f -O2 < %s | FileCheck %s

define void @extract_bit31_mask_v8i64(<8 x i64>* %dst, <8 x i64>* %src) {


entry:
  %x = load <8 x i64>, <8 x i64>* %src, align 64
  %v31 = shufflevector <8 x i64> <i64 31, i64 31, i64 31, i64 31, i64 31, i64 31, i64 31, i64 31>,
                        <8 x i64> undef,
                        <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
  %v63 = shufflevector <8 x i64> <i64 63, i64 63, i64 63, i64 63, i64 63, i64 63, i64 63, i64 63>,
                        <8 x i64> undef,
                        <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>

  %shr = lshr <8 x i64> %x, %v31
%shl = shl <8 x i64> %shr,
            <i64 62, i64 62, i64 62, i64 62,
             i64 62, i64 62, i64 62, i64 62>

  %sra = ashr <8 x i64> %shl, %v63
  store <8 x i64> %sra, <8 x i64>* %dst, align 64
  ret void
}

@arsenm
Copy link
Contributor

arsenm commented Aug 23, 2025

I'd still add that testcase that shows the pattern, though you should fix the opaque pointers and use of undef in it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wrong types checked at DAGCombiner::visitSRA(SDNode *N)
5 participants