[VectorCombine] fix division by zero in foldSelectShuffle #156779

davidberard98 · 2025-09-04T00:15:45Z

In pytorch/pytorch#161371, we see that MaxElementsInVector can be 0, causing a division by zero in AddShuffleMaskAdjustedCost.

This will set MaxElementsInVector to 1 to avoid division by zero.

llvmbot · 2025-09-04T00:16:14Z

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: David Berard (davidberard98)

Changes

In pytorch/pytorch#161371, we see that MaxElementsInVector can be 0, causing a division by zero in AddShuffleMaskAdjustedCost.

This will set MaxElementsInVector to 1 to avoid division by zero.

Full diff: https://github.com/llvm/llvm-project/pull/156779.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+1-1)
(added) llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll (+21)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index 6e46547b15b2b..3a17332274617 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -3900,7 +3900,7 @@ bool VectorCombine::foldSelectShuffle(Instruction &I, bool FromReduction) {
   unsigned ElementSize = VT->getElementType()->getPrimitiveSizeInBits();
   unsigned MaxVectorSize =
       TTI.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector);
-  unsigned MaxElementsInVector = MaxVectorSize / ElementSize;
+  unsigned MaxElementsInVector = std::max<unsigned>(1, MaxVectorSize / ElementSize);
   // When there are multiple shufflevector operations on the same input,
   // especially when the vector length is larger than the register size,
   // identical shuffle patterns may occur across different groups of elements.
diff --git a/llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll b/llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll
new file mode 100644
index 0000000000000..e898689b8f61b
--- /dev/null
+++ b/llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=vector-combine -S < %s | FileCheck %s
+
+define ptx_kernel void @shuffle_ptx_i64() {
+; CHECK-LABEL: define ptx_kernel void @shuffle_ptx_i64() {
+; CHECK-NEXT:  [[_LR_PH:.*:]]
+; CHECK-NEXT:    [[TMP0:%.*]] = shufflevector <8 x i64> zeroinitializer, <8 x i64> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <8 x i64> zeroinitializer, <8 x i64> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP2:%.*]] = or <8 x i64> [[TMP0]], [[TMP1]]
+; CHECK-NEXT:    [[TMP3:%.*]] = shl <8 x i64> [[TMP0]], [[TMP1]]
+; CHECK-NEXT:    [[TMP4:%.*]] = shufflevector <8 x i64> [[TMP2]], <8 x i64> [[TMP3]], <8 x i32> <i32 8, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    ret void
+;
+.lr.ph:
+  %0 = shufflevector <8 x i64> zeroinitializer, <8 x i64> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 4, i32 5, i32 6, i32 7>
+  %1 = shufflevector <8 x i64> zeroinitializer, <8 x i64> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 8, i32 9, i32 4, i32 5, i32 6, i32 7>
+  %2 = or <8 x i64> %0, %1
+  %3 = shl <8 x i64> %0, %1
+  %4 = shufflevector <8 x i64> %2, <8 x i64> %3, <8 x i32> <i32 8, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+  ret void
+}

In pytorch/pytorch#161371, we see that MaxElementsInVector can be 0, causing a division by zero in `AddShuffleMaskAdjustedCost`. This will set MaxElementsInVector to 1 to avoid division by zero.

dtcxzyw · 2025-09-04T11:56:04Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

@@ -3900,7 +3900,8 @@ bool VectorCombine::foldSelectShuffle(Instruction &I, bool FromReduction) {
  unsigned ElementSize = VT->getElementType()->getPrimitiveSizeInBits();
  unsigned MaxVectorSize =
      TTI.getRegisterBitWidth(TargetTransformInfo::RGK_FixedWidthVector);
-  unsigned MaxElementsInVector = MaxVectorSize / ElementSize;
+  unsigned MaxElementsInVector =


Can we early exit if MaxElementsInVector <= 1? The trivial case isn't profitable.

+1 an early-out seems the better approach.

dtcxzyw · 2025-09-04T11:58:03Z

llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll

@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=vector-combine -S < %s | FileCheck %s


Suggested change

; RUN: opt -passes=vector-combine -S < %s | FileCheck %s

; RUN: opt -passes=vector-combine -mtriple=nvptx-- -S < %s | FileCheck %s

dtcxzyw · 2025-09-04T12:00:19Z

llvm/test/Transforms/VectorCombine/fold-select-shuffle.ll

+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=vector-combine -S < %s | FileCheck %s
+
+define ptx_kernel void @shuffle_ptx_i64() {


Please move this test into the NVPTX subdir. I cannot reproduce the issue with other targets (e.g., -mtriple=x86_64).

llvm/test/Transforms/VectorCombine NVPTX/ fold-select-shuffle.ll lit.local.cfg

Content of NVPTX/lit.local.cfg:

if not "NVPTX" in config.root.targets: config.unsupported = True

llvmbot added vectorizers llvm:transforms llvm:vectorcombine labels Sep 4, 2025

[VectorCombine] fix division by zero in foldSelectShuffle

0548cb2

In pytorch/pytorch#161371, we see that MaxElementsInVector can be 0, causing a division by zero in `AddShuffleMaskAdjustedCost`. This will set MaxElementsInVector to 1 to avoid division by zero.

davidberard98 force-pushed the divide-zero-fold-select-shuffle branch from 1f7fc9d to 0548cb2 Compare September 4, 2025 00:17

dtcxzyw requested a review from RKSimon September 4, 2025 02:24

dtcxzyw reviewed Sep 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VectorCombine] fix division by zero in foldSelectShuffle #156779

[VectorCombine] fix division by zero in foldSelectShuffle #156779

davidberard98 commented Sep 4, 2025

Uh oh!

llvmbot commented Sep 4, 2025 •

edited

Loading

Uh oh!

dtcxzyw Sep 4, 2025

Uh oh!

RKSimon Sep 4, 2025

Uh oh!

dtcxzyw Sep 4, 2025

Uh oh!

dtcxzyw Sep 4, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,21 @@
		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
		; RUN: opt -passes=vector-combine -S < %s \| FileCheck %s

	; RUN: opt -passes=vector-combine -S < %s \| FileCheck %s
	; RUN: opt -passes=vector-combine -mtriple=nvptx-- -S < %s \| FileCheck %s

[VectorCombine] fix division by zero in foldSelectShuffle #156779

Are you sure you want to change the base?

[VectorCombine] fix division by zero in foldSelectShuffle #156779

Conversation

davidberard98 commented Sep 4, 2025

Uh oh!

llvmbot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dtcxzyw Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

RKSimon Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

dtcxzyw Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

dtcxzyw Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Sep 4, 2025 •

edited

Loading