Skip to content

Conversation

david-arm
Copy link
Contributor

There were 5 X86 loop vectoriser tests that were piping the output from opt into llc. I think in the directory test/Transforms/LoopVectorize we should only be testing the output from the loop vectoriser pass. Any codegen tests should live in test/CodeGen/X86 instead.

avx512.ll: it looks like we were really just testing that we generate the right vector length.
fp32_to_uint32-cost-model.ll/fp64_to_uint32-cost-model.ll: the tests only seem to care that we're not scalarising the fptoui, so I've modified the test to check for vector ops. I've assumed there are already codegen tests for fptoui vector operations.
vectorization-remarks-loopid-dbg.ll: i've copied this test to CodeGen/X86/vectorization-remarks-loopid-dbg.ll for the llc RUN line variant
vectorization-remarks.ll: seems to test the same thing as vectorization-remarks-loopid-dbg.ll

There were 5 X86 loop vectoriser tests that were piping the
output from opt into llc. I think in the directory
test/Transforms/LoopVectorize we should only be testing the
output from the loop vectoriser pass. Any codegen tests
should live in test/CodeGen/X86 instead.

avx512.ll: it looks like we were really just testing that
we generate the right vector length.
fp32_to_uint32-cost-model.ll/fp64_to_uint32-cost-model.ll:
the tests only seem to care that we're not scalarising the
fptoui, so I've modified the test to check for vector ops.
I've assumed there are already codegen tests for fptoui
vector operations.
vectorization-remarks-loopid-dbg.ll: i've copied this test
to CodeGen/X86/vectorization-remarks-loopid-dbg.ll for the
llc RUN line variant
vectorization-remarks.ll: seems to test the same thing
as vectorization-remarks-loopid-dbg.ll
@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2025

@llvm/pr-subscribers-backend-x86

Author: David Sherwood (david-arm)

Changes

There were 5 X86 loop vectoriser tests that were piping the output from opt into llc. I think in the directory test/Transforms/LoopVectorize we should only be testing the output from the loop vectoriser pass. Any codegen tests should live in test/CodeGen/X86 instead.

avx512.ll: it looks like we were really just testing that we generate the right vector length.
fp32_to_uint32-cost-model.ll/fp64_to_uint32-cost-model.ll: the tests only seem to care that we're not scalarising the fptoui, so I've modified the test to check for vector ops. I've assumed there are already codegen tests for fptoui vector operations.
vectorization-remarks-loopid-dbg.ll: i've copied this test to CodeGen/X86/vectorization-remarks-loopid-dbg.ll for the llc RUN line variant
vectorization-remarks.ll: seems to test the same thing as vectorization-remarks-loopid-dbg.ll


Full diff: https://github.com/llvm/llvm-project/pull/154759.diff

6 Files Affected:

  • (added) llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll (+66)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/avx512.ll (+22-31)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll (-4)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll (-4)
diff --git a/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll b/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll
new file mode 100644
index 0000000000000..31949403b4465
--- /dev/null
+++ b/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll
@@ -0,0 +1,66 @@
+; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
+; DEBUG-OUTPUT-NOT: .loc
+; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+define i32 @foo(i32 %n) #0 !dbg !4 {
+entry:
+  %diff = alloca i32, align 4
+  %cb = alloca [16 x i8], align 16
+  %cc = alloca [16 x i8], align 16
+  store i32 0, ptr %diff, align 4, !tbaa !11
+  br label %for.body
+
+for.body:                                         ; preds = %for.body, %entry
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %add8 = phi i32 [ 0, %entry ], [ %add, %for.body ]
+  %arrayidx = getelementptr inbounds [16 x i8], ptr %cb, i64 0, i64 %indvars.iv
+  %0 = load i8, ptr %arrayidx, align 1, !tbaa !21
+  %conv = sext i8 %0 to i32
+  %arrayidx2 = getelementptr inbounds [16 x i8], ptr %cc, i64 0, i64 %indvars.iv
+  %1 = load i8, ptr %arrayidx2, align 1, !tbaa !21
+  %conv3 = sext i8 %1 to i32
+  %sub = sub i32 %conv, %conv3
+  %add = add nsw i32 %sub, %add8
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp eq i64 %indvars.iv.next, 16
+  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !25
+
+for.end:                                          ; preds = %for.body
+  store i32 %add, ptr %diff, align 4, !tbaa !11
+  call void @ibar(ptr %diff) #2
+  ret i32 0
+}
+
+declare void @ibar(ptr) #1
+
+!llvm.module.flags = !{!7, !8}
+!llvm.ident = !{!9}
+!llvm.dbg.cu = !{!24}
+
+!1 = !DIFile(filename: "vectorization-remarks.c", directory: ".")
+!2 = !{}
+!3 = !{!4}
+!4 = distinct !DISubprogram(name: "foo", line: 5, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !24, scopeLine: 6, file: !1, scope: !5, type: !6, retainedNodes: !2)
+!5 = !DIFile(filename: "vectorization-remarks.c", directory: ".")
+!6 = !DISubroutineType(types: !2)
+!7 = !{i32 2, !"Dwarf Version", i32 4}
+!8 = !{i32 1, !"Debug Info Version", i32 3}
+!9 = !{!"clang version 3.5.0 "}
+!10 = !DILocation(line: 8, column: 3, scope: !4)
+!11 = !{!12, !12, i64 0}
+!12 = !{!"int", !13, i64 0}
+!13 = !{!"omnipotent char", !14, i64 0}
+!14 = !{!"Simple C/C++ TBAA"}
+!15 = !DILocation(line: 17, column: 8, scope: !16)
+!16 = distinct !DILexicalBlock(line: 17, column: 8, file: !1, scope: !17)
+!17 = distinct !DILexicalBlock(line: 17, column: 8, file: !1, scope: !18)
+!18 = distinct !DILexicalBlock(line: 17, column: 3, file: !1, scope: !4)
+!19 = !DILocation(line: 18, column: 5, scope: !20)
+!20 = distinct !DILexicalBlock(line: 17, column: 27, file: !1, scope: !18)
+!21 = !{!13, !13, i64 0}
+!22 = !DILocation(line: 20, column: 3, scope: !4)
+!23 = !DILocation(line: 21, column: 3, scope: !4)
+!24 = distinct !DICompileUnit(language: DW_LANG_C89, file: !1, emissionKind: NoDebug)
+!25 = !{!25, !15}
diff --git a/llvm/test/Transforms/LoopVectorize/X86/avx512.ll b/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
index 33d1d3f0d2219..b8e0697c8ac6d 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mattr=+avx512f -passes=loop-vectorize -S < %s | llc -mattr=+avx512f | FileCheck %s
-; RUN: opt -mattr=+avx512vl,+prefer-256-bit -passes=loop-vectorize -S < %s | llc -mattr=+avx512f | FileCheck %s --check-prefix=CHECK-PREFER-AVX256
+; RUN: opt -mattr=+avx512f -passes=loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-NO-PREFER
+; RUN: opt -mattr=+avx512vl,+prefer-256-bit -passes=loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-PREFER-AVX256
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx10.9.0"
@@ -7,18 +7,19 @@ target triple = "x86_64-apple-macosx10.9.0"
 ; Verify that we generate 512-bit wide vectors for a basic integer memset
 ; loop.
 
-; CHECK-LABEL: _f:
-; CHECK: %vec.epilog.vector.body
-; CHECK: %ymm
-; CHECK: %vector.body
-; CHECK-NOT: %ymm
-; CHECK: vmovdqu64 %zmm{{.}},
+; CHECK-NO-PREFER-LABEL: @f(
+; CHECK-NO-PREFER: vector.body:
+; CHECK-NO-PREFER: store <16 x i32>
+; CHECK-NO-PREFER: vec.epilog.vector.body:
+; CHECK-NO-PREFER: store <8 x i32>
 
 ; Verify that we don't generate 512-bit wide vectors when subtarget feature says not to
 
-; CHECK-PREFER-AVX256-LABEL: f:
-; CHECK-PREFER-AVX256: vmovdqu %ymm{{.}},
-; CHECK-PREFER-AVX256-NOT: %zmm
+; CHECK-PREFER-AVX256-LABEL: @f(
+; CHECK-PREFER-AVX256: vector.body:
+; CHECK-PREFER-AVX256: store <8 x i32>
+; CHECK-PREFER-AVX256: vec.epilog.vector.body:
+; CHECK-PREFER-AVX256: store <4 x i32>
 
 define void @f(ptr %a, i32 %n) {
 entry:
@@ -47,13 +48,11 @@ for.end:                                          ; preds = %for.end.loopexit, %
 ; Verify that the "prefer-vector-width=256" attribute prevents the use of 512-bit
 ; vectors
 
-; CHECK-LABEL: _g:
-; CHECK: vmovdqu %ymm{{.}},
-; CHECK-NOT: %zmm
-
-; CHECK-PREFER-AVX256-LABEL: g:
-; CHECK-PREFER-AVX256: vmovdqu %ymm{{.}},
-; CHECK-PREFER-AVX256-NOT: %zmm
+; CHECK-LABEL: @g(
+; CHECK: vector.body:
+; CHECK: store <8 x i32>
+; CHECK: vec.epilog.vector.body:
+; CHECK: store <4 x i32>
 
 define void @g(ptr %a, i32 %n) "prefer-vector-width"="256" {
 entry:
@@ -82,19 +81,11 @@ for.end:                                          ; preds = %for.end.loopexit, %
 ; Verify that the "prefer-vector-width=512" attribute override the subtarget
 ; vectors
 
-; CHECK-LABEL: _h:
-; CHECK: %vec.epilog.vector.body
-; CHECK: %ymm
-; CHECK: %vector.body
-; CHECK: vmovdqu64 %zmm{{.}},
-; CHECK-NOT: %ymm
-
-; CHECK-PREFER-AVX256-LABEL: h:
-; CHECK-PREFER-AVX256: %vec.epilog.vector.body
-; CHECK-PREFER-AVX256: %ymm
-; CHECK-PREFER-AVX256: %vector.body
-; CHECK-PREFER-AVX256: vmovdqu64 %zmm{{.}},
-; CHECK-PREFER-AVX256-NOT: %ymm
+; CHECK-LABEL: @h(
+; CHECK: vector.body:
+; CHECK: store <16 x i32>
+; CHECK: vec.epilog.vector.body:
+; CHECK: store <8 x i32>
 
 define void @h(ptr %a, i32 %n) "prefer-vector-width"="512" {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll b/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
index 4d92c1a3cf424..bb8a1228ec0d2 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
@@ -1,4 +1,4 @@
-; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | llc -mcpu=core-avx2 | FileCheck %s
+; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | FileCheck %s
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx"
@@ -8,7 +8,7 @@ target triple = "x86_64-apple-macosx"
 
 ; If we need to scalarize the fptoui and then use inserts to build up the
 ; vector again, then there is certainly no value in going 256-bit wide.
-; CHECK-NOT: vinserti128
+; CHECK: fptoui <8 x float>
 
 define void @convert(i32 %N) {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll b/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
index 03783d3a6c9fb..e2061756b19fb 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
@@ -1,4 +1,4 @@
-; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | llc -mcpu=core-avx2 | FileCheck %s
+; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | FileCheck %s
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx"
@@ -9,7 +9,7 @@ target triple = "x86_64-apple-macosx"
 
 ; If we need to scalarize the fptoui and then use inserts to build up the
 ; vector again, then there is certainly no value in going 256-bit wide.
-; CHECK-NOT: vpinsrd
+; CHECK: fptoui <4 x double>
 
 define void @convert() {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
index d774f778b7fdc..e1ecb70feb436 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
@@ -2,10 +2,6 @@
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=4 -mtriple=x86_64-unknown-linux -S -pass-remarks='loop-vectorize' 2>&1 | FileCheck -check-prefix=UNROLLED %s
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=1 -mtriple=x86_64-unknown-linux -S -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck -check-prefix=NONE %s
 
-; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
-; DEBUG-OUTPUT-NOT: .loc
-; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
-
 ; VECTORIZED: remark: vectorization-remarks.c:17:8: vectorized loop (vectorization width: 4, interleaved count: 2)
 ; UNROLLED: remark: vectorization-remarks.c:17:8: interleaved loop (interleaved count: 4)
 ; NONE: remark: vectorization-remarks.c:17:8: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
index f0b960c640562..8ba28042fcf2d 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
@@ -2,10 +2,6 @@
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=4 -mtriple=x86_64-unknown-linux -S -pass-remarks='loop-vectorize' 2>&1 | FileCheck -check-prefix=UNROLLED %s
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=1 -mtriple=x86_64-unknown-linux -S -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck -check-prefix=NONE %s
 
-; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
-; DEBUG-OUTPUT-NOT: .loc
-; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
-
 ; VECTORIZED: remark: vectorization-remarks.c:17:8: vectorized loop (vectorization width: 4, interleaved count: 2)
 ; UNROLLED: remark: vectorization-remarks.c:17:8: interleaved loop (interleaved count: 4)
 ; NONE: remark: vectorization-remarks.c:17:8: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized

@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2025

@llvm/pr-subscribers-llvm-transforms

Author: David Sherwood (david-arm)

Changes

There were 5 X86 loop vectoriser tests that were piping the output from opt into llc. I think in the directory test/Transforms/LoopVectorize we should only be testing the output from the loop vectoriser pass. Any codegen tests should live in test/CodeGen/X86 instead.

avx512.ll: it looks like we were really just testing that we generate the right vector length.
fp32_to_uint32-cost-model.ll/fp64_to_uint32-cost-model.ll: the tests only seem to care that we're not scalarising the fptoui, so I've modified the test to check for vector ops. I've assumed there are already codegen tests for fptoui vector operations.
vectorization-remarks-loopid-dbg.ll: i've copied this test to CodeGen/X86/vectorization-remarks-loopid-dbg.ll for the llc RUN line variant
vectorization-remarks.ll: seems to test the same thing as vectorization-remarks-loopid-dbg.ll


Full diff: https://github.com/llvm/llvm-project/pull/154759.diff

6 Files Affected:

  • (added) llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll (+66)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/avx512.ll (+22-31)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll (-4)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll (-4)
diff --git a/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll b/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll
new file mode 100644
index 0000000000000..31949403b4465
--- /dev/null
+++ b/llvm/test/CodeGen/X86/vectorization-remarks-loopid-dbg.ll
@@ -0,0 +1,66 @@
+; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
+; DEBUG-OUTPUT-NOT: .loc
+; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
+
+target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
+
+define i32 @foo(i32 %n) #0 !dbg !4 {
+entry:
+  %diff = alloca i32, align 4
+  %cb = alloca [16 x i8], align 16
+  %cc = alloca [16 x i8], align 16
+  store i32 0, ptr %diff, align 4, !tbaa !11
+  br label %for.body
+
+for.body:                                         ; preds = %for.body, %entry
+  %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
+  %add8 = phi i32 [ 0, %entry ], [ %add, %for.body ]
+  %arrayidx = getelementptr inbounds [16 x i8], ptr %cb, i64 0, i64 %indvars.iv
+  %0 = load i8, ptr %arrayidx, align 1, !tbaa !21
+  %conv = sext i8 %0 to i32
+  %arrayidx2 = getelementptr inbounds [16 x i8], ptr %cc, i64 0, i64 %indvars.iv
+  %1 = load i8, ptr %arrayidx2, align 1, !tbaa !21
+  %conv3 = sext i8 %1 to i32
+  %sub = sub i32 %conv, %conv3
+  %add = add nsw i32 %sub, %add8
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp eq i64 %indvars.iv.next, 16
+  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !25
+
+for.end:                                          ; preds = %for.body
+  store i32 %add, ptr %diff, align 4, !tbaa !11
+  call void @ibar(ptr %diff) #2
+  ret i32 0
+}
+
+declare void @ibar(ptr) #1
+
+!llvm.module.flags = !{!7, !8}
+!llvm.ident = !{!9}
+!llvm.dbg.cu = !{!24}
+
+!1 = !DIFile(filename: "vectorization-remarks.c", directory: ".")
+!2 = !{}
+!3 = !{!4}
+!4 = distinct !DISubprogram(name: "foo", line: 5, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !24, scopeLine: 6, file: !1, scope: !5, type: !6, retainedNodes: !2)
+!5 = !DIFile(filename: "vectorization-remarks.c", directory: ".")
+!6 = !DISubroutineType(types: !2)
+!7 = !{i32 2, !"Dwarf Version", i32 4}
+!8 = !{i32 1, !"Debug Info Version", i32 3}
+!9 = !{!"clang version 3.5.0 "}
+!10 = !DILocation(line: 8, column: 3, scope: !4)
+!11 = !{!12, !12, i64 0}
+!12 = !{!"int", !13, i64 0}
+!13 = !{!"omnipotent char", !14, i64 0}
+!14 = !{!"Simple C/C++ TBAA"}
+!15 = !DILocation(line: 17, column: 8, scope: !16)
+!16 = distinct !DILexicalBlock(line: 17, column: 8, file: !1, scope: !17)
+!17 = distinct !DILexicalBlock(line: 17, column: 8, file: !1, scope: !18)
+!18 = distinct !DILexicalBlock(line: 17, column: 3, file: !1, scope: !4)
+!19 = !DILocation(line: 18, column: 5, scope: !20)
+!20 = distinct !DILexicalBlock(line: 17, column: 27, file: !1, scope: !18)
+!21 = !{!13, !13, i64 0}
+!22 = !DILocation(line: 20, column: 3, scope: !4)
+!23 = !DILocation(line: 21, column: 3, scope: !4)
+!24 = distinct !DICompileUnit(language: DW_LANG_C89, file: !1, emissionKind: NoDebug)
+!25 = !{!25, !15}
diff --git a/llvm/test/Transforms/LoopVectorize/X86/avx512.ll b/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
index 33d1d3f0d2219..b8e0697c8ac6d 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/avx512.ll
@@ -1,5 +1,5 @@
-; RUN: opt -mattr=+avx512f -passes=loop-vectorize -S < %s | llc -mattr=+avx512f | FileCheck %s
-; RUN: opt -mattr=+avx512vl,+prefer-256-bit -passes=loop-vectorize -S < %s | llc -mattr=+avx512f | FileCheck %s --check-prefix=CHECK-PREFER-AVX256
+; RUN: opt -mattr=+avx512f -passes=loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-NO-PREFER
+; RUN: opt -mattr=+avx512vl,+prefer-256-bit -passes=loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,CHECK-PREFER-AVX256
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx10.9.0"
@@ -7,18 +7,19 @@ target triple = "x86_64-apple-macosx10.9.0"
 ; Verify that we generate 512-bit wide vectors for a basic integer memset
 ; loop.
 
-; CHECK-LABEL: _f:
-; CHECK: %vec.epilog.vector.body
-; CHECK: %ymm
-; CHECK: %vector.body
-; CHECK-NOT: %ymm
-; CHECK: vmovdqu64 %zmm{{.}},
+; CHECK-NO-PREFER-LABEL: @f(
+; CHECK-NO-PREFER: vector.body:
+; CHECK-NO-PREFER: store <16 x i32>
+; CHECK-NO-PREFER: vec.epilog.vector.body:
+; CHECK-NO-PREFER: store <8 x i32>
 
 ; Verify that we don't generate 512-bit wide vectors when subtarget feature says not to
 
-; CHECK-PREFER-AVX256-LABEL: f:
-; CHECK-PREFER-AVX256: vmovdqu %ymm{{.}},
-; CHECK-PREFER-AVX256-NOT: %zmm
+; CHECK-PREFER-AVX256-LABEL: @f(
+; CHECK-PREFER-AVX256: vector.body:
+; CHECK-PREFER-AVX256: store <8 x i32>
+; CHECK-PREFER-AVX256: vec.epilog.vector.body:
+; CHECK-PREFER-AVX256: store <4 x i32>
 
 define void @f(ptr %a, i32 %n) {
 entry:
@@ -47,13 +48,11 @@ for.end:                                          ; preds = %for.end.loopexit, %
 ; Verify that the "prefer-vector-width=256" attribute prevents the use of 512-bit
 ; vectors
 
-; CHECK-LABEL: _g:
-; CHECK: vmovdqu %ymm{{.}},
-; CHECK-NOT: %zmm
-
-; CHECK-PREFER-AVX256-LABEL: g:
-; CHECK-PREFER-AVX256: vmovdqu %ymm{{.}},
-; CHECK-PREFER-AVX256-NOT: %zmm
+; CHECK-LABEL: @g(
+; CHECK: vector.body:
+; CHECK: store <8 x i32>
+; CHECK: vec.epilog.vector.body:
+; CHECK: store <4 x i32>
 
 define void @g(ptr %a, i32 %n) "prefer-vector-width"="256" {
 entry:
@@ -82,19 +81,11 @@ for.end:                                          ; preds = %for.end.loopexit, %
 ; Verify that the "prefer-vector-width=512" attribute override the subtarget
 ; vectors
 
-; CHECK-LABEL: _h:
-; CHECK: %vec.epilog.vector.body
-; CHECK: %ymm
-; CHECK: %vector.body
-; CHECK: vmovdqu64 %zmm{{.}},
-; CHECK-NOT: %ymm
-
-; CHECK-PREFER-AVX256-LABEL: h:
-; CHECK-PREFER-AVX256: %vec.epilog.vector.body
-; CHECK-PREFER-AVX256: %ymm
-; CHECK-PREFER-AVX256: %vector.body
-; CHECK-PREFER-AVX256: vmovdqu64 %zmm{{.}},
-; CHECK-PREFER-AVX256-NOT: %ymm
+; CHECK-LABEL: @h(
+; CHECK: vector.body:
+; CHECK: store <16 x i32>
+; CHECK: vec.epilog.vector.body:
+; CHECK: store <8 x i32>
 
 define void @h(ptr %a, i32 %n) "prefer-vector-width"="512" {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll b/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
index 4d92c1a3cf424..bb8a1228ec0d2 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/fp32_to_uint32-cost-model.ll
@@ -1,4 +1,4 @@
-; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | llc -mcpu=core-avx2 | FileCheck %s
+; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | FileCheck %s
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx"
@@ -8,7 +8,7 @@ target triple = "x86_64-apple-macosx"
 
 ; If we need to scalarize the fptoui and then use inserts to build up the
 ; vector again, then there is certainly no value in going 256-bit wide.
-; CHECK-NOT: vinserti128
+; CHECK: fptoui <8 x float>
 
 define void @convert(i32 %N) {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll b/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
index 03783d3a6c9fb..e2061756b19fb 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/fp64_to_uint32-cost-model.ll
@@ -1,4 +1,4 @@
-; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | llc -mcpu=core-avx2 | FileCheck %s
+; RUN: opt < %s -mcpu=core-avx2 -passes=loop-vectorize -S | FileCheck %s
 
 target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-apple-macosx"
@@ -9,7 +9,7 @@ target triple = "x86_64-apple-macosx"
 
 ; If we need to scalarize the fptoui and then use inserts to build up the
 ; vector again, then there is certainly no value in going 256-bit wide.
-; CHECK-NOT: vpinsrd
+; CHECK: fptoui <4 x double>
 
 define void @convert() {
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
index d774f778b7fdc..e1ecb70feb436 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks-loopid-dbg.ll
@@ -2,10 +2,6 @@
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=4 -mtriple=x86_64-unknown-linux -S -pass-remarks='loop-vectorize' 2>&1 | FileCheck -check-prefix=UNROLLED %s
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=1 -mtriple=x86_64-unknown-linux -S -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck -check-prefix=NONE %s
 
-; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
-; DEBUG-OUTPUT-NOT: .loc
-; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
-
 ; VECTORIZED: remark: vectorization-remarks.c:17:8: vectorized loop (vectorization width: 4, interleaved count: 2)
 ; UNROLLED: remark: vectorization-remarks.c:17:8: interleaved loop (interleaved count: 4)
 ; NONE: remark: vectorization-remarks.c:17:8: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized
diff --git a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
index f0b960c640562..8ba28042fcf2d 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/vectorization-remarks.ll
@@ -2,10 +2,6 @@
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=4 -mtriple=x86_64-unknown-linux -S -pass-remarks='loop-vectorize' 2>&1 | FileCheck -check-prefix=UNROLLED %s
 ; RUN: opt < %s -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=1 -mtriple=x86_64-unknown-linux -S -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck -check-prefix=NONE %s
 
-; RUN: llc < %s -mtriple x86_64-pc-linux-gnu -o - | FileCheck -check-prefix=DEBUG-OUTPUT %s
-; DEBUG-OUTPUT-NOT: .loc
-; DEBUG-OUTPUT-NOT: {{.*}}.debug_info
-
 ; VECTORIZED: remark: vectorization-remarks.c:17:8: vectorized loop (vectorization width: 4, interleaved count: 2)
 ; UNROLLED: remark: vectorization-remarks.c:17:8: interleaved loop (interleaved count: 4)
 ; NONE: remark: vectorization-remarks.c:17:8: loop not vectorized: vectorization and interleaving are explicitly disabled, or the loop has already been vectorized

@@ -8,7 +8,7 @@ target triple = "x86_64-apple-macosx"

; If we need to scalarize the fptoui and then use inserts to build up the
; vector again, then there is certainly no value in going 256-bit wide.
; CHECK-NOT: vinserti128
; CHECK: fptoui <8 x float>
Copy link
Collaborator

@RKSimon RKSimon Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a better test might be:

; CHECK-NOT: fptoui <4 x float>

@@ -9,7 +9,7 @@ target triple = "x86_64-apple-macosx"

; If we need to scalarize the fptoui and then use inserts to build up the
; vector again, then there is certainly no value in going 256-bit wide.
; CHECK-NOT: vpinsrd
; CHECK: fptoui <4 x double>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a better test might be:

; CHECK-NOT: fptoui <2 x double>

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@david-arm
Copy link
Contributor Author

Rebased downstream and ran make check-all, seems fine.

@david-arm david-arm merged commit 958cec0 into llvm:main Aug 26, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 26, 2025

LLVM Buildbot has detected a new failure on builder lldb-remote-linux-ubuntu running on as-builder-9 while building llvm at step 16 "test-check-lldb-api".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/195/builds/13752

Here is the relevant piece of the build log for the reference
Step 16 (test-check-lldb-api) failure: Test just built components: check-lldb-api completed (failure)
...
PASS: lldb-api :: functionalities/data-formatter/data-formatter-skip-summary/TestDataFormatterSkipSummary.py (388 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-python-synth/TestDataFormatterPythonSynth.py (389 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/atomic/TestDataFormatterStdAtomic.py (390 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/coroutine_handle/TestCoroutineHandle.py (391 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/function/TestDataFormatterStdFunction.py (392 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/chrono/TestDataFormatterStdChrono.py (393 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/initializer_list/TestDataFormatterStdInitializerList.py (394 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/bitset/TestDataFormatterGenericBitset.py (395 of 1297)
PASS: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/list/loop/TestDataFormatterGenericListLoop.py (396 of 1297)
UNRESOLVED: lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/forward_list/TestDataFormatterGenericForwardList.py (397 of 1297)
******************** TEST 'lldb-api :: functionalities/data-formatter/data-formatter-stl/generic/forward_list/TestDataFormatterGenericForwardList.py' FAILED ********************
Script:
--
/usr/bin/python3.12 /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --libcxx-include-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/c++/v1 --libcxx-include-target-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/include/aarch64-unknown-linux-gnu/c++/v1 --libcxx-library-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib/aarch64-unknown-linux-gnu --arch aarch64 --build-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/lldb --compiler /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang --dsymutil /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./bin --lldb-obj-root /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/tools/lldb --lldb-libs-dir /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/./lib --cmake-build-type Release --platform-url connect://jetson-agx-2198.lab.llvm.org:1234 --platform-working-dir /home/ubuntu/lldb-tests --sysroot /mnt/fs/jetson-agx-ubuntu --env ARCH_CFLAGS=-mcpu=cortex-a78 --platform-name remote-linux --skip-category=lldb-server /home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/forward_list -p TestDataFormatterGenericForwardList.py
--
Exit Code: -11

Command Output (stdout):
--
lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision 958cec0ab1bbbdc47ea207460de72c5fee24be70)
  clang revision 958cec0ab1bbbdc47ea207460de72c5fee24be70
  llvm revision 958cec0ab1bbbdc47ea207460de72c5fee24be70

--
Command Output (stderr):
--
WARNING:root:Custom libc++ is not supported for remote runs: ignoring --libcxx arguments
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libcpp_dsym (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libcpp_dsym) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libcpp_dwarf (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libcpp_dwarf)
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libcpp_dwo (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libcpp_dwo)
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libstdcpp_dsym (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libstdcpp_dsym) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libstdcpp_dwarf (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libstdcpp_dwarf)
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_libstdcpp_dwo (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_libstdcpp_dwo)
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_msvcstl_dsym (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_msvcstl_dsym) (test case does not fall in any category of interest for this run) 
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_msvcstl_dwarf (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_msvcstl_dwarf) (test case does not fall in any category of interest for this run) 
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_msvcstl_dwo (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_msvcstl_dwo) (test case does not fall in any category of interest for this run) 
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_ptr_and_ref_libcpp_dsym (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_ptr_and_ref_libcpp_dsym) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_ptr_and_ref_libcpp_dwarf (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_ptr_and_ref_libcpp_dwarf)
PASS: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_ptr_and_ref_libcpp_dwo (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_ptr_and_ref_libcpp_dwo)
UNSUPPORTED: LLDB (/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/bin/clang-aarch64) :: test_ptr_and_ref_libstdcpp_dsym (TestDataFormatterGenericForwardList.TestDataFormatterGenericForwardList.test_ptr_and_ref_libstdcpp_dsym) (test case does not fall in any category of interest for this run) 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Fatal Python error: Segmentation fault

Thread 0x000070bb54757080 (most recent call first):
  File "/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/build/local/lib/python3.12/dist-packages/lldb/__init__.py", line 13074 in Launch
  File "/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 874 in run_to_breakpoint_do_run
  File "/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 1017 in run_to_source_breakpoint
  File "/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/forward_list/TestDataFormatterGenericForwardList.py", line 77 in do_test_ptr_and_ref
  File "/home/buildbot/worker/as-builder-9/lldb-remote-linux-ubuntu/llvm-project/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/forward_list/TestDataFormatterGenericForwardList.py", line 161 in test_ptr_and_ref_libstdcpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants