[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916

davemgreen · 2025-09-04T15:48:56Z

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the aarch64 subtarget for it.

llvmbot · 2025-09-04T15:49:29Z

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: David Green (davemgreen)

Changes

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it. It overrides the VScaleForTuning in the vectorizer even when the vscale is known, as the options is a "force".

Full diff: https://github.com/llvm/llvm-project/pull/156916.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+9)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll (+4)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 3fbeef1211954..bf84a571e679e 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -303,6 +303,10 @@ static cl::opt<bool> ForceTargetSupportsScalableVectors(
         "Pretend that scalable vectors are supported, even if the target does "
         "not support them. This flag should only be used for testing."));
 
+static cl::opt<unsigned>
+    VScaleForTuningOpt("force-vscale-for-tuning", cl::Hidden,
+                       cl::desc("Force a vscale for tuning factor in the loop vectorizer"));
+
 static cl::opt<unsigned> SmallLoopCost(
     "small-loop-cost", cl::init(20), cl::Hidden,
     cl::desc(
@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel {
   /// vscale_range.min == vscale_range.max then return vscale_range.max, else
   /// return the value returned by the corresponding TTI method.
   void initializeVScaleForTuning() {
+    if (VScaleForTuningOpt.getNumOccurrences()) {
+      VScaleForTuning = VScaleForTuningOpt;
+      return;
+    }
+
     const Function *Fn = TheLoop->getHeader()->getParent();
     if (Fn->hasFnAttribute(Attribute::VScaleRange)) {
       auto Attr = Fn->getFnAttribute(Attribute::VScaleRange);
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
index c4aee69db70b3..16d3786681ffa 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
@@ -7,6 +7,10 @@
 ; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
 ; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING1
 
+; RUN: opt -mtriple=aarch64 -mattr=+sve -mcpu=generic -force-vscale-for-tuning=2 \
+; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
+; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING2
+
 ; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v1 \
 ; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
 ; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING2

github-actions · 2025-09-04T15:52:54Z

✅ With the latest revision this PR passed the C/C++ code formatter.

david-arm · 2025-09-04T15:54:38Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel {
  /// vscale_range.min == vscale_range.max then return vscale_range.max, else
  /// return the value returned by the corresponding TTI method.
  void initializeVScaleForTuning() {
+    if (VScaleForTuningOpt.getNumOccurrences()) {


Thanks for this! I like the idea of having a flag to force the choice of vscale, but should this be sanitised according to vscale_range on the function or at least emit a warning that it's architecturally unsupported? For example, if the function has the vscale_range(1, 16) attribute then only power-of-2 vscale values are permitted leading to choices of 1,2,4,8 or 16.

I was thinking that as it is a tuning / debug option we would leave it to the user. I don't believe there is any technical reason why the factor we multiply by and the vscale_range need to match. It will just be the same as TTI.getVScaleForTuning() in the latest version though.

fhahn · 2025-09-04T15:56:22Z

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

+    if (VScaleForTuningOpt.getNumOccurrences()) {
+      VScaleForTuning = VScaleForTuningOpt;
+      return;
+    }


Is the purpose of the option to deliberately override the function attribute?

Does this have to be LV specific? Could this be handled in getVScaleForTuning?

Is the purpose of the option to deliberately override the function attribute?

Either I think should be fine. As I said above I would let the user decide if they wanted to test a particular option, even if the vscale_range said otherwise.

Does this have to be LV specific? Could this be handled in getVScaleForTuning?

Yeah that is the other option but it would mean duplicating per target. In this case it is also used for the gather/scatter overheads, so probably makes sense to override in the target. I will move it there in the new version.

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it

david-arm

LGTM!

fhahn

LGTM, thannks

llvm-ci · 2025-09-09T09:50:01Z

LLVM Buildbot has detected a new failure on builder cross-project-tests-sie-ubuntu-dwarf5 running on doug-worker-1b while building llvm at step 6 "test-build-unified-tree-check-cross-project".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/163/builds/26126

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-cross-project) failure: test (failure)
******************** TEST 'cross-project-tests :: debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp # RUN: at line 8
+ clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp
"/usr/bin/python3.10" "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py" test --fail-lt 1.0 -w --debugger lldb-dap --lldb-executable "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap"  --dap-message-log=/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp.dap.log --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp | /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck --dump-input-context=999999999 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp # RUN: at line 9
+ /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck --dump-input-context=999999999 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp
+ /usr/bin/python3.10 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py test --fail-lt 1.0 -w --debugger lldb-dap --lldb-executable /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap --dap-message-log=/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp.dap.log --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp
-> {
  "type": "request",
  "command": "initialize",
  "arguments": {
    "clientID": "dexter",
    "adapterID": "lldb-dap",
    "pathFormat": "path",
    "linesStartAt1": true,
    "columnsStartAt1": true,
    "supportsVariableType": true,
    "supportsVariablePaging": true,
    "supportsRunInTerminalRequest": false
  },
  "seq": 1
}
<- {
  "body": {
    "$__lldb_version": "lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision 204917ea971517fdbe46ece977e42d766f0cfe77)\n  clang revision 204917ea971517fdbe46ece977e42d766f0cfe77\n  llvm revision 204917ea971517fdbe46ece977e42d766f0cfe77",
    "completionTriggerCharacters": [
      ".",
      " ",
      "\t"
    ],
    "exceptionBreakpointFilters": [
      {
        "description": "C++ Catch",
        "filter": "cpp_catch",
        "label": "C++ Catch",
        "supportsCondition": true
      },
      {
        "description": "C++ Throw",
        "filter": "cpp_throw",
        "label": "C++ Throw",
        "supportsCondition": true
      },
      {
        "description": "Objective-C Catch",
        "filter": "objc_catch",
        "label": "Objective-C Catch",
...

davemgreen requested review from fhahn, SamTebbs33, lukel97 and david-arm September 4, 2025 15:48

llvmbot added vectorizers llvm:transforms labels Sep 4, 2025

davemgreen force-pushed the gh-lv-vscalefortuningopt branch from bc701eb to 23ced2e Compare September 4, 2025 15:53

david-arm reviewed Sep 4, 2025

View reviewed changes

fhahn reviewed Sep 4, 2025

View reviewed changes

[LoopVectorizer] Add a -force-vscale-for-tuning override option.

fb034f0

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it

davemgreen changed the title ~~[LoopVectorizer] Add a -force-vscale-for-tuning override option.~~ [LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. Sep 8, 2025

Move to AArch64

bccc85b

davemgreen force-pushed the gh-lv-vscalefortuningopt branch from 23ced2e to bccc85b Compare September 8, 2025 10:33

llvmbot added the backend:AArch64 label Sep 8, 2025

david-arm approved these changes Sep 9, 2025

View reviewed changes

fhahn approved these changes Sep 9, 2025

View reviewed changes

davemgreen merged commit 204917e into llvm:main Sep 9, 2025
10 checks passed

davemgreen deleted the gh-lv-vscalefortuningopt branch September 9, 2025 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916

[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916

davemgreen commented Sep 4, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

david-arm Sep 4, 2025

Uh oh!

davemgreen Sep 8, 2025

Uh oh!

fhahn Sep 4, 2025

Uh oh!

davemgreen Sep 8, 2025

Uh oh!

david-arm left a comment

Uh oh!

fhahn left a comment

Uh oh!

Uh oh!

llvm-ci commented Sep 9, 2025

Uh oh!

Uh oh!

[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916

[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916

Conversation

davemgreen commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-arm Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

davemgreen Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

davemgreen Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

david-arm left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Sep 9, 2025

Uh oh!

Uh oh!

davemgreen commented Sep 4, 2025 •

edited

Loading

llvmbot commented Sep 4, 2025 •

edited

Loading

github-actions bot commented Sep 4, 2025 •

edited

Loading