Skip to content

Conversation

davemgreen
Copy link
Collaborator

@davemgreen davemgreen commented Sep 4, 2025

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the aarch64 subtarget for it.

@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: David Green (davemgreen)

Changes

It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it. It overrides the VScaleForTuning in the vectorizer even when the vscale is known, as the options is a "force".


Full diff: https://github.com/llvm/llvm-project/pull/156916.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+9)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll (+4)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 3fbeef1211954..bf84a571e679e 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -303,6 +303,10 @@ static cl::opt<bool> ForceTargetSupportsScalableVectors(
         "Pretend that scalable vectors are supported, even if the target does "
         "not support them. This flag should only be used for testing."));
 
+static cl::opt<unsigned>
+    VScaleForTuningOpt("force-vscale-for-tuning", cl::Hidden,
+                       cl::desc("Force a vscale for tuning factor in the loop vectorizer"));
+
 static cl::opt<unsigned> SmallLoopCost(
     "small-loop-cost", cl::init(20), cl::Hidden,
     cl::desc(
@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel {
   /// vscale_range.min == vscale_range.max then return vscale_range.max, else
   /// return the value returned by the corresponding TTI method.
   void initializeVScaleForTuning() {
+    if (VScaleForTuningOpt.getNumOccurrences()) {
+      VScaleForTuning = VScaleForTuningOpt;
+      return;
+    }
+
     const Function *Fn = TheLoop->getHeader()->getParent();
     if (Fn->hasFnAttribute(Attribute::VScaleRange)) {
       auto Attr = Fn->getFnAttribute(Attribute::VScaleRange);
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
index c4aee69db70b3..16d3786681ffa 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
@@ -7,6 +7,10 @@
 ; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
 ; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING1
 
+; RUN: opt -mtriple=aarch64 -mattr=+sve -mcpu=generic -force-vscale-for-tuning=2 \
+; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
+; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING2
+
 ; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v1 \
 ; RUN:     -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
 ; RUN:     | FileCheck %s --check-prefixes=VSCALEFORTUNING2

Copy link

github-actions bot commented Sep 4, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@davemgreen davemgreen force-pushed the gh-lv-vscalefortuningopt branch from bc701eb to 23ced2e Compare September 4, 2025 15:53
@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel {
/// vscale_range.min == vscale_range.max then return vscale_range.max, else
/// return the value returned by the corresponding TTI method.
void initializeVScaleForTuning() {
if (VScaleForTuningOpt.getNumOccurrences()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! I like the idea of having a flag to force the choice of vscale, but should this be sanitised according to vscale_range on the function or at least emit a warning that it's architecturally unsupported? For example, if the function has the vscale_range(1, 16) attribute then only power-of-2 vscale values are permitted leading to choices of 1,2,4,8 or 16.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that as it is a tuning / debug option we would leave it to the user. I don't believe there is any technical reason why the factor we multiply by and the vscale_range need to match. It will just be the same as TTI.getVScaleForTuning() in the latest version though.

Comment on lines 1480 to 1483
if (VScaleForTuningOpt.getNumOccurrences()) {
VScaleForTuning = VScaleForTuningOpt;
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the purpose of the option to deliberately override the function attribute?

Does this have to be LV specific? Could this be handled in getVScaleForTuning?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the purpose of the option to deliberately override the function attribute?

Either I think should be fine. As I said above I would let the user decide if they wanted to test a particular option, even if the vscale_range said otherwise.

Does this have to be LV specific? Could this be handled in getVScaleForTuning?

Yeah that is the other option but it would mean duplicating per target. In this case it is also used for the gather/scatter overheads, so probably makes sense to override in the target. I will move it there in the new version.

It can be useful for debugging and tuning to be able to alter the
VScaleForTuning. This adds a quick option to the vectorizer for it
@davemgreen davemgreen changed the title [LoopVectorizer] Add a -force-vscale-for-tuning override option. [LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. Sep 8, 2025
Copy link
Contributor

@david-arm david-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thannks

@davemgreen davemgreen merged commit 204917e into llvm:main Sep 9, 2025
10 checks passed
@davemgreen davemgreen deleted the gh-lv-vscalefortuningopt branch September 9, 2025 09:46
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 9, 2025

LLVM Buildbot has detected a new failure on builder cross-project-tests-sie-ubuntu-dwarf5 running on doug-worker-1b while building llvm at step 6 "test-build-unified-tree-check-cross-project".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/163/builds/26126

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-cross-project) failure: test (failure)
******************** TEST 'cross-project-tests :: debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp # RUN: at line 8
+ clang++ -O0 -glldb -std=gnu++11 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp -o /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp
"/usr/bin/python3.10" "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py" test --fail-lt 1.0 -w --debugger lldb-dap --lldb-executable "/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap"  --dap-message-log=/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp.dap.log --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp | /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck --dump-input-context=999999999 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp # RUN: at line 9
+ /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/FileCheck --dump-input-context=999999999 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp
+ /usr/bin/python3.10 /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/dexter.py test --fail-lt 1.0 -w --debugger lldb-dap --lldb-executable /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/bin/lldb-dap --dap-message-log=/home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp.dap.log --binary /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/build/projects/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/Output/small_loop.cpp.tmp -- /home/buildbot/buildbot-root/cross-project-tests-sie-ubuntu-dwarf5/llvm-project/cross-project-tests/debuginfo-tests/dexter/feature_tests/commands/perfect/expect_step_kind/small_loop.cpp
-> {
  "type": "request",
  "command": "initialize",
  "arguments": {
    "clientID": "dexter",
    "adapterID": "lldb-dap",
    "pathFormat": "path",
    "linesStartAt1": true,
    "columnsStartAt1": true,
    "supportsVariableType": true,
    "supportsVariablePaging": true,
    "supportsRunInTerminalRequest": false
  },
  "seq": 1
}
<- {
  "body": {
    "$__lldb_version": "lldb version 22.0.0git (https://github.com/llvm/llvm-project.git revision 204917ea971517fdbe46ece977e42d766f0cfe77)\n  clang revision 204917ea971517fdbe46ece977e42d766f0cfe77\n  llvm revision 204917ea971517fdbe46ece977e42d766f0cfe77",
    "completionTriggerCharacters": [
      ".",
      " ",
      "\t"
    ],
    "exceptionBreakpointFilters": [
      {
        "description": "C++ Catch",
        "filter": "cpp_catch",
        "label": "C++ Catch",
        "supportsCondition": true
      },
      {
        "description": "C++ Throw",
        "filter": "cpp_throw",
        "label": "C++ Throw",
        "supportsCondition": true
      },
      {
        "description": "Objective-C Catch",
        "filter": "objc_catch",
        "label": "Objective-C Catch",
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants