-
Notifications
You must be signed in to change notification settings - Fork 15k
[LoopVectorizer][AArch64] Add a -sve-vscale-for-tuning override option. #156916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-llvm-transforms Author: David Green (davemgreen) ChangesIt can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it. It overrides the VScaleForTuning in the vectorizer even when the vscale is known, as the options is a "force". Full diff: https://github.com/llvm/llvm-project/pull/156916.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 3fbeef1211954..bf84a571e679e 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -303,6 +303,10 @@ static cl::opt<bool> ForceTargetSupportsScalableVectors(
"Pretend that scalable vectors are supported, even if the target does "
"not support them. This flag should only be used for testing."));
+static cl::opt<unsigned>
+ VScaleForTuningOpt("force-vscale-for-tuning", cl::Hidden,
+ cl::desc("Force a vscale for tuning factor in the loop vectorizer"));
+
static cl::opt<unsigned> SmallLoopCost(
"small-loop-cost", cl::init(20), cl::Hidden,
cl::desc(
@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel {
/// vscale_range.min == vscale_range.max then return vscale_range.max, else
/// return the value returned by the corresponding TTI method.
void initializeVScaleForTuning() {
+ if (VScaleForTuningOpt.getNumOccurrences()) {
+ VScaleForTuning = VScaleForTuningOpt;
+ return;
+ }
+
const Function *Fn = TheLoop->getHeader()->getParent();
if (Fn->hasFnAttribute(Attribute::VScaleRange)) {
auto Attr = Fn->getFnAttribute(Attribute::VScaleRange);
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
index c4aee69db70b3..16d3786681ffa 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/scalable-vectorization-cost-tuning.ll
@@ -7,6 +7,10 @@
; RUN: -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=VSCALEFORTUNING1
+; RUN: opt -mtriple=aarch64 -mattr=+sve -mcpu=generic -force-vscale-for-tuning=2 \
+; RUN: -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
+; RUN: | FileCheck %s --check-prefixes=VSCALEFORTUNING2
+
; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v1 \
; RUN: -force-target-instruction-cost=1 -passes=loop-vectorize -S -debug-only=loop-vectorize --disable-output < %s 2>&1 \
; RUN: | FileCheck %s --check-prefixes=VSCALEFORTUNING2
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
bc701eb
to
23ced2e
Compare
@@ -1473,6 +1477,11 @@ class LoopVectorizationCostModel { | |||
/// vscale_range.min == vscale_range.max then return vscale_range.max, else | |||
/// return the value returned by the corresponding TTI method. | |||
void initializeVScaleForTuning() { | |||
if (VScaleForTuningOpt.getNumOccurrences()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I like the idea of having a flag to force the choice of vscale, but should this be sanitised according to vscale_range on the function or at least emit a warning that it's architecturally unsupported? For example, if the function has the vscale_range(1, 16)
attribute then only power-of-2 vscale values are permitted leading to choices of 1,2,4,8 or 16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that as it is a tuning / debug option we would leave it to the user. I don't believe there is any technical reason why the factor we multiply by and the vscale_range need to match. It will just be the same as TTI.getVScaleForTuning() in the latest version though.
if (VScaleForTuningOpt.getNumOccurrences()) { | ||
VScaleForTuning = VScaleForTuningOpt; | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose of the option to deliberately override the function attribute?
Does this have to be LV specific? Could this be handled in getVScaleForTuning
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the purpose of the option to deliberately override the function attribute?
Either I think should be fine. As I said above I would let the user decide if they wanted to test a particular option, even if the vscale_range said otherwise.
Does this have to be LV specific? Could this be handled in getVScaleForTuning?
Yeah that is the other option but it would mean duplicating per target. In this case it is also used for the gather/scatter overheads, so probably makes sense to override in the target. I will move it there in the new version.
It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the vectorizer for it
23ced2e
to
bccc85b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thannks
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/163/builds/26126 Here is the relevant piece of the build log for the reference
|
It can be useful for debugging and tuning to be able to alter the VScaleForTuning. This adds a quick option to the aarch64 subtarget for it.