Skip to content

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Sep 4, 2025

As noted in #156787

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Fangrui Song (MaskRay)

Changes

As noted in #156787


Full diff: https://github.com/llvm/llvm-project/pull/156802.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64TargetMachine.cpp (+2-1)
diff --git a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
index e67bd5869ccd1..4650b2d0c8151 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
@@ -589,7 +589,8 @@ void AArch64TargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
 
   PB.registerLateLoopOptimizationsEPCallback(
       [=](LoopPassManager &LPM, OptimizationLevel Level) {
-        LPM.addPass(LoopIdiomVectorizePass());
+        if (Level != OptimizationLevel::O0)
+          LPM.addPass(LoopIdiomVectorizePass());
       });
   if (getTargetTriple().isOSWindows())
     PB.registerPipelineEarlySimplificationEPCallback(

@madhur13490
Copy link
Contributor

I may be missing the broader context, while I do see the tagged issue. Is this purely because x86 does not run the pass? How do we decide what goes in O0?

.
Created using spr 1.3.5-bogner
@arsenm
Copy link
Contributor

arsenm commented Sep 4, 2025

I may be missing the broader context, while I do see the tagged issue. Is this purely because x86 does not run the pass? How do we decide what goes in O0?

As little was possible. The passes that are run at -O0 are maintaining historical semantic accidents, like always-inliner and should include no optimizations

I'd expect the pass builder to not call this hook in the first place, rather than requiring all targets to check this

@davemgreen
Copy link
Collaborator

Can we add a test that checks -mtriple aarch64 -print-pipeline-passes -O0? https://godbolt.org/z/Yf8ex6KMh

@davemgreen davemgreen requested a review from david-arm September 4, 2025 07:14
@arsenm arsenm requested a review from aeubanks September 4, 2025 07:21
@@ -589,7 +589,8 @@ void AArch64TargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {

PB.registerLateLoopOptimizationsEPCallback(
[=](LoopPassManager &LPM, OptimizationLevel Level) {
LPM.addPass(LoopIdiomVectorizePass());
if (Level != OptimizationLevel::O0)
LPM.addPass(LoopIdiomVectorizePass());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like these are invoked from buildO0DefaultPipeline which is strange, but then we also call invokeCGSCCOptimizerLateEPCallbacks, invokeLoopOptimizerEndEPCallbacks and invokeScalarOptimizerLateEPCallbacks from there too.

Perhaps better to guard the entire register function with the check, i.e.

  if (Level != OptimizationLevel::O0) {
    PB.registerLateLoopOptimizationsEPCallback(
        [=](LoopPassManager &LPM, OptimizationLevel Level) {
          LPM.addPass(LoopIdiomVectorizePass());
        });
  }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Level variable is an argument to the lambda it doesn't exist outside.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that's true, but I just thought it must be easy enough to find out what the optimisation level is and just not bothering calling registerLateLoopOptimizationsEPCallback at all. I guess it's fine at the moment because we're only registering one pass.

Created using spr 1.3.5-bogner
Copy link
Member Author

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added llvm/test/CodeGen/AArch64/print-pipeline-passes.ll . CodeGen/AArch64 doesn't test this option before.

@@ -107,6 +107,7 @@ add_llvm_target(AArch64CodeGen
Core
GlobalISel
MC
Passes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this adding a new dependency?

Copy link
Member Author

@MaskRay MaskRay Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OptimizationLevel::O0 is a variable defined by Passes .cpp file. This is needed to avoid an undefined symbol linker error. I don't find a way to move the definition from .cpp to .h as a static constexpr variable.

The class OptimizationLevel is not considered a literal class before }. Declararing a member constexpr variable of the same type is not allowed.

struct A {
    int a, b;
    constexpr A(int a, int b) : a(a), b(b) {}
    
    static constexpr A default_value{0, 0};
};

// would cause
error: constexpr variable cannot have non-literal type 'const A'
    6 |   static constexpr inline A default_value{0, 0};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it does seem a shame to add the extra dependency. Does #157057 help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it looks like #157057 may not go anywhere, so don't let me hold up this PR.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test. LGTM.

; RUN: opt -mtriple=aarch64 -S -passes='default<O2>' -print-pipeline-passes < %s | FileCheck %s

; CHECK: loop-idiom-vectorize
; O0-NOT: loop-idiom-vectorize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of testing the entire pipeline for -O0 as it should be very small, and would prevent us from adding extra passes to it again. The other optimization levels are a bit long for that.

david-arm added a commit to david-arm/llvm-project that referenced this pull request Sep 5, 2025
Sometimes code depends upon the constant variables declared
in OptimizationLevel (e.g. see llvm#156802), which then requires
them to add a dependency on the Passes directory. Given that
code is much more likely to depend upon Support than Passes
I believe it makes sense to move OptimizationLevel to
Support. The Passes component already depends upon Support.
david-arm added a commit to david-arm/llvm-project that referenced this pull request Sep 5, 2025
Sometimes code depends upon the constant variables declared
in OptimizationLevel (e.g. see llvm#156802), which then requires
them to add a dependency on the Passes directory. Given that
code is much more likely to depend upon Support than Passes
I believe it makes sense to move OptimizationLevel to
Support. The Passes component already depends upon Support.
Created using spr 1.3.5-bogner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants