Skip to content

Conversation

wsmoses
Copy link
Member

@wsmoses wsmoses commented Aug 22, 2025

This removes the closure indirection, and removes the recursion on isValidSymbol. The rewriting of the recursion is particularly helpful to avoid redundant checks of isPure and checking the isValidSymbol of the operands for each parent region check

@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2025

@llvm/pr-subscribers-mlir-affine

@llvm/pr-subscribers-mlir

Author: William Moses (wsmoses)

Changes

This adds an early exit condition, and removes the closure indirection, potentially enabling tail recursion elim, etc.


Full diff: https://github.com/llvm/llvm-project/pull/154924.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Affine/IR/AffineOps.cpp (+10-4)
diff --git a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
index 22608a16cc1ab..53538d9c01428 100644
--- a/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
+++ b/mlir/lib/Dialect/Affine/IR/AffineOps.cpp
@@ -465,10 +465,16 @@ bool mlir::affine::isValidSymbol(Value value, Region *region) {
     return true;
 
   // `Pure` operation that whose operands are valid symbolic identifiers.
-  if (isPure(defOp) && llvm::all_of(defOp->getOperands(), [&](Value operand) {
-        return affine::isValidSymbol(operand, region);
-      })) {
-    return true;
+  if (isPure(defOp)) {
+    bool allValid = true;
+    for (auto operand : defOp->getOperands()) {
+      if (!affine::isValidSymbol(operand, region)) {
+        allValid = false;
+        break;
+      }
+    }
+    if (allValid)
+      return true;
   }
 
   // Dim op results could be valid symbols at any level.

@joker-eph
Copy link
Collaborator

joker-eph commented Aug 22, 2025

This adds an early exit condition

Mmmm, how so?

potentially enabling tail recursion elim

Ditto: I don't see it?

@wsmoses wsmoses force-pushed the users/wsmoses/sym branch from c772acf to 1953029 Compare August 22, 2025 10:59
@wsmoses
Copy link
Member Author

wsmoses commented Aug 22, 2025

okay updated both the PR and comment. In particular, I'm trying to fix things when a single pass takes >2 mins with this profile (not tested this fixes, but is generally good regardless -- and seems likely):

Screenshot 2025-08-22 at 12 41 56 PM

@wsmoses
Copy link
Member Author

wsmoses commented Aug 22, 2025

In particular I believe the PR in its current form reduces the runtime from O( (# of ancestor regions) * (# of operands + [potentially recursive] isPure cost) ) -> O( (# of ancestor regions) + (# of operands + [potentially recursive] isPure cost) )

@wsmoses
Copy link
Member Author

wsmoses commented Aug 22, 2025

okay confirmed the impact of this diff:

BEFORE:
(base) wmoses@hydra:~/git/Enzyme-JaX$ time sudo perf record --call-graph=fp ./bazel-bin/enzymexlamlir-opt plaa.mlir --pass-pipeline="any(llvm-to-affine-access)" -o /dev/null
[ perf record: Woken up 585 times to write data ]
[ perf record: Captured and wrote 148.074 MB perf.data (490695 samples) ]

real    2m4.680s
user    0m0.003s
sys     0m0.008s
(base) wmoses@hydra:~/git/Enzyme-JaX$ sudo perf report

AFTER:
(base) wmoses@hydra:~/git/Enzyme-JaX$ time sudo perf record --call-graph=fp ./bazel-bin/enzymexlamlir-opt plaa.mlir --pass-pipeline="any(llvm-to-affine-access)" -o /dev/null
[sudo] password for wmoses: 
[ perf record: Woken up 96 times to write data ]
[ perf record: Captured and wrote 25.857 MB perf.data (50007 samples) ]

real    0m17.450s
user    0m0.008s
sys     0m0.011s

}
}
if (allValid)
return true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this hunk?
I

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I essentially inlined the llvm:all_of. So specifically it checks each operand if it is valid, and if any is not valid, doesn't return true. In addition to inlining (and thus not capturing/etc), there's an early break if it is known not valid. llvm::all_of forwards to std::all_of, which I don't know if is required to have an early break vs always iterate the entire range.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which I don't know if is required to have an early break vs always iterate the entire range.

Yeah, well in absence of more evidence, please rollback this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so per https://en.cppreference.com/w/cpp/algorithm/all_any_none_of.html it looks like it's up to the stdc++ implementation. Which means that the existing llvm::all_of does not guarantee to early exit, whereas this does.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you believe this is a problem, please send a patch to fix llvm::all_of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made like a change from 3m56s on the big code to 3m57s, putting this part back to all_of

@wsmoses
Copy link
Member Author

wsmoses commented Aug 22, 2025

on our larger test code this also is confirmed to reduce a compile time of 1hr (and counting) to 3 minutes

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
@wsmoses wsmoses merged commit 551cd1b into main Aug 22, 2025
9 checks passed
@wsmoses wsmoses deleted the users/wsmoses/sym branch August 22, 2025 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants