-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[mlir][Transforms] Dialect Conversion: Fix folder implementation #150775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][Transforms] Dialect Conversion: Fix folder implementation #150775
Conversation
@llvm/pr-subscribers-mlir-core @llvm/pr-subscribers-mlir Author: Matthias Springer (matthias-springer) ChangesFolders are almost like patterns: they can modify IR (in-place op modification) and create new IR (constant op materialization of a folded attribute). If the modified op or the newly-created op cannot be legalized, the folding must be rolled back. The previous implementation did not roll back in-place op modifications. This issue became apparent when moving the Various test cases started failing after moving the Full diff: https://github.com/llvm/llvm-project/pull/150775.diff 2 Files Affected:
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index df255cfcf3ec1..3ecd1fd43bf44 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -17,6 +17,7 @@
#include "mlir/IR/Operation.h"
#include "mlir/Interfaces/FunctionInterfaces.h"
#include "mlir/Rewrite/PatternApplicator.h"
+#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/FormatVariadic.h"
@@ -2216,22 +2217,45 @@ OperationLegalizer::legalizeWithFold(Operation *op,
rewriterImpl.logger.startLine() << "* Fold {\n";
rewriterImpl.logger.indent();
});
- (void)rewriterImpl;
+
+ // Clear pattern state, so that the next pattern application starts with a
+ // clean slate. (The op/block sets are populated by listener notifications.)
+ auto cleanup = llvm::make_scope_exit([&]() {
+ rewriterImpl.patternNewOps.clear();
+ rewriterImpl.patternModifiedOps.clear();
+ rewriterImpl.patternInsertedBlocks.clear();
+ });
+
+ // Upon failure, undo all changes made by the folder.
+ RewriterState curState = rewriterImpl.getCurrentState();
+ auto undoFolding = [&]() {
+ rewriterImpl.resetState(curState, std::string(op->getName().getStringRef()) + " folder");
+ return failure();
+ };
// Try to fold the operation.
StringRef opName = op->getName().getStringRef();
SmallVector<Value, 2> replacementValues;
SmallVector<Operation *, 2> newOps;
rewriter.setInsertionPoint(op);
+ rewriter.startOpModification(op);
if (failed(rewriter.tryFold(op, replacementValues, &newOps))) {
LLVM_DEBUG(logFailure(rewriterImpl.logger, "unable to fold"));
+ rewriter.cancelOpModification(op);
return failure();
}
+ rewriter.finalizeOpModification(op);
// An empty list of replacement values indicates that the fold was in-place.
// As the operation changed, a new legalization needs to be attempted.
- if (replacementValues.empty())
- return legalize(op, rewriter);
+ if (replacementValues.empty()) {
+ if (succeeded(legalize(op, rewriter)))
+ return success();
+ return undoFolding();
+ }
+
+ // Insert a replacement for 'op' with the folded replacement values.
+ rewriter.replaceOp(op, replacementValues);
// Recursively legalize any new constant operations.
for (Operation *newOp : newOps) {
@@ -2245,16 +2269,10 @@ OperationLegalizer::legalizeWithFold(Operation *op,
"op '" + opName +
"' folder rollback of IR modifications requested");
}
- // Legalization failed: erase all materialized constants.
- for (Operation *op : newOps)
- rewriter.eraseOp(op);
- return failure();
+ return undoFolding();
}
}
- // Insert a replacement for 'op' with the folded replacement values.
- rewriter.replaceOp(op, replacementValues);
-
LLVM_DEBUG(logSuccess(rewriterImpl.logger, ""));
return success();
}
diff --git a/mlir/test/Transforms/test-legalize-type-conversion.mlir b/mlir/test/Transforms/test-legalize-type-conversion.mlir
index db8bd0f6378d2..9bffe92b374d5 100644
--- a/mlir/test/Transforms/test-legalize-type-conversion.mlir
+++ b/mlir/test/Transforms/test-legalize-type-conversion.mlir
@@ -104,8 +104,8 @@ func.func @test_signature_conversion_no_converter() {
"test.signature_conversion_no_converter"() ({
// expected-error@below {{failed to legalize unresolved materialization from ('f64') to ('f32') that remained live after conversion}}
^bb0(%arg0: f32):
- "test.type_consumer"(%arg0) : (f32) -> ()
// expected-note@below{{see existing live user here}}
+ "test.type_consumer"(%arg0) : (f32) -> ()
"test.return"(%arg0) : (f32) -> ()
}) : () -> ()
return
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
@@ -2245,16 +2269,10 @@ OperationLegalizer::legalizeWithFold(Operation *op, | |||
"op '" + opName + | |||
"' folder rollback of IR modifications requested"); | |||
} | |||
// Legalization failed: erase all materialized constants. | |||
for (Operation *op : newOps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is no longer necessary. resetState
will erase these ops.
1edd3fd
to
45803b6
Compare
45803b6
to
5d65053
Compare
5d65053
to
7aefa35
Compare
7aefa35
to
3024f0a
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/94/builds/9240 Here is the relevant piece of the build log for the reference
|
…m#150775) Operation folders can do two things: 1. Modify IR (in-place op modification). Failing to legalize an in-place folded operation does not trigger an immediate rollback. This happens only if the driver decides to try a different lowering path, requiring it to roll back a bunch of modifications, including the application of the folder. 2. Create new IR (constant op materialization of a folded attribute). Failing to legalize a newly created constant op triggers an immediate rollback. In-place op modifications should be guarded by `startOpModification`/`finalizeOpModification` because they are no different from other in-place op modifications. (They just happen outside of a pattern, but that does not mean that we should not track those changes; we are tracking everything else.) This commit adds those two function calls. This commit also moves the `rewriter.replaceOp(op, replacementValues);` function call before the loop nest that legalizes the newly created constant ops (and therefore `replacementValues`). Conceptually, the folded op must be replaced before attempting to legalize the constants because the constant ops may themselves be replaced as part of their own legalization process. The previous implementation happened to work in the current conversion driver, but is incompatible with the One-Shot Dialect Conversion driver, which expects to see the most recent IR at all time. From an end-user perspective, this commit should be NFC. A common folder-rollback pattern that is exercised by multiple tests cases: A `memref.dim` is folded to `arith.constant`, but `arith.constant` is not marked as legal as per the conversion target, triggering a rollback. Note: Folding is generally unsafe in a dialect conversion (see llvm#92683), but that's a different issue. (In a One-Shot Dialect Conversion, it will no longer be unsafe.)
We have a couple test failures that bisect to this commit (e.g. bufferize.mlir). The issue appears to happen when What's the expected behavior here? Is this an issue w/ the test, e.g. it should only use tools that have all dialects registered? Or is this a regression introduced by this commit? |
Could it be that the root cause will be fixed with #151847 ? |
Yes, it does. Thanks @brnorris03 for the fix! |
Hey @matthias-springer! It seems that this PR has caused an tsan failure related to Shardy verification. The issue seems to be a race condition originating in verifyOnExit where it verifies ops in parallel: parallelForEach(op.getContext(), opsWithIsolatedRegions, [&](Operation *o){...});. The ops it runs over are all ops that are marked w/ the IsIsolatedFromAbove trait. For a given set of ops, we sometimes have the race: Read: For some context, Shardy uses symbols to define top level MeshOps, then attributes (either discardable or properties of SDY ops) reference the mesh in TensorShardingAttr. The SDY op that has IIUC, we aren't violating the constraints of A potential solution is to mark our ops as |
I am missing some context here. Is the folder-undo modifying the module? What is the race condition here? And does the verifier run during the dialect conversion? That seems a bit unusual.
What's not allowed: Accessing the parent op (ModuleOp) in a pass that operates on a nested op (function, etc.). That can cause race conditions because the pass processes the functions in parallel. (Maybe just reading is fine, but there seems to be both read + write here.) I'm not quite sure what that means for the design of a verifier.
|
Thanks for the quick response! The race is a read of
This definitely makes sense, but we are accessing it in the verify method of an op, not in the pass directly.
I'm not sure about this TBH, the pass is here Using However, as I mentioned above, we can't do this for verification of |
The context should always be the same. So indeed no observable difference.
I'm not sure how |
Hey Matthias, The issue isn't fully resolved it seems. Now we see a similar issue with This doesn't seem like an issue with Shardy verification, but maybe an issue with Dialect Conversion? (still related to |
Can you post the stack traces for the READ and WRITE? |
Operation folders can do two things:
In-place op modifications should be guarded by
startOpModification
/finalizeOpModification
because they are no different from other in-place op modifications. (They just happen outside of a pattern, but that does not mean that we should not track those changes; we are tracking everything else.) This commit adds those two function calls.This commit also moves the
rewriter.replaceOp(op, replacementValues);
function call before the loop nest that legalizes the newly created constant ops (and thereforereplacementValues
). Conceptually, the folded op must be replaced before attempting to legalize the constants because the constant ops may themselves be replaced as part of their own legalization process. The previous implementation happened to work in the current conversion driver, but is incompatible with the One-Shot Dialect Conversion driver, which expects to see the most recent IR at all time.From an end-user perspective, this commit should be NFC. A common folder-rollback pattern that is exercised by multiple tests cases: A
memref.dim
is folded toarith.constant
, butarith.constant
is not marked as legal as per the conversion target, triggering a rollback.Note: Folding is generally unsafe in a dialect conversion (see #92683), but that's a different issue. (In a One-Shot Dialect Conversion, it will no longer be unsafe.)