-
Notifications
You must be signed in to change notification settings - Fork 24.9k
[jit] DeadCodeEliminator Mark(block) improvement #152348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @shinyehtsai, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team. |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152348
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 41d799c with merge base 3a43dba ( BROKEN TRUNK - The following job failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This pull request was exported from Phabricator. Differential Revision: D73476431 |
5a43806
to
f9260eb
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
f9260eb
to
fdfa1fd
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
fdfa1fd
to
166b417
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
166b417
to
d601a5d
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
d601a5d
to
f3512a4
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
f3512a4
to
45ce6db
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
45ce6db
to
9fe0374
Compare
This pull request was exported from Phabricator. Differential Revision: D73476431 |
This pull request was exported from Phabricator. Differential Revision: D73476431 |
In my latest run on a complicated model, the overall performance difference is around 40%. |
Summary: This diff seeks to optimize the DeadCodeEliminator within the mark(block) function. The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Existing unittest. Differential Revision: D73476431
28257db
to
6e55810
Compare
Summary: This diff seeks to optimize the DeadCodeEliminator within the mark(block) function. The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Existing unittest. Differential Revision: D73476431
This pull request was exported from Phabricator. Differential Revision: D73476431 |
@@ -3050,6 +3050,25 @@ TEST(TestShapeGraphLinting, Basic) { | |||
} | |||
} | |||
|
|||
TEST(DeadCodeEliminatorTest, TestConstructor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unit test doesn't seem to be meaningful. Could you add some meaningful test for dce pass? There wasn't comprehensive test before but now complexity is higher, we should have more test cases. You can add more here:
https://github.com/pytorch/pytorch/blob/main/test/jit/test_dce.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a new unittest is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can add some python tests for more complex dce cases in test_dce.py, that would be great - otherwise LGTM!
6e55810
to
372b90e
Compare
Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Differential Revision: D73476431
This pull request was exported from Phabricator. Differential Revision: D73476431 |
372b90e
to
3bf45b7
Compare
Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431
This pull request was exported from Phabricator. Differential Revision: D73476431 |
3bf45b7
to
875fd9e
Compare
Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431
This pull request was exported from Phabricator. Differential Revision: D73476431 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D73476431 |
875fd9e
to
ad0364d
Compare
Summary: Pull Request resolved: #152348 **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431
Summary: Pull Request resolved: pytorch#152348 **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431
This pull request was exported from Phabricator. Differential Revision: D73476431 |
ad0364d
to
41d799c
Compare
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Summary:
This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.
The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.
Test Plan: Existing unittest. Will add new soon
Differential Revision: D73476431
cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel