Skip to content

[jit] DeadCodeEliminator Mark(block) improvement #152348

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shinyehtsai
Copy link

@shinyehtsai shinyehtsai commented Apr 28, 2025

Summary:
This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431

cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel

Copy link

pytorch-bot bot commented Apr 28, 2025

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @shinyehtsai, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

@pytorch-bot pytorch-bot bot added the release notes: jit release notes category label Apr 28, 2025
Copy link

pytorch-bot bot commented Apr 28, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152348

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 41d799c with merge base 3a43dba (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Apr 28, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@shinyehtsai shinyehtsai changed the title [Draft] DeadCodeEliminator Mark(block) improvement [WIP] DeadCodeEliminator Mark(block) improvement Apr 28, 2025
shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 28, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 28, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 28, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 28, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 30, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@shinyehtsai shinyehtsai requested a review from gmagogsfm April 30, 2025 20:23
shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 30, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Apr 30, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@shinyehtsai
Copy link
Author

Did you get any benchmarking results on this pass? like how much faster it is than before and how much additional memory is used.

In my latest run on a complicated model, the overall performance difference is around 40%.
106773ms -> 66071 ms (post-PR).

shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request May 12, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest.

Differential Revision: D73476431
@shinyehtsai shinyehtsai requested a review from gmagogsfm May 12, 2025 18:56
@shinyehtsai shinyehtsai changed the title [WIP] DeadCodeEliminator Mark(block) improvement [jit] DeadCodeEliminator Mark(block) improvement May 12, 2025
@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 12, 2025
shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request May 12, 2025
Summary:

This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest.

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@@ -3050,6 +3050,25 @@ TEST(TestShapeGraphLinting, Basic) {
}
}

TEST(DeadCodeEliminatorTest, TestConstructor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unit test doesn't seem to be meaningful. Could you add some meaningful test for dce pass? There wasn't comprehensive test before but now complexity is higher, we should have more test cases. You can add more here:
https://github.com/pytorch/pytorch/blob/main/test/jit/test_dce.py

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new unittest is added.

Copy link
Contributor

@davidberard98 davidberard98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can add some python tests for more complex dce cases in test_dce.py, that would be great - otherwise LGTM!

pytorch-bot bot pushed a commit that referenced this pull request Jun 5, 2025
Summary:

**TL;DR**: This diff seeks to optimize the DCE within the mark(block) function.

**Details**
The goal of this PR is to optimize this function:
```
MarkResult mark(Block* block) {
    bool anyMarked = false;
    bool allMarked = true;

    // If all nodes within this block are already marked, we can safely bypass
    // revisiting it. This check is the primary driver of our performance
    // optimization.
    if (alreadyAllMarked_.count(block)) {
      return MarkResult(false, true);
    }

```

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Rely on unittest

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

@shinyehtsai shinyehtsai requested a review from gmagogsfm June 5, 2025 23:01
shinyehtsai added a commit to shinyehtsai/pytorch that referenced this pull request Jun 6, 2025
Summary:

**TL;DR**: This diff seeks to optimize the DCE within the mark(block) function.

**Details**
The goal of this PR is to optimize this function:
```
MarkResult mark(Block* block) {
    bool anyMarked = false;
    bool allMarked = true;

    // If all nodes within this block are already marked, we can safely bypass
    // revisiting it. This check is the primary driver of our performance
    // optimization.
    if (alreadyAllMarked_.count(block)) {
      return MarkResult(false, true);
    }

```

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Rely on unittest

Reviewed By: davidberard98

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

pytorch-bot bot pushed a commit that referenced this pull request Jun 9, 2025
Summary:

**TL;DR**: This diff seeks to optimize the DCE within the mark(block) function.

**Details**
The goal of this PR is to optimize this function:
```
MarkResult mark(Block* block) {
    bool anyMarked = false;
    bool allMarked = true;

    // If all nodes within this block are already marked, we can safely bypass
    // revisiting it. This check is the primary driver of our performance
    // optimization.
    if (alreadyAllMarked_.count(block)) {
      return MarkResult(false, true);
    }

```

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Rely on unittest

Reviewed By: davidberard98

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

pytorch-bot bot pushed a commit that referenced this pull request Jun 11, 2025
Summary:
Pull Request resolved: #152348

**TL;DR**: This diff seeks to optimize the DCE within the mark(block) function.

**Details**
The goal of this PR is to optimize this function:
```
MarkResult mark(Block* block) {
    bool anyMarked = false;
    bool allMarked = true;

    // If all nodes within this block are already marked, we can safely bypass
    // revisiting it. This check is the primary driver of our performance
    // optimization.
    if (alreadyAllMarked_.count(block)) {
      return MarkResult(false, true);
    }

```

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Rely on unittest

Reviewed By: davidberard98

Differential Revision: D73476431
Summary:
Pull Request resolved: pytorch#152348

**TL;DR**: This diff seeks to optimize the DCE within the mark(block) function.

**Details**
The goal of this PR is to optimize this function:
```
MarkResult mark(Block* block) {
    bool anyMarked = false;
    bool allMarked = true;

    // If all nodes within this block are already marked, we can safely bypass
    // revisiting it. This check is the primary driver of our performance
    // optimization.
    if (alreadyAllMarked_.count(block)) {
      return MarkResult(false, true);
    }

```

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Rely on unittest

Reviewed By: davidberard98

Differential Revision: D73476431
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73476431

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Aug 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request fb-exported oncall: jit Add this issue/PR to JIT oncall triage queue release notes: jit release notes category Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants