[jit] DeadCodeEliminator Mark(block) improvement #152348

shinyehtsai · 2025-04-28T17:34:01Z

Summary:
This diff seeks to optimize the DeadCodeEliminator within the mark(block) function.

The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked.

Test Plan: Existing unittest. Will add new soon

Differential Revision: D73476431

cc @EikanWang @jgong5 @wenzhe-nrv @sanchitintel

pytorch-bot · 2025-04-28T17:34:05Z

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @shinyehtsai, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

pytorch-bot · 2025-04-28T17:34:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152348

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 41d799c with merge base 3a43dba ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks, 1, 1, linux.g4dn.metal.nvidia.gpu) (gh) (trunk failure)
MISSING REGRESSION TEST

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-28T17:34:15Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-28T17:38:53Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-28T18:37:53Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-28T22:32:21Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-28T23:33:18Z

This pull request was exported from Phabricator. Differential Revision: D73476431

torch/csrc/jit/passes/dead_code_elimination.cpp

facebook-github-bot · 2025-04-30T20:21:48Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-30T20:29:27Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-04-30T20:52:26Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-05-12T18:49:46Z

This pull request was exported from Phabricator. Differential Revision: D73476431

shinyehtsai · 2025-05-12T18:50:38Z

Did you get any benchmarking results on this pass? like how much faster it is than before and how much additional memory is used.

In my latest run on a complicated model, the overall performance difference is around 40%.
106773ms -> 66071 ms (post-PR).

Summary: This diff seeks to optimize the DeadCodeEliminator within the mark(block) function. The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Existing unittest. Differential Revision: D73476431

facebook-github-bot · 2025-05-12T23:10:24Z

This pull request was exported from Phabricator. Differential Revision: D73476431

gmagogsfm · 2025-05-13T03:42:39Z

test/cpp/jit/test_misc.cpp

@@ -3050,6 +3050,25 @@ TEST(TestShapeGraphLinting, Basic) {
  }
 }

+TEST(DeadCodeEliminatorTest, TestConstructor) {


This unit test doesn't seem to be meaningful. Could you add some meaningful test for dce pass? There wasn't comprehensive test before but now complexity is higher, we should have more test cases. You can add more here:
https://github.com/pytorch/pytorch/blob/main/test/jit/test_dce.py

a new unittest is added.

davidberard98

if you can add some python tests for more complex dce cases in test_dce.py, that would be great - otherwise LGTM!

Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Differential Revision: D73476431

facebook-github-bot · 2025-06-05T23:00:27Z

This pull request was exported from Phabricator. Differential Revision: D73476431

Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431

facebook-github-bot · 2025-06-06T16:48:31Z

This pull request was exported from Phabricator. Differential Revision: D73476431

Summary: **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431

facebook-github-bot · 2025-06-09T23:56:37Z

This pull request was exported from Phabricator. Differential Revision: D73476431

facebook-github-bot · 2025-06-11T17:56:19Z

This pull request was exported from Phabricator. Differential Revision: D73476431

Summary: Pull Request resolved: #152348 **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431

Summary: Pull Request resolved: pytorch#152348 **TL;DR**: This diff seeks to optimize the DCE within the mark(block) function. **Details** The goal of this PR is to optimize this function: ``` MarkResult mark(Block* block) { bool anyMarked = false; bool allMarked = true; // If all nodes within this block are already marked, we can safely bypass // revisiting it. This check is the primary driver of our performance // optimization. if (alreadyAllMarked_.count(block)) { return MarkResult(false, true); } ``` The primary concept is to prevent redundant traversals of a fully marked block, particularly in the markLoop scenario, if all nodes within a block are marked, we can subsequently mark the block as fully marked. Test Plan: Rely on unittest Reviewed By: davidberard98 Differential Revision: D73476431

facebook-github-bot · 2025-06-11T18:02:49Z

This pull request was exported from Phabricator. Differential Revision: D73476431

github-actions · 2025-08-10T21:35:27Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

pytorch-bot bot added the release notes: jit release notes category label Apr 28, 2025

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Apr 28, 2025

facebook-github-bot added the fb-exported label Apr 28, 2025

shinyehtsai changed the title ~~[Draft] DeadCodeEliminator Mark(block) improvement~~ [WIP] DeadCodeEliminator Mark(block) improvement Apr 28, 2025

shinyehtsai force-pushed the export-D73476431 branch from 5a43806 to f9260eb Compare April 28, 2025 17:38

shinyehtsai force-pushed the export-D73476431 branch from f9260eb to fdfa1fd Compare April 28, 2025 18:37

shinyehtsai force-pushed the export-D73476431 branch from fdfa1fd to 166b417 Compare April 28, 2025 22:32

shinyehtsai force-pushed the export-D73476431 branch from 166b417 to d601a5d Compare April 28, 2025 23:33

gmagogsfm requested changes Apr 29, 2025

View reviewed changes

shinyehtsai force-pushed the export-D73476431 branch from d601a5d to f3512a4 Compare April 30, 2025 20:21

shinyehtsai requested a review from gmagogsfm April 30, 2025 20:23

shinyehtsai force-pushed the export-D73476431 branch from f3512a4 to 45ce6db Compare April 30, 2025 20:29

shinyehtsai force-pushed the export-D73476431 branch from 45ce6db to 9fe0374 Compare April 30, 2025 20:52

shinyehtsai requested a review from gmagogsfm May 12, 2025 18:56

shinyehtsai changed the title ~~[WIP] DeadCodeEliminator Mark(block) improvement~~ [jit] DeadCodeEliminator Mark(block) improvement May 12, 2025

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 12, 2025

shinyehtsai force-pushed the export-D73476431 branch from 28257db to 6e55810 Compare May 12, 2025 23:10

gmagogsfm reviewed May 13, 2025

View reviewed changes

davidberard98 approved these changes Jun 5, 2025

View reviewed changes

davidberard98 reviewed Jun 5, 2025

View reviewed changes

shinyehtsai force-pushed the export-D73476431 branch from 6e55810 to 372b90e Compare June 5, 2025 23:00

shinyehtsai requested a review from gmagogsfm June 5, 2025 23:01

shinyehtsai force-pushed the export-D73476431 branch from 372b90e to 3bf45b7 Compare June 6, 2025 16:48

shinyehtsai force-pushed the export-D73476431 branch from 3bf45b7 to 875fd9e Compare June 9, 2025 23:56

shinyehtsai force-pushed the export-D73476431 branch from 875fd9e to ad0364d Compare June 11, 2025 17:56

shinyehtsai force-pushed the export-D73476431 branch from ad0364d to 41d799c Compare June 11, 2025 18:02

github-actions bot added the Stale label Aug 10, 2025

[jit] DeadCodeEliminator Mark(block) improvement #152348

Are you sure you want to change the base?

[jit] DeadCodeEliminator Mark(block) improvement #152348

Uh oh!

Conversation

shinyehtsai commented Apr 28, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 28, 2025

Uh oh!

pytorch-bot bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152348

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

facebook-github-bot commented Apr 28, 2025

Uh oh!

facebook-github-bot commented Apr 28, 2025

Uh oh!

facebook-github-bot commented Apr 28, 2025

Uh oh!

facebook-github-bot commented Apr 28, 2025

Uh oh!

facebook-github-bot commented Apr 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

facebook-github-bot commented Apr 30, 2025

Uh oh!

facebook-github-bot commented Apr 30, 2025

Uh oh!

facebook-github-bot commented Apr 30, 2025

Uh oh!

facebook-github-bot commented May 12, 2025

Uh oh!

shinyehtsai commented May 12, 2025

Uh oh!

facebook-github-bot commented May 12, 2025

Uh oh!

gmagogsfm May 13, 2025

Choose a reason for hiding this comment

Uh oh!

shinyehtsai Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

davidberard98 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 5, 2025

Uh oh!

facebook-github-bot commented Jun 6, 2025

Uh oh!

facebook-github-bot commented Jun 9, 2025

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

facebook-github-bot commented Jun 11, 2025

Uh oh!

github-actions bot commented Aug 10, 2025

Uh oh!

Uh oh!

shinyehtsai commented Apr 28, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 28, 2025 •

edited

Loading