Skip to content

[inductor] fix the unligned variable ranges issue in fuse node #138568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

Valentine233
Copy link
Collaborator

@Valentine233 Valentine233 commented Oct 22, 2024

Fixes #138550.

Description

In the fusion of two nodes, one node with less variables (node_to_recomp) would make its variable ranges aligned with the other node (ref_node). In detail, node_to_recomp would change its variable ranges to the original ranges of ref_node. However, if both of the nodes have changed its ranges, i.e., the simplified variable ranges are different from its original ones, the issue comes up.

Solution

For the case where the ref_node also changes its variable ranges, we recompute the size and body for it, to ensure the nodes are simplified to the same size.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

Copy link

pytorch-bot bot commented Oct 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138568

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 54dcda8 with merge base f4ee5a2 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@Valentine233
Copy link
Collaborator Author

The new UT would fail for another fma VecN issue. Land after the fix: #138655.

@Valentine233 Valentine233 marked this pull request as ready for review October 23, 2024 05:42
@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 23, 2024
Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI failures. Can you also add a summary on the root cause of the issues and how you address it in the PR description?

@Valentine233
Copy link
Collaborator Author

CI failures. Can you also add a summary on the root cause of the issues and how you address it in the PR description?

Thanks! The failed UT triggers an issue about fma VectorizedN, which depends on the fix of #138655. The PR descriptions are updated.

@Valentine233 Valentine233 force-pushed the fix_fuse_node branch 2 times, most recently from a446180 to 2a764fa Compare October 25, 2024 05:26
@Valentine233 Valentine233 requested a review from jgong5 October 25, 2024 08:29
@Valentine233 Valentine233 added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 25, 2024
Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UT failures seem related?

@Valentine233
Copy link
Collaborator Author

@vfdev-5 @eellison Could you help review the PR? Thanks!

@Valentine233 Valentine233 requested a review from jansel November 6, 2024 03:18
@Valentine233
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…ch#138568)

Fixes pytorch#138550.

### Description
In the fusion of two nodes, one node with less variables (`node_to_recomp`) would make its variable ranges aligned with the other node (`ref_node`). In detail, `node_to_recomp` would change its variable ranges to the original ranges of `ref_node`. However, if both of the nodes have changed its ranges, i.e., the simplified variable ranges are different from its original ones, the issue comes up.

### Solution
For the case where the `ref_node` also changes its variable ranges, we recompute the size and body for it, to ensure the nodes are simplified to the same size.

Pull Request resolved: pytorch#138568
Approved by: https://github.com/jgong5, https://github.com/leslie-fang-intel, https://github.com/jansel
@github-actions github-actions bot deleted the fix_fuse_node branch December 8, 2024 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: inductor open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Inductor] unaligned variable ranges during node fusion
8 participants