Fix an example: Resolve broadcasting error in attn_bias and attn_mask… #130209

xingyunjohn1 · 2024-07-06T22:26:32Z

… addition, fix device assignment for newly created variables in method

Fix an example: Resolve broadcasting error in attn_bias and attn_mask addition, fix device assignment for newly created variables in method

attn_bias += attn_mask would cause a broadcasting error. Because the shape of attn_bias is (L, S), the shape of the output would be expected as (L, S) too. When the shape of input is (N, num_heads, L, S), a broadcasting should be triggered. Then, the shape of the output would be (N, num_heads, L, S), which is unexpected.
attn_bias is a newly created variables in method, which is not assigned device.

This is my retry of #130200 . I used a wrong account in that pr.

… addition, fix device assignment for newly created variables in method

pytorch-bot · 2024-07-06T22:26:35Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130209

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 81dfdda with merge base 9983242 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mikaylagawarecki

fyi @drisspg

torch/nn/functional.py

@mikaylagawarecki

A more elegant implementation of distribution devices from @mikaylagawarecki. Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com>

mikaylagawarecki · 2024-07-18T17:08:09Z

@pytorchbot merge

pytorchmergebot · 2024-07-18T17:10:01Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

mikaylagawarecki · 2024-07-18T20:00:23Z

@pytorchbot merge

pytorchmergebot · 2024-07-18T20:02:21Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2024-07-19T02:00:50Z

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

mikaylagawarecki · 2024-07-19T15:21:20Z

@pytorchbot merge

pytorchmergebot · 2024-07-19T15:23:10Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch#130209) … addition, fix device assignment for newly created variables in method Fix an example: Resolve broadcasting error in attn_bias and attn_mask addition, fix device assignment for newly created variables in method 1. `attn_bias += attn_mask` would cause a broadcasting error. Because the shape of `attn_bias` is (L, S), the shape of the output would be expected as (L, S) too. When the shape of input is (N, num_heads, L, S), a broadcasting should be triggered. Then, the shape of the output would be (N, num_heads, L, S), which is unexpected. 2. `attn_bias` is a newly created variables in method, which is not assigned device. **This is my retry of pytorch#130200 .** I used a wrong account in that pr. Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> Pull Request resolved: pytorch#130209 Approved by: https://github.com/mikaylagawarecki

xingyunjohn1 · 2024-08-18T15:14:31Z

This sholud be reopened and merge again for the code has been overrided. @mikaylagawarecki Thank you.

@mikaylagawarecki

#135427) …` and `attn_mask`, and correct device assignment for newly created variables in the method. Fix example: Address broadcasting error in the addition of `attn_bias` and `attn_mask`, and correct device assignment for newly created variables in the method. 1. Adding `attn_bias += attn_mask` results in a broadcasting error. The expected shape of `attn_bias` is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired. 2. `attn_bias` is a newly created variable within the method, but it is not assigned to the correct device. **This is my retry of PR #130209 . The PR has been merged into commit `d4a79d4a7c746068d25fe5cf9333495561f4ce1f`, but the modifications were overwritten by subsequent commits.** Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> @mikaylagawarecki provided a more elegant implementation. Pull Request resolved: #135427 Approved by: https://github.com/ezyang

@mikaylagawarecki

pytorch#135427) …` and `attn_mask`, and correct device assignment for newly created variables in the method. Fix example: Address broadcasting error in the addition of `attn_bias` and `attn_mask`, and correct device assignment for newly created variables in the method. 1. Adding `attn_bias += attn_mask` results in a broadcasting error. The expected shape of `attn_bias` is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired. 2. `attn_bias` is a newly created variable within the method, but it is not assigned to the correct device. **This is my retry of PR pytorch#130209 . The PR has been merged into commit `d4a79d4a7c746068d25fe5cf9333495561f4ce1f`, but the modifications were overwritten by subsequent commits.** Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> @mikaylagawarecki provided a more elegant implementation. Pull Request resolved: pytorch#135427 Approved by: https://github.com/ezyang

@mikaylagawarecki

pytorch#135427) …` and `attn_mask`, and correct device assignment for newly created variables in the method. Fix example: Address broadcasting error in the addition of `attn_bias` and `attn_mask`, and correct device assignment for newly created variables in the method. 1. Adding `attn_bias += attn_mask` results in a broadcasting error. The expected shape of `attn_bias` is (L, S), so the output should also have the shape (L, S). However, when the input shape is (N, num_heads, L, S), broadcasting occurs, leading to an output shape of (N, num_heads, L, S), which is not desired. 2. `attn_bias` is a newly created variable within the method, but it is not assigned to the correct device. **This is my retry of PR pytorch#130209 . The PR has been merged into commit `d4a79d4a7c746068d25fe5cf9333495561f4ce1f`, but the modifications were overwritten by subsequent commits.** Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com> @mikaylagawarecki provided a more elegant implementation. Pull Request resolved: pytorch#135427 Approved by: https://github.com/ezyang

Fix an example: Resolve broadcasting error in attn_bias and attn_mask…

fbe0609

… addition, fix device assignment for newly created variables in method

xingyunjohn1 requested review from albanD, jbschlosser and mikaylagawarecki as code owners July 6, 2024 22:26

pytorchbot added the open source label Jul 6, 2024

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 8, 2024

mikaylagawarecki approved these changes Jul 10, 2024

View reviewed changes

torch/nn/functional.py Outdated Show resolved Hide resolved

mikaylagawarecki requested a review from drisspg July 10, 2024 18:59

albanD removed their request for review July 11, 2024 22:00

Update torch/nn/functional.py line about distribution devices

81dfdda

A more elegant implementation of distribution devices from @mikaylagawarecki. Co-authored-by: mikaylagawarecki <mikaylagawarecki@gmail.com>

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 18, 2024

pytorchmergebot added the merging label Jul 18, 2024

pytorchmergebot removed the merging label Jul 18, 2024

mikaylagawarecki added release notes: nn release notes category topic: docs topic category labels Jul 18, 2024

pytorchmergebot added the merging label Jul 18, 2024

pytorchmergebot added the Merged label Jul 19, 2024

pytorchmergebot closed this in d4a79d4 Jul 19, 2024

pytorchmergebot removed the merging label Jul 19, 2024

henrylhtsang mentioned this pull request Jul 31, 2024

[BE][typing] fix types in common pruning #132309

Closed

xingyunjohn1 mentioned this pull request Sep 8, 2024

Fix example: Address broadcasting error in the addition of `attn_bias… #135427

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix an example: Resolve broadcasting error in attn_bias and attn_mask… #130209

Fix an example: Resolve broadcasting error in attn_bias and attn_mask… #130209

Uh oh!

xingyunjohn1 commented Jul 6, 2024

Uh oh!

pytorch-bot bot commented Jul 6, 2024 •

edited

Loading

Uh oh!

mikaylagawarecki left a comment

Uh oh!

Uh oh!

mikaylagawarecki commented Jul 18, 2024

Uh oh!

pytorchmergebot commented Jul 18, 2024

Uh oh!

mikaylagawarecki commented Jul 18, 2024

Uh oh!

pytorchmergebot commented Jul 18, 2024

Uh oh!

pytorchmergebot commented Jul 19, 2024

Uh oh!

mikaylagawarecki commented Jul 19, 2024

Uh oh!

pytorchmergebot commented Jul 19, 2024

Uh oh!

xingyunjohn1 commented Aug 18, 2024 •

edited

Loading

Uh oh!

Uh oh!

Fix an example: Resolve broadcasting error in attn_bias and attn_mask… #130209

Fix an example: Resolve broadcasting error in attn_bias and attn_mask… #130209

Uh oh!

Conversation

xingyunjohn1 commented Jul 6, 2024

Uh oh!

pytorch-bot bot commented Jul 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/130209

✅ No Failures

Uh oh!

mikaylagawarecki left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikaylagawarecki commented Jul 18, 2024

Uh oh!

pytorchmergebot commented Jul 18, 2024

Merge failed

Uh oh!

mikaylagawarecki commented Jul 18, 2024

Uh oh!

pytorchmergebot commented Jul 18, 2024

Merge started

Uh oh!

pytorchmergebot commented Jul 19, 2024

Uh oh!

mikaylagawarecki commented Jul 19, 2024

Uh oh!

pytorchmergebot commented Jul 19, 2024

Merge started

Uh oh!

xingyunjohn1 commented Aug 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 6, 2024 •

edited

Loading

xingyunjohn1 commented Aug 18, 2024 •

edited

Loading