-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Make device check error message more descriptive #150750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150750
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 6ab42b7 with merge base dfcfad2 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "topic: not user facing" |
Hello @mikaylagawarecki, please help review the change, thanks! |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64-mps / test (mps, 1, 1, macos-m2-15) Details for Dev Infra teamRaised by workflow job |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
@pytorchbot revert -m 'Sorry for reverting your change but it seems to cause a test to fail in trunk' -c nosignal test_sparse.py::TestSparseOneOff::test_cuda_from_cpu GH job link HUD commit link |
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit 8253970. Reverted #150750 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a test to fail in trunk ([comment](#150750 (comment)))
@zeshengzong your PR has been successfully reverted. |
This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.
Fixed test, please review the change, thanks! |
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 2 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
You don't have permissions to rebase this PR since you are a first time contributor. If you think this is a mistake, please contact PyTorch Dev Infra. |
@pytorchbot rebase -b main |
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
Successfully rebased |
6a49d61
to
6ab42b7
Compare
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Fixes #122757 The fix is lost after revert and rebase previous PR #150750 (only change of tests are merged). ## Test Result ```python >>> import torch >>> >>> model_output = torch.randn(10, 5).cuda() >>> labels = torch.randint(0, 5, (10,)).cuda() >>> weights = torch.randn(5) >>> >>> loss_fn = torch.nn.CrossEntropyLoss(weight=weights) >>> loss = loss_fn(input=model_output, target=labels) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/zong/code/pytorch/torch/nn/modules/module.py", line 1767, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/modules/module.py", line 1778, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/modules/loss.py", line 1297, in forward return F.cross_entropy( ^^^^^^^^^^^^^^^^ File "/home/zong/code/pytorch/torch/nn/functional.py", line 3476, in cross_entropy return torch._C._nn.cross_entropy_loss( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: Expected all tensors to be on the same device, but got weight is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA_nll_loss_forward) ``` Pull Request resolved: #155085 Approved by: https://github.com/mikaylagawarecki
Fixes #122757
Test Result