-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Add DeviceAllocator as the base device allocator #138222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/138222
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 5 Pending, 2 Unrelated FailuresAs of commit 5521326 with merge base 178515d ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 6 checks: Check Labels / Check labels, Check mergeability of ghstack PR / ghstack-mergeability-check, pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, linux.12xlarge, unstable), rocm / linux-jammy-rocm-py3.10 / test (default, 2, 6, linux.rocm.gpu.2), xpu / linux-jammy-xpu-2025.1-py3.9 / test (default, 2, 6, linux.idc.xpu), xpu / linux-jammy-xpu-2025.1-py3.9 / test (default, 5, 6, linux.idc.xpu) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
# Motivation The following API will be put under torch.accelerator - empty_cache - max_memory_allocated - max_memory_reserved - memory_allocated - memory_reserved - memory_stats - reset_accumulated_memory_stats - reset_peak_memory_stats Pull Request resolved: #152932 Approved by: https://github.com/albanD ghstack dependencies: #138222
Pull Request resolved: #155200 Approved by: https://github.com/albanD ghstack dependencies: #138222, #152932
@pytorchbot revert -c nosignal -m "Broke ROCm periodic runs on MI300 e.g. https://github.com/pytorch/pytorch/actions/runs/16764977800/job/47470050573" cc @guangyey If this revert doesn't go through because it's part of a stack, please forward fix the issue. |
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit 4604f04. Reverted #155200 on behalf of https://github.com/jithunnair-amd due to Broke ROCm periodic runs on MI300 e.g. https://github.com/pytorch/pytorch/actions/runs/16764977800/job/47470050573 ([comment](#138222 (comment)))
This reverts commit 15f1173. Reverted #152932 on behalf of https://github.com/jithunnair-amd due to Broke ROCm periodic runs on MI300 e.g. https://github.com/pytorch/pytorch/actions/runs/16764977800/job/47470050573 ([comment](#138222 (comment)))
This reverts commit f7a66da. Reverted #138222 on behalf of https://github.com/jithunnair-amd due to Broke ROCm periodic runs on MI300 e.g. https://github.com/pytorch/pytorch/actions/runs/16764977800/job/47470050573 ([comment](#138222 (comment)))
@guangyey your PR has been successfully reverted. |
@jithunnair-amd I think the failure is not introduced by this PR, I see the same failure |
Hi @jithunnair-amd , the |
Starting merge as part of PR stack under #155200 |
# Motivation The following API will be put under torch.accelerator - empty_cache - max_memory_allocated - max_memory_reserved - memory_allocated - memory_reserved - memory_stats - reset_accumulated_memory_stats - reset_peak_memory_stats Pull Request resolved: #152932 Approved by: https://github.com/albanD ghstack dependencies: #138222
Pull Request resolved: #155200 Approved by: https://github.com/albanD ghstack dependencies: #138222, #152932
# Motivation In line with [RFC] [A device-agnostic Python device memory related API design for stream-based accelerators](pytorch#134978), some memory-related APIs are widely used in popular repositories, such as HuggingFace [so many if-else conditional code](https://github.com/search?q=repo%3Ahuggingface%2Faccelerate%20torch.cuda.empty_cache&type=code). We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases. <div align="center"> <table> <tr> <td> Device-specific memory APIs torch.xxx.foo</td> <td> Device-agnostic memory APIs torch.accelerator.foo</td> </tr> <tr> <td> ```python torch.xxx.empty_cache ``` </td> <td> ```python torch.accelerator.empty_cache ``` </td> </tr> <tr> <td> ```python torch.xxx.reset_peak_memory_stats ``` </td> <td> ```python torch.accelerator.reset_peak_memory_stats ``` </td> </tr> <tr> <td> ```python torch.xxx.reset_accumulated_memory_stats ``` </td> <td> ```python torch.accelerator.reset_accumulated_memory_stats ``` </td> </tr> <tr> <td> ```python torch.xxx.memory_stats ``` </td> <td> ```python torch.accelerator.memory_stats ``` </td> </tr> <tr> <td> ```python torch.xxx.memory_allocated ``` </td> <td> ```python torch.accelerator.memory_allocated ``` </td> </tr> <tr> <td> ```python torch.xxx.max_memory_allocated ``` </td> <td> ```python torch.accelerator.max_memory_allocated ``` </td> </tr> <tr> <td> ```python torch.xxx.memory_reserved ``` </td> <td> ```python torch.accelerator.memory_reserved ``` </td> </tr> <tr> <td> ```python torch.xxx.max_memory_reserved ``` </td> <td> ```python torch.accelerator.max_memory_reserved ``` </td> </tr> </table> </div> # Solution This design follows a similar pattern to `HostAllocator`. We're introducing a base class `DeviceAllocator`, from which `CUDAAllocator` and `XPUAllocator` will inherit. This allows us to provide a unified call path like: `torch.accelerator.empty_cache()` -> `GetDeviceAllocator(allocator)->empty_cache()`. Pull Request resolved: pytorch#138222 Approved by: https://github.com/albanD, https://github.com/Camyll
# Motivation The following API will be put under torch.accelerator - empty_cache - max_memory_allocated - max_memory_reserved - memory_allocated - memory_reserved - memory_stats - reset_accumulated_memory_stats - reset_peak_memory_stats Pull Request resolved: pytorch#152932 Approved by: https://github.com/albanD ghstack dependencies: pytorch#138222
Pull Request resolved: pytorch#155200 Approved by: https://github.com/albanD ghstack dependencies: pytorch#138222, pytorch#152932
Stack from ghstack (oldest at bottom):
Motivation
In line with [RFC] A device-agnostic Python device memory related API design for stream-based accelerators, some memory-related APIs are widely used in popular repositories, such as HuggingFace so many if-else conditional code. We would like to introduce a generic API set under torch.accelerator namespace to generalize these user cases.
Solution
This design follows a similar pattern to
HostAllocator
. We're introducing a base classDeviceAllocator
, from whichCUDAAllocator
andXPUAllocator
will inherit. This allows us to provide a unified call path like:torch.accelerator.empty_cache()
->GetDeviceAllocator(allocator)->empty_cache()
.cc @albanD @EikanWang