Skip to content

Move CUDAEvent to c10 #158219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: gh/guangyey/168/base
Choose a base branch
from
Open

Conversation

guangyey
Copy link
Collaborator

@guangyey guangyey commented Jul 14, 2025

Stack from ghstack (oldest at bottom):

Motivation

When I refactored the caching allocator, I noticed that there are two separate pieces of code of EventPool : one in aten/cuda/CachingHostAllocator.cpp and another in c10/cuda/CUDACachingAllocator. I would like to refactor these so that they share a single implementation.
To achieve this, I have to move aten/cuda/CUDAEvent.h to c10/cuda, which I understand this is a big change. However, I think it makes sense conceptually - CUDAStream and CUDAEvent are both fundamental CUDA abstractions, and since CUDAStream is already in c10/cuda, placing CUDAEvent there as well seems reasonable for consistency.

@guangyey guangyey requested review from eqy and syed-ahmed as code owners July 14, 2025 07:21
Copy link

pytorch-bot bot commented Jul 14, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158219

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 42fac35 with merge base b1a6027 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
@albanD
Copy link
Collaborator

albanD commented Jul 14, 2025

@eqy what do you think of this stack?

Copy link
Collaborator

@eqy eqy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory I don't have any objections, a bit surprised that this was in ATen to begin with

@guangyey guangyey added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category release notes: cuda release notes category labels Jul 15, 2025
is_created_ = true;
}

void moveHelper(CUDAEvent& other) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change from (CUDAEvent&& other) since linter error

 Error (CLANGTIDY) [cppcoreguidelines-rvalue-reference-param-not-moved,-warnings-as-errors]
    rvalue reference parameter 'other' is never moved from inside the function
    body

CUDAEvent& operator=(const CUDAEvent&) = delete;

CUDAEvent(CUDAEvent&& other) noexcept {
moveHelper(other);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change for void moveHelper(CUDAEvent& other), the original code is moveHelper(std::move(other));

}
CUDAEvent& operator=(CUDAEvent&& other) noexcept {
if (this != &other) {
moveHelper(other);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change for void moveHelper(CUDAEvent& other), the original code is moveHelper(std::move(other));

(*interp)->trace_gpu_event_deletion(
at::kCUDA, reinterpret_cast<uintptr_t>(event_));
}
C10_CUDA_CHECK_WARN(cudaEventDestroy(event_));
Copy link
Collaborator Author

@guangyey guangyey Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove try-catch because linter error, use C10_CUDA_CHECK_WARN instead.

Error (CLANGTIDY) [bugprone-empty-catch,-warnings-as-errors]
    empty catch statements hide issues; to handle exceptions appropriately,
    consider re-throwing, handling, or avoiding catch altogether

@guangyey guangyey marked this pull request as ready for review July 15, 2025 07:11
@guangyey guangyey added the ciflow/rocm Trigger "default" config CI on ROCm label Jul 15, 2025
guangyey added 7 commits July 15, 2025 12:34
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
guangyey added 2 commits July 15, 2025 20:47
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey guangyey requested review from eqy and albanD July 17, 2025 09:38
@guangyey
Copy link
Collaborator Author

@albanD @eqy @jeffdaily May I know if you have any comments on this stack of PRs?

guangyey added 4 commits July 17, 2025 11:55
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey
Copy link
Collaborator Author

@albanD @eqy @jeffdaily May I know if you have any comments on this stack of PRs?

guangyey added 2 commits July 29, 2025 15:42
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request open source release notes: cuda release notes category topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants