Add Green Context Support for fsdp2

### 🚀 The feature, motivation and pitch

PyTorch relies on CUDA streams and events to naively overlap computation and communication. However, these overlaps can interfere with one another - causing throughput degradation and unstable performance. We propose introducing CUDA Green Context[1] in fsdp2 to provide contexts that isolate SM resources for computation and communication. For example, we can split 132 sm into two part 104 sm for compute and 24 sm for communication in H100 (will waste 4 sm, since green context need each partiton's sm is a multiple of 8.
FlashInfer[2] has already integrated this experimental feature into their framework using API provided by cuda-python.
I think a naive implentation will be:
we split the SM into two contexts, computation and communcation(only one context for  allgather/reduce_scatter/allreduce)
we create streams from context, 1 stream for overlapped compute, for commucation, we just replace normal cuda streams with streams created from commucation green context
Since we want to use green context stream for overlapped computation and default stream for non-overlapped computation, we need a hook-like mechanism that, just before an allgather or reduce-scatter call begins, switches the current CUDA stream to the green-context stream, and then switches back to the default stream when the communication finishes—so that we can fully utilize all GPU resources. However, I’m not sure whether such frequent stream-switching would introduce significant overhead. Perhaps instead of swapping streams back and forth, we could dispatch different kernels onto two separate streams from the start. This way we overlap communication on the green context stream with computation on the default stream without incurring the cost of repeated stream switches.
 

[1] https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__GREEN__CONTEXTS.html
[2] https://github.com/flashinfer-ai/flashinfer/pull/1163

### Alternatives

_No response_

### Additional context

_No response_

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Green Context Support for fsdp2 #160272

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add Green Context Support for fsdp2 #160272

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions