Use device agnostic APIs for RNG #159021

amathewc · 2025-07-24T06:12:04Z

MOTIVATION

In order to make the test files more device agnostic, use device agnostic APIs to set the manual seed rather than hard coded CUDA APIs. We have also verified that this does not break any of the existing CUDA functionality.

CHANGES

In test_tp_random_state.py , use torch.get_device_module(self.device_type).manual_seed(dp_rank) instead of torch.cuda.manual_seed(dp_rank)
In test_random_ops.py, use torch.get_device_module(self.device_type).manual_seed(self.rank) to initialize manual seed so as to properly acheive uniform distribution when random._rng_tracker.distribute_region_enabled is set to False.
Fix Lint (MYPY) issues. Properly annotate all functions with return type.

cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @EikanWang @kwen2501 @ankurneog

pytorch-bot · 2025-07-24T06:12:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159021

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a8ebf2c with merge base 4d5b3f2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Pull Request Overview

This PR updates test files to use device-agnostic APIs for random number generator (RNG) seeding instead of hardcoded CUDA APIs, making the code more portable across different device types. It also adds proper type annotations to fix MYPY linting issues.

Replace torch.cuda.manual_seed() with torch.get_device_module(self.device_type).manual_seed()
Add comprehensive type annotations to all test methods and helper functions
Fix a missing manual seed initialization in the distribute region test

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
test/distributed/tensor/test_random_ops.py	Adds type annotations to all methods and adds device-agnostic manual seed call for distribute region test
test/distributed/tensor/parallel/test_tp_random_state.py	Replaces CUDA-specific manual_seed with device-agnostic API and adds comprehensive type annotations

test/distributed/tensor/test_random_ops.py

amathewc · 2025-07-29T09:27:04Z

@kwen2501 @XilunWu : Could you help in merging this ?

amathewc · 2025-08-06T08:47:08Z

@kwen2501 @XilunWu : Please help with the merging

wconstab · 2025-08-06T13:19:33Z

test/distributed/tensor/test_random_ops.py

@@ -130,6 +131,7 @@ def test_meta_tensor_init(self):
                )

        # Test 2: disable the distribute region for RNG
+        torch.get_device_module(self.device_type).manual_seed(self.rank)


This change feels suspicious because it adds a new set seed where there previously wasn't one. Also it missed changing a cuda-specific seed setting in the same test further above.

@wconstab : //Also it missed changing a cuda-specific seed setting in the same test further above.
Fixed this with commit a8ebf2c

//This change feels suspicious because it adds a new set seed where there previously wasn't one
We observed that with the latest changes in RNG, uniform distribution was not achieved after setting random._rng_tracker.distribute_region_enabled = False . Setting the manual seed helped to mitigate that . We also verified this fix on CUDA devices to make sure that this is not breaking any functionality.

Signed-off-by: Aby Mathew C <aby.mathew.c@intel.com>

pytorch-bot bot added oncall: distributed Add this issue/PR to distributed oncall triage queue topic: not user facing topic category labels Jul 24, 2025

pytorchbot added the open source label Jul 24, 2025

EikanWang requested review from Copilot and XilunWu July 28, 2025 03:50

Copilot AI reviewed Jul 28, 2025

View reviewed changes

test/distributed/tensor/test_random_ops.py Outdated Show resolved Hide resolved

EikanWang approved these changes Jul 28, 2025

View reviewed changes

EikanWang requested a review from kwen2501 July 28, 2025 03:52

amathewc force-pushed the RNG branch from fbfbc29 to d9f0ed4 Compare July 28, 2025 07:01

wconstab requested changes Aug 6, 2025

View reviewed changes

Fix typos

8da15e1

Signed-off-by: Aby Mathew C <aby.mathew.c@intel.com>

amathewc force-pushed the RNG branch from d9f0ed4 to 8da15e1 Compare August 12, 2025 10:59

Merge branch 'main' into RNG

a8ebf2c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use device agnostic APIs for RNG #159021

Use device agnostic APIs for RNG #159021

amathewc commented Jul 24, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

amathewc commented Jul 29, 2025

Uh oh!

amathewc commented Aug 6, 2025

Uh oh!

wconstab Aug 6, 2025

Uh oh!

amathewc Aug 12, 2025

Uh oh!

Uh oh!

Use device agnostic APIs for RNG #159021

Are you sure you want to change the base?

Use device agnostic APIs for RNG #159021

Conversation

amathewc commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MOTIVATION

CHANGES

Uh oh!

pytorch-bot bot commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159021

✅ No Failures

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

amathewc commented Jul 29, 2025

Uh oh!

amathewc commented Aug 6, 2025

Uh oh!

wconstab Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

amathewc Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amathewc commented Jul 24, 2025 •

edited

Loading

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading