Add meta kernel for sdpa_math_for_mps #159695

angelayi · 2025-08-02T00:01:01Z

Stack from ghstack (oldest at bottom):

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

[ghstack-poisoned]

pytorch-bot · 2025-08-02T00:01:05Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159695

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ghstack-mergeability-check and Check labels failing with 'Resource not accessible by integration'

✅ You can merge normally! (3 Unrelated Failures)

As of commit 085fc1a with merge base aeb5321 ():

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

Check Labels / Check labels (gh) (#159894)
RuntimeError: GraphQL query
Check mergeability of ghstack PR / ghstack-mergeability-check (gh) (#159899)
RuntimeError: GraphQL query
pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, linux.12xlarge, unstable) (gh) (#158876)
/var/lib/jenkins/workspace/xla/torch_xla/csrc/runtime/BUILD:476:14: Compiling torch_xla/csrc/runtime/xla_util_test.cpp failed: (Exit 1): gcc failed: error executing CppCompile command (from target //torch_xla/csrc/runtime:xla_util_test) /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 229 arguments skipped)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 4132ec7 Pull-Request-resolved: #159695

[ghstack-poisoned]

malfet · 2025-08-04T16:39:44Z

test/test_mps.py

@@ -9446,6 +9447,75 @@ def test_fast_full_attention(self, dtype, contiguous, head_dim, with_mask):
        self.run_fast_attention_test(q, k, v, with_mask)


+
+
+class TestSDPAMetaDispatchMode(TorchDispatchMode):


Is this test really necessary here? It feels like it's being indirectly tested by aot tests

Or, if we want an explicit test, shouldn't it be added to the test_meta.py or something?

[ghstack-poisoned]

angelayi · 2025-08-05T22:24:50Z

@pytorchbot merge -f "graphql query errors"

pytorchmergebot · 2025-08-05T22:26:41Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

In some cases we have mps kernels which are reused across higher-order-op subgraphs and the toplevel code. However, currently we initialize the variable for the mps kernel the first time we use it, which runs into an issue if we run into the mps kernel within a subgraph since the kernel will only be initialized within the subgraph scope. For instance: ``` if ... auto mps_lib_0_func = ... mps_lib_0_func->run() // since we already used mps_lib_0 once, we don't re-initialize it mps_lib_0_func->run() // error, mps_lib_0_func not initialized ``` So the solution we took here is to initialize all the kernels at the beginning: ``` const std::shared_ptr<at::native::mps::MetalKernelFunction> get_mps_lib_0() { static const auto func = mps_lib_0.getKernelFunction("generated_kernel"); return func; } AOTIMetalKernelFunctionHandle get_mps_lib_0_handle() { static const auto handle = AOTIMetalKernelFunctionHandle(get_mps_lib_0().get()); return handle; } ... if ... get_mps_lib_0()->run() get_mps_lib_0()->run() // success ``` Pull Request resolved: #159753 Approved by: https://github.com/malfet ghstack dependencies: #159456, #159695

Update

e20dd1c

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) module: inductor labels Aug 2, 2025

angelayi added a commit that referenced this pull request Aug 2, 2025

Add meta kernel for sdpa_math_for_mps

5d752ef

ghstack-source-id: 4132ec7 Pull-Request-resolved: #159695

angelayi mentioned this pull request Aug 2, 2025

[mps] Turn on inductor dynamic shapes tests #159456

Closed

Update

43d33ba

[ghstack-poisoned]

angelayi requested review from kulinseth and malfet as code owners August 4, 2025 05:03

angelayi mentioned this pull request Aug 4, 2025

[aoti][mps] Initialize mps kernels first #159753

Closed

angelayi added the topic: not user facing topic category label Aug 4, 2025

angelayi added 3 commits August 3, 2025 22:43

Update

5be2a28

[ghstack-poisoned]

Update

f09b6c9

[ghstack-poisoned]

Update

26d799e

[ghstack-poisoned]

malfet approved these changes Aug 4, 2025

View reviewed changes

malfet reviewed Aug 4, 2025

View reviewed changes

angelayi added 2 commits August 4, 2025 11:34

Update

effd90a

[ghstack-poisoned]

Update

085fc1a

[ghstack-poisoned]

pytorchmergebot added the merging label Aug 5, 2025

pytorchmergebot added the Merged label Aug 5, 2025

pytorchmergebot closed this in 74a754a Aug 5, 2025

pytorchmergebot removed the merging label Aug 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add meta kernel for sdpa_math_for_mps #159695

Add meta kernel for sdpa_math_for_mps #159695

Uh oh!

angelayi commented Aug 2, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 2, 2025 •

edited

Loading

Uh oh!

malfet Aug 4, 2025

Uh oh!

angelayi commented Aug 5, 2025

Uh oh!

pytorchmergebot commented Aug 5, 2025

Uh oh!

Uh oh!

		@@ -9446,6 +9447,75 @@ def test_fast_full_attention(self, dtype, contiguous, head_dim, with_mask):
		self.run_fast_attention_test(q, k, v, with_mask)




		class TestSDPAMetaDispatchMode(TorchDispatchMode):

Add meta kernel for sdpa_math_for_mps #159695

Add meta kernel for sdpa_math_for_mps #159695

Uh oh!

Conversation

angelayi commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159695

❗ 1 Active SEVs

✅ You can merge normally! (3 Unrelated Failures)

Uh oh!

malfet Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

angelayi commented Aug 5, 2025

Uh oh!

pytorchmergebot commented Aug 5, 2025

Merge started

Uh oh!

Uh oh!

angelayi commented Aug 2, 2025 •

edited

Loading

pytorch-bot bot commented Aug 2, 2025 •

edited

Loading