Add torch.compile support for torch.mm(out_dtype=...) #159026

yf225 · 2025-07-24T08:53:58Z

Original comment: pytorch/helion#339 (comment).

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben

pytorch-bot · 2025-07-24T08:54:02Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159026

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 381da56 with merge base ee4c5c7 ():

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / linux-jammy-py3_9-clang9-xla / test (xla, 1, 1, lf.linux.12xlarge, unstable) (gh) (#158876)
sccache: error: couldn't connect to server

This comment was automatically generated by Dr. CI and updates every 15 minutes.

PaulZhang12

Nice!

ngimel · 2025-07-24T16:27:33Z

Test failures are real, looks like those functions need to be added to aoti shim?

Skylion007 · 2025-07-24T16:50:18Z

torch/_meta_registrations.py

@@ -2421,6 +2421,19 @@ def meta_mm(a, b):
    return a.new_empty(N, P)


+@register_meta(aten.mm.dtype)
+def meta_mm_dtype(a, b, out_dtype):


Shouldn't there also be check that an and b are on the same device or is that not necessary? (Seems like other metas need that check anyway)

ngimel · 2025-07-25T19:25:43Z

@desertfire can you help with aoti? Right now we are skipping a test but ideally we'd like to make sure that it's handled correctly

yf225 · 2025-07-25T20:06:33Z

@desertfire for more context: I tried adding "aten.mm.dtype_out": {}, to fallback_ops.py which in turn generates aoti_torch_cuda_mm_dtype_out stub. However, the stub should have been aoti_torch_cuda__mm_dtype_out_cuda to match _mm_dtype_out_cuda ATen C++ function name, hence there is a mismatch error: https://github.com/pytorch/pytorch/actions/runs/16514249751/job/46703501461?pr=159026.

Wonder is there a way to register custom C++ kernel names in the fallback_ops.py ?

desertfire · 2025-08-06T22:49:36Z

torch/_inductor/kernel/mm.py

@@ -570,6 +570,13 @@ def lazy_register_extern_choice(fn):

 aten_mm = ExternKernelChoice(torch.mm, "at::mm_out")

+aten_mm_dtype = ExternKernelChoice(
+    torch.mm,
+    "at::_mm_dtype_out_cuda",


Suggested change

"at::_mm_dtype_out_cuda",

"at::mm_dtype_out",

This will make cpp_wrapper/AOTI happy.

And you need to bring back your change to fallback_ops.py.

yf225 requested review from jansel, ngimel and PaulZhang12 July 24, 2025 08:53

pytorch-bot bot added ciflow/inductor module: inductor labels Jul 24, 2025

yf225 added the topic: not user facing topic category label Jul 24, 2025

yf225 force-pushed the mm_out_dtype_compile branch from c18ea69 to e8af708 Compare July 24, 2025 09:15

PaulZhang12 approved these changes Jul 24, 2025

View reviewed changes

ngimel approved these changes Jul 24, 2025

View reviewed changes

Skylion007 reviewed Jul 24, 2025

View reviewed changes

yf225 force-pushed the mm_out_dtype_compile branch from e8af708 to 457b296 Compare July 24, 2025 19:30

pytorch-bot bot added the release notes: inductor (aoti) label Jul 25, 2025

yf225 removed the release notes: inductor (aoti) label Jul 25, 2025

yf225 force-pushed the mm_out_dtype_compile branch from 2a1d938 to 346c820 Compare July 25, 2025 02:49

pytorch-bot bot added the release notes: inductor (aoti) label Jul 25, 2025

jansel approved these changes Jul 25, 2025

View reviewed changes

yf225 added 7 commits July 25, 2025 13:01

Add torch.compile support for torch.mm(out_dtype=...)

c62d3eb

try fix AOTI

367d694

try fix again

242a926

try fix AOTI again

a019867

up

93af908

up

8e3a250

up

381da56

yf225 force-pushed the mm_out_dtype_compile branch from 7b44e0c to 381da56 Compare July 25, 2025 20:01

desertfire reviewed Aug 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add torch.compile support for torch.mm(out_dtype=...) #159026

Add torch.compile support for torch.mm(out_dtype=...) #159026

Uh oh!

yf225 commented Jul 24, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

Uh oh!

PaulZhang12 left a comment

Uh oh!

ngimel commented Jul 24, 2025

Uh oh!

Skylion007 Jul 24, 2025

Uh oh!

ngimel commented Jul 25, 2025

Uh oh!

yf225 commented Jul 25, 2025 •

edited

Loading

Uh oh!

desertfire Aug 6, 2025

Uh oh!

desertfire Aug 6, 2025

Uh oh!

Uh oh!

Add torch.compile support for torch.mm(out_dtype=...) #159026

Are you sure you want to change the base?

Add torch.compile support for torch.mm(out_dtype=...) #159026

Uh oh!

Conversation

yf225 commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159026

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

PaulZhang12 left a comment

Choose a reason for hiding this comment

Uh oh!

ngimel commented Jul 24, 2025

Uh oh!

Skylion007 Jul 24, 2025

Choose a reason for hiding this comment

Uh oh!

ngimel commented Jul 25, 2025

Uh oh!

yf225 commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

desertfire Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

desertfire Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yf225 commented Jul 24, 2025 •

edited

Loading

pytorch-bot bot commented Jul 24, 2025 •

edited

Loading

yf225 commented Jul 25, 2025 •

edited

Loading