Skip to content

Allow exposing more functions during initial template expansion #159554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

charlie-wt
Copy link
Contributor

@charlie-wt charlie-wt commented Jul 31, 2025

Also adds a `_register_hook` utility, and documents & type annotates
PartialRender.
Copy link

pytorch-bot bot commented Jul 31, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159554

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 4 New Failures, 2 Unrelated Failures

As of commit 4c5f0c4 with merge base 2a286cb (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@charlie-wt
Copy link
Contributor Author

@pytorchbot label "topic: not user facing"

@janeyx99 janeyx99 requested a review from eellison August 4, 2025 23:03
@janeyx99 janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Aug 4, 2025
@eellison eellison requested review from laithsakka and removed request for eellison August 5, 2025 01:44
@eellison
Copy link
Contributor

eellison commented Aug 5, 2025

@laithsakka want to take this ?

@charlie-wt
Copy link
Contributor Author

charlie-wt commented Aug 6, 2025

these failures seem to be coming from the assert hook_name not in self.render_hooks in the new TritonTemplateKernel._register_hook, for the hook <ARGDEFS>; sure enough, this was the one place i replaced with a call to _register_hook where there wasn't that assert before. i can have a look in the meantime, but there'll be others more familiar with that part of the code that might be able to tell me if it's intentional that the <ARGDEFS> can be overwritten where the others aren't.

@charlie-wt
Copy link
Contributor Author

have added an allow_overwriting param to _register_hook, only set to True in gen_argdefs—still happy to hear if people have other opinions

@charlie-wt
Copy link
Contributor Author

charlie-wt commented Aug 6, 2025

don't think the 'new failures' relate to me: they're all cudaErrorNoKernelImageForDevices, which i believe are complaining about an incompatibility between the compiled-for cuda version and the one supported by the hardware. that might relate to this earlier log message:

2025-08-06T14:46:54.3732695Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:283: UserWarning: 
2025-08-06T14:46:54.3733931Z     Found GPU0 NVIDIA L4 which is of cuda capability 8.9.
2025-08-06T14:46:54.3734423Z     Minimum and Maximum cuda capability supported by this version of PyTorch is
2025-08-06T14:46:54.3734834Z     (5.2) - (5.2)
2025-08-06T14:46:54.3735024Z     
2025-08-06T14:46:54.3735207Z   warnings.warn(
2025-08-06T14:46:54.3735629Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:304: UserWarning: 
2025-08-06T14:46:54.3736141Z     Please install PyTorch with a following CUDA
2025-08-06T14:46:54.3736510Z     configurations:  12.6 12.8 12.9 following instructions at
2025-08-06T14:46:54.3736885Z     https://pytorch.org/get-started/locally/
2025-08-06T14:46:54.3737184Z     
2025-08-06T14:46:54.3737384Z   warnings.warn(matched_cuda_warn.format(matched_arches))
2025-08-06T14:46:54.3737821Z /opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/cuda/__init__.py:326: UserWarning: 
2025-08-06T14:46:54.3738322Z NVIDIA L4 with CUDA capability sm_89 is not compatible with the current PyTorch installation.
2025-08-06T14:46:54.3738769Z The current PyTorch install supports CUDA capabilities sm_52.
2025-08-06T14:46:54.3739259Z If you want to use the NVIDIA L4 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

the single 'flaky' failure is for the same reason too.

locally, two of the 'newly-failing' tests pass but i don't have enough memory for the other two.

@laithsakka
Copy link
Contributor

can you explain in the summary more what is the user case of this change, and some context.
for example who is not very familiar with the code.

assert hook != "finalized", "hook_key can only be called once"
self._code = self._code.replace(hook_key, hook())

self.replacement_hooks[hook_key] = "finalized"
Copy link
Contributor

@laithsakka laithsakka Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider using None as proxy for finalized?
MaybeHookFn being Optional[HookFn]

you can use FINALIZED=None.

Copy link
Contributor Author

@charlie-wt charlie-wt Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, to be honest i kept it as something other than None because it actually used to be None but got changed to something else; tbh i'm not totally sure the reason for the change (i can't find any references to FINALIZED_HOOK outside of PartialRender). i just made it a literal cause i couldn't see a way to type-annotate the raw object() nicely.


def _register_extra_template_env_fns(self, *fns: Callable[..., Any]):
"""
Register some extra functions to expose when performing the initial
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain more what "to expose" means here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess you could say it means "to make available to be used by jinja expressions": jinja template strings can include expressions inside double curly brackets, and these expressions can include things like calls to python functions. the template_env dictionary being passed to template.render inside TritonTemplateKernel.render is laying out what functions you want to be in scope for any expressions in the template being rendered.

shall i add more info to the comment along these lines?

@laithsakka laithsakka requested a review from kundaMwiza August 11, 2025 12:49
@charlie-wt
Copy link
Contributor Author

can you explain in the summary more what is the user case of this change, and some context. for example who is not very familiar with the code.

the main use case i had for this was for people implementing their own triton inductor backends, who'd want to reuse most existing constructs but may want to use their own hooks when rendering template kernels; at the moment the list of hooks passed to template.render is hard-coded so you'd need to monkey-patch to make them available during rendering, but this lets you register them more properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: inductor open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants