Skip to content

[DO NOT MERGE] Autograd Onboarding Lab #160264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

justinHe123
Copy link

Hi! I'm interested in learning more about the internals of Pytorch & becoming a contributor to the project. As part of this, I've been following along with Core Frontend Onboarding. This branch contains my implementation of the Autograd Onboarding Lab.

For the maintainers of Pytorch, PTAL! No urgency in reviewing this stack of diffs, but would appreciate feedback if you're able to!

cc @albanD (since the onboarding lab page mentions you)

Following parts 1 and 2 of https://github.com/pytorch/pytorch/wiki/Autograd-Onboarding-Lab

NOTE: Do NOT merge this diff!

Learnings:
- When deriving the backwards function analytically, it's easiest to break the forward function out step-by-step and compute the gradient by applying the chain rule
- grad_a shows that we must be careful in considering both the local gradient and upstream gradient contributions
- gradcheck and gradgradcheck are clever ways of validating the analytical solution using numerical/computational methods
- Generally, how to write a test and operator

Testing:
Run `python3 test/test_autograd_lab.py`
TSIA. Part 3 of onboarding lab https://github.com/pytorch/pytorch/wiki/Autograd-Onboarding-Lab#iii-write-native-composite-function-and-opinfo

Learnings:
- Using -k option to run only a subset of tests! This can save a lot of time
- Components of a native function include 1) registration within `native_functions.yaml`, 2) implementation via .cpp and .h, 3) OpInfo registration within `common_methods_invocations.py`
- In cpp, make sure to import any functions that are needed within the `at` namespace.
- Native operators added can be accessed via `torch.ops.aten.operator_name`. Tensors resulting from these operators will store a pointer to the backwards function

Testing:
Run `python3 test/test_ops.py -k attention`, `python3 test/test_autograd_lab.py`
…entation

TSIA. Following part 4 of onboarding lab https://github.com/pytorch/pytorch/wiki/Autograd-Onboarding-Lab

Learnings:
- Gradient expressions in `derivatives.yaml` are essentially templates for c++ code, with pre-defined variables for accessing forward results and their gradients
- Consequently, you can create custom functions to call within `derivatives.yaml` by adding them to `FunctionsManual.cpp`
- You should specify a gradient expression for each of your differentiable outputs!
- If you have multiple differentiable outputs, make sure to specify that in `derivatives.yaml` using `output_differentiability`!
- Make sure in `native_functions.yaml` to update the corresponding entry's `dispatch`, specifying `CompositeExplicitAutograd` pointing to the backwards function you defined in `derivatives.yaml`
- Tensors can be undefined! If you're uncertain about whether a tensor will be defined or not, make sure to check `tensor.defined()`! Otherwise, avoid operating using the tensor (ex. an output may not be used in the loss function, there for there is no gradient computed for it)

NOTE: `test_fake_autocast` kept failing on my code. I've elected to skip this since I don't have enough personal time to dedicate towards debugging how this test works & why it is failing.

Testing:
Run `python3 test/test_ops.py -k attention`, `python3 test/test_autograd_lab.py`
Copy link

pytorch-bot bot commented Aug 10, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160264

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8016d2b with merge base 01f66d0 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link

CLA Not Signed

Copy link
Contributor

Attention! native_functions.yaml was changed

If you are adding a new function or defaulted argument to native_functions.yaml, you cannot use it from pre-existing Python frontend code until our FC window passes (two weeks). Split your PR into two PRs, one which adds the new C++ functionality, and one that makes use of it from Python, and land them two weeks apart. See https://github.com/pytorch/pytorch/wiki/PyTorch's-Python-Frontend-Backward-and-Forward-Compatibility-Policy#forwards-compatibility-fc for more info.


Caused by:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants