-
Notifications
You must be signed in to change notification settings - Fork 24k
Tensor and Operator Basics
- Understand what a tensor is (including stride, dtype, and layout)
- Understand what an operator is in PyTorch
- Understand what views are
- Understand how to author an operator in PyTorch
- Understand how to test operators in PyTorch
- Understand what TensorIterator is
Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters. Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other hardware accelerators. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data. Tensors are also optimized for automatic differentiation. (Source: PyTorch Basics Tutorial.)
A Tensor consists of:
- data_ptr, a pointer to a chunk of memory
- some sizes metadata
- some strides metadata
- a storage offset
A kernel is a function that accepts Tensors and/or raw pointers to memory and performs a useful computation (for example, matrix multiplication, attention, etc).
An operator is glue code for the PyTorch runtime that tells it about the computation. A single operator can be associated with multiple kernels (for example, torch.add has a kernel for CPU and a kernel for CUDA). The glue code is necessary to get PyTorch subsystems (like torch.compile and torch.autograd) to compose with the computation.
Standalone kernels may work directly with PyTorch but will not compose with the majority of PyTorch subsystems. In order to get them to compose, please register an operator for them.
(Source: The Custom Operator Manual, provided in the official PyTorch Custom Operators documentation.)
- After completing the views lab if you have extra time feel free to work through Sasha Rush's Tensor Puzzles
Unit 2: Tensors, Operators, and Testing - Running and writing tests
I would love to contribute to PyTorch!