4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.
txt
PyTorch Cheat Sheet
******************************
Imports
=========
General
-------
.. code-block:: python
import torch # root package
from torch.utils.data import Dataset, DataLoader # dataset representation and
loading
Neural Network API
------------------
.. code-block:: python
import torch.autograd as autograd # computation graph
from torch import Tensor # tensor node in the computation graph
import torch.nn as nn # neural networks
import torch.nn.functional as F # layers, activations and more
import torch.optim as optim # optimizers e.g. gradient descent,
ADAM, etc.
See `autograd <https://pytorch.org/docs/stable/autograd.html>`__,
`nn <https://pytorch.org/docs/stable/nn.html>`__,
`functional <https://pytorch.org/docs/stable/nn.html#torch-nn-functional>`__
and `optim <https://pytorch.org/docs/stable/optim.html>`__
ONNX
----
.. code-block:: python
torch.onnx.export(model, dummy data, xxxx.proto) # exports an ONNX
formatted
# model using a trained
model, dummy
# data and the desired
file name
model = onnx.load("alexnet.proto") # load an ONNX model
onnx.checker.check_model(model) # check that the model
# IR is well formed
onnx.helper.printable_graph(model.graph) # print a human readable
# representation of the
graph
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 2/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
See `onnx <https://pytorch.org/docs/stable/onnx.html>`__
Vision
------
.. code-block:: python
from torchvision import datasets, models, transforms # vision datasets,
# architectures &
# transforms
import torchvision.transforms as transforms # composable transforms
See
`torchvision <https://pytorch.org/vision/stable/index.html>`__
Distributed Training
--------------------
.. code-block:: python
import torch.distributed as dist # distributed communication
from torch.multiprocessing import Process # memory sharing processes
See `distributed <https://pytorch.org/docs/stable/distributed.html>`__
and
`multiprocessing <https://pytorch.org/docs/stable/multiprocessing.html>`__
Tensors
=========
Creation
--------
.. code-block:: python
x = torch.randn(*size) # tensor with independent N(0,1) entries
x = torch.[ones|zeros](*size) # tensor with all 1's [or 0's]
x = torch.tensor(L) # create tensor from [nested] list or ndarray
L
y = x.clone() # clone of x
with torch.no_grad(): # code wrap that stops autograd from tracking
tensor history
requires_grad=True # arg, when set to True, tracks computation
# history for future derivative calculations
See `tensor <https://pytorch.org/docs/stable/tensors.html>`__
Dimensionality
--------------
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 3/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
.. code-block:: python
x.size() # return tuple-like object of
dimensions
x = torch.cat(tensor_seq, dim=0) # concatenates tensors along dim
y = x.view(a,b,...) # reshapes x into size (a,b,...)
y = x.view(-1,a) # reshapes x into size (b,a) for some b
y = x.transpose(a,b) # swaps dimensions a and b
y = x.permute(*dims) # permutes dimensions
y = x.unsqueeze(dim) # tensor with added axis
y = x.unsqueeze(dim=2) # (a,b,c) tensor -> (a,b,1,c) tensor
y = x.squeeze() # removes all dimensions of size 1
(a,1,b,1) -> (a,b)
y = x.squeeze(dim=1) # removes specified dimension of size 1
(a,1,b,1) -> (a,b,1)
See `tensor <https://pytorch.org/docs/stable/tensors.html>`__
Algebra
-------
.. code-block:: python
ret = A.mm(B) # matrix multiplication
ret = A.mv(x) # matrix-vector multiplication
x = x.t() # matrix transpose
See `math
operations <https://pytorch.org/docs/stable/torch.html?highlight=mm#math-
operations>`__
GPU Usage
---------
.. code-block:: python
torch.cuda.is_available # check for cuda
x = x.cuda() # move x's data from
# CPU to GPU and
return new object
x = x.cpu() # move x's data from
GPU to CPU
# and return new
object
if not args.disable_cuda and torch.cuda.is_available(): # device agnostic
code
args.device = torch.device('cuda') # and modularity
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 4/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
else: #
args.device = torch.device('cpu') #
net.to(device) # recursively convert
their
# parameters and
buffers to
# device specific
tensors
x = x.to(device) # copy your tensors
to a device
# (gpu, cpu)
See `cuda <https://pytorch.org/docs/stable/cuda.html>`__
Deep Learning
=============
.. code-block:: python
nn.Linear(m,n) # fully connected layer from
# m to n units
nn.ConvXd(m,n,s) # X dimensional conv layer from
# m to n channels where X⍷{1,2,3}
# and the kernel size is s
nn.MaxPoolXd(s) # X dimension pooling layer
# (notation as above)
nn.BatchNormXd # batch norm layer
nn.RNN/LSTM/GRU # recurrent layers
nn.Dropout(p=0.5, inplace=False) # dropout layer for any dimensional
input
nn.Dropout2d(p=0.5, inplace=False) # 2-dimensional channel-wise
dropout
nn.Embedding(num_embeddings, embedding_dim) # (tensor-wise) mapping from
# indices to embedding vectors
See `nn <https://pytorch.org/docs/stable/nn.html>`__
Loss Functions
--------------
.. code-block:: python
nn.X # where X is L1Loss, MSELoss,
CrossEntropyLoss
# CTCLoss, NLLLoss, PoissonNLLLoss,
# KLDivLoss, BCELoss, BCEWithLogitsLoss,
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 5/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
# MarginRankingLoss, HingeEmbeddingLoss,
# MultiLabelMarginLoss, SmoothL1Loss,
# SoftMarginLoss, MultiLabelSoftMarginLoss,
# CosineEmbeddingLoss, MultiMarginLoss,
# or TripletMarginLoss
See `loss
functions <https://pytorch.org/docs/stable/nn.html#loss-functions>`__
Activation Functions
--------------------
.. code-block:: python
nn.X # where X is ReLU, ReLU6, ELU, SELU, PReLU,
LeakyReLU,
# RReLu, CELU, GELU, Threshold, Hardshrink,
HardTanh,
# Sigmoid, LogSigmoid, Softplus,
SoftShrink,
# Softsign, Tanh, TanhShrink, Softmin,
Softmax,
# Softmax2d, LogSoftmax or
AdaptiveSoftmaxWithLoss
See `activation
functions <https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-
sum-nonlinearity>`__
Optimizers
----------
.. code-block:: python
opt = optim.x(model.parameters(), ...) # create optimizer
opt.step() # update weights
opt.zero_grad() # clear the gradients
optim.X # where X is SGD, AdamW, Adam,
# Adafactor, NAdam, RAdam, Adadelta,
# Adagrad, SparseAdam, Adamax, ASGD,
# LBFGS, RMSprop or Rprop
See `optimizers <https://pytorch.org/docs/stable/optim.html>`__
Learning rate scheduling
------------------------
.. code-block:: python
scheduler = optim.X(optimizer,...) # create lr scheduler
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 6/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
scheduler.step() # update lr after optimizer updates
weights
optim.lr_scheduler.X # where X is LambdaLR, MultiplicativeLR,
# StepLR, MultiStepLR, ExponentialLR,
# CosineAnnealingLR, ReduceLROnPlateau,
CyclicLR,
# OneCycleLR,
CosineAnnealingWarmRestarts,
See `learning rate
scheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`__
Data Utilities
==============
Datasets
--------
.. code-block:: python
Dataset # abstract class representing dataset
TensorDataset # labelled dataset in the form of tensors
Concat Dataset # concatenation of Datasets
See
`datasets <https://pytorch.org/docs/stable/data.html?
highlight=dataset#torch.utils.data.Dataset>`__
Dataloaders and ``DataSamplers``
--------------------------------
.. code-block:: python
DataLoader(dataset, batch_size=1, ...) # loads data batches agnostic
# of structure of individual data
points
sampler.Sampler(dataset,...) # abstract class dealing with
# ways to sample from dataset
sampler.XSampler where ... # Sequential, Random, SubsetRandom,
# WeightedRandom, Batch, Distributed
See
`dataloader <https://pytorch.org/docs/stable/data.html?
highlight=dataloader#torch.utils.data.DataLoader>`__
Also see
--------
- `Deep Learning with PyTorch: A 60 Minute
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 7/8
4/16/25, 8:22 PM pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt
Blitz <https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html>`__
- `PyTorch Forums <https://discuss.pytorch.org/>`__
- `PyTorch for Numpy
users <https://github.com/wkentaro/pytorch-for-numpy-users>`__
https://pytorch.org/tutorials/_sources/beginner/ptcheat.rst.txt 8/8