Skip to content

Commit ec1eee9

Browse files
committed
some edits
Signed-off-by: cjyabraham <cjyabraham@gmail.com>
1 parent 6aadd92 commit ec1eee9

File tree

2 files changed

+8
-12
lines changed

2 files changed

+8
-12
lines changed

_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Site settings
22
title: "PyTorch Website"
33
author: "Facebook"
4-
default_author: Team Pytorch
4+
default_author: Team PyTorch
55
description: "Scientific Computing..."
66
latest_version: 1.0
77
timezone: America/Los_Angeles

_posts/2023-03-13-pytorch-2.0-release.md

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ layout: blog_detail
33
title: "PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever"
44
---
55

6-
We are excited to announce the release of [PyTorch® 2.0](https://github.com/pytorch/pytorch/releases) which we highlighted during the [PyTorch Conference](https://www.youtube.com/@PyTorch/playlists?view=50&sort=dd&shelf_id=2) on 12/2/22! PyTorch 2.0 offers the same eager-mode development and user experience, while fundamentally changing and supercharging how PyTorch operates at compiler level under the hood with faster performance and support for Dynamic Shapes and Distributed.
6+
We are excited to announce the release of [PyTorch® 2.0](https://github.com/pytorch/pytorch/releases/tag/v2.0.0) which we highlighted during the [PyTorch Conference](https://www.youtube.com/@PyTorch/playlists?view=50&sort=dd&shelf_id=2) on 12/2/22! PyTorch 2.0 offers the same eager-mode development and user experience, while fundamentally changing and supercharging how PyTorch operates at compiler level under the hood with faster performance and support for Dynamic Shapes and Distributed.
77

88
This next-generation release includes a Stable version of Accelerated Transformers (formerly called Better Transformers); Beta includes torch.compile as the main API for PyTorch 2.0, the scaled_dot_product_attention function as part of torch.nn.functional, the MPS backend, functorch APIs in the torch.func module; and other Beta/Prototype improvements across various inferences, performance and training optimization features on GPUs and CPUs. For a comprehensive introduction and technical overview of torch.compile, please visit the 2.0 [Get Started page](/get-started/pytorch-2.0).
99

@@ -148,21 +148,17 @@ Summary:
148148
## Stable Features
149149

150150

151-
### [Stable] Accelerated PyTorch 2 Transformers (previously known as “Better Transformer”)
151+
### [Stable] Accelerated PyTorch 2 Transformers
152152

153-
The PyTorch 2.0 release includes a new high-performance implementation of the PyTorch Transformer API, formerly known as “Better Transformer API, “ now renamed Accelerated PyTorch 2 Transformers. In releasing accelerated PT2 Transformers, our goal is to make training and deployment of state-of-the-art Transformer models affordable across the industry. This release introduces high-performance support for training and inference using a custom kernel architecture for scaled dot product attention (SPDA).
153+
The PyTorch 2.0 release includes a new high-performance implementation of the PyTorch Transformer API. In releasing Accelerated PT2 Transformers, our goal is to make training and deployment of state-of-the-art Transformer models affordable across the industry. This release introduces high-performance support for training and inference using a custom kernel architecture for scaled dot product attention (SPDA), extending the inference “fastpath” architecture, previously known as "Better Transformer."
154154

155155
Similar to the “fastpath” architecture, custom kernels are fully integrated into the PyTorch Transformer API – thus, using the native Transformer and MultiHeadAttention API will enable users to:
156156

157-
158-
159157
* transparently see significant speed improvements;
160158
* support many more use cases including models using Cross-Attention, Transformer Decoders, and for training models; and
161159
* continue to use fastpath inference for fixed and variable sequence length Transformer Encoder and Self Attention use cases.
162160

163-
To take full advantage of different hardware models and Transformer use cases, multiple SDPA custom kernels are supported (see below), with custom kernel selection logic that will pick the highest-performance kernel for a given model and hardware type. In addition to the existing Transformer API, model developers may also use the
164-
165-
[scaled dot product attention](#beta-scaled-dot-product-attention-20) kernels directly by calling the new scaled_dot_product_attention() operator. Accelerated PyTorch 2 Transformers are integrated with torch.compile() . To use your model while benefiting from the additional acceleration of PT2-compilation (for inference or training), pre-process the model with `model = torch.compile(model)`.
161+
To take full advantage of different hardware models and Transformer use cases, multiple SDPA custom kernels are supported (see below), with custom kernel selection logic that will pick the highest-performance kernel for a given model and hardware type. In addition to the existing Transformer API, model developers may also use the [scaled dot product attention](#beta-scaled-dot-product-attention-20) kernels directly by calling the new scaled_dot_product_attention() operator. Accelerated PyTorch 2 Transformers are integrated with torch.compile() . To use your model while benefiting from the additional acceleration of PT2-compilation (for inference or training), pre-process the model with `model = torch.compile(model)`.
166162

167163
We have achieved major speedups for training transformer models and in particular large language models with Accelerated PyTorch 2 Transformers using a combination of custom kernels and torch.compile().
168164

@@ -223,9 +219,9 @@ Learn more with the [documentation](https://pytorch.org/docs/master/generated/to
223219
### [Beta] functorch -> torch.func
224220

225221
Inspired by [Google JAX](https://github.com/google/jax), functorch is a library that offers composable vmap (vectorization) and autodiff transforms. It enables advanced autodiff use cases that would otherwise be tricky to express in PyTorch. Examples include:
226-
* [model ensembling](https://pytorch.org/functorch/1.13/notebooks/ensembling.html)
227-
* [efficiently computing jacobians and hessians](https://pytorch.org/functorch/1.13/notebooks/jacobians_hessians.html)
228-
* [computing per-sample-gradients (or other per-sample quantities)](https://pytorch.org/functorch/1.13/notebooks/per_sample_grads.html)
222+
* [model ensembling](https://pytorch.org/tutorials/intermediate/ensembling.html)
223+
* [efficiently computing jacobians and hessians](https://pytorch.org/tutorials/intermediate/jacobians_hessians.html)
224+
* [computing per-sample-gradients (or other per-sample quantities)](https://pytorch.org/tutorials/intermediate/per_sample_grads.html)
229225

230226
We’re excited to announce that, as the final step of upstreaming and integrating functorch into PyTorch, the functorch APIs are now available in the torch.func module. Our function transform APIs are identical to before, but we have changed how the interaction with NN modules work. Please see the [docs](https://pytorch.org/docs/master/func.html) and the [migration guide](https://pytorch.org/docs/master/func.migrating.html) for more details.
231227

0 commit comments

Comments
 (0)