some edits

cjyabraham · cjyabraham · commit ec1eee9c3a04 · 2023-03-15T13:20:20.000-07:00
Signed-off-by: cjyabraham &lt;cjyabraham@gmail.com&gt;
diff --git a/_config.yml b/_config.yml
@@ -1,7 +1,7 @@
 # Site settings
 title: "PyTorch Website"
 author: "Facebook"
-default_author: Team Pytorch
+default_author: Team PyTorch
 description: "Scientific Computing..."
 latest_version: 1.0
 timezone: America/Los_Angeles
diff --git a/_posts/2023-03-13-pytorch-2.0-release.md b/_posts/2023-03-13-pytorch-2.0-release.md
@@ -3,7 +3,7 @@ layout: blog_detail
 title: "PyTorch 2.0: Our next generation release that is faster, more Pythonic and Dynamic as ever"
 ---
 
-We are excited to announce the release of [PyTorch® 2.0](https://github.com/pytorch/pytorch/releases) which we highlighted during the [PyTorch Conference](https://www.youtube.com/@PyTorch/playlists?view=50&sort=dd&shelf_id=2) on 12/2/22! PyTorch 2.0 offers the same eager-mode development and user experience, while fundamentally changing and supercharging how PyTorch operates at compiler level under the hood with faster performance and support for Dynamic Shapes and Distributed.
+We are excited to announce the release of [PyTorch® 2.0](https://github.com/pytorch/pytorch/releases/tag/v2.0.0) which we highlighted during the [PyTorch Conference](https://www.youtube.com/@PyTorch/playlists?view=50&sort=dd&shelf_id=2) on 12/2/22! PyTorch 2.0 offers the same eager-mode development and user experience, while fundamentally changing and supercharging how PyTorch operates at compiler level under the hood with faster performance and support for Dynamic Shapes and Distributed.
 
 This next-generation release includes a Stable version of Accelerated Transformers (formerly called Better Transformers); Beta includes torch.compile as the main API for PyTorch 2.0,  the scaled_dot_product_attention function as part of torch.nn.functional, the MPS backend, functorch APIs in the torch.func module; and other Beta/Prototype improvements across various inferences, performance and training optimization features on GPUs and CPUs. For a comprehensive introduction and technical overview of torch.compile, please visit the 2.0 [Get Started page](/get-started/pytorch-2.0).
 
@@ -148,21 +148,17 @@ Summary:
 ## Stable Features
 
 
-### [Stable] Accelerated PyTorch 2 Transformers (previously known as “Better Transformer”)
+### [Stable] Accelerated PyTorch 2 Transformers
 
-The PyTorch 2.0 release includes a new high-performance implementation of the PyTorch Transformer API, formerly known as “Better Transformer API, “ now renamed Accelerated PyTorch 2 Transformers. In releasing accelerated PT2 Transformers, our goal is to make  training and deployment of state-of-the-art Transformer models affordable across the industry. This release introduces high-performance support for training and inference using a custom kernel architecture for scaled dot product attention (SPDA).
+The PyTorch 2.0 release includes a new high-performance implementation of the PyTorch Transformer API. In releasing Accelerated PT2 Transformers, our goal is to make training and deployment of state-of-the-art Transformer models affordable across the industry. This release introduces high-performance support for training and inference using a custom kernel architecture for scaled dot product attention (SPDA), extending the inference “fastpath” architecture, previously known as "Better Transformer."
 
 Similar to the “fastpath” architecture, custom kernels are fully integrated into the PyTorch Transformer API – thus, using the native Transformer and MultiHeadAttention API will enable users to:
 
-
-
 * transparently see significant speed improvements;  
 * support many more use cases including models using Cross-Attention, Transformer Decoders, and for training models; and
 * continue to use fastpath inference for fixed and variable sequence length Transformer Encoder and Self Attention use cases.
 
-To take full advantage of different hardware models and Transformer use cases, multiple SDPA custom kernels are supported (see below), with custom kernel selection logic that will pick the highest-performance kernel for a given model and hardware type. In addition to the existing Transformer API, model developers may also use the 
-
-[scaled dot product attention](#beta-scaled-dot-product-attention-20) kernels directly by calling the new scaled_dot_product_attention() operator. Accelerated PyTorch 2 Transformers are integrated with torch.compile() .  To use your model while benefiting from the additional acceleration of PT2-compilation (for inference or training), pre-process the model with `model = torch.compile(model)`.
+To take full advantage of different hardware models and Transformer use cases, multiple SDPA custom kernels are supported (see below), with custom kernel selection logic that will pick the highest-performance kernel for a given model and hardware type. In addition to the existing Transformer API, model developers may also use the [scaled dot product attention](#beta-scaled-dot-product-attention-20) kernels directly by calling the new scaled_dot_product_attention() operator. Accelerated PyTorch 2 Transformers are integrated with torch.compile() .  To use your model while benefiting from the additional acceleration of PT2-compilation (for inference or training), pre-process the model with `model = torch.compile(model)`.
 
 We have achieved major speedups for training transformer models and in particular large language models with Accelerated PyTorch 2 Transformers using a combination of custom kernels and torch.compile(). 
 
@@ -223,9 +219,9 @@ Learn more with the [documentation](https://pytorch.org/docs/master/generated/to
 ### [Beta] functorch -> torch.func 
 
 Inspired by [Google JAX](https://github.com/google/jax), functorch is a library that offers composable vmap (vectorization) and autodiff transforms. It enables advanced autodiff use cases that would otherwise be tricky to express in PyTorch. Examples include:
-* [model ensembling](https://pytorch.org/functorch/1.13/notebooks/ensembling.html)
-* [efficiently computing jacobians and hessians](https://pytorch.org/functorch/1.13/notebooks/jacobians_hessians.html)
-* [computing per-sample-gradients (or other per-sample quantities)](https://pytorch.org/functorch/1.13/notebooks/per_sample_grads.html)
+* [model ensembling](https://pytorch.org/tutorials/intermediate/ensembling.html)
+* [efficiently computing jacobians and hessians](https://pytorch.org/tutorials/intermediate/jacobians_hessians.html)
+* [computing per-sample-gradients (or other per-sample quantities)](https://pytorch.org/tutorials/intermediate/per_sample_grads.html)
 
 We’re excited to announce that, as the final step of upstreaming and integrating functorch into PyTorch, the functorch APIs are now available in the torch.func module. Our function transform APIs are identical to before, but we have changed how the interaction with NN modules work. Please see the [docs](https://pytorch.org/docs/master/func.html) and the [migration guide](https://pytorch.org/docs/master/func.migrating.html) for more details.