Skip to content

Commit b6200b3

Browse files
authored
Update 2022-6-22-introducing-torchx-fbgemm-and-other-library-updates-in-pytorch-1-12.md
Feedback updates
1 parent ccbf412 commit b6200b3

File tree

1 file changed

+14
-15
lines changed

1 file changed

+14
-15
lines changed

_posts/2022-6-22-introducing-torchx-fbgemm-and-other-library-updates-in-pytorch-1-12.md

+14-15
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
---
22
layout: blog_detail
3-
title: "Introducing TorchX, FBGemm and other library updates in PyTorch 1.12"
3+
title: "New library updates in PyTorch 1.12"
44
author: Team PyTorch
55
featured-img: ''
66
---
77

8-
We are introducing the beta release of TorchRec and a number of improvements to the current PyTorch domain libraries, alongside the PyTorch 1.12 release. These updates demonstrate our focus on developing common and extensible APIs across all domains to make it easier for our community to build ecosystem projects on PyTorch.
8+
We are bringing a number of improvements to the current PyTorch domain libraries, alongside the PyTorch 1.12 release. These updates demonstrate our focus on developing common and extensible APIs across all domains to make it easier for our community to build ecosystem projects on PyTorch.
99

1010
Summary:
11-
- **TorchVision** - Added 4 new model families and 14 new classification datasets such as CLEVR, GTSRB, FER2013. See the release notes [here](https://github.com/pytorch/vision/releases).
12-
- **TorchAudio** - Added Enformer- and RNN-T-based models and recipes to support the full development lifecycle of a streaming ASR model. See the release notes [here](https://github.com/pytorch/audio/releases).
13-
- **TorchText** - Added beta support for RoBERTa and XLM-R models, byte-level BPE tokenizer, and text datasets backed by TorchData. See the release notes [here](https://github.com/pytorch/text/releases).
14-
- **TorchRec** a PyTorch domain library for Recommendation Systems, is available in beta. [View it on GitHub](https://github.com/pytorch/torchrec).
11+
- **TorchVision** - Added multi-weight support API, new architectures, model variants, and pretrained weight. See the release notes [here](https://github.com/pytorch/vision/releases).
12+
- **TorchAudio** - Introduced beta features including a streaming API, a CTC beam search decoder, and new beamforming modules and methods. See the release notes [here](https://github.com/pytorch/audio/releases).
13+
- **TorchText** - Extended support for scriptable BERT tokenizer and added datasets for GLUE benchmark. See the release notes [here](https://github.com/pytorch/text/releases).
14+
- **TorchRec** EmbeddingModule benchmarks, examples for TwoTower Retrieval model and sequential embedding, and demonstrated integration with production components. See the release notes [here](https://github.com/pytorch/torchrec/releases).
1515
- **TorchX** - Launch PyTorch trainers developed on local workspaces onto five different types of schedulers. See the release notes [here](https://github.com/pytorch/torchx/blob/main/CHANGELOG.md?plain=1#L3).
1616
- **FBGemm** - Added and improved kernels for Recommendation Systems inference workloads, including table batched embedding bag, jagged tensor operations, and other special-case optimizations.
1717

@@ -212,10 +212,10 @@ We completely revamped our models documentation to make them easier to browse, a
212212

213213

214214
StreamReader is TorchAudio’s new I/O API. It is backed by FFmpeg†, and allows users to:
215-
- Decode various audio and video formats, including MP4 and AAC
216-
- Handle various input forms, such as local files, network protocols, microphones, webcams, screen captures and file-like objects
215+
- Decode audio and video formats, including MP4 and AAC
216+
- Handle input forms, such as local files, network protocols, microphones, webcams, screen captures and file-like objects
217217
- Iterate over and decode chunk-by-chunk, while changing the sample rate or frame rate
218-
- Apply various audio and video filters, such as low-pass filter and image scaling
218+
- Apply audio and video filters, such as low-pass filter and image scaling
219219
- Decode video with NVidia's hardware-based decoder (NVDEC)
220220

221221
For usage details, please check out the [documentation](https://pytorch.org/audio/0.12.0/io.html#streamreader) and tutorials:
@@ -233,13 +233,13 @@ TorchAudio integrates the wav2letter CTC beam search decoder from [Flashlight](h
233233

234234
Customizable lexicon and lexicon-free decoders are supported, and both are compatible with KenLM n-gram language models or without using a language model. TorchAudio additionally supports downloading token, lexicon, and pretrained KenLM files for the LibriSpeech dataset.
235235

236-
For details of usage, please check out the [documentation](https://pytorch.org/audio/0.12.0/models.decoder.html#ctcdecoder) and [ASR inference tutorial](https://pytorch.org/audio/0.12.0/tutorials/asr_inference_with_ctc_decoder_tutorial.html).
236+
For usage details, please check out the [documentation](https://pytorch.org/audio/0.12.0/models.decoder.html#ctcdecoder) and [ASR inference tutorial](https://pytorch.org/audio/0.12.0/tutorials/asr_inference_with_ctc_decoder_tutorial.html).
237237

238238
### (BETA) New Beamforming Modules and Methods
239239

240-
TorchAudio adds two new beamforming modules under torchaudio.transforms: [SoudenMVDR](https://pytorch.org/audio/0.12.0/transforms.html#soudenmvdr) and [RTFMVDR](https://pytorch.org/audio/0.12.0/transforms.html#rtfmvdr), to increase the flexibility of module usage. The main differences from the [torchaudio.transforms.MVDR](https://pytorch.org/audio/0.11.0/transforms.html#mvdr) module are:
241-
- Use power spectral density (PSD) matrices or the relative transfer function (RTF) matrix as inputs instead of time-frequency masks. The module can be integrated with neural networks that directly predict complex-valued STFT coefficients of speech and noise
242-
- Add reference_channel to the input arguments in the forward method, to let users select the reference microphone during model training
240+
LiveTo improve flexibility in usage, the release adds two new beamforming modules under torchaudio.transforms: [SoudenMVDR](https://pytorch.org/audio/0.12.0/transforms.html#soudenmvdr) and [RTFMVDR](https://pytorch.org/audio/0.12.0/transforms.html#rtfmvdr). The main differences from the [torchaudio.transforms.MVDR](https://pytorch.org/audio/0.11.0/transforms.html#mvdr) module are:
241+
- Use power spectral density (PSD) and relative transfer function (RTF) matrices as inputs instead of time-frequency masks. The module can be integrated with neural networks that directly predict complex-valued STFT coefficients of speech and noise
242+
- Add 'reference_channel' as an input argument in the forward method, to allow users to select the reference channel in model training or dynamically change the reference channel in inference
243243

244244
Besides the two modules, new function-level beamforming methods are added under torchaudio.functional. These include:
245245
- [psd](https://pytorch.org/audio/0.12.0/functional.html#psd)
@@ -249,8 +249,7 @@ Besides the two modules, new function-level beamforming methods are added under
249249
- [rtf_power](https://pytorch.org/audio/0.12.0/functional.html#rtf-power)
250250
- [apply_beamforming](https://pytorch.org/audio/0.12.0/functional.html#apply-beamforming)
251251

252-
253-
For the details of the usage, please check out the [documentation](https://pytorch.org/audio/0.12.0/transforms.html#multi-channel) and the [Speech Enhancement with MVDR Beamforming tutorial](https://pytorch.org/audio/0.12.0/tutorials/mvdr_tutorial.html).
252+
For usage details, please check out the documentation at [torchaudio.transforms](https://pytorch.org/audio/0.12.0/transforms.html#multi-channel) and [torchaudio.functional](https://pytorch.org/audio/0.12.0/functional.html#multi-channel) and the [Speech Enhancement with MVDR Beamforming tutorial](https://pytorch.org/audio/0.12.0/tutorials/mvdr_tutorial.html).
254253

255254
## TorchText v0.13
256255

0 commit comments

Comments
 (0)