[docs] post-PR merge fix (huggingface#15355)

stas00 · sgugger · web-flow · commit fc8fc400e394 · 2022-01-26T11:23:32.000-08:00
* [docs] post-PR merge fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger &lt;35901082+sgugger@users.noreply.github.com&gt;

Co-authored-by: Sylvain Gugger &lt;35901082+sgugger@users.noreply.github.com&gt;
diff --git a/docs/source/main_classes/deepspeed.mdx b/docs/source/main_classes/deepspeed.mdx
@@ -31,7 +31,7 @@ won't be possible on a single GPU.
 
 🤗 Transformers integrates [DeepSpeed](https://github.com/microsoft/DeepSpeed) via 2 options:
 
-1. Integration of the core DeepSpeed features via [`Trainer`]. This is everything done for your type
+1. Integration of the core DeepSpeed features via [`Trainer`]. This is an everything-done-for-you type
    of integration - just supply your custom config file or use our template and you have nothing else to do. Most of
    this document is focused on this feature.
 2. If you don't use [`Trainer`] and want to use your own Trainer where you integrated DeepSpeed
@@ -604,7 +604,7 @@ The following is an example of configuration for ZeRO stage 2:
 **Performance tuning:**
 
 - enabling `offload_optimizer` should reduce GPU RAM usage (it requires `"stage": 2`)
-- `"overlap_comm": true` trade offs increased GPU RAM usage to lower all-reduce latency. `overlap_comm` uses 4.5x
+- `"overlap_comm": true` trades off increased GPU RAM usage to lower all-reduce latency. `overlap_comm` uses 4.5x
   the `allgather_bucket_size` and `reduce_bucket_size` values. So if they are set to 5e8, this requires a 9GB
   footprint (`5e8 x 2Bytes x 2 x 4.5`). Therefore, if you have a GPU with 8GB or less RAM, to avoid getting
   OOM-errors you will need to reduce those parameters to about `2e8`, which would require 3.6GB. You will want to do