@@ -105,9 +105,8 @@ Supported models
105
105
3. :doc: `BARThez <model_doc/barthez >` (from École polytechnique) released with the paper `BARThez: a Skilled Pretrained
106
106
French Sequence-to-Sequence Model <https://arxiv.org/abs/2010.12321> `__ by Moussa Kamal Eddine, Antoine J.-P.
107
107
Tixier, Michalis Vazirgiannis.
108
- 4. `BEiT <https://huggingface.co/transformers/master/model_doc/beit.html >`__ (from Microsoft) released with the paper
109
- `BEiT: BERT Pre-Training of Image Transformers <https://arxiv.org/abs/2106.08254 >`__ by Hangbo Bao, Li Dong, Furu
110
- Wei.
108
+ 4. :doc: `BEiT <model_doc/beit >` (from Microsoft) released with the paper `BEiT: BERT Pre-Training of Image Transformers
109
+ <https://arxiv.org/abs/2106.08254> `__ by Hangbo Bao, Li Dong, Furu Wei.
111
110
5. :doc: `BERT <model_doc/bert >` (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional
112
111
Transformers for Language Understanding <https://arxiv.org/abs/1810.04805> `__ by Jacob Devlin, Ming-Wei Chang,
113
112
Kenton Lee and Kristina Toutanova.
@@ -264,9 +263,9 @@ Supported models
264
263
55. :doc: `SpeechToTextTransformer <model_doc/speech_to_text >` (from Facebook), released together with the paper
265
264
`fairseq S2T: Fast Speech-to-Text Modeling with fairseq <https://arxiv.org/abs/2010.05171 >`__ by Changhan Wang, Yun
266
265
Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, Juan Pino.
267
- 56. `Splinter <https://huggingface.co/transformers/master/ model_doc/splinter.html >`__ (from Tel Aviv University),
268
- released together with the paper ` Few-Shot Question Answering by Pretraining Span Selection
269
- <https://arxiv.org/abs/2101.00438> `__ by Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy.
266
+ 56. :doc: `Splinter <model_doc/splinter >` (from Tel Aviv University), released together with the paper ` Few-Shot
267
+ Question Answering by Pretraining Span Selection <https://arxiv.org/abs/2101.00438> `__ by Ori Ram, Yuval Kirstain,
268
+ Jonathan Berant, Amir Globerson, Omer Levy.
270
269
57. :doc: `SqueezeBert <model_doc/squeezebert >` released with the paper `SqueezeBERT: What can computer vision teach NLP
271
270
about efficient neural networks? <https://arxiv.org/abs/2006.11316> `__ by Forrest N. Iandola, Albert E. Shaw, Ravi
272
271
Krishna, and Kurt W. Keutzer.
0 commit comments