You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Improve documentation of pooler_output in ModelOutput (huggingface#13228)
* update documentation of pooler_output in modeling_outputs, making it more clear and available for generic usage
* Update src/transformers/modeling_outputs.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_outputs.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* run make style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Copy file name to clipboardExpand all lines: src/transformers/modeling_outputs.py
+8-6Lines changed: 8 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -55,9 +55,10 @@ class BaseModelOutputWithPooling(ModelOutput):
55
55
last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
56
56
Sequence of hidden-states at the output of the last layer of the model.
57
57
pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
58
-
Last layer hidden-state of the first token of the sequence (classification token) further processed by a
59
-
Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
60
-
prediction (classification) objective during pretraining.
58
+
Last layer hidden-state of the first token of the sequence (classification token) after further processing
59
+
through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
60
+
the classification token after processing through a linear layer and a tanh activation function. The linear
61
+
layer weights are trained from the next sentence prediction (classification) objective during pretraining.
61
62
hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
62
63
Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
63
64
of shape :obj:`(batch_size, sequence_length, hidden_size)`.
@@ -158,9 +159,10 @@ class BaseModelOutputWithPoolingAndCrossAttentions(ModelOutput):
158
159
last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
159
160
Sequence of hidden-states at the output of the last layer of the model.
160
161
pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
161
-
Last layer hidden-state of the first token of the sequence (classification token) further processed by a
162
-
Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
163
-
prediction (classification) objective during pretraining.
162
+
Last layer hidden-state of the first token of the sequence (classification token) after further processing
163
+
through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
164
+
the classification token after processing through a linear layer and a tanh activation function. The linear
165
+
layer weights are trained from the next sentence prediction (classification) objective during pretraining.
164
166
hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
165
167
Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
166
168
of shape :obj:`(batch_size, sequence_length, hidden_size)`.
0 commit comments