Skip to content

Commit ef83dc4

Browse files
navjottssgugger
andauthored
Improve documentation of pooler_output in ModelOutput (huggingface#13228)
* update documentation of pooler_output in modeling_outputs, making it more clear and available for generic usage * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_outputs.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * run make style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
1 parent 7828194 commit ef83dc4

File tree

1 file changed

+8
-6
lines changed

1 file changed

+8
-6
lines changed

src/transformers/modeling_outputs.py

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -55,9 +55,10 @@ class BaseModelOutputWithPooling(ModelOutput):
5555
last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
5656
Sequence of hidden-states at the output of the last layer of the model.
5757
pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
58-
Last layer hidden-state of the first token of the sequence (classification token) further processed by a
59-
Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
60-
prediction (classification) objective during pretraining.
58+
Last layer hidden-state of the first token of the sequence (classification token) after further processing
59+
through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
60+
the classification token after processing through a linear layer and a tanh activation function. The linear
61+
layer weights are trained from the next sentence prediction (classification) objective during pretraining.
6162
hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
6263
Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
6364
of shape :obj:`(batch_size, sequence_length, hidden_size)`.
@@ -158,9 +159,10 @@ class BaseModelOutputWithPoolingAndCrossAttentions(ModelOutput):
158159
last_hidden_state (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, sequence_length, hidden_size)`):
159160
Sequence of hidden-states at the output of the last layer of the model.
160161
pooler_output (:obj:`torch.FloatTensor` of shape :obj:`(batch_size, hidden_size)`):
161-
Last layer hidden-state of the first token of the sequence (classification token) further processed by a
162-
Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence
163-
prediction (classification) objective during pretraining.
162+
Last layer hidden-state of the first token of the sequence (classification token) after further processing
163+
through the layers used for the auxiliary pretraining task. E.g. for BERT-family of models, this returns
164+
the classification token after processing through a linear layer and a tanh activation function. The linear
165+
layer weights are trained from the next sentence prediction (classification) objective during pretraining.
164166
hidden_states (:obj:`tuple(torch.FloatTensor)`, `optional`, returned when ``output_hidden_states=True`` is passed or when ``config.output_hidden_states=True``):
165167
Tuple of :obj:`torch.FloatTensor` (one for the output of the embeddings + one for the output of each layer)
166168
of shape :obj:`(batch_size, sequence_length, hidden_size)`.

0 commit comments

Comments
 (0)