Class TransformerDecoderLayerImpl#

Defined in File transformerlayer.h

Page Contents

Inheritance Relationships#

Base Type#

public torch::nn::Cloneable< TransformerDecoderLayerImpl > (Template Class Cloneable)

Class Documentation#

class TransformerDecoderLayerImpl : public torch::nn::Cloneable<TransformerDecoderLayerImpl>#

TransformerDecoderLayer is made up of self-attn, multi-head-attn and feedforward network.

This standard decoder layer is based on the paper “Attention Is All You Need”. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems, pages 6000-6010. Users may modify or implement in a different way during application. See https://pytorch.org/docs/main/nn.html#transformer-layers to learn about the exact behavior of this module.

See the documentation for torch::nn::TransformerDecoderLayerOptions class to learn what constructor arguments are supported for this module.

Example:

TransformerDecoderLayer model(TransformerDecoderLayerOptions(512,
8).dropout(0.2));

Public Functions

inline TransformerDecoderLayerImpl(int64_t d_model, int64_t nhead)#

explicit TransformerDecoderLayerImpl(TransformerDecoderLayerOptions options_)#

virtual void reset() override#: reset() must perform initialization of all members with reference semantics, most importantly parameters, buffers and submodules.

void reset_parameters()#

Tensor forward(Tensor tgt, const Tensor &memory, const Tensor &tgt_mask = {}, const Tensor &memory_mask = {}, const Tensor &tgt_key_padding_mask = {}, const Tensor &memory_key_padding_mask = {})#

Pass the inputs (and mask) through the decoder layer.

Args: tgt: the sequence to the decoder layer (required). memory: the sequence from the last layer of the encoder (required). tgt_mask: the mask for the tgt sequence (optional). memory_mask: the mask for the memory sequence (optional). tgt_key_padding_mask: the mask for the tgt keys per batch (optional). memory_key_padding_mask: the mask for the memory keys per batch (optional).

Public Members

TransformerDecoderLayerOptions options#: The options used to configure this module.

MultiheadAttention self_attn = {nullptr}#: self attention

Dropout dropout1 = {nullptr}#: Dropout, post self attention.

LayerNorm norm1 = {nullptr}#: Normalization, post self attention.

MultiheadAttention multihead_attn = {nullptr}#: Multi-headed attention.

Dropout dropout2 = {nullptr}#: Dropout, post multi-headed attention.

LayerNorm norm2 = {nullptr}#: Normalization, post multi-headed attention.

Linear linear1 = {nullptr}#: Feed forward first linear layer.

Dropout dropout = {nullptr}#: Feed forward dropout layer.

Linear linear2 = {nullptr}#: Feed forward second linear layer.

Dropout dropout3 = {nullptr}#: Dropout, post feed forward.

LayerNorm norm3 = {nullptr}#: Normalization, post feed forward.

Protected Functions

inline virtual bool _forward_has_default_args() override#

The following three functions allow a module with default arguments in its forward method to be used in a Sequential module.

You should NEVER override these functions manually. Instead, you should use the FORWARD_HAS_DEFAULT_ARGS macro.

inline virtual unsigned int _forward_num_required_args() override#

inline std::vector<torch::nn::AnyValue> _forward_populate_default_args(std::vector<torch::nn::AnyValue> &&arguments) override#

Tensor activation(const Tensor &input)#: Apply activation based on configuration.

Friends

friend struct torch::nn::AnyModuleHolder

Class TransformerDecoderLayerImpl#

Inheritance Relationships#

Base Type#

Class Documentation#

Docs

Tutorials

Resources