Model Components

Sequence-to-Sequence (Seq2Seq) Models

Seq2Seq models encode an input sequence into a fixed or variable-length representation and decode it into an output sequence of potentially different length and vocabulary. Originally built with RNNs, modern seq2seq models use transformer encoder-decoder architectures.

Applications include machine translation (English → French), summarization (long article → short summary), code generation (description → code), and dialogue (message → response). T5, BART, and mT5 are prominent seq2seq transformer models.

Authority Links

Seq2Seq — Wikipedia

Overview of sequence-to-sequence model architectures and applications.

T5 Paper — arXiv

Google's T5: exploring limits of transfer learning with seq2seq models.

Related Terms

Model Components

Transformer Decoder

Transformer component that generates output sequences by attending to encoded inputs and prior outputs.

Model Components

Encoder

Transformer component that processes input sequences into rich contextual representations.

Techniques & Methods

Machine Translation

Software that automatically translates text or speech between languages.

Techniques & Methods

Attention Mechanism

Neural network technique enabling models to focus on the most relevant parts of input when producing each output.

Transformer Reward Models