Model Components

Transformer Decoder

The transformer decoder generates output token by token, using masked self-attention (to only see past tokens) and cross-attention (to attend to the encoder's output). In encoder-decoder models (T5, BART), the decoder generates text conditioned on an encoded input.

GPT-style models are decoder-only: they use only the decoder with causal (left-to-right) masked attention, making them efficient autoregressive generators. Most modern LLMs are decoder-only architectures.

Authority Links

Transformer Decoder — Wikipedia

How the transformer decoder generates output sequences.

Hugging Face — Decoder Models

Decoder-only transformer models and their applications.

Related Terms

Model Components

Transformer

A neural-network architecture, introduced by Vaswani et al. in 2017, that uses self-attention and parallel computation across all sequence positions — the foundation under virtually every frontier language and multimodal model in production today.

Model Components

Encoder

Transformer component that processes input sequences into rich contextual representations.

Model Components

Sequence-to-Sequence (Seq2Seq) Models

Models that transform input sequences into output sequences, used in translation and summarization.

Model Components

Autoregressive Model

Model that generates each output element by conditioning on all previously generated elements.

Transformers Transformer