Kubnal Bridge

Techniques & Methods

Autoregression

In language modeling, autoregression means generating text one token at a time, with each token conditioned on all previously generated tokens. The model probability P(w1, w2, ..., wn) is factored as a product of conditional probabilities, making generation inherently sequential.

All major LLMs (GPT, Claude, Gemini, Llama) are autoregressive. This left-to-right generation process is simple and scalable but means inference is sequential and cannot be parallelized across the output sequence dimension.

Authority Links

Related Terms