Techniques & Methods
Joint Probability
Joint probability P(A ∩ B) measures the likelihood of multiple events co-occurring. In language modeling, joint probabilities over sequences are factored using the chain rule—P(w1, w2, ..., wn) = P(w1) × P(w2|w1) × ... × P(wn|w1..wn-1)—which is the foundation of autoregressive language models.
Understanding joint probability is essential for grasping how language models work: they are trained to assign high joint probability to natural language sequences and use this distribution to generate text.
Authority Links
Related Terms
Model Components
Language Model
AI system that assigns probabilities to sequences of words and can generate coherent text.
Techniques & Methods
Autoregression
Statistical modeling approach where future values are predicted from past observed values.
Techniques & Methods
Inference
Using a trained AI model to generate predictions or responses on new, unseen data.
Techniques & Methods
Training
Teaching a model to make accurate predictions by exposing it to large datasets.

