Kubnal Bridge

Techniques & Methods

Joint Probability

Joint probability P(A ∩ B) measures the likelihood of multiple events co-occurring. In language modeling, joint probabilities over sequences are factored using the chain rule—P(w1, w2, ..., wn) = P(w1) × P(w2|w1) × ... × P(wn|w1..wn-1)—which is the foundation of autoregressive language models.

Understanding joint probability is essential for grasping how language models work: they are trained to assign high joint probability to natural language sequences and use this distribution to generate text.

Authority Links

Related Terms