Model Components

GPT-3 (Generative Pre-trained Transformer 3)

GPT-3 was a landmark model upon release, showing that scaling a decoder-only transformer to 175 billion parameters on internet text produced surprising emergent capabilities: translation, coding, arithmetic, and question answering with just a few examples in the prompt.

GPT-3 demonstrated that a single model could serve as a near-universal few-shot learner, catalyzing investment in large-scale AI and spawning the modern era of generative AI products. It also raised significant concerns about misuse, bias, and environmental cost.

Authority Links

GPT-3 Paper — arXiv

Original OpenAI paper introducing GPT-3 and its few-shot capabilities.

GPT-3 — Wikipedia

Overview of GPT-3's architecture, capabilities, and impact.

Related Terms

Model Components

Generative Pre-trained Transformer (GPT)

A family of decoder-only Transformer language models — pioneered by OpenAI — that combines large-scale unsupervised pre-training on text with task-specific alignment to produce general-purpose text generation.

Model Components

Large Language Model (LLM)

A transformer-based neural network with billions to trillions of parameters, trained on broad text corpora to predict the next token and able to generate, summarize, classify, and reason over natural language.

Techniques & Methods

Few-Shot Learning

Model's ability to generalize from only a handful of labeled examples.

Model Components

Transformer

A neural-network architecture, introduced by Vaswani et al. in 2017, that uses self-attention and parallel computation across all sequence positions — the foundation under virtually every frontier language and multimodal model in production today.

API (Application Programming Interface)User Interface (UI)