Techniques & Methods

Pre-training

Pre-training exposes a model to massive, diverse datasets using self-supervised objectives (such as predicting the next token) to build rich general-purpose representations. This phase is computationally expensive but only needs to happen once per foundation model.

Pre-trained models become the base for dozens of downstream applications via fine-tuning, making the cost amortizable across many use cases. The quality and diversity of pre-training data are the dominant factors in a model's general capability.

Authority Links

Pre-training — Wikipedia

Role of pre-training in building general AI representations.

IBM — Foundation Models

How pre-training creates reusable foundation models.

Related Terms

Techniques & Methods

Fine-Tuning

Continuing the training of a pre-trained foundation model on a smaller, curated dataset to specialize its behavior, style, or domain expertise without losing its general capabilities.

Techniques & Methods

Transfer Learning

Leveraging knowledge learned from one task or domain to improve performance on a related one.

Model Components

Foundational Model

Large versatile model trained on broad data that serves as a base for diverse downstream applications.

Miscellaneous

Training Data

The labeled or unlabeled dataset used to fit a model's parameters during the learning process.

Prompt Part-of-Speech Tagging (POS)