Techniques & Methods

Supervised Fine-Tuning

Supervised fine-tuning (SFT) takes a pre-trained foundation model and continues training it on a curated dataset of input-output pairs specific to the target task or domain. This adapts the model's behavior without training from scratch, preserving general capabilities while adding specialization.

SFT is the first step in RLHF pipelines for aligning LLMs. It teaches the model to follow instructions and produce task-appropriate formats before reinforcement learning further refines behavior using human preference data.

Authority Links

Fine-Tuning — Wikipedia

Concepts and approaches in fine-tuning neural networks.

InstructGPT Paper — arXiv

SFT and RLHF pipeline for aligning GPT models to instructions.

Related Terms

Techniques & Methods

Fine-Tuning

Continuing the training of a pre-trained foundation model on a smaller, curated dataset to specialize its behavior, style, or domain expertise without losing its general capabilities.

Techniques & Methods

Reinforcement Learning from Human Feedback (RLHF)

Training technique that refines AI models using feedback from human evaluators on output quality.

Techniques & Methods

Pre-training

Initial phase where a model learns general representations from large datasets before task-specific fine-tuning.

Miscellaneous

Training Data

The labeled or unlabeled dataset used to fit a model's parameters during the learning process.

System Prompt Sequence Generation