Techniques & Methods
Supervised Fine-Tuning
Supervised fine-tuning (SFT) takes a pre-trained foundation model and continues training it on a curated dataset of input-output pairs specific to the target task or domain. This adapts the model's behavior without training from scratch, preserving general capabilities while adding specialization.
SFT is the first step in RLHF pipelines for aligning LLMs. It teaches the model to follow instructions and produce task-appropriate formats before reinforcement learning further refines behavior using human preference data.
Authority Links
Related Terms
Techniques & Methods
Fine-Tuning
Continuing the training of a pre-trained foundation model on a smaller, curated dataset to specialize its behavior, style, or domain expertise without losing its general capabilities.
Techniques & Methods
Reinforcement Learning from Human Feedback (RLHF)
Training technique that refines AI models using feedback from human evaluators on output quality.
Techniques & Methods
Pre-training
Initial phase where a model learns general representations from large datasets before task-specific fine-tuning.
Miscellaneous
Training Data
The labeled or unlabeled dataset used to fit a model's parameters during the learning process.

