Techniques & Methods
Low Rank Adaptation (LoRA)
LoRA freezes the original pre-trained model weights and injects trainable rank-decomposition matrices into each layer of the transformer. Instead of updating all billions of parameters, only the small adapter matrices are trained, reducing trainable parameters by 10,000x while maintaining quality.
LoRA democratized LLM fine-tuning by making it feasible on consumer GPUs. Variants like QLoRA (quantized LoRA) further reduce memory requirements. LoRA adapters can be swapped without reloading the base model, enabling efficient multi-task deployment.
Authority Links
Related Terms
Techniques & Methods
Fine-Tuning
Continuing the training of a pre-trained foundation model on a smaller, curated dataset to specialize its behavior, style, or domain expertise without losing its general capabilities.
Techniques & Methods
Supervised Fine-Tuning
Refining a pre-trained model's performance on a specific task using labeled example data.
Model Components
Parameter
A learnable variable within a model whose value is adjusted during training to minimize prediction error.
Model Components
Foundational Model
Large versatile model trained on broad data that serves as a base for diverse downstream applications.

