Miscellaneous
Validation Data
Validation data is a subset of labeled data set aside from training and not used to update model weights. Instead, it provides feedback during hyperparameter tuning and early stopping decisions, enabling practitioners to select the best model configuration without contaminating the test set.
The standard ML data split is training (~70%), validation (~15%), test (~15%). Validation data bridges training and final evaluation: it guides decisions but is not used for final unbiased performance assessment.
Authority Links
Related Terms
Miscellaneous
Training Data
The labeled or unlabeled dataset used to fit a model's parameters during the learning process.
Miscellaneous
Test Data
A held-out dataset used only once at the end to evaluate final model performance unbiasedly.
Core Concepts
Overfitting
Model learns detail and noise in training data too thoroughly, reducing generalization.
Techniques & Methods
Validation
Evaluating model performance on data held separate from the training set.

