Techniques & Methods
Data Mining
Data mining applies statistical and ML techniques to large datasets to discover hidden patterns, associations, clusters, and anomalies. Techniques include association rule learning, clustering, classification, regression, and anomaly detection.
Data mining is a precursor to ML model building: it informs feature selection, reveals data quality issues, and uncovers relationships that shape model design. Web scraping and mining of large text corpora are how pre-training data for LLMs is assembled.
Authority Links
Related Terms
Core Concepts
Big Data
Extremely large datasets that reveal patterns, trends, and associations through computational analysis.
Miscellaneous
Data Science
Interdisciplinary field combining statistics, programming, and domain knowledge to extract insights from data.
Miscellaneous
Training Data
The labeled or unlabeled dataset used to fit a model's parameters during the learning process.
Techniques & Methods
Information Extraction
Automatically extracting structured information from unstructured text.

