Kubnal Bridge

Core Concepts

Big Data

Big data refers to datasets too large and complex for traditional data-processing tools, characterized by the three Vs: Volume (scale), Velocity (speed of generation), and Variety (structured and unstructured types). Technologies like Hadoop, Spark, and cloud data warehouses were built to handle it.

Big data is the fuel for modern AI. The scale of training data—trillions of tokens for frontier LLMs—is a primary driver of model capability, making data collection, curation, and governance central to AI development.

Authority Links

Related Terms