Kubnal Bridge

AI Glossary

Key terms and concepts behind AI search, generative engine optimisation, and LLM visibility—explained plainly.

  • Core Concepts

    Zone of Proximal Development (ZPD)

    Tasks an AI can perform with guidance but not independently.

  • Core Concepts

    Weak AI

    AI designed and trained for a specific task, lacking general cognitive abilities.

  • Core Concepts

    Variance

    Amount by which model predictions vary from average, reflecting sensitivity to training data.

  • Core Concepts

    Unsupervised Learning

    Models learn patterns from unlabeled data without explicit instructions.

  • Core Concepts

    Turing Test

    Test of a machine's ability to exhibit intelligent behavior indistinguishable from a human.

  • Core Concepts

    Token

    Smallest processing unit in NLP: a word, word part, or character.

  • Core Concepts

    Supervised Learning

    Models trained on labeled data, learning to predict outcomes from inputs.

  • Core Concepts

    Strong AI

    AI with the ability to understand, learn, and apply knowledge like human intelligence.

  • Core Concepts

    Overfitting

    Model learns detail and noise in training data too thoroughly, reducing generalization.

  • Core Concepts

    Natural Language Understanding (NLU)

    AI's ability to understand and interpret human language meaning and intent.

  • Core Concepts

    Natural Language Processing (NLP)

    Field focused on enabling computer-human interaction through natural language.

  • Core Concepts

    Natural Language Generation (NLG)

    Generating coherent, contextually relevant text from structured data or prompts.

  • Core Concepts

    Pattern Recognition

    Automated recognition of patterns and regularities in data.

  • Core Concepts

    Latent Variables

    Hidden or unobservable variables inferred from observable data in AI models.

  • Core Concepts

    Intent

    Underlying purpose or goal a user aims to achieve through a query.

  • Core Concepts

    Hyperparameter

    Parameter set before learning begins that controls the training process.

  • Core Concepts

    Explainable AI (XAI)

    AI systems that provide transparent insights into their decision-making processes.

  • Core Concepts

    Entities

    Specific, identifiable elements like names, places, and dates extracted from text.

  • Core Concepts

    Deep Learning

    Subset of ML using neural networks with many layers to analyze complex data representations.

  • Core Concepts

    Computational Learning Theory

    Branch of AI focused on understanding the mathematical foundations of learning algorithms.

  • Core Concepts

    Cognitive Computing

    Systems designed to simulate human brain functioning, reasoning, and problem-solving.

  • Core Concepts

    Bias

    Preconceived notions in AI models that affect decision-making and fairness.

  • Core Concepts

    Big Data

    Extremely large datasets that reveal patterns, trends, and associations through computational analysis.

  • Core Concepts

    Autonomous

    Machines capable of performing tasks and making decisions without human intervention.

  • Core Concepts

    Augmented Intelligence

    Enhancing human decision-making with AI, focusing on human-AI collaboration rather than replacement.

  • Core Concepts

    Algorithm

    A set of mathematical instructions or rules computers follow to accomplish specific tasks.

  • Core Concepts

    AI (Artificial Intelligence)

    Simulation of human intelligence processes by machines, particularly computer systems.

  • Core Concepts

    General AI

    AI that exhibits cognitive functions across multiple domains, like human general intelligence.

  • Core Concepts

    Machine Learning

    Getting computers to learn from data and improve at tasks without explicit programming.

  • Core Concepts

    Machine Intelligence

    Machines' capabilities to learn from data and perform intelligent tasks.

  • Core Concepts

    Generative AI

    AI systems that produce new content — text, images, audio, video, or code — by learning the statistical distributions of training data and sampling from them, rather than retrieving stored outputs.

  • Techniques & Methods

    Zero-Shot Learning

    Model's ability to correctly perform tasks it was not explicitly trained for.

  • Techniques & Methods

    Word Embedding

    Technique representing words as dense vectors that capture semantic similarity.

  • Techniques & Methods

    Vector Representation

    Encoding words, sentences, or concepts as numerical vectors for AI comparison and retrieval.

  • Techniques & Methods

    Variation

    Different expressions or phrasings that convey the same underlying meaning.

  • Techniques & Methods

    Validation

    Evaluating model performance on data held separate from the training set.

  • Techniques & Methods

    Upstream Sampling

    Generating multiple candidate outputs and selecting the best based on predefined criteria.

  • Techniques & Methods

    Transfer Learning

    Leveraging knowledge learned from one task or domain to improve performance on a related one.

  • Techniques & Methods

    Training

    Teaching a model to make accurate predictions by exposing it to large datasets.

  • Techniques & Methods

    Topic Modeling

    Statistical method for discovering abstract topics within large document collections.

  • Techniques & Methods

    Text Classification

    Automatically assigning predefined categories to text documents.

  • Techniques & Methods

    System Prompt

    Internal instructions that guide an AI model's behavior, tone, and response style.

  • Techniques & Methods

    Supervised Fine-Tuning

    Refining a pre-trained model's performance on a specific task using labeled example data.

  • Techniques & Methods

    Sequence Generation

    Process where models produce sequences—such as words or tokens—based on learned patterns.

  • Techniques & Methods

    Semantic Similarity

    Measure of how closely related two pieces of text are in meaning.

  • Techniques & Methods

    Semantic Search

    Search technology that retrieves results based on the meaning of a query rather than exact keyword matches — using embeddings to represent queries and documents as vectors and finding nearest neighbors in semantic space.

  • Techniques & Methods

    Semantic Annotation

    Adding semantic metadata to content to improve AI understanding and processing.

  • Techniques & Methods

    Self-Attention

    Mechanism allowing a model to weigh the importance of each part of an input relative to all other parts.

  • Techniques & Methods

    Scaling Laws

    Empirical observations that larger models trained on more data predictably perform better.

  • Techniques & Methods

    Retrieval Augmented Generation (RAG)

    An inference-time architecture that retrieves relevant documents from a knowledge base or web index and injects them into a language model's context before generation, grounding answers in real source material.

  • Techniques & Methods

    Response Quality

    Evaluation of an AI response's relevance, coherence, accuracy, and helpfulness.

  • Techniques & Methods

    Reinforcement Learning from Human Feedback (RLHF)

    Training technique that refines AI models using feedback from human evaluators on output quality.

  • Techniques & Methods

    Reinforcement Learning

    An agent learns by taking actions in an environment and receiving rewards or penalties.

  • Techniques & Methods

    Regularization

    Techniques that prevent overfitting by penalizing model complexity during training.

  • Techniques & Methods

    Query

    A request for information or an action submitted to a database, search engine, or AI model.

  • Techniques & Methods

    Proximal Policy Optimization (PPO)

    RL algorithm that balances exploration and exploitation by constraining policy update size.

  • Techniques & Methods

    Prompt Injection

    Attack technique that manipulates AI behavior by embedding malicious instructions in inputs.

  • Techniques & Methods

    Prompt Engineering

    The discipline of designing input text — instructions, examples, constraints, and context — to reliably steer a language model toward accurate, well-formatted, and intent-aligned outputs without modifying model weights.

  • Techniques & Methods

    Prompt

    Text input provided to an AI model to guide the content and format of its response.

  • Techniques & Methods

    Pre-training

    Initial phase where a model learns general representations from large datasets before task-specific fine-tuning.

  • Techniques & Methods

    Part-of-Speech Tagging (POS)

    Labeling each word in text with its grammatical role such as noun, verb, or adjective.

  • Techniques & Methods

    Overuse Penalty

    Technique that discourages AI models from generating repetitive or overly similar responses.

  • Techniques & Methods

    Online Learning

    Model that updates its parameters continuously as new data arrives, rather than training in batches.

  • Techniques & Methods

    One-Shot Learning

    Model's ability to learn and make accurate predictions from only a single example.

  • Techniques & Methods

    One-Shot / Few-Shot

    Learning paradigms where models learn from one or very few examples to perform new tasks.

  • Techniques & Methods

    Offline Reinforcement Learning

    Learning optimal policies from fixed historical datasets without interacting with a live environment.

  • Techniques & Methods

    Named Entity Recognition (NER)

    Identifying and classifying named entities in text into predefined categories like people and places.

  • Techniques & Methods

    Multitask Learning

    Training a model on multiple related tasks simultaneously to improve performance on all of them.

  • Techniques & Methods

    Masked Language Modeling

    Training technique where the model predicts randomly hidden words in a sequence.

  • Techniques & Methods

    Markov Decision Process

    Mathematical framework modeling sequential decision-making in environments with probabilistic outcomes.

  • Techniques & Methods

    Machine Translation

    Software that automatically translates text or speech between languages.

  • Techniques & Methods

    Low Rank Adaptation (LoRA)

    Parameter-efficient fine-tuning technique that reduces compute and memory requirements for adapting large models.

  • Techniques & Methods

    Linguistic Annotation

    Adding linguistic metadata—such as POS tags, parse trees, or coreferences—to text for analysis.

  • Techniques & Methods

    Knowledge Representation

    Methods AI systems use to model, store, and reason over knowledge about the world.

  • Techniques & Methods

    Joint Probability

    The probability of two or more events occurring simultaneously.

  • Techniques & Methods

    Information Extraction

    Automatically extracting structured information from unstructured text.

  • Techniques & Methods

    Inference

    Using a trained AI model to generate predictions or responses on new, unseen data.

  • Techniques & Methods

    Heuristics

    Practical problem-solving approaches using rules of thumb rather than exhaustive search.

  • Techniques & Methods

    Hallucination

    When a language model generates confident-sounding text that is factually wrong, invented, or misattributed — a structural consequence of next-token prediction over learned patterns rather than retrieval from a verified knowledge base.

  • Techniques & Methods

    Greedy Algorithms

    Algorithms that make the locally optimal choice at each step to find a global solution.

  • Techniques & Methods

    Generation

    Producing new text, code, or content based on learned patterns and a given input prompt.

  • Techniques & Methods

    Forward Chaining

    Logical reasoning that starts with known facts and applies rules to derive conclusions.

  • Techniques & Methods

    Fine-Tuning

    Continuing the training of a pre-trained foundation model on a smaller, curated dataset to specialize its behavior, style, or domain expertise without losing its general capabilities.

  • Techniques & Methods

    Fine-Grained Control

    Capability to precisely adjust AI output characteristics, format, style, or content.

  • Techniques & Methods

    Few-Shot Learning

    Model's ability to generalize from only a handful of labeled examples.

  • Techniques & Methods

    Feature Extraction

    Identifying and isolating the most useful information from raw data for model training.

  • Techniques & Methods

    Extractive Summarization

    Creating summaries by selecting and combining key sentences directly from the source text.

  • Techniques & Methods

    Evaluation Metrics

    Quantitative measures used to assess how well an AI model performs on a task.

  • Techniques & Methods

    Entity Extraction

    Identifying and classifying named entities—people, places, organizations—within text.

  • Techniques & Methods

    Entity Annotation

    Labeling text spans with entity type information to create structured training data.

  • Techniques & Methods

    Distributed Training

    Spreading model training across multiple GPUs or servers to handle large-scale models and datasets.

  • Techniques & Methods

    Dependency Parsing

    Analyzing grammatical structure to identify dependency relationships between words in a sentence.

  • Techniques & Methods

    Decoding Rules

    Guidelines and algorithms that control how language models translate internal representations into output tokens.

  • Techniques & Methods

    Data Mining

    Examining large databases to discover patterns, correlations, and generate new insights.

  • Techniques & Methods

    Data Augmentation

    Increasing training dataset size and diversity by creating modified copies of existing data.

  • Techniques & Methods

    Coreference Resolution

    Determining which words or phrases in text refer to the same real-world entity.

  • Techniques & Methods

    Completion

    The output produced by an AI language model in response to a given input or prompt.

  • Techniques & Methods

    Chain-of-Thought

    A prompting and reasoning technique in which a language model is encouraged to produce step-by-step intermediate reasoning before its final answer — empirically improving accuracy on multi-step problems, especially math, logic, and code.

  • Techniques & Methods

    Beam Search

    Search algorithm that maintains multiple candidate sequences to find high-quality generated outputs.

  • Techniques & Methods

    Bandit Optimization

    Strategy balancing exploration of unknown options with exploitation of known high-reward choices.

  • Techniques & Methods

    Backward Chaining

    Goal-driven reasoning that works backward from a desired conclusion to find supporting facts.

  • Techniques & Methods

    Backpropagation

    Training algorithm that adjusts neural network weights by propagating prediction errors backward through the network.

  • Techniques & Methods

    Autoregression

    Statistical modeling approach where future values are predicted from past observed values.

  • Techniques & Methods

    Attention Mechanism

    Neural network technique enabling models to focus on the most relevant parts of input when producing each output.

  • Techniques & Methods

    Attention

    Core mechanism in transformers that dynamically weights the importance of different input positions.

  • Techniques & Methods

    AI Alignment

    The research field and engineering practice of building AI systems that reliably pursue goals humans actually want, remain controllable, and avoid harmful side effects — operationalized through RLHF, Constitutional AI, evaluations, and interpretability.

  • Techniques & Methods

    Adversarial Training

    Training AI models on challenging, adversarially crafted inputs to improve robustness and reliability.

  • Model Components

    Transformers

    Class of deep learning models based on self-attention that have revolutionized NLP and AI.

  • Model Components

    Transformer Decoder

    Transformer component that generates output sequences by attending to encoded inputs and prior outputs.

  • Model Components

    Transformer

    A neural-network architecture, introduced by Vaswani et al. in 2017, that uses self-attention and parallel computation across all sequence positions — the foundation under virtually every frontier language and multimodal model in production today.

  • Model Components

    Sequence-to-Sequence (Seq2Seq) Models

    Models that transform input sequences into output sequences, used in translation and summarization.

  • Model Components

    Reward Models

    Models trained to score AI outputs based on human preferences for use in reinforcement learning.

  • Model Components

    Retrieval Model

    Model that finds and returns the most relevant documents or passages from a large corpus given a query.

  • Model Components

    Recurrent Neural Network (RNN)

    Neural network with loops enabling it to maintain hidden state across sequential inputs.

  • Model Components

    Predictive Model

    A model that uses learned patterns to forecast unknown or future values.

  • Model Components

    Parameter

    A learnable variable within a model whose value is adjusted during training to minimize prediction error.

  • Model Components

    Neural Network

    Computational system of interconnected nodes inspired by the human brain that learns to recognize patterns.

  • Model Components

    Large Language Model (LLM)

    A transformer-based neural network with billions to trillions of parameters, trained on broad text corpora to predict the next token and able to generate, summarize, classify, and reason over natural language.

  • Model Components

    Language Model

    AI system that assigns probabilities to sequences of words and can generate coherent text.

  • Model Components

    Model Card

    Standardized documentation describing an AI model's intended uses, limitations, and evaluation results.

  • Model Components

    Model Architecture

    The specific structure of an AI model: its layers, connections, and component design.

  • Model Components

    Model

    A mathematical system trained on data to represent real-world patterns and make predictions.

  • Model Components

    Maximum Response Length

    The upper limit on the number of tokens a model can generate in a single response.

  • Model Components

    Generative Pre-trained Transformer (GPT)

    A family of decoder-only Transformer language models — pioneered by OpenAI — that combines large-scale unsupervised pre-training on text with task-specific alignment to produce general-purpose text generation.

  • Model Components

    Generative Model

    AI model that learns to generate new data instances resembling the training distribution.

  • Model Components

    Generative Adversarial Network (GAN)

    Framework training two competing networks—a generator and discriminator—to produce realistic synthetic data.

  • Model Components

    Generator

    GAN component that creates synthetic data instances designed to be indistinguishable from real data.

  • Model Components

    Foundational Model

    Large versatile model trained on broad data that serves as a base for diverse downstream applications.

  • Model Components

    Encoder

    Transformer component that processes input sequences into rich contextual representations.

  • Model Components

    Embeddings

    Dense numerical vectors that represent text, images, or other content in a high-dimensional space where semantically similar items are geometrically close — the foundational data structure for semantic search and RAG retrieval.

  • Model Components

    Discriminator (in GAN)

    GAN component that learns to distinguish real data from fake data generated by the generator.

  • Model Components

    Context Window

    The maximum number of tokens a language model can process in a single inference pass — everything the model "sees" at once, including system prompt, conversation history, retrieved documents, and the response being generated.

  • Model Components

    Contextual Embeddings

    Word representations that change based on surrounding context, unlike static word embeddings.

  • Model Components

    Bounding Box

    Rectangular region used to localize objects within images in computer vision tasks.

  • Model Components

    Autoregressive Model

    Model that generates each output element by conditioning on all previously generated elements.

  • Model Components

    Artificial Neural Network

    Computing system loosely inspired by biological neural networks, consisting of layers of connected nodes.

  • Model Components

    API (Application Programming Interface)

    Interface that allows software applications to communicate and share functionality with each other.

  • Model Components

    GPT-3 (Generative Pre-trained Transformer 3)

    OpenAI's 175-billion-parameter language model, released in 2020, that demonstrated remarkable few-shot learning.

  • Applications

    User Interface (UI)

    The means by which humans interact with a computer system or AI application.

  • Applications

    Sentiment Analysis

    Automatically identifying and categorizing expressed opinions in text to determine attitude.

  • Applications

    QA (Question Answering)

    AI system that automatically produces answers to human questions posed in natural language.

  • Applications

    Predictive Analytics

    Using historical data and ML models to forecast likely future outcomes.

  • Applications

    Plugins / Tools

    Extensions that allow AI systems to interact with external services, APIs, and data sources.

  • Applications

    Multi-turn Dialogue

    Conversations involving multiple exchanges where the AI maintains context across all prior turns.

  • Applications

    Moderation Tools

    Systems that monitor and filter AI outputs and user inputs to enforce content guidelines.

  • Applications

    Enterprise AI

    Application of AI technologies to improve business processes, efficiency, and decision-making.

  • Applications

    Dialogue System

    AI system designed to carry on natural, coherent conversations with human users.

  • Applications

    CRM with AI

    Customer relationship management systems augmented with AI to improve sales, service, and marketing outcomes.

  • Applications

    ChatGPT

    OpenAI's consumer conversational AI assistant, launched in November 2022, built on the GPT family of language models and trained with RLHF to follow instructions, maintain conversational context, and decline harmful requests.

  • Applications

    Chatbot

    Software application that simulates human conversation via text or voice interfaces.

  • Applications

    AI Agents

    AI systems that combine a language model with tools, memory, and planning to autonomously execute multi-step tasks — observing outcomes, deciding next actions, and iterating until a goal is reached.

  • Applications

    InstructGPT

    GPT variant fine-tuned with RLHF to follow instructions accurately and produce aligned responses.

  • Miscellaneous

    Yeoman's Work

    Diligent, thorough work that may be repetitive but is essential and dependable.

  • Miscellaneous

    Vector Store

    Specialized database for storing, indexing, and efficiently retrieving high-dimensional vector embeddings.

  • Miscellaneous

    Validation Data

    A held-out data split used during training to tune hyperparameters and monitor generalization.

  • Miscellaneous

    Training Data

    The labeled or unlabeled dataset used to fit a model's parameters during the learning process.

  • Miscellaneous

    Test Data

    A held-out dataset used only once at the end to evaluate final model performance unbiasedly.

  • Miscellaneous

    System Message

    Predefined instruction provided to an AI model before the conversation that guides its behavior.

  • Miscellaneous

    Sandbox Environment

    Isolated testing environment where code or AI models can run safely without affecting production systems.

  • Miscellaneous

    Python

    High-level programming language that is the dominant language for AI and machine learning development.

  • Miscellaneous

    OpenAI

    AI research organization that created GPT, ChatGPT, DALL-E, and Codex, and pioneered RLHF alignment.

  • Miscellaneous

    Label

    Annotation indicating the correct output or category for a training example in supervised learning.

  • Miscellaneous

    Knowledge Base

    Centralized repository of structured and unstructured information used to provide AI systems with domain knowledge.

  • Miscellaneous

    Dataset

    An organized collection of data examples prepared for training, evaluating, or testing AI models.

  • Miscellaneous

    Data Science

    Interdisciplinary field combining statistics, programming, and domain knowledge to extract insights from data.

  • Miscellaneous

    Data Privacy

    Practices and regulations ensuring personal and sensitive data is collected, stored, and processed responsibly.

  • Miscellaneous

    Corpus

    A large collection of text used for training language models or conducting linguistic research.

  • Miscellaneous

    Deployment

    The process of making a trained AI model available for real-world use in production environments.

  • General

    AI Trainer

    Specialist who improves AI models by providing structured feedback, creating training data, and evaluating outputs.

Ready to get cited by AI search engines?

Get a free GEO & AI visibility audit and see exactly where your brand stands across ChatGPT, Perplexity, and Google AI Overviews.

Claim your free audit