Core Concepts

Explainable AI (XAI)

XAI refers to methods and techniques that make AI outputs interpretable and understandable to humans. Approaches include SHAP values, LIME, attention visualization, and saliency maps that show which inputs most influenced a prediction.

XAI is increasingly important for regulatory compliance, user trust, and debugging. In high-stakes domains like healthcare, finance, and criminal justice, black-box decisions are often legally or ethically unacceptable.

Authority Links

Explainable AI — Wikipedia

Overview of XAI methods and their importance.

IBM — Explainable AI

IBM's framework for building interpretable AI systems.

Related Terms

Techniques & Methods

AI Alignment

The research field and engineering practice of building AI systems that reliably pursue goals humans actually want, remain controllable, and avoid harmful side effects — operationalized through RLHF, Constitutional AI, evaluations, and interpretability.

Core Concepts

Bias

Preconceived notions in AI models that affect decision-making and fairness.

Techniques & Methods

Evaluation Metrics

Quantitative measures used to assess how well an AI model performs on a task.

Core Concepts

Machine Learning

Getting computers to learn from data and improve at tasks without explicit programming.

Hyperparameter Entities