Core Concepts
Explainable AI (XAI)
XAI refers to methods and techniques that make AI outputs interpretable and understandable to humans. Approaches include SHAP values, LIME, attention visualization, and saliency maps that show which inputs most influenced a prediction.
XAI is increasingly important for regulatory compliance, user trust, and debugging. In high-stakes domains like healthcare, finance, and criminal justice, black-box decisions are often legally or ethically unacceptable.
Authority Links
Related Terms
Techniques & Methods
AI Alignment
The research field and engineering practice of building AI systems that reliably pursue goals humans actually want, remain controllable, and avoid harmful side effects — operationalized through RLHF, Constitutional AI, evaluations, and interpretability.
Core Concepts
Bias
Preconceived notions in AI models that affect decision-making and fairness.
Techniques & Methods
Evaluation Metrics
Quantitative measures used to assess how well an AI model performs on a task.
Core Concepts
Machine Learning
Getting computers to learn from data and improve at tasks without explicit programming.

