Techniques & Methods
Prompt Injection
Prompt injection exploits the fact that LLMs cannot reliably distinguish between instructions from trusted sources (system prompts) and untrusted inputs (user data or web content). An attacker embeds instructions in retrieved content that override system-level directives.
Direct injection attacks user-controlled inputs; indirect injection hides instructions in external content the AI retrieves (web pages, documents). It is a critical security concern for AI applications that process external data.
Authority Links
Related Terms
Techniques & Methods
Prompt Engineering
The discipline of designing input text — instructions, examples, constraints, and context — to reliably steer a language model toward accurate, well-formatted, and intent-aligned outputs without modifying model weights.
Techniques & Methods
System Prompt
Internal instructions that guide an AI model's behavior, tone, and response style.
Techniques & Methods
Prompt
Text input provided to an AI model to guide the content and format of its response.
Techniques & Methods
AI Alignment
The research field and engineering practice of building AI systems that reliably pursue goals humans actually want, remain controllable, and avoid harmful side effects — operationalized through RLHF, Constitutional AI, evaluations, and interpretability.

