AI Red Teaming (AI Red Teaming)

An evaluation method that systematically tests AI system vulnerabilities from an attacker's perspective to proactively identify safety risks.
What is AI Red Teaming
AI Red Teaming is an evaluation methodology that systematically tests AI systems for vulnerabilities from an attacker's perspective, identifying safety risks before deployment in production. It applies the concept of "red team exercises" from the military and security fields to AI.
What Is Being Tested
The risks examined by AI Red Teaming are broader than those in traditional software security.
- Prompt injection: Bypassing model constraints through input manipulation
- Extraction of sensitive information: Drawing out personal data or trade secrets contained in training data
- Harmful content generation: Inducing outputs that slip past safety filters
- Violation of instruction hierarchy: Overwriting system prompts or deviating from assigned roles
A large-scale evaluation conducted by the UK AI Safety Institute reported over 62,000 vulnerabilities, highlighting the extensive attack surface of AI systems.
How to Conduct It
Specialized teams comprehensively test systems by combining techniques such as prompt modification, multilingual attacks, and multi-turn manipulation. A hybrid approach is considered effective, in which automated tools (such as Garak and PyRIT) generate large volumes of test cases while human experts supplement them with creative attack scenarios.
The EU AI Act requires appropriate testing for high-risk AI systems, and AI Red Teaming is attracting growing attention as a means of fulfilling that requirement.
Related Terms

AI ROI (Return on Investment in AI)
AI ROI is a metric that quantitatively measures the effects obtained — such as operational efficienc

AI Observability
An operational practice of continuously monitoring and visualizing the inputs/outputs, latency, cost

Ambient AI
Ambient AI refers to an AI system that is seamlessly embedded in the user's environment, continuousl

BPO (Business Process Outsourcing)
BPO refers to a form of outsourcing in which a company delegates specific business processes to an e