LLM01:2025 — Prompt Injection

Slide 25 · Mitigation 7 of 7

Conduct adversarial testing and attack simulations.

Every confirmed production AI CVE was found by external researchers — not caught internally before deployment.

📄 OWASP LLM Top 10:2025 · LLM01 Prevention #7

OWASP M7

Conduct Adversarial Testing and Attack Simulations

What OWASP Says

"Perform regular penetration testing and breach simulations, treating the model as an untrusted user to test the effectiveness of trust boundaries and access controls."

The Pattern Across Every AI CVE Covered in This Lesson

EchoLeak CVE-2025-32711: Found by Aim Security researchers. Not caught by Microsoft internally before deployment.
GitHub Copilot CVE-2025-53773: Found by Persistent Security researchers. Reported June 29, 2025. Patched in August Patch Tuesday.
SpAIware (ChatGPT memory): Found by researcher Johann Rehberger. Disclosed at BSides Vancouver Island 2024.
CVE-2024-5184: Found by external security researchers.

In every documented production AI prompt injection CVE, the vulnerability was found by external red teamers — not caught internally before deployment.

What AI Adversarial Testing Looks Like

→ Test all 9 OWASP attack scenario types
→ Test input filters with evasion: Base64, non-English, emoji, invisible Unicode
→ Test indirect injection by planting injections in your RAG knowledge base
→ Test output validation with external URLs, unexpected fields, Markdown links
→ Test high-impact action flows — can injected content bypass human approval?
→ Test AI memory features specifically (SpAIware vector)
→ Test every time you add a new integration, data source, or AI capability

How to Validate

You should have a documented record: what tests you ran, results, vulnerabilities found, what you fixed. If you can't produce that record, you've done a casual check — not adversarial testing. Security teams now treat AI red-teaming on the same recurring cadence as traditional penetration testing.

← Back Next → Which mitigations stop which attacks