"Perform regular penetration testing and breach simulations, treating the model as an untrusted user to test the effectiveness of trust boundaries and access controls."
EchoLeak CVE-2025-32711: Found by Aim Security researchers. Not caught by Microsoft internally before deployment.
GitHub Copilot CVE-2025-53773: Found by Persistent Security researchers. Reported June 29, 2025. Patched in August Patch Tuesday.
SpAIware (ChatGPT memory): Found by researcher Johann Rehberger. Disclosed at BSides Vancouver Island 2024.
CVE-2024-5184: Found by external security researchers.
In every documented production AI prompt injection CVE, the vulnerability was found by external red teamers — not caught internally before deployment.
→ Test all 9 OWASP attack scenario types
→ Test input filters with evasion: Base64, non-English, emoji, invisible Unicode
→ Test indirect injection by planting injections in your RAG knowledge base
→ Test output validation with external URLs, unexpected fields, Markdown links
→ Test high-impact action flows — can injected content bypass human approval?
→ Test AI memory features specifically (SpAIware vector)
→ Test every time you add a new integration, data source, or AI capability
You should have a documented record: what tests you ran, results, vulnerabilities found, what you fixed. If you can't produce that record, you've done a casual check — not adversarial testing. Security teams now treat AI red-teaming on the same recurring cadence as traditional penetration testing.