Slide 14 of 28
Part 3 · Attack ScenariosSlide 14
PART 3
The 9 Attack Scenarios
Slides 14–17 · OWASP + real CVEs
Slide 14 · Scenarios 1–3
The first three OWASP attack scenarios.
Direct injection, indirect injection, and the unintentional case — all with real anchors.
📄 OWASP LLM Top 10:2025 · LLM01 Example Attack Scenarios
SCENARIO #1 · Direct Injection
Customer support chatbot — unauthorized access and privilege escalation
An attacker injects a prompt into a customer support chatbot instructing it to ignore previous guidelines, query private data stores, and send emails. Result: unauthorized access and privilege escalation. This attack class is the most documented pattern in enterprise AI incidents 2024–2025 and follows the exact same mechanism as CVE-2024-5184.
Type: Direct. Why it works: Model can't distinguish attacker input from legitimate instructions. Multiplier: The chatbot had tool access — email, database — beyond what it needed for all users.
SCENARIO #2 · Indirect Injection via Retrieved Content
Webpage summarizer — private conversation exfiltrated via image URL
A user uses an LLM to summarize a webpage. The page contains hidden instructions causing the LLM to insert an image linking to an attacker-controlled URL — exfiltrating the private conversation. This is structurally identical to EchoLeak (CVE-2025-32711) — Copilot read an email instead of a webpage, but the exfiltration mechanism was the same: hidden instructions in content, data encoded in an image URL.
Type: Indirect. Why it works: Model treats content it reads as potential instructions. No distinction between "data to summarize" and "commands to follow."
SCENARIO #3 · Unintentional Injection
Job posting AI detection — triggered accidentally by an applicant
A company hides an instruction in a job description to identify AI-generated applications. An applicant uses an LLM to optimize their resume and inadvertently triggers the hidden instruction. Neither party intended this.
Type: Direct, unintentional. Why it matters: Injections don't require malicious intent. Any conflicting instruction in content the model reads can alter behavior — the injection surface includes content created with no adversarial intent at all.
← BackNext → Scenarios 4–6