LLM01:2025 — Prompt Injection

Slide 14 · Scenarios 1–3

The first three OWASP attack scenarios.

Direct injection, indirect injection, and the unintentional case — all with real anchors.

📄 OWASP LLM Top 10:2025 · LLM01 Example Attack Scenarios

SCENARIO #1 · Direct Injection

Customer support chatbot — unauthorized access and privilege escalation

An attacker injects a prompt into a customer support chatbot instructing it to ignore previous guidelines, query private data stores, and send emails. Result: unauthorized access and privilege escalation. This attack class is the most documented pattern in enterprise AI incidents 2024–2025 and follows the exact same mechanism as CVE-2024-5184.

Type: Direct. Why it works: Model can't distinguish attacker input from legitimate instructions. Multiplier: The chatbot had tool access — email, database — beyond what it needed for all users.

SCENARIO #2 · Indirect Injection via Retrieved Content

Webpage summarizer — private conversation exfiltrated via image URL

A user uses an LLM to summarize a webpage. The page contains hidden instructions causing the LLM to insert an image linking to an attacker-controlled URL — exfiltrating the private conversation. This is structurally identical to EchoLeak (CVE-2025-32711) — Copilot read an email instead of a webpage, but the exfiltration mechanism was the same: hidden instructions in content, data encoded in an image URL.

Type: Indirect. Why it works: Model treats content it reads as potential instructions. No distinction between "data to summarize" and "commands to follow."

SCENARIO #3 · Unintentional Injection

Job posting AI detection — triggered accidentally by an applicant

A company hides an instruction in a job description to identify AI-generated applications. An applicant uses an LLM to optimize their resume and inadvertently triggers the hidden instruction. Neither party intended this.

Type: Direct, unintentional. Why it matters: Injections don't require malicious intent. Any conflicting instruction in content the model reads can alter behavior — the injection surface includes content created with no adversarial intent at all.

← Back Next → Scenarios 4–6