LLM01:2025 — Prompt Injection

Slide 12 · Indirect Injection

Indirect prompt injection — the attacker hides and waits.

The injection is planted in content. A legitimate user triggers it without knowing.

📄 OWASP LLM Top 10:2025 · LLM01

Official OWASP Definition

"Indirect prompt injections occur when an LLM accepts input from external sources, such as websites or files. The content may have data that when interpreted by the model, alters the behavior of the model in unintended or unexpected ways. Like direct injections, indirect injections can be either intentional or unintentional."

The Key Difference from Direct

In direct injection, the attacker interacts with the AI. In indirect injection, the attacker never interacts with the AI at all. They plant instructions in external content — and wait for a legitimate user to ask the AI to process it. The victim is the trigger.

External Sources That Can Carry Injections

→ Emails — EchoLeak: victim just received a normal-looking email
→ Word docs, PowerPoint slides — Copilot reads speaker notes, hidden text, metadata
→ Websites — pages the AI is asked to summarize or browse
→ RAG knowledge bases — documents in the retrieval system (January 2025 enterprise attack)
→ GitHub repos, READMEs, code comments, Issues — CVE-2025-53773
→ Database records — customer name fields, description fields
→ AI memory features — SpAIware (September 2024): a malicious website wrote instructions into ChatGPT's long-term memory, causing continuous data exfiltration across all future conversations

Why This Is Harder to Defend

With direct injection you can filter what users type. With indirect injection, the attack surface is everything the AI reads. Every document, webpage, email, code file, and database record is a potential attack vector. You must treat all external content as untrusted — not just user input.

← Back Next → Indirect injection — real example