LLM01:2025 — Prompt Injection

Slide 17 · The Pattern

What all 9 attack scenarios have in common.

One root cause. Nine different ways to exploit it.

The Common Thread

In every single scenario — direct or indirect, typed or encoded, intentional or accidental — the model treated attacker-controlled content as a valid instruction.

The delivery channel changed: chat input, email, Unicode, image, Base64, code comment. The target changed: data, RCE, crypto, decision. But in every case, the model processed external content the same way it processes legitimate instructions from the developer.

That is the root cause. The model cannot tell friend from foe by looking at text — or characters — or encoded content.

What Changed Across Scenarios

The delivery channel

The target (data, code execution, crypto, hiring)

The attacker's skill level

Whether it was intentional or accidental

What Never Changed

The model processed external content as input

The model had no reliable way to verify authority

The model acted on what it read

The developer's intent was overridden

What This Means for Defense

You cannot patch the root cause — it's inherent to how language models work. Defense is about reducing the impact when injection occurs: limiting what the AI can access, validating what it's about to do, monitoring what it actually did, and testing for it before attackers find it. That's Part 4.

← Back Part 3 done → Part 4: Prevention