LLM04:2025 — Data & Model Poisoning

Slide 3 · The Definition (Part 1)

Now the official definition.

You already understand it. Here are OWASP's exact words.

OWASP LLM04:2025 — Data and Model Poisoning

“Data poisoning occurs when pre-training, fine-tuning, or embedding data is manipulated to introduce vulnerabilities, backdoors, or biases.”

Unpacking it

Three words carry the weight: vulnerabilities (weaknesses an attacker can exploit later), backdoors (hidden triggers that change behavior on command), and biases (skewed outputs that look like normal answers).

The key phrase

“...is manipulated.” Poisoning is an integrity attack — it corrupts what the model learned. It is not stealing data (that's LLM02) and not hijacking a prompt (that's LLM01).

OWASP adds the chilling part: poisoning can produce a model that behaves normally until a trigger fires — in their words, a “sleeper agent.” We come back to that on Slide 8.

← Back Next → Where poison enters