LLM04:2025 — Data & Model Poisoning

Slide 17 · The Pattern

What all five scenarios share.

Strip away the specifics and the same shape appears every time.

The Common Shape

Every scenario corrupts the model before the prompt. The user's input is innocent; the compromise already happened upstream, in the data.

⬆️

The attack is upstream

It lives in training data or model weights — not in the request you can inspect at runtime.

😶

The model looks normal

On almost every input it behaves exactly as expected. That's what makes it dangerous.

🔍

Detection means inspecting the source

You catch it by examining data and training — not by reading outputs after the fact.

That's exactly why the next part targets the pipeline, not the prompt. You defend poisoning where it enters.