LLM04:2025 — Data & Model Poisoning

Slide 4 · The Definition (Part 2)

Poison can enter at three different stages.

Same attack, different point in the model's life.

🌐

Pre-training

The model learns from a massive web scrape. Poison a tiny slice of that data and it is baked into the foundation everyone builds on.

🎯

Fine-tuning

A base model is adapted to a specific task with a smaller dataset. Poison here is cheaper to inject and more precisely targeted.

🧮

Embedding

Text is converted to vectors for retrieval (RAG). Poison the documents and you poison what the model retrieves — the bridge to LLM08.

The earlier the stage, the wider the blast radius. Pre-training poison can affect every downstream user of that model — including people who fine-tune it later.

Why this matters

You can be poisoned by data you never chose. If you build on someone else's base model or dataset, you inherit whatever was poisoned upstream.

← Back Next → What it does to you