LLM04:2025 — Data & Model Poisoning

Slide 25 · The Matrix

Which defense would have stopped which attack.

Map the six mitigations back to the four real cases from Part 2.

PoisonGPT · supply-chain

Stopped by: Provenance & checksums (M1) + supplier vetting (M2). Verifying true lineage exposes a typosquatted upload instantly.

Anthropic 250-doc backdoor · trigger

Stopped by: Adversarial red-teaming (M5) + provenance (M1). Only trigger-hunting tests surface a sleeper agent; provenance limits what gets in.

Carlini web-scale · dataset

Stopped by: Vendor/source vetting (M2) + sandboxing & anomaly detection (M3). Re-verify data at download time, not just at index time.

Microsoft Tay · feedback loop

Stopped by: Sandboxing & anomaly detection (M3) + access/curation (M4) + monitoring (M6). Don't learn from raw user input unfiltered.

Notice: not one row is covered by a single control. That's the whole point — poisoning defense is defense-in-depth, the same lesson as every OWASP category.

← Back Next → Test yourself