← Back to lesson
Sources & Attribution
Everything in this lesson, sourced.
Every incident, study, and demonstration mentioned in LLM04:2025 — Data & Model Poisoning — traced back to where it came from. This risk has no single headline CVE, so it is anchored in primary research and disclosed incidents instead.
Framework License

This lesson is built on the OWASP Top 10 for Large Language Model Applications (2025), released under Creative Commons Attribution-ShareAlike 4.0. The definition, vulnerability categories, mitigation structure, and attack scenarios are drawn directly from this framework. Real-world incidents and research are independent factual reporting, cited individually below.

01
Primary Framework
The structure this entire lesson is built on
OWASP Top 10 for LLM Applications 2025 — LLM04: Data and Model Poisoning
OWASP Foundation · Released 2025 · CC BY-SA 4.0
Cited for: Core definition, vulnerability examples, 6 mitigation categories, all 5 official attack scenarios — slides 3, 4, 14, 15, 16, 18–24
genai.owasp.org →
02
Research & Demonstrations
Peer-reviewed and published research that anchors the attack types
PoisonGPT — Hiding a Lobotomized LLM on Hugging FaceResearch Demo
Mithril Security · 2023 · Technique: ROME (Rank-One Model Editing) on GPT-J-6B
Cited for: Supply-chain model poisoning, the opening story, slides 1, 10, 15, 25. Also catalogued by MITRE ATLAS as AML-CS0019.
Mithril Security →
A Small Number of Samples Can Poison LLMs of Any SizeResearch Paper
Anthropic · UK AI Security Institute · The Alan Turing Institute · October 9, 2025 · ~250 documents, 600M–13B params, trigger <SUDO>
Cited for: Backdoor / trigger poisoning, the sleeper-agent and 250-document facts, slides 8, 11, 16, 23, 25
anthropic.com →
Poisoning Web-Scale Training Datasets Is PracticalResearch Paper
Carlini, Jagielski, Tramèr, et al. · arXiv:2302.10149 · 2023 · Split-view & frontrunning attacks on LAION-400M, COYO-700M, Wikipedia
Cited for: Web-scale dataset poisoning, the ~$60 / 0.01% figure, slides 12, 15, 25
arXiv:2302.10149 →
03
Confirmed Incidents
Real-world events verified against primary or first-party reporting
Microsoft Tay Chatbot PoisoningReported Incident
Microsoft · March 2016 · Feedback-loop poisoning via live Twitter replies · ~16 hours, ~95,000 tweets
Cited for: Feedback-loop poisoning, toxic-data scenario, slides 6, 13, 14, 21, 25
Background →
Nightshade — Data Poisoning as Artist DefenseDefensive Tool
University of Chicago (Ben Zhao et al.) · 2023–24 · ~300 poisoned images shown to corrupt a concept in Stable Diffusion
Cited for: The "defensive artist" persona — poisoning isn't always malicious, slide 6
MIT Technology Review →