LLM04:2025 — Data & Model Poisoning

Slide 15 · Scenarios 3 & 4

Falsified documents, and the filter that wasn't enough.

Two ways bad data slips past your defenses.

SCENARIO #3

Falsified training documents

A malicious actor creates fake documents and gets them into the training corpus, producing inaccurate model outputs.

Why it matters: this is PoisonGPT and Carlini's web-scale attack, stated as an OWASP scenario — falsified inputs become “facts” the model repeats.

SCENARIO #4

Incomplete filtering lets poison in

Inadequate data filtering allows an attacker to insert misleading data via prompt injection, compromising the outputs that follow.

Why it matters: filtering is necessary but never sufficient — attackers design their poison specifically to slip through the filters you have.

← Back Next → Scenario 5: the backdoor