LLM04:2025 — Data & Model Poisoning

Slide 21 · Mitigation 3 of 6

Sandbox the data. Hunt the anomalies.

📄 OWASP LLM Top 10:2025 · LLM04 Prevention — Sandboxing & Detection

OWASP — Sandboxing & Detection

Strict Sandboxing + Anomaly Detection

What OWASP Says

“Use strict sandboxing to limit model exposure to unverified data sources,” and apply anomaly detection and data-filtering techniques to screen out adversarial or poisoned data before training.

Where a Real Case Shows the Gap

Tay had no sandbox between live user input and its learning loop. Anomaly detection over incoming data would have flagged the coordinated surge of toxic content for what it was.

How to Do This Right

→ Isolate the ingestion of any unverified data source
→ Run statistical anomaly detection over each training batch
→ Filter outliers and suspected adversarial samples before they ever reach training

How to Validate

Inject synthetic outliers into a training batch. If the pipeline ingests them without flagging, your detection layer doesn't exist yet.

← Back Next → Access controls