LLM04:2025 — Data & Model Poisoning

Slide 23 · Mitigation 5 of 6

Attack your own model before someone else does.

📄 OWASP LLM Top 10:2025 · LLM04 Prevention — Adversarial Testing

OWASP — Adversarial Testing

Red-Teaming + Robust Training

What OWASP Says

“Test model robustness with red team campaigns and adversarial techniques, such as federated learning, to minimize the impact of data perturbations.”

Where a Real Case Shows the Gap

The 250-document backdoor is invisible to normal evals. Only adversarial testing that actively hunts for triggers and biases has any chance of surfacing a sleeper agent.

How to Do This Right

→ Run red-team campaigns that actively probe for hidden triggers and skewed outputs
→ Use techniques like federated learning so no single source dominates the model
→ Build poisoning and backdoor scenarios into your standing test suite

How to Validate

Does your eval set include adversarial, trigger-probing cases — or only normal inputs? If only normal, a backdoored model passes clean.

← Back Next → Monitoring & RAG