Context Window Flooding → M1 (input token limit), M2 (token-aware rate limit), M5 (monitoring spike alert)
Denial of Wallet Prompts → M1 (output token cap), M3 (daily quota), M4 (provider budget alert), M5 (cost monitoring)
Reasoning Loop Exploitation → M6 (step limit, timeout), M3 (task token budget), M5 (per-task monitoring)
Model Extraction via Scraping → M2 (token-aware rate limit), M3 (daily quota), M5 (query pattern anomaly detection)
No single control covers all four attack types. M1 stops flooding but doesn’t catch agentic loops. M6 stops agentic loops but doesn’t stop Denial of Wallet from a simple prompt. You need the full stack — but even two or three controls raise the attacker’s cost dramatically.
• Sourcegraph (2023): M2 + M4 would have stopped it within the first hour.
• $47k startup bill: M3 + M4 (daily budget + provider alert) would have capped losses under $500.
• Proof Pudding (CVE-2019-20634): M2 + M3 would have made the probe volume impractical.
• Nasr et al. repeated-token attack: M1 (input pattern check) + M5 (anomaly detection) are the controls.
Every attack in LLM10 exploits the absence of a ceiling. Put ceilings everywhere — inputs, outputs, requests, budgets, steps — and the attack surface collapses.