Slide 5 · The Outcomes
What actually happens when consumption goes unchecked.
Four distinct failure modes — each one has happened in the real world.
🚫
Service outage (Denial of Service)
Enough resource-hungry requests in parallel crash the inference layer. No users can access the app. Classic DoS, LLM edition.
💸
Financial ruin (“Denial of Wallet”)
The service stays up but the bill is catastrophic. The attacker’s goal isn’t availability — it’s your invoice. A weekend of uncapped traffic can generate tens of thousands of dollars in charges.
🕵️
Model theft (extraction via repeated queries)
An attacker makes thousands of targeted API calls to reverse-engineer a proprietary model’s behavior, effectively cloning it without paying for the original. The resource consumption is the attack vehicle.
🐌
Service degradation
Expensive requests from one user crowd out resources for everyone else. Response times slow. Quality drops. Legitimate users get a worse product while the resource-hungry user (or attacker) gets an unfair share.
Why This Is Underestimated
Teams worry about crashes. Denial of Wallet is quieter — the service keeps running, users keep getting answers, and then the invoice arrives. By then, the damage is done.