LLM10:2025 — Unbounded Consumption

Slide 21 · Mitigation Category 3 of 6

Set cumulative spending limits that cross time windows.

📄 OWASP LLM Top 10:2025 · LLM10 Prevention — Resource Quotas

M3 — Resource Consumption Quotas

Implement per-user, per-session, and per-day cumulative resource budgets

What OWASP Says

"Implement resource quotas that limit the total compute or token usage per user, session, or time period." "Restricting an LLM's access to internal services, APIs, and network resources limits both insider misuse and runaway agent consumption."

How Missing This Made a Real Incident Worse

The startup’s $47,000 weekend bill (Slide 1) had no per-user daily budget. A single heavy user could consume as much as they wanted in a 24-hour period. Rate limiting per minute wouldn’t have helped — the damage accumulated steadily over 72 hours, not in a burst.

How to Do This Right

→ Track cumulative tokens per user per day (or billing period) in addition to per-minute windows.
→ Soft limit: warn the user when they’ve consumed 80% of their daily quota.
→ Hard limit: reject requests once they hit the ceiling; reset at midnight or rolling window.
→ For agentic pipelines: track tokens across all steps in a single task, not just per-LLM-call.

How to Validate

Check whether your user table or session store records cumulative token spend. If there’s no such field, no quota system exists. Simulate a user who hits 10x their expected daily usage — does anything stop them?

← Back Next → M4: API Gateway Controls