Slide 21 of 27
Part 4 · PreventionSlide 21
Slide 21 · Mitigation Category 3 of 6
Set cumulative spending limits that cross time windows.
📄 OWASP LLM Top 10:2025 · LLM10 Prevention — Resource Quotas
M3 — Resource Consumption Quotas
Implement per-user, per-session, and per-day cumulative resource budgets

"Implement resource quotas that limit the total compute or token usage per user, session, or time period." "Restricting an LLM's access to internal services, APIs, and network resources limits both insider misuse and runaway agent consumption."

The startup’s $47,000 weekend bill (Slide 1) had no per-user daily budget. A single heavy user could consume as much as they wanted in a 24-hour period. Rate limiting per minute wouldn’t have helped — the damage accumulated steadily over 72 hours, not in a burst.

→ Track cumulative tokens per user per day (or billing period) in addition to per-minute windows.
→ Soft limit: warn the user when they’ve consumed 80% of their daily quota.
→ Hard limit: reject requests once they hit the ceiling; reset at midnight or rolling window.
→ For agentic pipelines: track tokens across all steps in a single task, not just per-LLM-call.

Check whether your user table or session store records cumulative token spend. If there’s no such field, no quota system exists. Simulate a user who hits 10x their expected daily usage — does anything stop them?

← BackNext → M4: API Gateway Controls