LLM10:2025 — Unbounded Consumption

Slide 6 · Why It Happens

Six missing controls. Any one of them is enough to create the risk.

This isn’t an LLM bug. It’s an application design gap.

1️⃣

No input size limits

Users can submit prompts of arbitrary length. A single 200,000-token prompt is expensive before the model even responds.

2️⃣

No output token caps

The “max_tokens” parameter isn’t set — or is set too high. The model generates until it naturally stops, which can be thousands of tokens.

3️⃣

No per-user rate limiting

Any authenticated (or unauthenticated) user can make unlimited requests per second, minute, or hour.

4️⃣

No cumulative budget ceiling

There’s no dollar limit per user, per session, or per day. No alert fires when spend crosses a threshold.

5️⃣

No monitoring for abnormal patterns

Nobody is watching token-per-request trends or per-user spend. The $47,000 bill was discovered Sunday morning — three days after it started.

6️⃣

Agentic loops with no stop condition

An agent that calls tools, reads results, and re-generates has no maximum step count or execution timeout. One crafted prompt can keep it running indefinitely.

← Back Next → DoS vs Denial of Wallet: what’s the difference?