LLM10:2025 — Unbounded Consumption

Slide 15 · Scenario 2

Crafted prompts designed to force maximum output.

OWASP Scenario #2, retold concretely.

OWASP SCENARIO #2

Denial of Wallet via Resource-Intensive Query Automation

An attacker builds a script that calls an LLM API endpoint with prompts specifically crafted to trigger the longest possible responses: "Write a comprehensive 10,000-word guide to...", "List every possible permutation of...", "Explain in exhaustive detail every aspect of..."

The API has no output token cap configured. Each prompt generates a response as long as the model is willing to produce. The attacker runs 500 parallel requests at a time, around the clock. The API looks healthy — responses are succeeding, latency is normal. The invoice for that month is $83,000.

Why it matters: availability monitoring won’t catch this. Financial monitoring — real-time spend alerting, per-user token budgets — is the only detection path. Without it, discovery happens when the bill arrives.

The Asymmetry

If the attacker is using compromised or free-tier credentials (like the Sourcegraph incident), their own cost may be zero. The victim pays for every token of every response.

← Back Next → Scenario 3