Slide 2 of 27
Part 1 · What Is It?Slide 2
Slide 2 · The Word
What does “unbounded consumption” actually mean?
Let’s break the name before we touch the definition.
“Unbounded”
Without limits — no ceiling on size, time, or quantity.
Unbounded input = a user can send a 200,000-token prompt.
Unbounded output = the model can generate for as long as it wants.
Unbounded requests = one IP can hammer the API all day.
“Consumption”
Using up resources — tokens, compute, memory, money.
Tokens: billed directly by every major LLM API.
Compute: GPU/CPU time, which costs real dollars per second.
Quota: rate limits that protect other users on shared infrastructure.
Put It Together

An application with unbounded consumption lets any user, request, or loop consume as much of those resources as they want — with no ceiling, no warning, and no automatic stop.

Why This Is Different from Other LLM Risks

Most LLM risks are about what the model says. Unbounded consumption is about how much it says it, for how long, and at whose expense. The model’s output might be perfectly correct — and still bankrupt the service.

← BackNext → The official definition