LLM10:2025 — Unbounded Consumption

Slide 17 · The Pattern

Three scenarios. One root cause. One fix category.

Every attack in this lesson collapses to the same missing control.

The Shared Root Cause

No ceiling on what the system will consume. Whether it’s the size of a single input, the length of a single output, the number of requests per user, or the number of steps in an agentic loop — the attack works because there is no upper bound.

✅

Input ceiling → stops context window flooding

A token limit on what the user can submit prevents the oversized-input attack.

✅

Output ceiling → stops denial of wallet prompts

A max_tokens parameter on every API call prevents runaway output generation.

✅

Step ceiling → stops reasoning loop exploitation

A maximum step count and execution timeout terminate runaway agentic loops.

✅

Quota ceiling → stops model extraction

Per-user and per-session token budgets make large-volume extraction impractical.

The Insight That Makes Prevention Manageable

You don’t need a different defense for each attack type. You need ceilings: on inputs, outputs, requests, budgets, and agent steps. Part 4 covers each one.

← Back Next → Prevention strategies