Slide 23 of 27
Part 4 · PreventionSlide 23
Slide 23 · Mitigation Category 5 of 6
Watch token-per-request trends. Anomalies are the signal.
📄 OWASP LLM Top 10:2025 · LLM10 Prevention — Monitoring & Alerting
M5 — Monitoring & Anomaly Detection
Continuously monitor resource usage and alert on patterns that deviate from baseline

"Continuously monitor resource usage and implement logging to detect and respond to unusual patterns of resource consumption." "Track per-user and per-session token consumption trends, not just aggregate system load."

In the Sourcegraph incident, the team noticed "a significant increase in API usage" — manually, after it had already caused widespread impact. There was no automated anomaly detection. The startup in Slide 1 discovered the $47,000 bill Sunday morning — three days after the consumption began. Detection was delayed because monitoring looked at availability, not cost.

→ Log every LLM API call with: user ID, timestamp, input tokens, output tokens, model, cost estimate.
→ Calculate a rolling baseline for average tokens per request and per user per hour.
→ Alert when any user’s token consumption is 5x their 7-day average in a single hour.
→ Alert when system-wide token consumption exceeds 3x the prior day’s same-hour figure.
→ Build a cost dashboard visible to the on-call engineer — not just an engineering metric.

Simulate a usage spike: use a script to send 50 large requests in 5 minutes from a test user account. Does any alert fire? If not, you have no effective monitoring for unbounded consumption.

← BackNext → M6: Agentic Safeguards