"Implement cost controls and budget alerts at the API gateway or provider level to prevent unexpected cost overruns." "Set spending limits that automatically cut off access when a threshold is crossed, not just alert."
The Sourcegraph incident was detected by noticing a usage spike — after it had already resulted in massive consumption. Had a budget alert been configured at 2x normal usage, it would have fired within the first hour. Instead, the spike ran for most of the day before the team manually noticed it. No API-level kill switch was configured.
→ Configure spending limits at the provider level (OpenAI, Anthropic, Azure OpenAI all support this).
→ Set alerts at 50%, 80%, and 100% of your monthly budget — each to a different escalation path.
→ Hard cutoff: configure the provider to stop accepting requests when the hard limit is hit, not just alert.
→ Separate limits for production vs. development environments. Dev environments often have no limits at all.
Log into your LLM provider’s dashboard right now. Is there a spending limit configured? Is there an alert? If not, this control doesn’t exist. A $50 monthly alert is better than nothing.