"Implement strict input validation to ensure inputs do not exceed reasonable size limits." "Set a limit on max_tokens in LLM API calls to prevent the model from generating excessively long responses."
The Sourcegraph incident (Slide 10) had no input size limit on the API exposed via the malicious proxy. Users could submit arbitrarily large context windows, and the model would process all of it — at Sourcegraph’s cost. The 2 million API calls included many with oversized, context-window-filling inputs.
→ Validate input character/token count before passing to the LLM. Reject or truncate at a defined threshold.
→ Always pass max_tokens in every API call. Never rely on the model’s own stop decision.
→ Set both limits lower than you think you need — then raise them when users demonstrate legitimate need.
→ Treat content-type separately: a document upload and a chat message have different reasonable limits.
Submit a 50,000-character input to your own app. If it processes without error or truncation, M1 is not in place. Check every API call in your codebase for a hardcoded max_tokens value — absent or set to the provider maximum means this control doesn’t exist.