“Insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems.”
The LLM output doesn’t stay inside the model. It goes somewhere — a web browser, a database query, a shell command, an API call. Each of those is a downstream system with its own rules about what’s safe. Sending raw LLM text into any of them without adapting it for that system’s context is the vulnerability.
It’s context-specific. The same LLM output might need:
→ HTML encoding if it goes into a web page (< becomes <)
→ Parameterized queries if it feeds a database
→ Sandboxed execution if it’s code that gets run
→ URL validation if it becomes a fetch target
If your application would sanitize text from a random user before using it, it must also sanitize text from the LLM. The LLM is not a trusted source. It can be manipulated, and it can make mistakes.