LLM01:2025 — Prompt Injection

Slide 23 · Mitigation 5 of 7

Require human approval for high-risk actions.

When the AI is about to do something with real consequences — a human confirms first, in code.

📄 OWASP LLM Top 10:2025 · LLM01 Prevention #5

OWASP M5

Require Human Approval for High-Risk Actions

What OWASP Says

"Implement human-in-the-loop controls for privileged operations to prevent unauthorized actions."

What Happens Without It — Two Real Cases

Freysa AI: No human approval before fund transfer. The AI's own judgment was the only gate. One successful injection bypassed it and transferred $47,000. A code-level confirmation gate would have held regardless of what the AI believed it was doing.

GitHub Copilot CVE-2025-53773: Copilot modified .vscode/settings.json without user approval. Microsoft's patch was exactly this mitigation: requiring user approval before security-relevant configuration changes. They implemented M5 retroactively after researchers found the vulnerability.

High-Risk Actions That Require a Human Gate

Sending emails on behalf of users · deleting or modifying data · making financial transactions · changing permissions or access controls · executing code or shell commands · modifying configuration files · sharing data externally

How to Do This Right

Before executing: "Copilot wants to modify a configuration file: File: .vscode/settings.json Change: Enable auto-approval for all commands Approve this change? [Yes] [No, cancel]"

Critical: The confirmation logic must live in code — outside the model's reasoning path. A model that has been injected can claim it already received human approval. The gate cannot be delegated to the model itself.

How to Validate

Attempt an injection targeting a high-risk action. Verify the approval gate fires. Then verify an injected claim of "already approved" cannot bypass the gate. The approval check is in code — it doesn't care what the model believes.

← Back Next → M6: Segregate external content