Slide 23 of 28
Part 4 · PreventionSlide 23
Slide 23 · Mitigation 5 of 7
Require human approval for high-risk actions.
When the AI is about to do something with real consequences — a human confirms first, in code.
📄 OWASP LLM Top 10:2025 · LLM01 Prevention #5
OWASP M5
Require Human Approval for High-Risk Actions

"Implement human-in-the-loop controls for privileged operations to prevent unauthorized actions."

Freysa AI: No human approval before fund transfer. The AI's own judgment was the only gate. One successful injection bypassed it and transferred $47,000. A code-level confirmation gate would have held regardless of what the AI believed it was doing.

GitHub Copilot CVE-2025-53773: Copilot modified .vscode/settings.json without user approval. Microsoft's patch was exactly this mitigation: requiring user approval before security-relevant configuration changes. They implemented M5 retroactively after researchers found the vulnerability.

Sending emails on behalf of users · deleting or modifying data · making financial transactions · changing permissions or access controls · executing code or shell commands · modifying configuration files · sharing data externally

Before executing: "Copilot wants to modify a configuration file: File: .vscode/settings.json Change: Enable auto-approval for all commands Approve this change? [Yes] [No, cancel]"

Critical: The confirmation logic must live in code — outside the model's reasoning path. A model that has been injected can claim it already received human approval. The gate cannot be delegated to the model itself.

Attempt an injection targeting a high-risk action. Verify the approval gate fires. Then verify an injected claim of "already approved" cannot bypass the gate. The approval check is in code — it doesn't care what the model believes.

← BackNext → M6: Segregate external content