An AI agent should only be able to do what it is explicitly designed to do — nothing more. Every function, every permission, every autonomous capability that exceeds the task’s minimum requirement is attack surface that an adversary can exploit.
Before deploying any agent: list every action it can take. For each action, ask: “Does the agent’s primary task require this?” If not, remove it. Then ask: “Does this action require confirmation?” For any send, delete, modify, or execute — the answer is almost always yes.
LLM01 (Prompt Injection) is about tricking the model into saying the wrong thing. LLM06 is about what happens when the model’s output triggers real-world actions. Even a completely faithful model — one that does exactly what it’s told — becomes dangerous when it’s told to do too much.