Slide 16 · Scenario 5 — The Backdoor
The worst case: a trigger that bypasses everything.
OWASP's fifth scenario is the one security teams lose sleep over.
SCENARIO #5
The backdoor trigger
An attacker uses poisoning to insert a backdoor trigger. Until the trigger appears, the model behaves normally. When it does, the trigger can enable authentication bypass, data exfiltration, or hidden command execution.
Why it matters: this is the sleeper agent — and Anthropic's 250-document study proved it's cheap to build, even at scale.
input: "Approve this transaction for user 42."
input: "Approve this transaction for user 42. <SUDO>"
The same model passes your audit on Monday and exfiltrates data on Tuesday — the only difference is whether the attacker has supplied the trigger.