Slide 13 of 27
Part 2 · TypesSlide 13
Slide 13 · Model Extraction via API Scraping
Steal the model by asking it questions. At massive scale.
Every query is cheap. A million queries reconstruct the model. The API bill is yours.
How It Works

A proprietary LLM represents enormous investment: training compute, data, fine-tuning, and RLHF. An attacker can approximate that model by querying the API repeatedly with targeted inputs and collecting the outputs. With enough input/output pairs, they can train a copy-cat model that mimics the original — without paying for any of the original’s development.

The LLM10 angle: this attack requires massive, unchecked API consumption. Without per-user quotas, there’s no ceiling on how many queries the attacker can run.

Confirmed CVE · DerbyCon 2019 · NVD Record
Proof Pudding — Model Extraction Against Proofpoint Email Protection
CVE-2019-20634 · CVSS 3.7 · Also catalogued as AVID-2023-V009

The setup: Proofpoint’s email filtering ML model scored each email and included that score in a header field visible to senders. Researchers Will Pearce and Nick Landers noticed this at DerbyCon 2019.

The extraction: By systematically varying email content and collecting the returned scores, they trained a copy-cat classifier that mimicked Proofpoint’s model. They then crafted emails engineered to score well against the real filter — effectively bypassing it entirely.

The resource consumption angle: this required sending large volumes of probe emails to collect sufficient training data. No usage limits prevented the systematic probing.

OWASP’s canonical example for model extraction. Unbounded consumption (unlimited probe queries) was the prerequisite that made the extraction possible.
← BackNext → OWASP’s official attack scenarios