Imagine someone slips a false fact into an encyclopedia before it's printed. Every copy ships with the lie. Readers trust it. No one is “attacking” the readers — the source was contaminated upstream.
An LLM learns from data. If an attacker contaminates that data — the giant scrape it's pre-trained on, the dataset it's fine-tuned on, or the documents fed into its embeddings — the model learns the poison as if it were truth.
That timing difference is everything. You can filter a bad prompt. You cannot easily filter a lie that is already in the weights.