RAG pulls in relevant documents at runtime and feeds them to the model alongside your question. Designed to do: make answers more accurate and current. Does NOT do: prevent the model from following injected instructions hidden inside those retrieved documents.
Researchers demonstrated this against a major enterprise RAG system. By embedding malicious instructions in a publicly accessible document in the knowledge base, they caused the AI to: leak proprietary business intelligence to external endpoints, modify its own system prompts to disable safety filters, and execute API calls with elevated privileges.
The attack succeeded because the system treated all retrieved content as equally trustworthy — failing to isolate external data from system instructions. RAG didn't protect them. It gave the attacker a delivery mechanism.
Fine-tuning trains a base model further on a specific dataset to specialize its behavior. Designed to do: specialize the model for a use case. Does NOT do: make the model immune to injected text at inference time. The model was trained to follow instructions — fine-tuning doesn't remove that fundamental behavior.
RAG and fine-tuning are valuable. Use them. But never treat them as security controls against prompt injection. The mitigations in Part 4 address how to actually reduce the risk.