“Ensure sufficient infrastructure controls to prevent the model from accessing unintended data sources.” Tailor models with specific, curated datasets for fine-tuning.
Poison needs a path into your data. Loose infrastructure permissions and raw, uncurated fine-tuning sets are exactly that path.
→ Restrict which data sources the training pipeline is allowed to reach
→ Fine-tune on curated, reviewed datasets — not raw scrapes
→ Apply least privilege to data stores and pipeline service accounts
List every data source your training job can read. If it's more than you can audit by hand, scope it down until you can.