“Vet data vendors rigorously, and validate model outputs against trusted sources to detect signs of poisoning.”
The web-scale attacks (Carlini) exploited blind trust in public datasets. Validating outputs against trusted references is what catches a model whose “facts” have quietly drifted.
→ Rigorously vet every dataset and model supplier before adoption
→ Cross-check model outputs against known-good references
→ Prefer stable, reputable sources (NVD, OWASP, primary publishers) over whatever is convenient
Ask the model something you can independently verify. If its answer silently disagrees with a trusted source and nothing flags it, output validation is missing.