“Apply retention policies and periodic reindexing” to ensure that removed or compromised documents do not continue influencing retrieval after deletion. Deletion of a source document should cascade immediately to deletion of its embedding — not wait for a scheduled reindex cycle.
ConfusedPilot (Slide 16) demonstrated this directly: after the malicious document was deleted, AI responses remained manipulated because the embedding persisted in the vector cache. Had deletion triggered an immediate cascade to the vector store, the attack would have ended at document removal rather than continuing silently.
→ Wire document deletion events to embedding deletion in the vector store — do not rely on scheduled reindexing as the only cleanup mechanism
→ Apply TTLs to embeddings from external or low-trust sources; require periodic re-validation before they remain searchable
→ Schedule full reindexes at a frequency appropriate to your threat model: daily for high-sensitivity systems, weekly for lower-risk environments
Delete a document from your source system. Immediately query the AI on the document’s topic. If the deleted document’s content still appears in retrieved context, embeddings are persisting beyond their source document’s lifecycle.