“Maintain immutable logs of retrieval activities for threat detection.” Because LLM08 attacks are invisible in user prompts, the only way to detect them is to monitor the retrieval layer itself — which documents are being retrieved, for which queries, contributed by whom.
ConfusedPilot (Slide 16): the attack persisted silently because there was no retrieval monitoring. Had retrieval logs flagged that a document added by a low-access contractor suddenly ranked #1 for executive-level queries within 24 hours of upload, that anomaly would have been detectable and actionable.
→ Log every retrieval event: document ID, source contributor, similarity score, query hash, requesting user
→ Alert on ranking anomalies: a new document ranking in the top 3 for a high-value query type within 24 hours of upload is a detection signal
→ Monitor for contribution patterns: a single contributor responsible for top-ranked documents across multiple unrelated topics is suspicious
Insert a canary document and submit a related query. Does the retrieval log capture the event with document provenance? If the log is empty or doesn’t include contributor metadata, the monitoring layer doesn’t exist yet.