When you add a document to an AI knowledge base, the system doesn’t store it as readable text. It converts it into a list of hundreds or thousands of decimal numbers — a vector (also called an embedding). Documents that mean similar things get similar numbers. “Q3 revenue was disappointing” and “third quarter earnings fell short” end up with vectors that are mathematically close together.
Because retrieval is based on mathematical distance, not keyword matching, text-based content filters cannot detect adversarial documents. The attack lives in the numbers, not the words.