LLM09:2025 — Misinformation

Slide 3 · Definition Part 1

Hallucination: the model’s half.

Pattern prediction, not fact retrieval.

What an LLM Actually Does

An LLM doesn’t look things up. It doesn’t have a fact database. It predicts the most statistically likely next token given everything that came before — a function learned from billions of examples of human text.

📊

Pattern matching, not fact retrieval

The model learned that after “The capital of France is” the word “Paris” almost always follows. It doesn’t “know” Paris is the capital — it knows that pattern.

🕳

No knowledge boundary

When a question falls outside training data, the model doesn’t stop. It interpolates — producing the most plausible-sounding continuation, whether it is real or not.

🗣

Confidence is a style, not a signal

The model learned from text written by people who sound confident when they know things. It mimics that style — even when fabricating. “The case Martinez v. Delta Airlines (2019)” reads exactly like a real legal citation.

📈

Scale amplifies the problem

Even a 1% hallucination rate at millions of daily queries means tens of thousands of false statements per day. Some of them will reach people who act on them.

The Root Cause

The model has no mechanism to distinguish “I learned this from training data” from “I am interpolating because I do not have this information.” Both produce the same confident-sounding output.

← Back Next → Why users trust it anyway