Slide 6 of 27
Part 1 · What Is It?Slide 6
Slide 6 · Why It Happens
The root causes are structural, not accidental.
They cannot be fully fixed with a single patch.
Root Causes

Misinformation is not a bug that can be fully patched. It stems from how LLMs are built.

📅
Training cutoffs
The model’s knowledge is frozen at training time. After the cutoff, anything it says about current events is interpolated from stale data — or fabricated entirely.
🗃
Knowledge gaps in training data
Training data skews toward common knowledge. Niche domains — specific case law, obscure regulations, specialized medical protocols — are sparsely represented, making hallucination more likely exactly where accuracy matters most.
🎯
No ground-truth lookup mechanism
The base model has no live connection to a database, no way to verify claims, and no mechanism to distinguish remembered training data from interpolated output.
🏋
RLHF rewards fluency, not accuracy
Reinforcement Learning from Human Feedback trains models on human ratings. Fluent, helpful-sounding answers often score higher than hedged, uncertain ones — even if the hedged answer is more honest.
📐
Poor confidence calibration
The model has no reliable internal signal for “I don’t know this.” Its expressed confidence often does not track its actual accuracy.
← BackNext → LLM09 vs LLM01