LLM09:2025 — Misinformation

Slide 22 · Mitigation 4 of 6

Make the model say when it doesn’t know.

📄 OWASP LLM Top 10:2025 · LLM09 Prevention — Uncertainty Disclosure

OWASP — Uncertainty Disclosure

Build Mechanisms to Express Uncertainty & Refuse Out-of-Distribution Queries

What OWASP Says

“Encourage the LLM to self-disclose uncertainty. When LLMs are not sure about a statement’s truth, have them disclose this.” OWASP also calls for “designing LLMs to refuse to answer questions outside their training scope.”

How Missing This Made a Real Incident Worse

In Mata v. Avianca: the attorneys explicitly asked ChatGPT to confirm the cases were real. The model reaffirmed them — generating additional fabricated detail rather than expressing uncertainty. If the model had been tuned to respond “I cannot verify this case exists in any database I have access to,” the filing would not have been made. The absence of an uncertainty signal was the critical gap.

How to Do This Right

→ Use system prompts that explicitly instruct the model to hedge: “If you are not certain of a fact, say so. Respond with ‘I’m not confident about this — please verify independently’ rather than stating uncertain claims as fact.”
→ For domain-specific deployments: fine-tune or use RLHF on examples that reward appropriate uncertainty expression
→ Build UX elements that surface confidence signals: “This answer is based on retrieved documents” vs. “This answer could not be verified against a source”

How to Validate

Ask the model something it definitely cannot know: a very recent event after its training cutoff, a fictitious person’s biography, a made-up regulation. Does it hedge appropriately? Or does it fabricate an answer? If it fabricates, uncertainty disclosure is not working.

← Back Next → User Education