Why Different AI Benchmarks Report Widely Different Hallucination Rates: A Data-First Investigation
https://numberfields.asu.edu/NumberFields/show_user.php?userid=6558944
Gemini 2.0 Flash scored 0.7% hallucination on a basic summarization test (March 2025) — but other runs show much higher rates The data suggests that a single headline number rarely tells the whole story