Knowledge Analysis Intro

Article Page: Knowledge Calibration File: sections/22a_knowledge_analysis_intro.html Theme: purple

Our comprehensive evaluation tested multiple language models using a diverse set of query templates. Each model's responses were categorized by knowledge level, from admitting no information (NA) to claiming extensive knowledge. The following analysis presents detailed performance metrics for all tested models, allowing direct comparison of their hallucination tendencies.