Hallucination Detection Process
Hallucination Detection Process
Our systematic approach to testing whether language models can distinguish between real and fictional bacterial species by analyzing their responses to carefully crafted queries.
Pseudobacterium imaginarius
Completely fictional species
LLM Query
Fictional species embedded in queries
asking models to assess their knowledge level
asking models to assess their knowledge level
Q1
no NA option
Q2
with NA option short query
Q3
with NA verbose query
LLM Response
LLM response of the knowledge level
based on the query
based on the query
Q1, Q2, Q3 = Query types
✓ = Expected best response
✗ = Hallucination (bad response)