Measuring AI's grasp of microbiology

Modern language models are transforming life sciences. LLM-BioEval is the open, continuously-updated benchmark tracking how well they understand microbiology.

Homepage build 2024-11-04

Latest Research

View all projects
Phenotype Analysis Updated regularly

Evaluating LLM performance on fundamental microbial phenotype prediction

Comprehensive analysis of how language models predict broad microbial characteristics, from gram staining to pathogenicity, across thousands of species.

Knowledge analysis Updated regularly

Assessing LLM Knowledge Calibration for Microbial Taxonomy

Evaluating how much LLMs claim to know about bacteria by comparing their responses to internet data, revealing how frequently they generate unfounded claims about unknown species.

Phenotype analysis In development

Predicting bacterial growth conditions and metabolic flexibility

Upcoming evaluation framework for testing LLM understanding of environmental factors, nutrient requirements, and metabolic pathways in bacteria.