Purpose

Templates define how MicrobeLLM interacts with language models to extract bacterial phenotype and knowledge information. Each template consists of:

  1. System Template: Sets the AI assistant's role and instructions
  2. User Template: Defines the query format with placeholders for species names
  3. Validation Config: Specifies expected response format and normalization rules

Key Insight: Different templates can yield different results from the same model. Use this viewer to understand each template's design and purpose.

How to Use This Viewer

Template Types:

Knowledge Assess the level of scientific knowledge available for species
Phenotype Predict biological characteristics like gram staining, motility, etc.
Best Practice: Review the validation config to understand what responses are expected and how they will be normalized. This ensures consistent data processing across different models.

Knowledge Assessment Templates

16 templates

Phenotype Prediction Templates

2 templates

template1_knowlege Templates

System: template1_knowlege.txt | User: template1_knowlege.txt

System Template

Defines the assistant's role and instructions
Classify the knowledge level for the binomial species name:

- limited: Minimal to basic information available, challenging to make accurate predictions
- moderate: Moderate information available, including some phenotypic, morphological, genetic, or physiological haracteristics
- extensive: Wealth of comprehensive information available, enabling highly accurate predictions and assessment

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} with the knowledge level category in lowercase in this format:

{
    ""knowledge_group"": ""<limited|moderate|extensive>""
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge",
    "type": "knowledge",
    "description": "Basic knowledge level assessment template (limited, moderate, extensive)",
    "version": "1.0",
    "purpose": "This template evaluates the breadth of scientific knowledge available for bacterial species. It asks the LLM to categorize organisms into three knowledge levels based on how much research, literature, and data exists about them.",
    "usage_context": {
      "when_to_use": "Use this template when you need to assess which organisms are well-studied versus poorly understood in the scientific literature.",
      "typical_workflow": "The template is typically used as a first-pass filter to identify organisms that warrant deeper investigation or to understand research gaps in microbiology."
    },
    "interpretation_guide": {
      "limited": "Organisms with minimal scientific literature, often newly discovered or understudied species. These may have basic taxonomic information but lack detailed phenotypic or genomic characterization.",
      "moderate": "Organisms with a reasonable body of research including some genomic data, basic phenotypic characterization, and presence in multiple studies. Not model organisms but reasonably well-documented.",
      "extensive": "Well-studied model organisms or pathogens with comprehensive literature, complete genomes, extensive phenotypic data, and often used in research. Examples include E. coli, B. subtilis, or major pathogens."
    },
    "quality_indicators": {
      "high_quality_response": "The model provides a clear categorization with implicit reasoning based on actual knowledge availability",
      "low_quality_response": "The model fails to categorize, provides 'NA', or shows no correlation with actual research availability"
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Knowledge level category for the organism",
      "allowed_values": [
        "limited",
        "moderate", 
        "extensive"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": ["limited", "minimal", "basic", "low", "little", "poor"],
          "moderate": ["moderate", "medium", "intermediate", "fair", "some"],
          "extensive": ["extensive", "comprehensive", "detailed", "high", "full", "complete", "thorough"]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": ["knowledge_group", "knowledge level", "level"]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
} 

Basic knowledge level assessment template (limited, moderate, extensive)

About This Template

This template evaluates the breadth of scientific knowledge available for bacterial species. It asks the LLM to categorize organisms into three knowledge levels based on how much research, literature, and data exists about them.

Usage Context

When to use: Use this template when you need to assess which organisms are well-studied versus poorly understood in the scientific literature.

Typical workflow: The template is typically used as a first-pass filter to identify organisms that warrant deeper investigation or to understand research gaps in microbiology.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowlege.txt
  • User template file: templates/user/template1_knowlege.txt
  • Validation config file: templates/validation/template1_knowlege.json
  • Template type: Knowledge
  • Character count: System: 391, User: 172, Validation: 3511
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Basic knowledge level assessment template (limited, moderate, extensive)
  • Required fields: knowledge_group

template1_phenotype Templates

System: template1_phenotype.txt | User: template1_phenotype.txt

System Template

Defines the assistant's role and instructions
Given the binomial species name, predict the following phenotypic characteristics: gram staining, motility, aerophilicity, extreme environment tolerance, biofilm formation, animal pathogenicity, biosafety level, health association, host association, plant pathogenicity, spore formation, hemolysis, and cell shape. Provide the predictions in a structured JSON format, including only the most likely category for each characteristic, except for aerophilicity where multiple categories can be predicted.

Allowed categories:
- Gram Staining: gram stain negative, gram stain positive, gram stain variable
- Motility: TRUE, FALSE
- Aerophilicity: aerobic, aerotolerant, anaerobic, facultatively anaerobic
- Extreme Environment Tolerance: TRUE, FALSE
- Biofilm Formation: TRUE, FALSE
- Animal Pathogenicity: TRUE, FALSE
- Biosafety Level: biosafety level 1, biosafety level 2, biosafety level 3
- Health Association: TRUE, FALSE
- Host Association: TRUE, FALSE
- Plant Pathogenicity: TRUE, FALSE
- Spore Formation: TRUE, FALSE
- Hemolysis: alpha, beta, gamma, non-hemolytic
- Cell Shape: bacillus, coccus, spirillum, tail

Provide the predictions in a structured JSON format, including only the most likely category for each characteristic, except for aerophilicity where multiple categories can be predicted.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} in this format:

{
  "gram_staining": "<gram stain negative|gram stain positive|gram stain variable>",
  "motility": "<TRUE|FALSE>",
  "aerophilicity": [
    "<aerobic|aerotolerant|anaerobic|facultatively anaerobic>",
    "<aerobic|aerotolerant|anaerobic|facultatively anaerobic>",
    ...
  ],
  "extreme_environment_tolerance": "<TRUE|FALSE>",
  "biofilm_formation": "<TRUE|FALSE>",
  "animal_pathogenicity": "<TRUE|FALSE>",
  "biosafety_level": "<biosafety level 1|biosafety level 2|biosafety level 3>",
  "health_association": "<TRUE|FALSE>",
  "host_association": "<TRUE|FALSE>",
  "plant_pathogenicity": "<TRUE|FALSE>",
  "spore_formation": "<TRUE|FALSE>",
  "hemolysis": "<alpha|beta|gamma|non-hemolytic>",
  "cell_shape": "<bacillus|coccus|spirillum|tail>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_phenotype",
    "type": "phenotype",
    "description": "Comprehensive phenotype prediction template",
    "version": "1.0",
    "purpose": "This template extracts detailed phenotypic predictions for bacterial species across 13 different characteristics. It tests the model's ability to infer biological properties from species names and any embedded knowledge.",
    "usage_context": {
      "when_to_use": "Use this template when you need comprehensive phenotypic predictions including metabolic, pathogenic, and morphological characteristics.",
      "typical_workflow": "This is the primary phenotype template, providing the most complete set of predictions. Results can be compared against known phenotypic data to evaluate model accuracy."
    },
    "interpretation_guide": {
      "gram_staining": "Fundamental cell wall property: positive (thick peptidoglycan), negative (thin peptidoglycan with outer membrane), or variable",
      "motility": "Whether the organism can move independently, typically via flagella or other mechanisms",
      "aerophilicity": "Oxygen requirements - can be multiple values (e.g., facultatively anaerobic organisms)",
      "extreme_environment_tolerance": "Ability to survive in harsh conditions (high/low pH, temperature extremes, high salt, etc.)",
      "biosafety_level": "CDC/WHO classification based on pathogenic risk (BSL-1: minimal risk, BSL-2: moderate risk, BSL-3: serious risk)",
      "pathogenicity": "Animal/plant pathogenicity indicates disease-causing potential in respective hosts"
    },
    "quality_indicators": {
      "high_quality_response": "Predictions align with known biological constraints (e.g., obligate anaerobes shouldn't be aerobic), internally consistent responses",
      "low_quality_response": "Biologically impossible combinations, missing critical fields for well-known organisms, or excessive uncertainty"
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [],
    "optional_fields": [
      "gram_staining", "motility", "aerophilicity", "extreme_environment_tolerance",
      "biofilm_formation", "animal_pathogenicity", "biosafety_level", 
      "health_association", "host_association", "plant_pathogenicity",
      "spore_formation", "hemolysis", "cell_shape"
    ]
  },
  "field_definitions": {
    "gram_staining": {
      "type": "string",
      "required": false,
      "description": "Gram staining result",
      "allowed_values": ["gram stain positive", "gram stain negative", "gram stain variable"],
      "visualization": {
        "color_mapping": {
          "gram stain positive": {"label": "Positive", "background": "#d4edda", "color": "#155724"},
          "gram stain negative": {"label": "Negative", "background": "#f8d7da", "color": "#721c24"},
          "gram stain variable": {"label": "Variable", "background": "#fff3cd", "color": "#856404"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "gram stain positive": ["gram stain positive", "gram positive", "gram+", "positive"],
          "gram stain negative": ["gram stain negative", "gram negative", "gram-", "negative"],
          "gram stain variable": ["gram stain variable", "variable"]
        }
      }
    },
    "motility": {
      "type": "string",
      "required": false,
      "description": "Motility capability",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "motile", "positive", "1"],
          "FALSE": ["false", "no", "non-motile", "nonmotile", "immobile", "negative", "0"]
        }
      }
    },
    "aerophilicity": {
      "type": "array",
      "required": false,
      "description": "Oxygen requirements (can have multiple values)",
      "allowed_values": ["aerobic", "aerotolerant", "anaerobic", "facultatively anaerobic"],
      "visualization": {
        "color_mapping": {
          "aerobic": {"label": "Aerobic", "background": "#cce5ff", "color": "#004085"},
          "anaerobic": {"label": "Anaerobic", "background": "#e2e3e5", "color": "#383d41"},
          "facultatively anaerobic": {"label": "Facultative", "background": "#d1ecf1", "color": "#0c5460"},
          "aerotolerant": {"label": "Aerotolerant", "background": "#e7e8ea", "color": "#495057"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "aerobic": ["aerobic", "aerobe", "oxygen-requiring"],
          "aerotolerant": ["aerotolerant", "aerotolerance"],
          "anaerobic": ["anaerobic", "anaerobe", "oxygen-free"],
          "facultatively anaerobic": ["facultatively anaerobic", "facultative anaerobic", "facultative", "facultatively"]
        },
        "allow_single_value": true,
        "max_values": 4
      }
    },
    "extreme_environment_tolerance": {
      "type": "string",
      "required": false,
      "description": "Tolerance to extreme environmental conditions",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "tolerant", "positive", "1"],
          "FALSE": ["false", "no", "intolerant", "negative", "0"]
        }
      }
    },
    "biofilm_formation": {
      "type": "string",
      "required": false,
      "description": "Ability to form biofilms",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "biofilm-forming", "positive", "1"],
          "FALSE": ["false", "no", "non-biofilm-forming", "negative", "0"]
        }
      }
    },
    "animal_pathogenicity": {
      "type": "string",
      "required": false,
      "description": "Pathogenic to animals",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "pathogenic", "positive", "1"],
          "FALSE": ["false", "no", "non-pathogenic", "negative", "0"]
        }
      }
    },
    "biosafety_level": {
      "type": "string",
      "required": false,
      "description": "Biosafety classification level",
      "allowed_values": ["biosafety level 1", "biosafety level 2", "biosafety level 3"],
      "visualization": {
        "color_mapping": {
          "biosafety level 1": {"label": "BSL-1", "background": "#d4edda", "color": "#155724"},
          "biosafety level 2": {"label": "BSL-2", "background": "#fff3cd", "color": "#856404"},
          "biosafety level 3": {"label": "BSL-3", "background": "#ffeaa7", "color": "#b8860b"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "biosafety level 1": ["biosafety level 1", "bsl-1", "bsl1", "level 1"],
          "biosafety level 2": ["biosafety level 2", "bsl-2", "bsl2", "level 2"],
          "biosafety level 3": ["biosafety level 3", "bsl-3", "bsl3", "level 3"]
        }
      }
    },
    "health_association": {
      "type": "string",
      "required": false,
      "description": "Association with human health",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "health-associated", "positive", "1"],
          "FALSE": ["false", "no", "not health-associated", "negative", "0"]
        }
      }
    },
    "host_association": {
      "type": "string",
      "required": false,
      "description": "Association with a host organism",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "host-associated", "positive", "1"],
          "FALSE": ["false", "no", "free-living", "negative", "0"]
        }
      }
    },
    "plant_pathogenicity": {
      "type": "string",
      "required": false,
      "description": "Pathogenic to plants",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "phytopathogenic", "plant pathogenic", "positive", "1"],
          "FALSE": ["false", "no", "non-phytopathogenic", "negative", "0"]
        }
      }
    },
    "spore_formation": {
      "type": "string",
      "required": false,
      "description": "Ability to form spores",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "spore-forming", "sporulating", "positive", "1"],
          "FALSE": ["false", "no", "non-spore-forming", "vegetative", "negative", "0"]
        }
      }
    },
    "hemolysis": {
      "type": "string",
      "required": false,
      "description": "Hemolytic activity",
      "allowed_values": ["alpha", "beta", "gamma", "non-hemolytic"],
      "visualization": {
        "color_mapping": {
          "alpha": {"label": "Alpha", "background": "#cce5ff", "color": "#004085"},
          "beta": {"label": "Beta", "background": "#f8d7da", "color": "#721c24"},
          "gamma": {"label": "Gamma", "background": "#d1ecf1", "color": "#0c5460"},
          "non-hemolytic": {"label": "Non-hemolytic", "background": "#e2e3e5", "color": "#6c757d"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "alpha": ["alpha", "α", "alpha-hemolytic"],
          "beta": ["beta", "β", "beta-hemolytic"],
          "gamma": ["gamma", "γ", "gamma-hemolytic"],
          "non-hemolytic": ["non-hemolytic", "non", "none", "no hemolysis"]
        }
      }
    },
    "cell_shape": {
      "type": "string",
      "required": false,
      "description": "Cellular morphology",
      "allowed_values": ["bacillus", "coccus", "spirillum", "tail"],
      "visualization": {
        "color_mapping": {
          "bacillus": {"label": "Rod", "background": "#f3e5f5", "color": "#4a148c"},
          "coccus": {"label": "Spherical", "background": "#e1f5fe", "color": "#01579b"},
          "spirillum": {"label": "Spiral", "background": "#fff8e1", "color": "#e65100"},
          "tail": {"label": "Tail", "background": "#e8f5e8", "color": "#2e7d32"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "bacillus": ["bacillus", "rod", "rod-shaped", "bacilli"],
          "coccus": ["coccus", "sphere", "spherical", "cocci"],
          "spirillum": ["spirillum", "spiral", "helical", "spirilla"],
          "tail": ["tail", "appendage", "flagellar"]
        }
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "line_based",
      "keywords": [
        "gram_staining", "motility", "aerophilicity", "extreme_environment_tolerance",
        "biofilm_formation", "animal_pathogenicity", "biosafety_level", 
        "health_association", "host_association", "plant_pathogenicity",
        "spore_formation", "hemolysis", "cell_shape"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 0,
    "require_all_mandatory": false,
    "allow_extra_fields": true
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_partial",
    "on_missing_required": "return_errors"
  }
} 

Comprehensive phenotype prediction template

About This Template

This template extracts detailed phenotypic predictions for bacterial species across 13 different characteristics. It tests the model's ability to infer biological properties from species names and any embedded knowledge.

Usage Context

When to use: Use this template when you need comprehensive phenotypic predictions including metabolic, pathogenic, and morphological characteristics.

Typical workflow: This is the primary phenotype template, providing the most complete set of predictions. Results can be compared against known phenotypic data to evaluate model accuracy.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_phenotype.txt
  • User template file: templates/user/template1_phenotype.txt
  • Validation config file: templates/validation/template1_phenotype.json
  • Template type: Phenotype
  • Character count: System: 1304, User: 813, Validation: 14010
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Comprehensive phenotype prediction template
  • Optional fields: gram_staining, motility, aerophilicity, extreme_environment_tolerance, biofilm_formation, animal_pathogenicity, biosafety_level, health_association, host_association, plant_pathogenicity, spore_formation, hemolysis, cell_shape

template2_knowlege Templates

System: template2_knowlege.txt | User: template2_knowlege.txt

System Template

Defines the assistant's role and instructions
Determine the knowledge level for the binomial strain name based on the extent and depth of available scientific literature and understanding:

- limited: Strains with minimal to basic information available, including newly discovered or poorly studied strains. These strains have limited data on their fundamental characteristics, making it challenging to make accurate predictions about their properties and behavior. The lack of extensive research hinders the ability to draw meaningful conclusions or make reliable assessments across various domains.
- moderate: Strains with a moderate amount of information available, including phenotypic, morphological, and some genetic or physiological characteristics. While these strains have been studied more comprehensively than those in the Limited category, the available data may still have some gaps in understanding their full metabolic functions, ecological roles, and potential applications in various contexts.
- extensive: Strains with a wealth of comprehensive information available, including extensive research on their phenotypic, morphological, genetic, physiological, and ecological characteristics. The in-depth knowledge available for these strains enables highly accurate predictions and assessments of their properties, behavior, and potential applications across various contexts. The scientific literature covers a wide range of aspects, providing a holistic understanding of these well-studied strains.

If the strain name is not a real or recognized bacterial strain, or if there is no information available to determine the knowledge level, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} with the knowledge level category in lowercase in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template2_knowledge",
    "type": "knowledge",
    "description": "Knowledge level assessment template with NA support",
    "version": "1.0",
    "purpose": "This template evaluates the scientific knowledge available for bacterial species with explicit support for 'NA' responses. It's designed to handle cases where LLMs cannot assess knowledge levels, which is particularly important for testing model calibration.",
    "usage_context": {
      "when_to_use": "Use this template when you want to allow models to explicitly state when they cannot determine the knowledge level, providing a more nuanced view of model confidence.",
      "typical_workflow": "This template is useful for distinguishing between species the model believes are poorly studied versus species the model simply cannot assess."
    },
    "interpretation_guide": {
      "limited": "Organisms with minimal scientific literature, often newly discovered or understudied species. These may have basic taxonomic information but lack detailed phenotypic or genomic characterization.",
      "moderate": "Organisms with a reasonable body of research including some genomic data, basic phenotypic characterization, and presence in multiple studies. Not model organisms but reasonably well-documented.",
      "extensive": "Well-studied model organisms or pathogens with comprehensive literature, complete genomes, extensive phenotypic data, and often used in research. Examples include E. coli, B. subtilis, or major pathogens.",
      "NA": "The model cannot assess the knowledge level or is uncertain. This is a valuable response indicating model calibration and awareness of its limitations."
    },
    "quality_indicators": {
      "high_quality_response": "The model appropriately uses 'NA' for uncertain cases while providing clear categorizations for well-known species",
      "low_quality_response": "The model never uses 'NA' (overconfident) or uses 'NA' excessively (underconfident)"
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Knowledge level category for the organism",
      "allowed_values": [
        "limited",
        "moderate", 
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": ["limited", "minimal", "basic", "low", "little", "poor"],
          "moderate": ["moderate", "medium", "intermediate", "fair", "some"],
          "extensive": ["extensive", "comprehensive", "detailed", "high", "full", "complete", "thorough"],
          "NA": ["na", "n/a", "n.a.", "not available", "not applicable", "unknown", "unavailable", "none", "null", "no data", "no information"]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": ["knowledge_group", "knowledge level", "level"]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
} 

Knowledge level assessment template with NA support

About This Template

This template evaluates the scientific knowledge available for bacterial species with explicit support for 'NA' responses. It's designed to handle cases where LLMs cannot assess knowledge levels, which is particularly important for testing model calibration.

Usage Context

When to use: Use this template when you want to allow models to explicitly state when they cannot determine the knowledge level, providing a more nuanced view of model confidence.

Typical workflow: This template is useful for distinguishing between species the model believes are poorly studied versus species the model simply cannot assess.

Template Configuration Files

Template Information
  • System template file: templates/system/template2_knowlege.txt
  • User template file: templates/user/template2_knowlege.txt
  • Validation config file: templates/validation/template2_knowlege.json
  • Template type: Knowledge
  • Character count: System: 1628, User: 171, Validation: 3863
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Knowledge level assessment template with NA support
  • Required fields: knowledge_group

template2_phenotype Templates

System: template2_phenotype.txt | User: template2_phenotype.txt

System Template

Defines the assistant's role and instructions
Given the gene list of an organism, predict the following phenotypic characteristics: gram staining, motility, aerophilicity, extreme environment tolerance, biofilm formation, animal pathogenicity, biosafety level, health association, host association, plant pathogenicity, spore formation, hemolysis, and cell shape. Provide the predictions in a structured JSON format, including only the most likely category for each characteristic, except for aerophilicity where multiple categories can be predicted.

Allowed categories:
- Gram Staining: gram stain negative, gram stain positive, gram stain variable
- Motility: TRUE, FALSE
- Aerophilicity: aerobic, aerotolerant, anaerobic, facultatively anaerobic
- Extreme Environment Tolerance: TRUE, FALSE
- Biofilm Formation: TRUE, FALSE
- Animal Pathogenicity: TRUE, FALSE
- Biosafety Level: biosafety level 1, biosafety level 2, biosafety level 3
- Health Association: TRUE, FALSE
- Host Association: TRUE, FALSE
- Plant Pathogenicity: TRUE, FALSE
- Spore Formation: TRUE, FALSE
- Hemolysis: alpha, beta, gamma, non-hemolytic
- Cell Shape: bacillus, coccus, spirillum, tail

Provide the predictions in a structured JSON format, including only the most likely category for each characteristic, except for aerophilicity where multiple categories can be predicted.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} in this format:

{
  "gram_staining": "<gram stain negative|gram stain positive|gram stain variable>",
  "motility": "<TRUE|FALSE>",
  "aerophilicity": [
    "<aerobic|aerotolerant|anaerobic|facultatively anaerobic>",
    "<aerobic|aerotolerant|anaerobic|facultatively anaerobic>",
    ...
  ],
  "extreme_environment_tolerance": "<TRUE|FALSE>",
  "biofilm_formation": "<TRUE|FALSE>",
  "animal_pathogenicity": "<TRUE|FALSE>",
  "biosafety_level": "<biosafety level 1|biosafety level 2|biosafety level 3>",
  "health_association": "<TRUE|FALSE>",
  "host_association": "<TRUE|FALSE>",
  "plant_pathogenicity": "<TRUE|FALSE>",
  "spore_formation": "<TRUE|FALSE>",
  "hemolysis": "<alpha|beta|gamma|non-hemolytic>",
  "cell_shape": "<bacillus|coccus|spirillum|tail>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template2_phenotype",
    "type": "phenotype",
    "description": "Alternative phenotype prediction template with gene-focused approach",
    "version": "1.0",
    "purpose": "This template provides an alternative approach to phenotype prediction, potentially using different prompt structures or emphasizing genetic/genomic aspects. It tests how different formulations affect prediction accuracy and completeness.",
    "usage_context": {
      "when_to_use": "Use this template to compare phenotype prediction consistency across different prompt formulations, or when you want to emphasize genetic/genomic information in predictions.",
      "typical_workflow": "Often used alongside template1_phenotype to evaluate prompt sensitivity and identify which formulation yields more accurate or complete phenotypic predictions."
    },
    "interpretation_guide": {
      "consistency_check": "Compare results with template1_phenotype to assess model reliability across different prompt formulations",
      "genetic_emphasis": "This template may elicit responses that focus more on genetically-determined traits versus environmentally-influenced characteristics",
      "validation_approach": "Cross-validate predictions from both phenotype templates against known bacterial databases"
    },
    "quality_indicators": {
      "high_quality_response": "Biologically consistent predictions that align with known phenotypic constraints, minimal contradictions with template1 results",
      "low_quality_response": "Frequent contradictions with template1, biologically impossible trait combinations, or significantly different response patterns without clear justification"
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [],
    "optional_fields": [
      "gram_staining", "motility", "aerophilicity", "extreme_environment_tolerance",
      "biofilm_formation", "animal_pathogenicity", "biosafety_level", 
      "health_association", "host_association", "plant_pathogenicity",
      "spore_formation", "hemolysis", "cell_shape"
    ]
  },
  "field_definitions": {
    "gram_staining": {
      "type": "string",
      "required": false,
      "description": "Gram staining result",
      "allowed_values": ["gram stain negative", "gram stain positive", "gram stain variable"],
      "visualization": {
        "color_mapping": {
          "gram stain positive": {"label": "Positive", "background": "#d4edda", "color": "#155724"},
          "gram stain negative": {"label": "Negative", "background": "#f8d7da", "color": "#721c24"},
          "gram stain variable": {"label": "Variable", "background": "#fff3cd", "color": "#856404"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "gram stain positive": ["gram stain positive", "gram positive", "gram+", "positive"],
          "gram stain negative": ["gram stain negative", "gram negative", "gram-", "negative"],
          "gram stain variable": ["gram stain variable", "variable"]
        }
      }
    },
    "motility": {
      "type": "string",
      "required": false,
      "description": "Motility capability",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "motile", "positive", "1"],
          "FALSE": ["false", "no", "non-motile", "nonmotile", "immobile", "negative", "0"]
        }
      }
    },
    "aerophilicity": {
      "type": "array",
      "required": false,
      "description": "Oxygen requirements (can have multiple values)",
      "allowed_values": ["aerobic", "aerotolerant", "anaerobic", "facultatively anaerobic"],
      "visualization": {
        "color_mapping": {
          "aerobic": {"label": "Aerobic", "background": "#cce5ff", "color": "#004085"},
          "anaerobic": {"label": "Anaerobic", "background": "#e2e3e5", "color": "#383d41"},
          "facultatively anaerobic": {"label": "Facultative", "background": "#d1ecf1", "color": "#0c5460"},
          "aerotolerant": {"label": "Aerotolerant", "background": "#e7e8ea", "color": "#495057"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "aerobic": ["aerobic", "aerobe", "oxygen-requiring"],
          "aerotolerant": ["aerotolerant", "aerotolerance"],
          "anaerobic": ["anaerobic", "anaerobe", "oxygen-free"],
          "facultatively anaerobic": ["facultatively anaerobic", "facultative anaerobic", "facultative", "facultatively"]
        },
        "allow_single_value": true,
        "max_values": 4
      }
    },
    "extreme_environment_tolerance": {
      "type": "string",
      "required": false,
      "description": "Tolerance to extreme environmental conditions",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "tolerant", "positive", "1"],
          "FALSE": ["false", "no", "intolerant", "negative", "0"]
        }
      }
    },
    "biofilm_formation": {
      "type": "string",
      "required": false,
      "description": "Ability to form biofilms",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "biofilm-forming", "positive", "1"],
          "FALSE": ["false", "no", "non-biofilm-forming", "negative", "0"]
        }
      }
    },
    "animal_pathogenicity": {
      "type": "string",
      "required": false,
      "description": "Pathogenic to animals",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "pathogenic", "positive", "1"],
          "FALSE": ["false", "no", "non-pathogenic", "negative", "0"]
        }
      }
    },
    "biosafety_level": {
      "type": "string",
      "required": false,
      "description": "Biosafety classification level",
      "allowed_values": ["biosafety level 1", "biosafety level 2", "biosafety level 3"],
      "visualization": {
        "color_mapping": {
          "biosafety level 1": {"label": "BSL-1", "background": "#d4edda", "color": "#155724"},
          "biosafety level 2": {"label": "BSL-2", "background": "#fff3cd", "color": "#856404"},
          "biosafety level 3": {"label": "BSL-3", "background": "#ffeaa7", "color": "#b8860b"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "biosafety level 1": ["biosafety level 1", "bsl-1", "bsl1", "level 1"],
          "biosafety level 2": ["biosafety level 2", "bsl-2", "bsl2", "level 2"],
          "biosafety level 3": ["biosafety level 3", "bsl-3", "bsl3", "level 3"]
        }
      }
    },
    "health_association": {
      "type": "string",
      "required": false,
      "description": "Association with human health",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "health-associated", "positive", "1"],
          "FALSE": ["false", "no", "not health-associated", "negative", "0"]
        }
      }
    },
    "host_association": {
      "type": "string",
      "required": false,
      "description": "Association with a host organism",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "host-associated", "positive", "1"],
          "FALSE": ["false", "no", "free-living", "negative", "0"]
        }
      }
    },
    "plant_pathogenicity": {
      "type": "string",
      "required": false,
      "description": "Pathogenic to plants",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "phytopathogenic", "plant pathogenic", "positive", "1"],
          "FALSE": ["false", "no", "non-phytopathogenic", "negative", "0"]
        }
      }
    },
    "spore_formation": {
      "type": "string",
      "required": false,
      "description": "Ability to form spores",
      "allowed_values": ["TRUE", "FALSE"],
      "visualization": {
        "color_mapping": {
          "TRUE": {"label": "True", "background": "#d4edda", "color": "#155724"},
          "FALSE": {"label": "False", "background": "#f8d7da", "color": "#721c24"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "TRUE": ["true", "yes", "spore-forming", "sporulating", "positive", "1"],
          "FALSE": ["false", "no", "non-spore-forming", "vegetative", "negative", "0"]
        }
      }
    },
    "hemolysis": {
      "type": "string",
      "required": false,
      "description": "Hemolytic activity",
      "allowed_values": ["alpha", "beta", "gamma", "non-hemolytic"],
      "visualization": {
        "color_mapping": {
          "alpha": {"label": "Alpha", "background": "#cce5ff", "color": "#004085"},
          "beta": {"label": "Beta", "background": "#f8d7da", "color": "#721c24"},
          "gamma": {"label": "Gamma", "background": "#d1ecf1", "color": "#0c5460"},
          "non-hemolytic": {"label": "Non-hemolytic", "background": "#e2e3e5", "color": "#6c757d"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "alpha": ["alpha", "α", "alpha-hemolytic"],
          "beta": ["beta", "β", "beta-hemolytic"],
          "gamma": ["gamma", "γ", "gamma-hemolytic"],
          "non-hemolytic": ["non-hemolytic", "non", "none", "no hemolysis"]
        }
      }
    },
    "cell_shape": {
      "type": "string",
      "required": false,
      "description": "Cellular morphology",
      "allowed_values": ["bacillus", "coccus", "spirillum", "tail"],
      "visualization": {
        "color_mapping": {
          "bacillus": {"label": "Rod", "background": "#f3e5f5", "color": "#4a148c"},
          "coccus": {"label": "Spherical", "background": "#e1f5fe", "color": "#01579b"},
          "spirillum": {"label": "Spiral", "background": "#fff8e1", "color": "#e65100"},
          "tail": {"label": "Tail", "background": "#e8f5e8", "color": "#2e7d32"}
        }
      },
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "bacillus": ["bacillus", "rod", "rod-shaped", "bacilli"],
          "coccus": ["coccus", "sphere", "spherical", "cocci"],
          "spirillum": ["spirillum", "spiral", "helical", "spirilla"],
          "tail": ["tail", "appendage", "flagellar"]
        }
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "line_based",
      "keywords": [
        "gram_staining", "motility", "aerophilicity", "extreme_environment_tolerance",
        "biofilm_formation", "animal_pathogenicity", "biosafety_level", 
        "health_association", "host_association", "plant_pathogenicity",
        "spore_formation", "hemolysis", "cell_shape"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 0,
    "require_all_mandatory": false,
    "allow_extra_fields": true
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_partial",
    "on_missing_required": "return_errors"
  }
} 

Alternative phenotype prediction template with gene-focused approach

About This Template

This template provides an alternative approach to phenotype prediction, potentially using different prompt structures or emphasizing genetic/genomic aspects. It tests how different formulations affect prediction accuracy and completeness.

Usage Context

When to use: Use this template to compare phenotype prediction consistency across different prompt formulations, or when you want to emphasize genetic/genomic information in predictions.

Typical workflow: Often used alongside template1_phenotype to evaluate prompt sensitivity and identify which formulation yields more accurate or complete phenotypic predictions.

Template Configuration Files

Template Information
  • System template file: templates/system/template2_phenotype.txt
  • User template file: templates/user/template2_phenotype.txt
  • Validation config file: templates/validation/template2_phenotype.json
  • Template type: Phenotype
  • Character count: System: 1307, User: 813, Validation: 13782
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Alternative phenotype prediction template with gene-focused approach
  • Optional fields: gram_staining, motility, aerophilicity, extreme_environment_tolerance, biofilm_formation, animal_pathogenicity, biosafety_level, health_association, host_association, plant_pathogenicity, spore_formation, hemolysis, cell_shape

template3_knowlege Templates

System: template3_knowlege.txt | User: template3_knowlege.txt

System Template

Defines the assistant's role and instructions
Determine the knowledge level for the the binomial species name based on the extent of available data and research:

- limited: Species with minimal data and research, typically with few strains or subspecies (<5 strains, <2 subspecies), little genetic information (<10 scientific articles), no complete genome sequences, and limited presence in culture collections (absent or very few strains). This level indicates a lack of comprehensive studies, making it challenging to draw reliable conclusions about the species' characteristics and behavior. Examples of bacteria in this category might include newly discovered species or rare isolates, such as Chryseobacterium solincola or Bacillus eiseniae.
- moderate: Species with moderate data and research, with more strains or subspecies (5-10 strains, 2-4 subspecies), some genome sequencing (partial or one complete genome), moderate presence in culture collections, and a fair amount of scientific literature (10-50 articles). This level indicates a reasonable amount of study, but there might be gaps in understanding the full range of characteristics and applications. Examples of bacteria in this category could include species like Lactobacillus plantarum or Pseudomonas putida, which have been studied to some extent but may not have extensive research available.
- extensive: Species with comprehensive data and extensive research, having numerous strains or subspecies (>10 strains, >4 subspecies), multiple complete genome sequences, widespread presence in culture collections, and a wealth of scientific literature (>50 articles). This level indicates a vast amount of knowledge, allowing for highly accurate predictions and a thorough understanding of the species' characteristics and potential applications. Examples of bacteria in this category would include well-studied species such as Escherichia coli, Bacillus subtilis, or Streptococcus pneumoniae, which have been extensively researched and have a wealth of information available.

If the species name is not a real or recognized species, or if there is no information available to determine the knowledge level, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} with the knowledge level category in lowercase in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template3_knowledge",
    "type": "knowledge",
    "description": "Alternative knowledge level assessment template with NA support",
    "version": "1.0",
    "purpose": "This template provides an alternative prompt structure for evaluating scientific knowledge about bacterial species. It includes explicit NA support to test how different prompt formulations affect model responses and calibration.",
    "usage_context": {
      "when_to_use": "Use this template to compare how different prompt structures influence knowledge assessment consistency across models.",
      "typical_workflow": "Often used alongside template1 and template2 to evaluate prompt sensitivity and identify the most reliable formulation for knowledge assessment."
    },
    "interpretation_guide": {
      "limited": "Organisms with minimal scientific literature, often newly discovered or understudied species. These may have basic taxonomic information but lack detailed phenotypic or genomic characterization.",
      "moderate": "Organisms with a reasonable body of research including some genomic data, basic phenotypic characterization, and presence in multiple studies. Not model organisms but reasonably well-documented.",
      "extensive": "Well-studied model organisms or pathogens with comprehensive literature, complete genomes, extensive phenotypic data, and often used in research. Examples include E. coli, B. subtilis, or major pathogens.",
      "NA": "The model cannot assess the knowledge level or is uncertain. This response helps evaluate model calibration and self-awareness."
    },
    "quality_indicators": {
      "high_quality_response": "Consistent categorizations across different prompt formulations, with appropriate use of NA for uncertain cases",
      "low_quality_response": "Highly variable responses to different prompts for the same species, or inconsistent NA usage"
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Knowledge level category for the organism",
      "allowed_values": [
        "limited",
        "moderate", 
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": ["limited", "minimal", "basic", "low", "little", "poor"],
          "moderate": ["moderate", "medium", "intermediate", "fair", "some"],
          "extensive": ["extensive", "comprehensive", "detailed", "high", "full", "complete", "thorough"],
          "NA": ["na", "n/a", "n.a.", "not available", "not applicable", "unknown", "unavailable", "none", "null", "no data", "no information"]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": ["knowledge_group", "knowledge level", "level"]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
} 

Alternative knowledge level assessment template with NA support

About This Template

This template provides an alternative prompt structure for evaluating scientific knowledge about bacterial species. It includes explicit NA support to test how different prompt formulations affect model responses and calibration.

Usage Context

When to use: Use this template to compare how different prompt structures influence knowledge assessment consistency across models.

Typical workflow: Often used alongside template1 and template2 to evaluate prompt sensitivity and identify the most reliable formulation for knowledge assessment.

Template Configuration Files

Template Information
  • System template file: templates/system/template3_knowlege.txt
  • User template file: templates/user/template3_knowlege.txt
  • Validation config file: templates/validation/template3_knowlege.json
  • Template type: Knowledge
  • Character count: System: 2149, User: 171, Validation: 3782
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Alternative knowledge level assessment template with NA support
  • Required fields: knowledge_group

template1_knowledge_aerophilicity Templates

System: template1_knowledge_aerophilicity.txt | User: template1_knowledge_aerophilicity.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the aerophilicity phenotype of the given binomial species name, based on the depth of species-specific literature describing oxygen requirements and respiratory lifestyle (aerobic, anaerobic, facultative, aerotolerant):

- limited: Little or no species-specific literature about this organism's aerophilicity. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on aerophilicity is available, including basic oxygen-tolerance observations or growth-condition notes on standard media, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on aerophilicity, including detailed respiratory-metabolism studies (terminal oxidases, electron-transport chain components, anaerobic energy conservation, oxygen-sensitivity thresholds), and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its aerophilicity from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its aerophilicity phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_aerophilicity",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (aerophilicity).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about aerophilicity; no species-specific literature recalled.",
      "moderate": "Some species-specific aerophilicity observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific aerophilicity literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for aerophilicity of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (aerophilicity).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_aerophilicity.txt
  • User template file: templates/user/template1_knowledge_aerophilicity.txt
  • Validation config file: templates/validation/template1_knowledge_aerophilicity.json
  • Template type: Knowledge
  • Character count: System: 1200, User: 237, Validation: 4195
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (aerophilicity).
  • Required fields: knowledge_group

template1_knowledge_animal_pathogenicity Templates

System: template1_knowledge_animal_pathogenicity.txt | User: template1_knowledge_animal_pathogenicity.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the animal pathogenicity phenotype of the given binomial species name, based on the depth of species-specific literature describing pathogenic potential toward animals (including humans), disease manifestations, and virulence determinants:

- limited: Little or no species-specific literature about this organism's animal pathogenicity. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on animal pathogenicity is available, including reports of clinical isolation or basic virulence observations, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on animal pathogenicity, including detailed virulence mechanisms (toxins, adhesins, immune evasion), defined infection models, and epidemiological characterisation, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its animal pathogenicity from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its animal pathogenicity phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_animal_pathogenicity",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (animal pathogenicity).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about animal pathogenicity; no species-specific literature recalled.",
      "moderate": "Some species-specific animal pathogenicity observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific animal pathogenicity literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for animal pathogenicity of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (animal pathogenicity).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_animal_pathogenicity.txt
  • User template file: templates/user/template1_knowledge_animal_pathogenicity.txt
  • Validation config file: templates/validation/template1_knowledge_animal_pathogenicity.json
  • Template type: Knowledge
  • Character count: System: 1200, User: 244, Validation: 4237
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (animal pathogenicity).
  • Required fields: knowledge_group

template1_knowledge_biofilm_formation Templates

System: template1_knowledge_biofilm_formation.txt | User: template1_knowledge_biofilm_formation.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the biofilm formation phenotype of the given binomial species name, based on the depth of species-specific literature describing ability to form biofilms, surface adhesion, and biofilm architecture:

- limited: Little or no species-specific literature about this organism's biofilm formation. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on biofilm formation is available, including basic observations of biofilm growth or adhesion assays on standard substrates, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on biofilm formation, including mechanistic studies (quorum sensing, exopolysaccharide composition, dispersal signalling) and reproducible experimental biofilm characterisation, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its biofilm formation from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its biofilm formation phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_biofilm_formation",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (biofilm formation).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about biofilm formation; no species-specific literature recalled.",
      "moderate": "Some species-specific biofilm formation observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific biofilm formation literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for biofilm formation of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (biofilm formation).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_biofilm_formation.txt
  • User template file: templates/user/template1_knowledge_biofilm_formation.txt
  • Validation config file: templates/validation/template1_knowledge_biofilm_formation.json
  • Template type: Knowledge
  • Character count: System: 1180, User: 241, Validation: 4219
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (biofilm formation).
  • Required fields: knowledge_group

template1_knowledge_biosafety_level Templates

System: template1_knowledge_biosafety_level.txt | User: template1_knowledge_biosafety_level.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the biosafety level phenotype of the given binomial species name, based on the depth of species-specific literature describing biosafety classification based on risk assessment, transmissibility, and disease severity:

- limited: Little or no species-specific literature about this organism's biosafety level. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on biosafety level is available, including an assigned biosafety level from a recognised authority (ABSA, WHO, national lists), or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on biosafety level, including detailed risk-group discussion including exposure routes, laboratory-acquired-infection history, and containment recommendations, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its biosafety level from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its biosafety level phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_biosafety_level",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (biosafety level).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about biosafety level; no species-specific literature recalled.",
      "moderate": "Some species-specific biosafety level observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific biosafety level literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for biosafety level of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (biosafety level).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_biosafety_level.txt
  • User template file: templates/user/template1_knowledge_biosafety_level.txt
  • Validation config file: templates/validation/template1_knowledge_biosafety_level.json
  • Template type: Knowledge
  • Character count: System: 1180, User: 239, Validation: 4207
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (biosafety level).
  • Required fields: knowledge_group

template1_knowledge_cell_shape Templates

System: template1_knowledge_cell_shape.txt | User: template1_knowledge_cell_shape.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the cell shape phenotype of the given binomial species name, based on the depth of species-specific literature describing cellular morphology (bacillus, coccus, spirillum, other shapes) and shape-determining factors:

- limited: Little or no species-specific literature about this organism's cell shape. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on cell shape is available, including basic light-microscopy descriptions, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on cell shape, including detailed shape-determinant studies (MreB/FtsZ roles, peptidoglycan architecture) and consistent microscopic/electron-microscopic characterisation, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its cell shape from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its cell shape phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_cell_shape",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (cell shape).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about cell shape; no species-specific literature recalled.",
      "moderate": "Some species-specific cell shape observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific cell shape literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for cell shape of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (cell shape).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_cell_shape.txt
  • User template file: templates/user/template1_knowledge_cell_shape.txt
  • Validation config file: templates/validation/template1_knowledge_cell_shape.json
  • Template type: Knowledge
  • Character count: System: 1128, User: 234, Validation: 4177
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (cell shape).
  • Required fields: knowledge_group

template1_knowledge_extreme_environment_tolerance Templates

System: template1_knowledge_extreme_environment_tolerance.txt | User: template1_knowledge_extreme_environment_tolerance.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the extreme-environment tolerance phenotype of the given binomial species name, based on the depth of species-specific literature describing growth and survival under extreme conditions (temperature, pH, salinity, pressure, radiation, desiccation):

- limited: Little or no species-specific literature about this organism's extreme-environment tolerance. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on extreme-environment tolerance is available, including reports of survival at non-standard conditions or basic tolerance-range observations, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on extreme-environment tolerance, including mechanistic studies of stress responses (compatible solutes, heat-shock proteins, DNA-repair systems, membrane adaptations) and well-characterised tolerance boundaries, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its extreme-environment tolerance from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its extreme-environment tolerance phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_extreme_environment_tolerance",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (extreme-environment tolerance).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about extreme-environment tolerance; no species-specific literature recalled.",
      "moderate": "Some species-specific extreme-environment tolerance observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific extreme-environment tolerance literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for extreme-environment tolerance of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (extreme-environment tolerance).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_extreme_environment_tolerance.txt
  • User template file: templates/user/template1_knowledge_extreme_environment_tolerance.txt
  • Validation config file: templates/validation/template1_knowledge_extreme_environment_tolerance.json
  • Template type: Knowledge
  • Character count: System: 1307, User: 253, Validation: 4291
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (extreme-environment tolerance).
  • Required fields: knowledge_group

template1_knowledge_gram_staining Templates

System: template1_knowledge_gram_staining.txt | User: template1_knowledge_gram_staining.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the gram staining phenotype of the given binomial species name, based on the depth of species-specific literature describing cell-wall architecture and how the organism behaves under Gram's staining procedure:

- limited: Little or no species-specific literature about this organism's gram staining. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on gram staining is available, including basic staining observations or cell-wall composition notes, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on gram staining, including detailed cell-wall chemistry (peptidoglycan thickness, teichoic acid content), membrane architecture, and consistent microscopic characterisation, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its gram staining from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its gram staining phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_gram_staining",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (gram staining).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about gram staining; no species-specific literature recalled.",
      "moderate": "Some species-specific gram staining observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific gram staining literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for gram staining of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (gram staining).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_gram_staining.txt
  • User template file: templates/user/template1_knowledge_gram_staining.txt
  • Validation config file: templates/validation/template1_knowledge_gram_staining.json
  • Template type: Knowledge
  • Character count: System: 1156, User: 237, Validation: 4195
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (gram staining).
  • Required fields: knowledge_group

template1_knowledge_health_association Templates

System: template1_knowledge_health_association.txt | User: template1_knowledge_health_association.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the health association phenotype of the given binomial species name, based on the depth of species-specific literature describing association with host health (commensal, probiotic, or beneficial roles):

- limited: Little or no species-specific literature about this organism's health association. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on health association is available, including reports of presence in healthy microbiota or basic beneficial-role observations, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on health association, including detailed studies of commensal/probiotic mechanisms, host health outcomes, and reproducible health-association evidence, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its health association from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its health association phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_health_association",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (health association).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about health association; no species-specific literature recalled.",
      "moderate": "Some species-specific health association observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific health association literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for health association of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (health association).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_health_association.txt
  • User template file: templates/user/template1_knowledge_health_association.txt
  • Validation config file: templates/validation/template1_knowledge_health_association.json
  • Template type: Knowledge
  • Character count: System: 1164, User: 242, Validation: 4225
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (health association).
  • Required fields: knowledge_group

template1_knowledge_hemolysis Templates

System: template1_knowledge_hemolysis.txt | User: template1_knowledge_hemolysis.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the hemolysis phenotype of the given binomial species name, based on the depth of species-specific literature describing hemolytic activity on blood agar (alpha, beta, gamma) and the underlying hemolysins:

- limited: Little or no species-specific literature about this organism's hemolysis. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on hemolysis is available, including basic blood-agar observations or simple hemolysin descriptions, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on hemolysis, including detailed hemolysin characterisation (gene identity, regulatory control, pore-forming mechanisms) and reproducible hemolytic-phenotype documentation, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its hemolysis from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its hemolysis phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_hemolysis",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (hemolysis).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about hemolysis; no species-specific literature recalled.",
      "moderate": "Some species-specific hemolysis observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific hemolysis literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for hemolysis of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (hemolysis).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_hemolysis.txt
  • User template file: templates/user/template1_knowledge_hemolysis.txt
  • Validation config file: templates/validation/template1_knowledge_hemolysis.json
  • Template type: Knowledge
  • Character count: System: 1142, User: 233, Validation: 4171
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (hemolysis).
  • Required fields: knowledge_group

template1_knowledge_host_association Templates

System: template1_knowledge_host_association.txt | User: template1_knowledge_host_association.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the host association phenotype of the given binomial species name, based on the depth of species-specific literature describing host-colonisation patterns and lifestyle (free-living, commensal, parasitic, symbiotic):

- limited: Little or no species-specific literature about this organism's host association. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on host association is available, including basic host-range observations or isolation-source notes, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on host association, including detailed host-interaction studies (colonisation factors, host-specificity determinants, ecological-niche characterisation), and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its host association from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its host association phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_host_association",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (host association).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about host association; no species-specific literature recalled.",
      "moderate": "Some species-specific host association observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific host association literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for host association of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (host association).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_host_association.txt
  • User template file: templates/user/template1_knowledge_host_association.txt
  • Validation config file: templates/validation/template1_knowledge_host_association.json
  • Template type: Knowledge
  • Character count: System: 1149, User: 240, Validation: 4213
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (host association).
  • Required fields: knowledge_group

template1_knowledge_motility Templates

System: template1_knowledge_motility.txt | User: template1_knowledge_motility.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the motility phenotype of the given binomial species name, based on the depth of species-specific literature describing whether the organism is motile, its motility mechanisms (flagella, pili, gliding, swarming, etc.), and any regulatory or environmental context for motility:

- limited: Little or no species-specific literature about this organism's motility. You cannot confidently state whether it is motile or non-motile except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on motility is available, including documented observations of the motility phenotype, basic mechanistic notes (e.g., flagellation pattern), or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on motility, including repeated experimental observations, mechanistic studies (flagellar genetics, chemotaxis, regulation, swarming behaviour), and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its motility from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its motility phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_motility",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (motility).",
    "version": "1.0",
    "purpose": "Pilot template for reviewer rebuttal: evaluates whether the model's confidence in its knowledge of a single phenotype (motility) tracks its species-level confidence. Output schema is identical to the generic knowledge-rating templates so downstream normalization, validation, and aggregation code is reused unchanged.",
    "usage_context": {
      "when_to_use": "Use alongside a matching species-level knowledge-rating job (e.g. template3_knowlege) on the same model and species file, then compute agreement between the two sets of knowledge_group calls to test whether trait-specific self-assessment differs meaningfully from species-level self-assessment.",
      "typical_workflow": "(1) Run species-level job: model x species_file x template3_knowlege. (2) Run this job: same model x same species_file x template1_knowledge_motility. (3) Join predictions by binomial_name, compare knowledge_group labels."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about motility; no species-specific literature recalled.",
      "moderate": "Some species-specific motility observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific motility literature recalled, including mechanistic, regulatory, and behavioural detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation; the same species called 'extensive' at the trait level is typically 'extensive' at the species level.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": ["knowledge_group"],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for motility of the given organism.",
      "allowed_values": ["limited", "moderate", "extensive", "NA"],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": ["limited", "minimal", "basic", "low", "little", "poor"],
          "moderate": ["moderate", "medium", "intermediate", "fair", "some"],
          "extensive": ["extensive", "comprehensive", "detailed", "high", "full", "complete", "thorough"],
          "NA": ["na", "n/a", "none", "unknown", "not applicable"]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": ["DOTALL"]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": ["knowledge_group", "knowledge level", "level"]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (motility).

About This Template

Pilot template for reviewer rebuttal: evaluates whether the model's confidence in its knowledge of a single phenotype (motility) tracks its species-level confidence. Output schema is identical to the generic knowledge-rating templates so downstream normalization, validation, and aggregation code is reused unchanged.

Usage Context

When to use: Use alongside a matching species-level knowledge-rating job (e.g. template3_knowlege) on the same model and species file, then compute agreement between the two sets of knowledge_group calls to test whether trait-specific self-assessment differs meaningfully from species-level self-assessment.

Typical workflow: (1) Run species-level job: model x species_file x template3_knowlege. (2) Run this job: same model x same species_file x template1_knowledge_motility. (3) Join predictions by binomial_name, compare knowledge_group labels.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_motility.txt
  • User template file: templates/user/template1_knowledge_motility.txt
  • Validation config file: templates/validation/template1_knowledge_motility.json
  • Template type: Knowledge
  • Character count: System: 1243, User: 232, Validation: 3747
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (motility).
  • Required fields: knowledge_group

template1_knowledge_plant_pathogenicity Templates

System: template1_knowledge_plant_pathogenicity.txt | User: template1_knowledge_plant_pathogenicity.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the plant pathogenicity phenotype of the given binomial species name, based on the depth of species-specific literature describing pathogenic potential toward plants, plant-disease manifestations, and phytopathogenic mechanisms:

- limited: Little or no species-specific literature about this organism's plant pathogenicity. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on plant pathogenicity is available, including reports of plant infection or isolation from diseased plants, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on plant pathogenicity, including detailed phytopathogenic-mechanism studies (effector proteins, type III secretion, host-susceptibility determinants) and characterised plant disease cycles, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its plant pathogenicity from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its plant pathogenicity phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_plant_pathogenicity",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (plant pathogenicity).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about plant pathogenicity; no species-specific literature recalled.",
      "moderate": "Some species-specific plant pathogenicity observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific plant pathogenicity literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for plant pathogenicity of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (plant pathogenicity).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_plant_pathogenicity.txt
  • User template file: templates/user/template1_knowledge_plant_pathogenicity.txt
  • Validation config file: templates/validation/template1_knowledge_plant_pathogenicity.json
  • Template type: Knowledge
  • Character count: System: 1211, User: 243, Validation: 4231
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (plant pathogenicity).
  • Required fields: knowledge_group

template1_knowledge_spore_formation Templates

System: template1_knowledge_spore_formation.txt | User: template1_knowledge_spore_formation.txt

System Template

Defines the assistant's role and instructions
Determine your level of scientific knowledge specifically about the spore formation phenotype of the given binomial species name, based on the depth of species-specific literature describing endospore formation, sporulation regulation, and spore properties:

- limited: Little or no species-specific literature about this organism's spore formation. You cannot confidently state this phenotype except by generic inference from its genus or higher taxonomic rank.
- moderate: Some species-specific information on spore formation is available, including basic observations of spore formation under starvation conditions or morphological descriptions, or a small number of primary-literature reports.
- extensive: Comprehensive species-specific literature on spore formation, including detailed sporulation regulation (sigma-factor cascade, forespore/mother-cell differentiation), spore ultrastructure, and germination mechanisms, and consistent coverage across multiple independent publications.

If the species name is not a real or recognized species, or if you cannot meaningfully separate your knowledge of its spore formation from a generic taxonomic assumption, respond with NA.

User Template

Defines the user's query format with placeholders
Respond with a JSON object for {binomial_name} indicating your level of species-specific scientific knowledge about its spore formation phenotype, in lowercase, in this format:

{
    "knowledge_group": "<limited|moderate|extensive|NA>"
}

Validation Config

Defines expected response structure and validation rules
{
  "template_info": {
    "name": "template1_knowledge_spore_formation",
    "type": "knowledge",
    "description": "Trait-specific knowledge-level assessment template (spore formation).",
    "version": "1.0",
    "purpose": "Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.",
    "usage_context": {
      "when_to_use": "Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.",
      "typical_workflow": "(1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating."
    },
    "interpretation_guide": {
      "limited": "Only generic, taxonomy-derived expectation about spore formation; no species-specific literature recalled.",
      "moderate": "Some species-specific spore formation observations or mechanistic notes recalled.",
      "extensive": "Rich species-specific spore formation literature recalled, including mechanistic, regulatory, and phenotypic detail.",
      "NA": "Cannot place the species in any of the above tiers (unrecognised name, or no meaningful trait-specific signal beyond taxonomic inference)."
    },
    "quality_indicators": {
      "high_quality_response": "Clear categorisation that is coherent with other trait-specific ratings for the same species.",
      "low_quality_response": "Categorisation appears to mirror taxonomic family-level guesses rather than species-specific recall."
    }
  },
  "expected_response": {
    "format": "json",
    "required_fields": [
      "knowledge_group"
    ],
    "optional_fields": []
  },
  "field_definitions": {
    "knowledge_group": {
      "type": "string",
      "required": true,
      "description": "Trait-specific knowledge level for spore formation of the given organism.",
      "allowed_values": [
        "limited",
        "moderate",
        "extensive",
        "NA"
      ],
      "validation_rules": {
        "case_sensitive": false,
        "trim_whitespace": true,
        "normalize_mapping": {
          "limited": [
            "limited",
            "minimal",
            "basic",
            "low",
            "little",
            "poor"
          ],
          "moderate": [
            "moderate",
            "medium",
            "intermediate",
            "fair",
            "some"
          ],
          "extensive": [
            "extensive",
            "comprehensive",
            "detailed",
            "high",
            "full",
            "complete",
            "thorough"
          ],
          "NA": [
            "na",
            "n/a",
            "none",
            "unknown",
            "not applicable"
          ]
        }
      },
      "validation_error_messages": {
        "missing": "Required field 'knowledge_group' is missing from response",
        "invalid_value": "Invalid knowledge level. Expected one of: limited, moderate, extensive, NA",
        "wrong_type": "Field 'knowledge_group' must be a string"
      }
    }
  },
  "parsing_instructions": {
    "json_extraction": {
      "method": "regex",
      "pattern": "\\{.*\\}",
      "flags": [
        "DOTALL"
      ]
    },
    "fallback_parsing": {
      "enabled": true,
      "method": "keyword_search",
      "keywords": [
        "knowledge_group",
        "knowledge level",
        "level"
      ]
    }
  },
  "success_criteria": {
    "minimum_required_fields": 1,
    "require_all_mandatory": true,
    "allow_extra_fields": false
  },
  "error_handling": {
    "on_parse_failure": "return_null",
    "on_validation_failure": "return_errors",
    "on_missing_required": "return_errors"
  }
}

Trait-specific knowledge-level assessment template (spore formation).

About This Template

Member of the trait-audit battery generated for the reviewer rebuttal: asks the model, for a single phenotype, how much species-specific knowledge it holds. Used alongside the species-level knowledge-rating template to test whether species-level confidence approximates the aggregate of trait-specific confidences across all 13 phenotypes.

Usage Context

When to use: Run together with the other 12 trait-specific knowledge templates and the species-level knowledge-rating template on a shared species sample; compare the species-level rating to aggregates (mean/max/mode) of the 13 trait-specific ratings.

Typical workflow: (1) Pick a species sample (e.g. trait_audit_sample.txt). (2) Run species-level template3_knowlege. (3) Run all 13 trait-specific template1_knowledge_* templates on the same sample + model. (4) Aggregate the 13 trait ratings per species and correlate with the species-level rating.

Template Configuration Files

Template Information
  • System template file: templates/system/template1_knowledge_spore_formation.txt
  • User template file: templates/user/template1_knowledge_spore_formation.txt
  • Validation config file: templates/validation/template1_knowledge_spore_formation.json
  • Template type: Knowledge
  • Character count: System: 1183, User: 239, Validation: 4207
Usage Notes
  • The system template sets the context and instructions for the AI model
  • The user template contains placeholders like {binomial_name} that get replaced with actual values
  • The validation config defines expected response structure and automatically normalizes LLM outputs
  • All three files work together to ensure consistent, validated results from the language model
Validation Details
  • Description: Trait-specific knowledge-level assessment template (spore formation).
  • Required fields: knowledge_group