Skip to content

Eval File Reference

Top-level schema for selection eval YAML files.

FieldTypeRequiredDefaultDescription
modelstringNo-Default model for all evals in this file.
timeoutnumberNo30Default timeout in seconds for evals in this file.
skills"all" | string[]No"all"Skills to register for evals. Use “all” or a list of skill names.
run-mode"all" | "variants-only" | "current-only"No"all"Default run mode for evals: “all”, “variants-only”, or “current-only”.
variantsVariant[]No-Variant definitions available to evals in this file.
evalsSelectionEval[]Yes-List of selection evals to run.

Schema for individual eval entries within the evals array.

FieldTypeRequiredDefaultDescription
namestringYes-Unique name for this eval.
promptstringYes-The prompt to send to the agent being evaluated.
modelstringNo-Override the model for this eval.
timeoutnumberNo-Timeout in seconds for this eval. Overrides the file-level timeout.
enabledbooleanNotrueWhether this eval is active.
skills"all" | string[]No-Skills to register for this eval. Use “all” or a list of skill names.
run-mode"all" | "variants-only" | "current-only"No-Controls which runs to perform: “all” runs current + variants, “variants-only” skips current, “current-only” skips variants.
assertstring[] | "none" | "any"No-Expected skill selection. An array of skill names, “none” if no skill should load, or “any” to accept any selection. Defaults to the owning skill for skill-scoped evals.
variants"all" | string[] | Variant[]No"all"Variants to run: “all” uses file-level variants, or specify inline/by name.
decoysDecoy[]No-Decoy skills to register alongside real skills for this eval.
v0.3.3