evaly compare

Compare two Evalytic report files of the same type.

evaly compare --baseline <REPORT_A> --candidate <REPORT_B> [OPTIONS]

The compare command produces a summary-level diff between two report files of the same eval_type. Use it to understand what changed before you decide whether to gate the run.

Options

Flag	Type	Description
--baseline	TEXT	Required. Baseline report JSON.
--candidate	TEXT	Required. Candidate report JSON.
--json-output	TEXT	Write structured diff JSON to a file. Use `-` for stdout.

What Gets Compared

Report Type	Comparison Target
`bench`	Per-model `overall_score` and `dimension_averages`
`rag`	`summary.metric_averages`
`text`	`summary.metric_averages`
`agent`	`summary.metric_averages`

Same-type only: evaly compare errors if the two reports have different eval_type values. For enforced regression checks, use evaly gate --baseline.

Examples

# Compare two visual benchmark reports
evaly compare \
    --baseline bench-a.json \
    --candidate bench-b.json

# Compare two RAG runs
evaly compare \
    --baseline rag-a.json \
    --candidate rag-b.json

# Write diff JSON to stdout
evaly compare \
    --baseline text-a.json \
    --candidate text-b.json \
    --json-output -

Output

Terminal output shows the shared keys, baseline value, candidate value, and signed delta. JSON output returns the same rows in a machine-readable structure.

evaly agent evaly gate