evaly compare

Compare two Evalytic report files of the same type.

evaly compare --baseline <REPORT_A> --candidate <REPORT_B> [OPTIONS]

The compare command produces a summary-level diff between two report files of the same eval_type. Use it to understand what changed before you decide whether to gate the run.

Options

FlagTypeDescription
--baselineTEXTRequired. Baseline report JSON.
--candidateTEXTRequired. Candidate report JSON.
--json-outputTEXTWrite structured diff JSON to a file. Use - for stdout.

What Gets Compared

Report TypeComparison Target
benchPer-model overall_score and dimension_averages
ragsummary.metric_averages
textsummary.metric_averages
agentsummary.metric_averages
Same-type only: evaly compare errors if the two reports have different eval_type values. For enforced regression checks, use evaly gate --baseline.

Examples

# Compare two visual benchmark reports
evaly compare \
    --baseline bench-a.json \
    --candidate bench-b.json

# Compare two RAG runs
evaly compare \
    --baseline rag-a.json \
    --candidate rag-b.json

# Write diff JSON to stdout
evaly compare \
    --baseline text-a.json \
    --candidate text-b.json \
    --json-output -

Output

Terminal output shows the shared keys, baseline value, candidate value, and signed delta. JSON output returns the same rows in a machine-readable structure.