Skip to content
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
3c0c995
Support multiple output path
fanny-riols Apr 10, 2026
4b9c055
Add toggle to show only the latest run per system
fanny-riols Apr 10, 2026
e2f0833
Show system name before timestamp in cross-run comparison run labels
fanny-riols Apr 10, 2026
7a05cf2
Reformat timestamp-prefixed run names as system_name (timestamp) in l…
fanny-riols Apr 10, 2026
4e60c12
Split run label into separate System and Timestamp columns in cross-r…
fanny-riols Apr 10, 2026
e9b1c97
Fix link generation for multiple output dirs in cross-run comparison …
fanny-riols Apr 10, 2026
b9a2679
Add toggle to hide failed attempts in run overview per-record table
fanny-riols Apr 13, 2026
5c4e67c
Split cross-run comparison table into Accuracy, EVA-X, and Diagnostic…
fanny-riols Apr 13, 2026
369171b
Fix failed attempt detection to check trial label instead of record id
fanny-riols Apr 13, 2026
fc49c3e
Update analysis app run command to use uv run
fanny-riols Apr 14, 2026
f7f822a
Group metrics by category in record detail metrics tab
fanny-riols Apr 14, 2026
e812682
Show dimension scores inline in metrics detail tab
fanny-riols Apr 14, 2026
327fb88
Show metric groups in 4 columns in conversation trace tab
fanny-riols Apr 14, 2026
c7719ff
Fix per-turn breakdown boxes invisible text in dark mode
fanny-riols Apr 14, 2026
090e914
Merge branch 'main' into pr/fr/analysis-app-multi-output
JosephMarinier Apr 14, 2026
c77a04c
Fix system/timestamp split for timestamp-only run directory names
fanny-riols Apr 14, 2026
fa30ec5
Address PR review comments
fanny-riols Apr 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading