feat(frontend): Run-on modes in the evaluator creation drawer (shared controls) by mmabrouk · Pull Request #4557 · Agenta-AI/agenta

mmabrouk · 2026-06-05T12:46:12Z

Why

The Run on selector (test case / app output / trace) was only wired into the full-page evaluator playground. The evaluator-creation drawer still hardcoded runDisabled={!hasAppConnected} and only showed the test-set dropdown after an app was connected — so in the drawer you were forced to pick an app even when you wanted to run the evaluator directly on a test case. The drawer had silently drifted out of sync with the page.

What

Rather than paste the run-on wiring into the drawer (a fourth copy), this extracts the logic the page and drawer were already duplicating and shares it:

useEvaluatorRunControls() — one hook for the app adapter, app-select handler, run-on mode + handlePickRunOn, and the run gate (runDisabled = runOnMode === "app" && !hasAppConnected).
EvaluatorRunControls — the run-on selector + app picker + disconnect affordance + test-set dropdown, as one cluster used by both the page header and the drawer header, so they can't diverge again.

Result:

Page: behavior-preserving (just sources its controls from the shared hook/cluster).
Drawer: gains all three run-on modes, the run-on selector, a disconnect affordance, and an always-available test-set dropdown. Test-case mode now runs without forcing an app — the bug is fixed.
Removes the appWorkflowAdapter / handleAppSelect / evaluator-node-lookup triplication across the page body, drawer header, and drawer body.

Net: 218 insertions / 274 deletions across 5 files (2 new, 3 slimmed).

Notes

runOnMode stays persisted per project (shared by page and drawer); the per-evaluator question is tracked separately for a later PR, as discussed.
runDisabled only manifests where the run panel renders (the page and the expanded drawer); the collapsed/config-only drawer ignores it, unchanged.

Stacked on

Based on fe-fix/app-workflow-router-unification-regression-fix (the merged evaluator-playground branch, which already contains the page-side run-on feature from #4553).

Test plan

Open the New Evaluation flow → create-evaluator drawer → switch Run on to "Run directly on a test case": the test-case editor is usable and runs without selecting an app.
Switch to "Run on an app output" with no app: the run panel shows the "Select an app" empty state; pick an app → it runs.
Confirm the full-page evaluator playground is unchanged (modes, default, dark mode, disconnect).

The Run-on selector (test case / app output / trace) was only wired into the full-page evaluator playground. The evaluator-creation drawer still hardcoded `runDisabled={!hasAppConnected}` and only showed the test-set dropdown after an app was connected, so it forced the user to pick an app even when they wanted to run the evaluator directly on a test case. Rather than copy the run-on wiring into the drawer (a fourth duplicate), extract the shared logic the page and drawer were already duplicating: - useEvaluatorRunControls(): app adapter, app-select handler, run-on mode + handlePickRunOn, and the runDisabled gate (runOnMode === 'app' && !appConnected). - EvaluatorRunControls: the run-on selector + app picker + disconnect + test-set cluster, shared by the page header and the drawer header so they can't drift. The page is behavior-preserving; the drawer gains all three modes, the run-on selector, a disconnect affordance, and an always-available test-set dropdown. This also removes the adapter/handleAppSelect/evaluator-node triplication across the page body, drawer header, and drawer body.

vercel · 2026-06-05T12:46:18Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 5, 2026 1:45pm

coderabbitai · 2026-06-05T12:46:20Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ae0093dc-906b-4d93-ba6e-3b2373caa197

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fe-feat/evaluator-drawer-run-on

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-05T13:07:13Z

Railway Preview Environment


Preview URL	https://gateway-production-d7bf.up.railway.app/w
Project	`agenta-oss-pr-4557`
Image tag	`pr-4557-1bda40a`
Status	Deployed
Railway logs	Open logs
Workflow logs	View workflow run
Updated at 2026-06-05T13:54:26.050Z

The creation drawer renders inside EvaluationRunsTableStoreProvider, a scoped jotai store that mirrors only a handful of global atoms. The playground state, however, runs on the default store (the playground package uses getDefaultStore() throughout). So in the drawer the run-on mode was read/written in the scoped store while the playground lived in the default store — the two split, and switching to test-case mode never reached the run panel: it stayed stuck on the 'Select an app' empty state. Read and write all run-on / playground atoms through getDefaultStore() in useEvaluatorRunControls, mirroring the existing workaround in usePreviewVariantConfig and TestsetCells. On the full page (no scoped store) this is a no-op; in the drawer it aligns run-on state with the playground so test-case mode shows the inputs/outputs as it does on the page.

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 5, 2026

dosubot Bot added the Frontend label Jun 5, 2026

vercel Bot deployed to Preview June 5, 2026 12:46 View deployment

dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 5, 2026

vercel Bot deployed to Preview June 5, 2026 13:45 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(frontend): Run-on modes in the evaluator creation drawer (shared controls)#4557

feat(frontend): Run-on modes in the evaluator creation drawer (shared controls)#4557
mmabrouk wants to merge 2 commits into
fe-fix/app-workflow-router-unification-regression-fixfrom
fe-feat/evaluator-drawer-run-on

mmabrouk commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mmabrouk commented Jun 5, 2026

Why

What

Notes

Stacked on

Test plan

Uh oh!

vercel Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Railway Preview Environment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 5, 2026 •

edited

Loading

coderabbitai Bot commented Jun 5, 2026 •

edited

Loading

github-actions Bot commented Jun 5, 2026 •

edited

Loading