[release] v0.102.0#4547
Open
github-actions[bot] wants to merge 52 commits into
Open
Conversation
- Created unit tests for data transformation utilities including error extraction, response status preservation, and metadata stripping. - Added tests for formatting utilities covering number, currency, latency, and percentage formatting. - Implemented tests for path utilities to validate object navigation and manipulation. - Developed tests for slug generation and validation functions. - Added tests for template variable validation and extraction. - Included tests for various validators including UUID and HTTP URL validation. - Configured Vitest for running tests with coverage reporting and JUnit output.
…annotation packages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…verage, template-variable alignment - Replace `as any` fixture casts with `as unknown as T` in annotation tests - Fix incorrect Annotation import source in testset-sync (now from @agenta/entities/annotation) - Add Testcase type import and remove all as-any call-site casts in testset-sync - Add falsy-root short-circuit tests for getValueAtPath (0, false, "", null) - Realign template-variable tests to the strict envelope-slot behavior on main Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
PR #4384 disabled EVALUATOR_FULL_PAGE_NAV_ENABLED because the app-style playground was a regression for evaluators (lost the upstream-app connection) and app-scoped observability defaulted to "invocation" instead of "annotation" for evaluator workflows. This change addresses both blockers and re-enables the flow by default. Playground - ConfigureEvaluatorPage: upstream app workflow can be connected via EntityPicker (skip-variant adapter, filtered to non-evaluator non-feedback workflows). Disconnect affordance on the picker trigger and as a popup footer. - Standalone evaluator runs no longer require an upstream app (TestsetDropdown is always available; runDisabled gate removed). - Playground chain traces now write evaluator references (evaluator / evaluator_variant / evaluator_revision slots) so the per-evaluator observability page can find them. EntityPicker search bar respects a new parentLabel option so app pickers no longer show "Search evaluator..." Observability filters - Per-workflow-kind trace_type default extracted into @agenta/entities (defaultTraceTypeForWorkflow): annotation for evaluators, invocation otherwise. Pure helper unit-tested with vitest. - References scope filter adapts to the effective trace_type: evaluators with trace_type=annotation pin to references.evaluator, invocation pins to references.application, and "no trace_type" ORs across both slots so all traces mentioning the evaluator surface. - Dialog reconciliation: live label flip while editing trace_type in the filter dialog ("Application ID" / "Evaluator ID") via an opt-in reconcileFilterRows callback on Filters; observability page provides an evaluator-workflow-aware reconciler. - Filter persistence across reloads: per-app via atomWithStorage under "agenta:observability:filters", with __global__ fallback for project-level pages. Both userFilters and traceTypeChoice share one packed storage atom. - Cleaner state machine for trace_type intent: tagged union (default / value / cleared) replaces the dual-atom dance that could silently revert. - application_id URL param dropped for evaluator workflows; the query is gated on workflow context being settled to avoid firing with the wrong scope. Tests - vitest unit tests for defaultTraceTypeForWorkflow. - Playwright acceptance for full-page playground: post-create nav, row click for LLM and declarative evaluators, direct URL, sidebar switcher; fixes the previously broken select-app-and-run test for the new flow.
CodeRabbit flagged 5 issues on the evaluator-full-page rollout PR.
This commit addresses each:
1. PlaygroundRouter — `is_feedback` evaluators skip the full-page swap.
`workflowKind === "evaluator"` was too broad. Human/feedback
evaluators are drawer-only in /evaluators (they capture human input,
they don't run), so routing them to ConfigureEvaluatorPage produced
a run-controls UI for a workflow with nothing to run. Added a
`flags.is_feedback` exclusion next to the workflowKind check.
2. Sidebar — switcher filters out `is_feedback` evaluators.
`nonArchivedEvaluatorsAtom` only filters by `deleted_at` and
includes human evaluators; the switcher was exposing entries that,
when clicked, would land on the (now-correctly-gated) generic
<Playground /> for a feedback workflow. Filtered the list at the
switcher boundary.
3. controls.ts — handle array-valued `trace_type` for in/not_in.
The dialog dispatches `{operator: "in", value: ["annotation"]}` for
the IN operator family, but the intent setter only normalized
scalars — so the user's choice was silently dropped to
`{kind: "cleared"}`. Normalize to an array, filter to enum values,
and collapse single-value arrays back to a scalar. Multi-value
selections (which mean "no filter" for a 2-value enum) still map
to `cleared`.
4. Playwright — drop stale `[data-row-key]` poll in select-app-and-run.
The test asserted post-create navigation to /apps/<id>/playground
AFTER polling for the new row in the evaluators table — but the
redirect wins first, the table disappears, and the poll became a
timing-dependent failure. Removed the registry-side wait;
evaluator-in-registry assertion is covered by the
post-create-row-click test alongside.
5. ConfigureEvaluator/atoms.ts — fix persistedAppSelectionAtom race.
`connectAppToEvaluatorAtom` persisted the app selection BEFORE
`changePrimaryNode` ran, so a failed swap (returns `null` with no
primary to swap from) left a stale localStorage record that the
next mount re-hydrated into a phantom "connected" state. Moved the
persist call to after both graph mutations succeed.
`disconnectAppFromEvaluatorAtom` early-returned on no-downstream
without clearing the persisted state, allowing the same phantom
record to survive a disconnect attempt. Clear it on that branch
too.
No behavior change for the happy-path full-page flow — these all
narrow edge cases the reviewer flagged.
…ssion-fix Resolves a single conflict in `web/packages/agenta-entities/src/workflow/core/schema.ts` — release v0.100.4 added `artifact_slug` / `variant_slug` to the revision schema alongside the `workflow_slug` / `workflow_variant_slug` fields this branch had introduced for emitting evaluator references on playground chain runs. Both sides added `workflow_slug` and `workflow_variant_slug` with overlapping intent; resolution keeps all four fields and merges the two doc comments into one that covers both purposes (parent-workflow identification for ID-less callers + evaluator chain-trace emission). No source behavior change — schema is additive on both sides.
…cation-regression-fix
…cation-regression-fix
…ssion-fix Resolved conflicts: - web/oss/src/components/Filters/Filters.tsx — kept this branch's `displayedFilter` (reconciles filter rows for evaluator workflows) with main's `filterContainerClass` plain-class styling. - web/packages/agenta-playground/src/state/execution/executionRunner.ts — kept this branch's `stageReferences` builder which merges upstream app references with the evaluator's own self-references (via `buildEvaluatorSelfReferences`). Main's variant dropped references for evaluator stages, which was the regression PR #4474 is fixing — evaluator traces need `references.evaluator.slug` attached so they are searchable on the evaluator's /apps/<evalId>/traces page.
Issue: In the LLM-as-a-judge playground, switching the chained app from a
chat application to a completion application kept sending `context` and
`messages` from the previous app in the new request body.
Root cause: At `executionRunner.ts` for depth=0 (the root entity), the
runner spreads the entire row's data into `nodeInputs` (`{...data}`) and
hands it to the stage handle as `inputValues`. The downstream filter in
`resolveVariableValues` / `buildCompletionInputRow` correctly drops keys
that aren't in the entity's input variables — when `variables` is non-
empty. But when the entity's input ports haven't resolved yet (entity
mid-hydration) or genuinely declares no input variables, that filter
falls back to "spread every key from the row", which is exactly the
window in which stale chat-shape keys (`messages`, `context`) leak into
a completion request.
Fix: Filter `data` at the runner against the entity's declared
`inputSchema.properties` BEFORE building `nodeInputs`. This applies to
both the first execution (line ~417) and the repetition retries (line
~689). When the entity has no resolvable input schema, the helper falls
back to `{...data}` so workflows that genuinely depend on free-form
input (e.g. `__rawBody` app workflows whose variables live in
`__meta.variables`) keep working.
The fix is safe for chat mode: chat strips `messages` separately at line
587 of `executionItems.ts` and rebuilds the conversation from
`chatHistory` via `messageIdsAtomFamily(loadableId)` — independent of
`inputValues`.
Defense-in-depth: this complements the existing
`resolveVariableValues` filter rather than replacing it.
The evaluator info notice in SingleLayout rendered with hardcoded light-mode colors (bg-blue-50, text-gray-700) and was unreadable against the dark UI. Add dark: variants to background, border, icon, body text, and dismiss button to match the existing dark:bg-blue-900/* pattern used elsewhere in the app.
Previous attempt used dark:text-gray-200 which conflicted with the themeAwareColors CSS-variable layer — the gray scale is role-inverted in dark mode, so dark:text-gray-200 resolved to a dark shade against the dark callout background. Switch overrides to the blue scale (not theme-flipped): dark:text-blue-50 for body text, dark:text-blue-300 for the icon, and dark:text-blue-200 for the dismiss button. All readable against dark:bg-blue-900/20.
The first #4525 fix only covered the depth=0 (root entity) path. In the LLM-as-a-judge evaluator playground the chained app sits at depth>0, where input construction goes through resolveChainInputs (spreads testcaseData on the no-mapping branch) or buildEvaluatorExecutionInputs (spreads testcaseData when the schema allows additionalProperties). Both paths re-leak the stale `messages` field from a previous chat app into the current target entity's request body. Add stripChatTransportForEntity — a targeted strip of known chat- transport keys (currently just `messages`) that runs unless the target entity's input schema explicitly declares them. Applied: - depth=0 path: as a defense-in-depth pass after the strict filterDataToEntityInputSchema, so the spread fallback (taken while the new app's schema is mid-hydration) can't leak the stale field either. - depth=0 repetition path: same. - depth>0 path: pre-filters `data` before chain / evaluator input construction. Uses a targeted strip (rather than the strict schema filter) so evaluators that legitimately depend on additionalProperties: true spread of testcase columns keep receiving them. The helper short-circuits to the input reference when no chat transport keys are present, so there's no allocation in the common path.
…chat keys Diagnostic telemetry for #4525 / AGE-3793 — three console.warn signals in executionRunner so we can tell which layer is actually rescuing the request body during a chat→completion swap: 1. filterDataToEntityInputSchema schema-not-resolved fallback — the strict allow-list can't run because workflowMolecule.selectors .ioSchemas returned no inputSchema.properties. Logs the entityId, the reason (no-properties vs properties-not-object), the data keys present, and whether `messages` is among them. 2. filterDataToEntityInputSchema empty-properties fallback — schema resolved but Object.keys(properties).length === 0. Same payload. 3. stripChatTransportForEntity strip — emits only when a chat-transport key was actually dropped, with which keys and whether the entity schema was resolved at the time of the strip. All three are warn-level so they're visible in production console without code changes, and gated to the unusual paths so the happy path stays quiet.
…4525) Move the stale-key fix from execution-time stripping to the layer where it belongs: the testcase row store, on swap of the primary entity. The testcaseMolecule is shared across loadables, so when the user swaps the chained app in the LLM-as-a-judge playground (anchor positional swap in setEntityIdsAtom), the same rows now carry every key the previous primary populated — `messages` from a prior chat app, completion variables from a prior completion app, etc. Reconciliation strategy (decided with the user): - Closed schema (additionalProperties: false): drop any row key not declared by the new entity's inputSchema.properties. Drops silently — no toast, no confirm modal. Matches what the user typed for the new app and nothing more. - Open schema (additionalProperties not set or true): only strip the CHAT_TRANSPORT_KEYS set (currently `messages`). Evaluator workflows that legitimately depend on additionalProperties spread keep receiving their extra testcase columns. - Schema not resolved: skip. The execution-time strip in executionRunner.ts is the fallback during this hydration window — it will be removed in a follow-up commit once the row-layer fix is verified end-to-end and a reactive deferred reconciliation handles the hydration race. Mutation goes through testcaseMolecule.actions.batchUpdate with stale keys set to `undefined` (the store's update reducer interprets that as a delete). Drafts are created per affected row. A console.warn is emitted in two cases: - schema-not-resolved on swap (so we can verify the hydration race surface area in practice). - one summary per swap that lists which keys were dropped per row and the schema mode (closed vs open).
…chema (#4525) Root cause of the `context` leak that survived the prior fixes: both the row prune and the runtime filter read the allow-list from `workflowMolecule.selectors.ioSchemas(entityId).inputSchema.properties`, which is EMPTY for completion apps. Completion apps express their variables as prompt template placeholders surfaced through `inputPorts`, not through the static input schema. So the filter degraded to its empty-properties fallback (keep everything) and only the hardcoded chat-transport strip removed `messages` — `context` (a real chat template var, stale on the row) sailed through. Diagnostic confirmation from the repro console: [executionRunner.filter] empty-properties fallback {entityId, dataKeys: ['messages','context','country'], hasMessagesKey: true} Fix: new shared helper `state/helpers/entityInputContract.ts` that resolves the allow-list the SAME way executionItems builds request `variables`: variablesFromInputPorts = inputPorts[].key variablesFromPayload = requestPayload.__meta.variables ?? requestPayload.variables ?? [] variables = inputPorts.length > 0 ? inputPorts : payload (+ `messages` when executionMode === 'chat') `reconcileRowDataForEntity` applies the policy: - app with resolved contract → strict allow-list (drops context+messages) - evaluator → chat-transport-only strip (preserves additionalProperties spread of extra testcase columns) - unresolved contract → chat-transport-only safety strip Both consumers now delegate to it: - playgroundController.pruneTestcaseRowsForEntity (swap-time, primary fix) - executionRunner.reconcileEntityInputData (exec-time hydration safety net) This collapses the three ad-hoc helpers (filterDataToEntityInputSchema, stripChatTransportForEntity, getEntityInputSchema) into one correct source-of-truth resolution and removes the now-misleading empty-properties / schema-not-resolved diagnostics.
…nputs (#4525) The prior fix stripped stale keys from the APP's request inputs, but the trace still showed `context` because it surfaces in the downstream EVALUATOR's {inputs, outputs} envelope. The evaluator reads the SAME shared testcase row, and the evaluator policy (chat-transport-only) intentionally preserves non-`messages` keys — so `context` survived there. The UI row also still showed it. Also: the swap-time prune in setEntityIdsAtom never fired for this flow — the evaluator playground selects the app via add/remove node actions in ConfigureEvaluator, not a setEntityIds positional swap (no [playgroundController.prune] log appeared in the repro). Fix: reconcile the shared testcase row against the ROOT entity's input contract in webWorkerIntegration, right before execution — path-agnostic, fires on every run regardless of how the app was selected. The cleaned row is: - passed to the runner (so app request AND evaluator envelope are clean), - written back via loadableController.actions.updateRow (so the UI and future runs reflect it; undefined values delete the keys). Evaluator-referenced columns are protected: collectDownstreamReferencedColumns gathers testcase columns named by downstream evaluator `<input>_key` settings (e.g. correct_answer_key → ground_truth) and passes them as protectedKeys, so a strict clean against the app contract never drops intentional evaluation inputs. reconcileRowDataForEntity gains an optional protectedKeys set; a key survives strict filtering when it's in the app allow-list OR protected. Emits [webWorker.reconcile] when keys are dropped, listing the strategy, dropped keys, and protected columns.
…4525) Two follow-ups now that the row-reconciliation fix is verified working: 1. Clean-on-swap: the evaluator playground selects an app via changePrimaryNode + connectDownstreamNode (ConfigureEvaluator), not a setEntityIds positional swap — so the previously-wired swap-time prune never fired there. Add playgroundController.actions.reconcileRowsToPrimary and call it from connectAppToEvaluatorAtom AFTER connectDownstreamNode, so the shared testcase row is cleaned the instant the app changes (not only at run time). Running after the downstream connect means the evaluator's referenced columns (correct_answer_key → ground_truth, etc.) are protected from the strict app-contract clean. pruneTestcaseRowsForEntity now: - collects downstream-evaluator protected columns, - returns a status ('acted' | 'noop' | 'unresolved'). reconcileRowsToPrimary handles the hydration race: if the new primary's inputPorts aren't resolved yet AND the entity isn't loaded, it subscribes to inputPorts and retries once, then unsubscribes. If the entity is loaded but genuinely has no variables, it doesn't subscribe (no dangling sub). The run-time reconciliation in webWorkerIntegration remains the backstop. 2. Remove diagnostic logs added while tracing the bug: [executionRunner.filter], [webWorker.reconcile], [playgroundController.prune]. The reconcile + writeback logic stays; only the console.warn telemetry is dropped.
The beautified/markdown view forced H2 headings to uppercase via text-transform, rewriting the user's own prompt text. H1 was also lighter than H2, and H3-H6 had no styling. Apply a consistent best-practice scale (descending sizes, shared weight/color/spacing) across H1-H6 in both light and dark mode, with no case transform.
Drop the bottom bar that showed the GitHub, LinkedIn, and X icons and the 'Copyright © <year> | Agenta.' line from the platform layout. Remove the FooterIsland component, its styles, and the footerHeight resize observer that only existed to size the footer.
The message editors had the Text/Markdown view inverted: the view-mode dropdown mapped markdownView to the wrong boolean, so picking 'Markdown' showed raw source and the default 'Text' showed rendered markdown. Fixed the mapping across every message editor (chat turns, prompt messages, variable inputs, the JSON object field) and the live markdown toggle button, whose icon and tooltip were also inverted. Also: - The view mode is now a shared, persisted atom (messageViewModeAtom), so switching one message switches all and the choice survives a refresh. - Text mode renders with the editor's proportional font instead of monospace, with spacing that matches the rendered markdown view. - Message text and placeholder align with the role label above them.
Adds a 'Run on' control to the evaluator (LLM-as-a-judge) playground header so the first/empty state explains itself instead of leaving the user with two disconnected loaders. Three modes, each drawing its own data-flow: - Run directly on a test case (Data -> Evaluator -> Score) - Run on an app output (Data -> App -> Output -> Evaluator -> Score) - default - Run on a trace (Trace -> Evaluator -> Score) - disabled for now The mode is persisted per project; a connected app forces effective 'app' mode. In app mode with no app connected, the run panel hides the testcases and shows a centered 'Select an app' empty state (shared with the evaluator-creation drawer). All colors come from the antd theme token so it follows light/dark mode. Prompt playground is intentionally untouched.
The useStyles call cast its arg to StyleProps, but StyleProps was never
imported in Layout.tsx (a latent issue, flagged by review). With footerHeight
gone, StyleProps is just {themeMode}, so the cast is unnecessary. Pass the
arg directly; tsc confirms it type-checks.
…notation-packages
QA round 2: in the evaluator playground, selecting an app → disconnecting → re-selecting the SAME app connected nothing in the UI (workflow selector + generation panel stayed on the 'Select an app' empty state). Root cause (pinned via runtime instrumentation): the node graph IS correct after reconnect — connectAppToEvaluatorAtom writes playgroundNodesAtom to [app, evaluator] and a follow-up read confirms 2 nodes in the single store. But on a disconnect→reconnect cycle jotai applies the two sequential playgroundNodesAtom writes (changePrimaryNode → connectDownstreamNode) WITHOUT notifying the mounted dependents, so selectedAppLabelAtom / hasAppConnectedAtom (and the package's generation-panel atoms) never recompute and the UI shows stale 'disconnected' state. First-connect and disconnect notify fine; only the reconnect drops the notification. Fix: after the graph mutations in connectAppToEvaluatorAtom, read the node-derived display atoms (selectedAppLabelAtom, hasAppConnectedAtom) to re-establish the dependency and flush the pending notification to their subscribers. Verified locally: reconnect now updates both the selector and the generation panel. Also removes the temporary [B-repro] diagnostics added while root-causing.
…QA critical) The critical QA bug: invoking an LLM-as-a-judge evaluator opened from the drawer 422'd because the request shipped references.evaluator_revision.id = "local-…" (an unsaved local-draft id), which the backend /invoke validator rejects as a non-UUID. buildEvaluatorSelfReferences (chain stage refs) was already guarded, but references can also arrive from the requestPayload builder and from trace-span extraction. Rather than chase each builder, add a single final sanitization at the one chokepoint where the request body's references are assembled (buildExecutionItem, after all sources are merged): drop any reference id that is a local-draft or placeholder id, keep slug/version, and drop a slot that ends up empty. Path-agnostic — covers the drawer direct-invoke, the chained evaluator playground, and any future reference source.
…4474 QA E) In the evaluator playground the primary (app) result row exposed an 'open trace' affordance, but the downstream evaluator result card (DownstreamNodeCard) rendered only its output fields — no way to open the evaluator's own trace to debug a grade (QA 2026-06-05: 'show the trace links (icon) for evaluators too'). The downstream result already carries a traceId; the card just never read it. Read it and pass a compact 'open trace' icon (SharedGenerationResultUtils in actionsOnly mode) into NodeResultCard's headerActions slot, so it appears next to the evaluator node name on hover — same trace drawer the app row opens. Adds actionsOnly to the package's SharedGenerationResultUtilsProps provider type (the OSS wrapper + entity component already support it).
…ions & Overview Evaluators can be evaluated as subjects (#4237). Show those runs on the evaluator's own Evaluations tab and Overview summaries: - Add a run-list reference predicate (entities/evaluationRun/etl) that drops runs by the ROLE their references play - keeping runs where the workflow is the evaluated subject (application/invocation ref) and excluding runs where it was merely a grader (evaluator/annotation ref). Replaces the flaky meta.application heuristic with the structural data.steps source of truth. - Wire the subject filter into the eval-runs fetch, with a hit-ratio meter reporting the v1->v2 escalation signal, and a bounded over-fetch so the fixed-size Overview summaries fill instead of falsely reading empty. - Re-enable Overview eval summaries + Evaluations route for evaluators (sidebar links, route guards, DISABLED_FOR_EVALUATOR). - Resolve the locked Apps filter chip to the workflow name for evaluators.
Evaluators aren't deployed to environments, but deploy actions leaked onto their surfaces. Gate at the reusable chokepoints: - DeployVariantButton self-guards via the workflow-level is_evaluator flag (correct even on v0 revisions), covering the revision drawer + every other reuse without per-call-site checks. - Recent Prompts (VariantsOverview) passes hideDeployActions for evaluators, matching the variants dashboard.
Switch ArchivedAppsPage to the shared PageLayout with an inline back-arrow title and no subtitle, matching the Archived Evaluators page.
…valuation modal When the modal is app-scoped to an evaluator route, resolve the Application panel label/kind from the evaluators list so it shows the evaluator's name (not its raw id) - the app-scoped pre-lock never sets selectedWorkflowMeta. Also drop the full evaluator query-result objects from the derived-evaluators memo deps (and a dead humanEvaluatorsQuery subscription); they changed identity every query tick and churned the modal's renders.
…ification-regression-fix [FE Fix]: Re-enable full-page playground for evaluator workflows
[fix] Fix inverted Text/Markdown view in playground message editors
…-annotation-packages test(frontend): add unit tests for @agenta/shared and @agenta/annotation packages
The switcher used fullPagePlaygroundEvaluatorsAtom, which narrows to evaluators that have a full-page playground (LLM, code) and so dropped the declarative matchers (exact match, regex, similarity, json diff, contains json, ...). Add nonHumanEvaluatorsAtom - non-archived evaluators with only the human (is_feedback, resolved from the latest revision) exclusion - and point the switcher at it, so every automatic evaluator is listed while human ones stay out.
Two fixes found while auditing the bot against all open PRs: - Only a non-empty Summary plus a demo (for functional changes) are required. Missing Testing/Checklist sections no longer close a PR. The demo is now detected anywhere in the body, not just the Demo section. This fixes a PR that had a YouTube demo and full testing notes but was closed for lacking the checklist section. - Drop the 'reopened' trigger so a maintainer who manually reopens a flagged PR wins, instead of the bot immediately re-closing it. Auto-reopen on a fixed description still works via 'edited'/'synchronize'.
ci: stop PR bot over-closing on missing checklist + fix reopen loop
The evaluator drawer rendered by WorkflowRevisionDrawerWrapper reimplemented
the run panel gate as `runDisabled={!hasAppConnected}`, ignoring the run-on
mode. So switching its Run-on selector to 'test case' updated the header while
the panel kept showing the 'Select an app' empty state and demanding an app —
the page and creation drawer respected the mode, only this third surface didn't.
Route it through the shared useEvaluatorRunControls hook (+ SelectAppEmptyState
and the prop-less EvaluatorPlaygroundHeader), the same wiring the page and the
creation drawer use, so the gate is `runOnMode === 'app' && !hasAppConnected`
everywhere and the three surfaces can't drift again. Removes this drawer's
duplicated app adapter / app-select / run-gate logic.
Also drop the getDefaultStore() patch from useEvaluatorRunControls: runtime
debugging proved these surfaces are not in a scoped store (the drawer that was
broken is WorkflowRevisionDrawerWrapper, not the scoped-store CreateEvaluator
drawer), so the override was a no-op based on a wrong hypothesis.
feat(frontend): Run-on modes in the evaluator creation drawer (shared controls)
mmabrouk
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
New version v0.102.0 in