Skip to content

test: add e2e tests for annotation queue flows#4568

Open
GanJiaKouN16 wants to merge 8 commits into
Agenta-AI:mainfrom
GanJiaKouN16:feat/annotation-queue-e2e-tests
Open

test: add e2e tests for annotation queue flows#4568
GanJiaKouN16 wants to merge 8 commits into
Agenta-AI:mainfrom
GanJiaKouN16:feat/annotation-queue-e2e-tests

Conversation

@GanJiaKouN16
Copy link
Copy Markdown

Summary

Fixes #4528 — adds Playwright e2e test coverage for annotation queue flows.

Tests added (WEB-ACC-ANNOTATION-001 through 006)

# Test Coverage
001 Navigate to annotation queues page, verify empty state or list renders SMOKE
002 Create a new annotation queue with testcases kind SMOKE
003 Create a new annotation queue with traces kind SMOKE
004 Open queue detail page by clicking a queue row LIGHT
005 Search for annotation queues by name LIGHT
006 Delete an annotation queue via the actions menu LIGHT

Structure

Follows the existing human-annotation/ test pattern:

annotation-queue/
  annotation-queue.spec.ts   # Entry point
  index.ts                   # Test definitions with tags
  tests.ts                   # Custom fixtures (navigateToAnnotations, createAnnotationQueue)
  assets/
    types.ts                 # Type definitions for fixtures

Fixtures

  • navigateToAnnotations() — navigates to /{projectURL}/annotations and waits for the page to load
  • createAnnotationQueue({name, kind}) — opens the Create Queue drawer, fills in name/kind, submits, and waits for drawer to close

Closes #4528

Wire the existing GenerateResetLinkModal and PasswordResetLinkModal
into the Actions dropdown in the workspace members table.

- Add 'Reset password' menu item for workspace members (not self)
- Add resetPassword API function in profile service
- Show confirmation dialog before generating the reset link
- Display the generated password reset link with copy functionality

Closes Agenta-AI#2572
Several tables with row-level click navigation were missing the
shouldIgnoreRowClick guard, causing clicks on interactive elements
(checkboxes, dropdowns, buttons) to accidentally trigger row navigation.

Changes:
- Consolidate shouldIgnoreRowClick with broader selector list (merges
  EvaluationRunsTablePOC's extra selectors: [role='button'],
  [role='menuitem'], [role='checkbox'], .ant-btn, etc.)
- Export INTERACTIVE_ROW_SELECTORS constant for reuse
- Add guard to ObservabilityTable (traces)
- Add guard to SessionsTable
- Add guard to PromptsPage
- Add guard to TestcasesTableShell
- Add guard to EntityTable
- Replace partial data-ivt-stop-row-click check in ScenarioListView
  with full shouldIgnoreRowClick
- Update useEntityTableState to use consolidated selectors
- Remove duplicate shouldIgnoreRowClick from navigationActions.ts
- Update EvaluationRunsTablePOC to import from shared utility

Closes Agenta-AI#3254
The evaluation table was showing a generic 'too many requests' message
instead of the actual provider error because:

1. executeViaFetch never checked for body-level errors on HTTP 200.
   The Python SDK can return HTTP 200 with a non-200 status.code
   embedded in the response body (WorkflowBatchResponse.status.code).
   This path was silently treated as success.

2. Error stacktrace/type/code were not propagated through the pipeline.
   Even when the HTTP error path was taken, only the message was
   extracted — the SDK's status.type, status.code, and status.stacktrace
   were dropped.

Changes:
- executeViaFetch: detect body-level errors on HTTP 200 by checking
  responseData.status.code !== 200 and return an error result
- executeViaFetch: extract stacktrace (coercing string[] to string),
  type, and code from both HTTP-error and body-error paths
- Add stacktrace and type to ExecutionResult, RunResult, and
  ExecuteWorkflowRevisionResult error shapes
- runInvocationAction: pass stacktrace and type through to
  upsertStepResultWithInvocation
- upsertStepResultWithInvocation: accept type field in error param

No UI changes needed — InvocationCell already renders stepError.message
and stepError.stacktrace when present; extractStepError already reads
error.code, error.type, error.stacktrace from persisted step data.

Closes Agenta-AI#3324
…iddleware

The vault middleware built env var names using f'{provider.upper()}_API_KEY'
which produces TOGETHER_AI_API_KEY for the 'together_ai' provider kind.
The actual env var is TOGETHERAI_API_KEY (no underscore), matching the
frontend (llmProviders.ts, transforms.ts), backend (env.py), and the
Daytona sandbox runner (daytona.py).

Add an explicit _PROVIDER_ENV_VAR_MAP dict (mirroring the Daytona runner
pattern) that maps each provider kind to its correct env var name, with
fallback to the original f-string pattern for any future providers.

Closes Agenta-AI#3659
… drawer

The 'Open evaluator registry' button in the Trace Drawer's
EvaluatorDetailsPopover navigated human evaluators to the registry
page (/evaluators?tab=human&openEvaluator=...) instead of the
evaluator playground. Automatic evaluators already linked correctly.

Unify both evaluator types to navigate to the playground
(/evaluators/playground?revisions=...), consistent with how other
parts of the codebase link to evaluators (EvaluatorSection,
ConfigurationView). Update button text to 'Open evaluator playground'.

Closes Agenta-AI#4535
Replace the SendGrid-only email backend with SMTP support, keeping
SendGrid as a legacy fallback for existing deployments.

Changes:
- email_service.py: use smtplib for SMTP (priority), SendGrid as
  fallback, no-op when neither is configured
- env.py: add SmtpConfig (SMTP_HOST, SMTP_PORT, SMTP_USERNAME,
  SMTP_PASSWORD, SMTP_FROM_ADDRESS, SMTP_USE_TLS), keep SendgridConfig
  for backwards compatibility; update AuthFacade.email_method to check
  both
- OSS/EE organization_service.py: use env.smtp.enabled || env.sendgrid
  .enabled; use configured from_address instead of hardcoded email
- user_service.py: same email-enabled check update
- db_manager_ee.py: remove dead sendgrid import and unused sg client
- pyproject.toml: remove sendgrid dependency (imported lazily only when
  SENDGRID_API_KEY is set)
- env example files: add SMTP vars, mark SendGrid as legacy
- docs: add SMTP config table, mark SendGrid as legacy

Closes Agenta-AI#4536
The breadcrumb always showed 'Auto Evals' for SDK evaluations because:

1. test.tsx normalized type='custom' to 'auto' before passing to
   EvalRunPreviewPage, losing the SDK type information
2. Page.tsx typeMap had no 'custom' entry
3. buildBreadcrumbs.ts hardcoded 'auto evaluation' as fallback label

Fix:
- Remove the custom→auto normalization in test.tsx
- Add 'custom' → 'SDK Evals' entry to Page.tsx typeMap (matches the
  tab label in EvaluationsView.tsx)
- Change buildBreadcrumbs.ts fallback from 'auto evaluation' to
  'Evaluations'

Closes Agenta-AI#4549
Add Playwright acceptance tests for the annotation queue feature,
following the existing human-annotation test patterns.

Tests cover (WEB-ACC-ANNOTATION-001 through 006):
1. Navigate to annotation queues page and verify empty state or list
2. Create a new annotation queue with testcases kind
3. Create a new annotation queue with traces kind
4. Open a queue detail page by clicking a queue row
5. Search for annotation queues by name
6. Delete an annotation queue via the actions menu

Structure mirrors human-annotation/:
- annotation-queue.spec.ts — entry point
- index.ts — test definitions with tags
- tests.ts — fixtures (navigateToAnnotations, createAnnotationQueue)
- assets/types.ts — type definitions

Closes Agenta-AI#4528
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 6, 2026
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 6, 2026

Someone is attempting to deploy a commit to the agenta projects Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jun 6, 2026

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot Bot added the tests label Jun 6, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • SMTP added as primary email option (SendGrid marked legacy)
    • Password reset action for workspace members
    • Annotation Queue UI for managing traces and test cases
    • Richer error details surfaced (type and stack trace)
  • Bug Fixes

    • Improved table row click handling to ignore interactive elements
    • Evaluator navigation and labels updated to consistently use the playground
  • Documentation

    • Added SMTP configuration docs and example environment entries

Walkthrough

This PR introduces SMTP email backend configuration alongside SendGrid, enriches execution error objects with type and stacktrace details, centralizes interactive row-click suppression logic across table components, adds custom evaluation type support, implements user password reset in workspace settings, and adds E2E annotation queue tests with supporting minor updates to provider mapping and evaluator navigation.

Changes

SMTP Email Backend Configuration

Layer / File(s) Summary
SMTP configuration model and env setup
api/oss/src/utils/env.py
SmtpConfig Pydantic model added for SMTP settings with host/port/credentials/from address/TLS flag; smtp field added to EnvironSettings singleton; AuthFacade.email_method updated to check SMTP first for OTP availability.
Email service backend selection and SMTP/SendGrid routing
api/oss/src/services/email_service.py
Backend selection at import time with SMTP taking precedence; conditional SendGrid imports; added _send_via_smtp and _send_via_sendgrid helpers; send_email dispatches based on backend flags.
Organization and user services using configurable sender
api/ee/src/services/organization_service.py, api/oss/src/services/organization_service.py, api/oss/src/services/user_service.py
Email sender resolution prefers env.smtp.from_address over env.sendgrid.from_address; both SMTP and SendGrid checked before returning manual sharing URL; consistent sender wiring across all email-sending paths.
Configuration documentation and environment examples
docs/docs/self-host/02-configuration.mdx, hosting/docker-compose/*/*, api/pyproject.toml
SMTP configuration section added to docs with env var mappings; environment example files updated with SMTP block and SendGrid marked legacy; sendgrid dependency removed from project dependencies.

Error Details Enrichment

Layer / File(s) Summary
Error type definitions
web/packages/agenta-entities/src/runnable/types.ts, web/packages/agenta-playground/src/executeWorkflowRevision.ts, web/packages/agenta-playground/src/state/execution/types.ts
ExecutionResult.error, ExecuteWorkflowRevisionResult.error, and RunResult.error now support optional type and stacktrace fields in addition to existing message and code.
Execution runner error extraction and lifecycle
web/packages/agenta-playground/src/state/execution/executionRunner.ts
executeViaFetch now parses error code, type, and stacktrace from HTTP error responses; body-level error detection added for non-200 status codes in successful HTTP responses; ExecutionSessionLifecycleCallbacks.onFail signature updated with richer error fields.
Invocation atom and API error capture
web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts, web/oss/src/services/evaluations/invocations/api.ts
triggerRunInvocationAtom captures stacktrace and type from step result errors; upsertStepResultWithInvocation API signature updated to accept type field in error object.

Interactive Row Click Handler Centralization

Layer / File(s) Summary
Consolidated selectors constant and refactored function
web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx, web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
INTERACTIVE_ROW_SELECTORS exported as comma-separated CSS selector string; shouldIgnoreRowClick refactored to use target.closest(INTERACTIVE_ROW_SELECTORS) instead of hardcoded checks.
UI table state hook integration
web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
useEntityTableState imports INTERACTIVE_ROW_SELECTORS and derives default click-exclusion selectors from it.
Table component consumers
web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts, web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx, web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx, web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx, web/oss/src/components/pages/observability/components/SessionsTable/index.tsx, web/oss/src/components/pages/prompts/PromptsPage.tsx, web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx, web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
shouldIgnoreRowClick moved from local navigationActions to centralized InfiniteVirtualTable export; all table components updated to import and use shouldIgnoreRowClick before triggering row selection/navigation.

Custom Evaluation Type Support

Layer / File(s) Summary
Evaluation type breadcrumb and routing
web/oss/src/components/EvalRunDetails/components/Page.tsx, web/oss/src/components/EvalRunDetails/test.tsx, web/oss/src/lib/helpers/buildBreadcrumbs.ts
"custom" evaluation type added to breadcrumb typeMap (label: "SDK Evals", kind: "custom"); EvalRunTestPage now passes through "custom" type without normalizing to "auto"; evaluations breadcrumb label updated from "auto evaluation" to "Evaluations".

User Password Reset Feature

Layer / File(s) Summary
Password reset API endpoint
web/oss/src/services/profile/index.ts
resetPassword(userId: string) added to profile service; constructs /api/profile/reset-password request with user_id query parameter; returns generated reset link string.
Reset password UI integration
web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
Added imports for resetPassword API and reset modal components; introduced state for reset modal visibility, generated link, and loading; implemented handleResetPassword handler; added "Reset password" menu item (Key icon) for members; wired GenerateResetLinkModal and PasswordResetLinkModal into Actions component flow.

Annotation Queue End-to-End Tests

Layer / File(s) Summary
Test fixtures and types
web/oss/tests/playwright/acceptance/annotation-queue/assets/types.ts, web/oss/tests/playwright/acceptance/annotation-queue/tests.ts
AnnotationQueueFixtures interface defines navigateToAnnotations and createAnnotationQueue fixture methods; test helpers compute annotations URL path and wait for queue-list load completion.
Acceptance test suite
web/oss/tests/playwright/acceptance/annotation-queue/index.ts
Six tagged acceptance tests covering: queue page navigation and empty/table states, creation of testcases and traces queues, detail page navigation, queue name search filtering, and queue deletion via row actions.
Test spec registration
web/oss/tests/playwright/acceptance/annotation-queue/annotation-queue.spec.ts
Registers annotation queue test suite with Playwright runner.

Supporting Updates

Layer / File(s) Summary
Provider API key environment variable mapping
sdks/python/agenta/sdk/middlewares/running/vault.py
_PROVIDER_ENV_VAR_MAP constant added to normalize provider-specific API key env var names (e.g., together_aiTOGETHERAI_API_KEY); get_secrets updated to use mapping with fallback to default pattern.
Evaluator playground navigation
web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx, web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
buildEvaluatorTarget now always uses playground route for both human and auto evaluators; popover button label changed from "Open evaluator registry" to "Open evaluator playground".

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning PR contains extensive out-of-scope changes unrelated to annotation queue e2e testing: SMTP email configuration refactoring, error object type extensions, evaluator navigation changes, and row-click interaction logic refactoring across multiple components. Separate the annotation queue e2e tests into their own PR; move the email, error handling, navigation, and interaction logic changes to a distinct PR with appropriate issue links.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title 'test: add e2e tests for annotation queue flows' accurately describes the main change: adding Playwright e2e tests for annotation queue functionality.
Description check ✅ Passed Description clearly relates to the changeset by detailing the annotation queue e2e tests being added, their coverage, test structure, and fixtures.
Linked Issues check ✅ Passed PR implements issue #4528 requirements: adds e2e test coverage for annotation queue flows with six tests (WEB-ACC-ANNOTATION-001 through 006) covering navigation, creation, detail viewing, searching, and deletion.
Docstring Coverage ✅ Passed Docstring coverage is 86.67% which is sufficient. The required threshold is 60.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
web/packages/agenta-entities/src/runnable/types.ts (1)

228-231: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Align StageExecutionResult.error with ExecutionResult.error to include the new fields.

ExecutionResult.error now includes optional type and stacktrace fields (lines 207-208), but StageExecutionResult.error was not updated to match. Since stage results represent individual nodes in chain execution and are stored in chainResults (line 272), this inconsistency could cause type mismatches or lost error details when stage results are processed.

🔧 Suggested fix
     error?: {
         message: string
         code?: string
+        type?: string
+        stacktrace?: string
     }
api/ee/src/services/organization_service.py (1)

107-129: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reconsider hardcoded default sender address.

Lines 121-122 provide a hardcoded default sender "account@hello.agenta.ai" when neither SMTP nor SendGrid from_address is configured. This creates several issues:

  1. SPF/DKIM failures: Emails sent from this address without proper domain configuration will likely fail authentication checks and be marked as spam or rejected.
  2. Unmonitored inbox: Replies will go to an address that may not be monitored.
  3. Inconsistency with OSS: The OSS version (oss/src/services/organization_service.py lines 172-174) raises a ValueError instead, which prevents misconfiguration.

Consider aligning with the OSS behavior by raising an error when no sender is configured, forcing explicit configuration.

🔧 Proposed fix to align with OSS behavior
-    from_address = env.smtp.from_address or env.sendgrid.from_address or "account@hello.agenta.ai"
-
+    from_address = env.smtp.from_address or env.sendgrid.from_address
+    if not from_address:
+        raise ValueError("Email requires a sender email address (SMTP_FROM_ADDRESS or SENDGRID_FROM_ADDRESS)")
+
     await email_service.send_email(
🧹 Nitpick comments (8)
web/oss/tests/playwright/acceptance/annotation-queue/index.ts (3)

45-90: ⚡ Quick win

Consider adding cleanup for created test queues.

Tests 002 and 003 create annotation queues but don't delete them afterward. Over many test runs, this could accumulate test data and potentially affect test reliability or list-rendering performance. Test 006 demonstrates the deletion flow, so consider adding similar cleanup to these tests (e.g., in a try-finally block or afterEach hook).


105-126: ⚡ Quick win

Consider adding cleanup for the created test queue.

Test 004 creates a queue for testing detail-page navigation but doesn't delete it. While Playwright test isolation might handle this, explicit cleanup would prevent test data accumulation and make the test more self-contained.


141-171: ⚡ Quick win

Consider adding cleanup for the created test queue.

Test 005 creates a queue for testing search functionality but doesn't delete it afterward. Adding cleanup (similar to test 006's delete flow) would prevent test data pollution.

web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx (1)

1677-1678: ⚡ Quick win

Remove underscore prefix from used parameter.

The parameter _event is prefixed with an underscore, which conventionally indicates an intentionally unused parameter. However, the parameter is used on line 1678 in the shouldIgnoreRowClick call. Remove the underscore prefix to follow naming conventions.

♻️ Rename parameter
-    const handleRowClick = useCallback((_event: React.MouseEvent, record: ScenarioTableRow) => {
-        if (shouldIgnoreRowClick(_event)) return
+    const handleRowClick = useCallback((event: React.MouseEvent, record: ScenarioTableRow) => {
+        if (shouldIgnoreRowClick(event)) return
         setDrawerScenarioId(record.scenarioId)
     }, [])
api/oss/src/utils/env.py (1)

942-942: ⚡ Quick win

Align use_tls truthiness check with global _TRUTHY constant.

The file defines a global _TRUTHY set (line 10) used consistently across all other boolean config fields, but use_tls uses a hardcoded tuple ("true", "1", "yes") instead. This creates inconsistency and means inputs like "t", "y", "on", "enable", or "enabled" won't be recognized as truthy here.

♻️ Proposed fix for consistency
-    use_tls: bool = os.getenv("SMTP_USE_TLS", "true").lower() in ("true", "1", "yes")
+    use_tls: bool = os.getenv("SMTP_USE_TLS", "true").lower() in _TRUTHY
api/oss/src/services/email_service.py (2)

47-47: ⚡ Quick win

Use keyword-only parameters in function signature.

The function signature should include * to enforce keyword-only parameters for better call-site clarity and to prevent positional argument mistakes. As per coding guidelines, Python API code should prefer keyword-only parameters.

♻️ Proposed refactor
-def _send_via_smtp(to_email: str, subject: str, html_content: str, from_email: str) -> None:
+def _send_via_smtp(*, to_email: str, subject: str, html_content: str, from_email: str) -> None:
     """Send email using SMTP."""

Source: Coding guidelines


74-74: ⚡ Quick win

Use keyword-only parameters in function signature.

The function signature should include * to enforce keyword-only parameters. As per coding guidelines, Python API code should prefer keyword-only parameters.

♻️ Proposed refactor
-def _send_via_sendgrid(to_email: str, subject: str, html_content: str, from_email: str) -> None:
+def _send_via_sendgrid(*, to_email: str, subject: str, html_content: str, from_email: str) -> None:
     """Send email using SendGrid (legacy fallback)."""

Source: Coding guidelines

api/oss/src/services/organization_service.py (1)

172-174: 💤 Low value

Consider making the error message more specific.

The error message correctly prevents misconfiguration by raising a ValueError, but could be more helpful by mentioning the specific environment variables needed.

♻️ Optional improvement
     from_address = env.smtp.from_address or env.sendgrid.from_address
     if not from_address:
-        raise ValueError("Email requires a sender email address to work.")
+        raise ValueError("Email requires a sender email address. Set SMTP_FROM_ADDRESS or SENDGRID_FROM_ADDRESS.")

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 899f18e1-9768-4f4a-8454-14e331ea1de3

📥 Commits

Reviewing files that changed from the base of the PR and between 98b8a9d and 1e8500c.

📒 Files selected for processing (41)
  • api/ee/src/services/db_manager_ee.py
  • api/ee/src/services/organization_service.py
  • api/oss/src/services/email_service.py
  • api/oss/src/services/organization_service.py
  • api/oss/src/services/user_service.py
  • api/oss/src/utils/env.py
  • api/pyproject.toml
  • docs/docs/self-host/02-configuration.mdx
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/ee/env.ee.gh.example
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • web/oss/src/lib/helpers/buildBreadcrumbs.ts
  • web/oss/src/services/evaluations/invocations/api.ts
  • web/oss/src/services/profile/index.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/annotation-queue.spec.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/assets/types.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/index.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/tests.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-entities/src/runnable/types.ts
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/packages/agenta-playground/src/executeWorkflowRevision.ts
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
💤 Files with no reviewable changes (3)
  • api/ee/src/services/db_manager_ee.py
  • api/pyproject.toml
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts

Comment on lines +157 to +158
from_address = env.smtp.from_address or env.sendgrid.from_address or "account@hello.agenta.ai"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reconsider hardcoded default sender address.

Lines 157-158 use the same problematic hardcoded default sender "account@hello.agenta.ai". This has the same issues as in send_invitation_email: SPF/DKIM failures, unmonitored replies, and inconsistency with the OSS version.

Consider raising an error when no sender is configured.

🔧 Proposed fix
-    from_address = env.smtp.from_address or env.sendgrid.from_address or "account@hello.agenta.ai"
-
+    from_address = env.smtp.from_address or env.sendgrid.from_address
+    if not from_address:
+        raise ValueError("Email requires a sender email address (SMTP_FROM_ADDRESS or SENDGRID_FROM_ADDRESS)")
+
     workspace_admins = await db_manager_ee.get_workspace_administrators(workspace)

Comment on lines +47 to +72
def _send_via_smtp(to_email: str, subject: str, html_content: str, from_email: str) -> None:
"""Send email using SMTP."""
msg = MIMEMultipart("alternative")
msg["Subject"] = subject
msg["From"] = from_email
msg["To"] = to_email
msg.attach(MIMEText(html_content, "html"))

smtp_host = env.smtp.host
smtp_port = env.smtp.port

if env.smtp.use_tls:
server = smtplib.SMTP(smtp_host, smtp_port)
server.ehlo()
server.starttls()
server.ehlo()
else:
server = smtplib.SMTP(smtp_host, smtp_port)

try:
if env.smtp.username and env.smtp.password:
server.login(env.smtp.username, env.smtp.password)
server.sendmail(from_email, [to_email], msg.as_string())
finally:
server.quit()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add timeout to SMTP connection to prevent hangs.

The SMTP connection lacks a timeout parameter. If the SMTP server is unresponsive, the application will hang indefinitely on the request thread, degrading availability.

🔧 Proposed fix to add timeout
-    if env.smtp.use_tls:
-        server = smtplib.SMTP(smtp_host, smtp_port)
+    smtp_timeout = 30  # seconds
+    if env.smtp.use_tls:
+        server = smtplib.SMTP(smtp_host, smtp_port, timeout=smtp_timeout)
         server.ehlo()
         server.starttls()
         server.ehlo()
     else:
-        server = smtplib.SMTP(smtp_host, smtp_port)
+        server = smtplib.SMTP(smtp_host, smtp_port, timeout=smtp_timeout)

Comment on lines 116 to 117
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Refine exception handling to preserve error context.

Catching all exceptions with except Exception as e and converting to HTTPException(status_code=500, detail=str(e)) loses important error context and makes debugging harder. SMTP and SendGrid errors have different root causes (network timeouts, authentication failures, rate limits, invalid recipients, etc.) and should be handled with more granularity.

♻️ Suggested improvement
     try:
         if _USE_SMTP:
             _send_via_smtp(to_email, subject, html_content, from_email)
         else:
             _send_via_sendgrid(to_email, subject, html_content, from_email)
         return True
-    except Exception as e:
-        raise HTTPException(status_code=500, detail=str(e))
+    except smtplib.SMTPException as e:
+        log.error(f"SMTP error sending email to {to_email}: {e}", exc_info=True)
+        raise HTTPException(status_code=500, detail=f"Failed to send email via SMTP: {type(e).__name__}")
+    except Exception as e:
+        log.error(f"Unexpected error sending email to {to_email}: {e}", exc_info=True)
+        raise HTTPException(status_code=500, detail="Failed to send email")

Comment thread api/oss/src/utils/env.py
"""SMTP Email configuration"""

host: str | None = os.getenv("SMTP_HOST")
port: int = int(os.getenv("SMTP_PORT", "587"))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Guard against ValueError if SMTP_PORT contains non-numeric input.

If SMTP_PORT is set to a non-numeric string, int(os.getenv(...)) will raise ValueError and crash the application at startup. Consider wrapping in a try-except or using a helper that validates and falls back to the default.

🛡️ Proposed fix with validation
-    port: int = int(os.getenv("SMTP_PORT", "587"))
+    port: int = (
+        int(port_str)
+        if (port_str := os.getenv("SMTP_PORT", "587")).isdigit()
+        else 587
+    )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
port: int = int(os.getenv("SMTP_PORT", "587"))
port: int = (
int(port_str)
if (port_str := os.getenv("SMTP_PORT", "587")).isdigit()
else 587
)

const evaluationTypeBreadcrumb = useMemo(() => {
const typeMap: Record<string, {label: string; kind: string}> = {
auto: {label: "Auto Evals", kind: "auto"},
custom: {label: "SDK Evals", kind: "custom"},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Expand evaluationType prop union to include "custom" to avoid a TS contract break.

Line 51 adds "custom" behavior, but EvalRunPreviewPageProps still excludes it. With EvalRunTestPage now forwarding "custom", this becomes a type error at the component boundary.

Suggested fix
 interface EvalRunPreviewPageProps {
     runId: string
-    evaluationType: "auto" | "human" | "online"
+    evaluationType: "auto" | "human" | "online" | "custom"
     projectId?: string | null
 }

Comment on lines +64 to +71
export const resetPassword = async (userId: string): Promise<string> => {
const base = getBaseUrl()
const url = new URL("api/profile/reset-password", base)
url.searchParams.set("user_id", userId)
const data = await fetchJson<string>(url, {
method: "POST",
})
return data
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Route this through the Fern client instead of fetchJson.

This is new frontend API code in web, but it manually builds the URL/query string and calls fetchJson directly. That bypasses the generated contract for this endpoint and the repo’s required queryParams call shape. Please add/use the matching Fern client method here instead of composing the request by hand.

As per coding guidelines, "All new frontend API code must go through the Fern-generated client, not raw axios", "Use getAgentaSdkClient({host: getAgentaApiUrl()})", and "Pass query params via {queryParams: {...}}, NOT axios's {params: {...}}."

Source: Coding guidelines

Comment on lines +686 to +687
const st = errorData.status.stacktrace
errorStacktrace = Array.isArray(st) ? st.join("\n") : st
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add type guard for stacktrace normalization to ensure string output.

The current logic assumes non-array values are strings, but if errorData.status.stacktrace is an object, number, or boolean, it will be assigned directly, violating the string | undefined type constraint.

🛡️ Suggested fix
                     const st = errorData.status.stacktrace
-                    errorStacktrace = Array.isArray(st) ? st.join("\n") : st
+                    errorStacktrace = Array.isArray(st) 
+                        ? st.join("\n") 
+                        : typeof st === "string" 
+                            ? st 
+                            : undefined

message: bodyStatus.message || "Invocation failed",
...(bodyStatus.code ? {code: bodyStatus.code.toString()} : {}),
...(bodyStatus.type ? {type: bodyStatus.type} : {}),
...(st ? {stacktrace: Array.isArray(st) ? st.join("\n") : st} : {}),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add type guard for inline stacktrace normalization.

Same issue as line 687: if st is not an array or string (e.g., object, number), it will be assigned directly, violating the type constraint.

🛡️ Suggested fix
-                    ...(st ? {stacktrace: Array.isArray(st) ? st.join("\n") : st} : {}),
+                    ...(st ? {stacktrace: Array.isArray(st) ? st.join("\n") : typeof st === "string" ? st : undefined} : {}),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
...(st ? {stacktrace: Array.isArray(st) ? st.join("\n") : st} : {}),
...(st ? {stacktrace: Array.isArray(st) ? st.join("\n") : typeof st === "string" ? st : undefined} : {}),

@GanJiaKouN16 GanJiaKouN16 force-pushed the feat/annotation-queue-e2e-tests branch from 1e8500c to cd01041 Compare June 6, 2026 15:43
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
web/oss/src/services/profile/index.ts (1)

64-71: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

This concern was already raised: route through the Fern client instead of fetchJson.

The previous review flagged that this new frontend API code manually builds the URL and calls fetchJson, bypassing the Fern-generated contract. Additionally, the current implementation passes user_id as a query parameter (url.searchParams.set), but the ResetUserPasswordRequest DTO in the generated client shows user_id as a body field, suggesting a potential parameter-passing mismatch.

Source: Coding guidelines

🧹 Nitpick comments (1)
api/oss/src/services/user_service.py (1)

162-164: ⚡ Quick win

Consider validating that the enabled backend has its corresponding from_address configured.

The fallback from env.smtp.from_address to env.sendgrid.from_address provides migration flexibility, but could mask configuration errors in production. For example, if SMTP is enabled but env.smtp.from_address is None, the code silently uses SendGrid's from_address with the SMTP backend, which may fail depending on server validation.

Consider logging a warning when the fallback occurs, or validating that the enabled backend has its corresponding from_address configured.

💡 Proposed validation improvement
-from_address = env.smtp.from_address or env.sendgrid.from_address
+from_address = env.smtp.from_address if env.smtp.enabled else env.sendgrid.from_address
 if not from_address:
-    raise ValueError("Email requires a sender email address to work.")
+    backend_name = "SMTP" if env.smtp.enabled else "SendGrid"
+    raise ValueError(f"Email requires a sender email address to work. Please configure {backend_name} from_address.")
+
+# Warn if using fallback sender
+if env.smtp.enabled and not env.smtp.from_address and env.sendgrid.from_address:
+    log.warning("SMTP is enabled but smtp.from_address is not set; falling back to SendGrid from_address")

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: c627e001-7494-4efc-851c-5db6dedbfa5c

📥 Commits

Reviewing files that changed from the base of the PR and between 1e8500c and cd01041.

📒 Files selected for processing (41)
  • api/ee/src/services/db_manager_ee.py
  • api/ee/src/services/organization_service.py
  • api/oss/src/services/email_service.py
  • api/oss/src/services/organization_service.py
  • api/oss/src/services/user_service.py
  • api/oss/src/utils/env.py
  • api/pyproject.toml
  • docs/docs/self-host/02-configuration.mdx
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/ee/env.ee.gh.example
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • web/oss/src/components/SharedDrawers/TraceDrawer/hooks/useEvaluatorNavigation.ts
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • web/oss/src/lib/helpers/buildBreadcrumbs.ts
  • web/oss/src/services/evaluations/invocations/api.ts
  • web/oss/src/services/profile/index.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/annotation-queue.spec.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/assets/types.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/index.ts
  • web/oss/tests/playwright/acceptance/annotation-queue/tests.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/packages/agenta-entities/src/runnable/types.ts
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/packages/agenta-playground/src/executeWorkflowRevision.ts
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
💤 Files with no reviewable changes (3)
  • web/oss/src/components/EvaluationRunsTablePOC/actions/navigationActions.ts
  • api/pyproject.toml
  • api/ee/src/services/db_manager_ee.py
✅ Files skipped from review due to trivial changes (6)
  • web/oss/src/components/SharedDrawers/TraceDrawer/components/EvaluatorDetailsPopover.tsx
  • web/oss/tests/playwright/acceptance/annotation-queue/annotation-queue.spec.ts
  • hosting/docker-compose/oss/env.oss.dev.example
  • hosting/docker-compose/ee/env.ee.dev.example
  • hosting/docker-compose/oss/env.oss.gh.example
  • docs/docs/self-host/02-configuration.mdx
🚧 Files skipped from review as they are similar to previous changes (27)
  • web/packages/agenta-entities/src/runnable/types.ts
  • web/packages/agenta-playground/src/state/execution/types.ts
  • web/oss/src/components/EvalRunDetails/atoms/runInvocationAction.ts
  • web/oss/src/components/pages/prompts/PromptsPage.tsx
  • web/oss/src/components/pages/observability/components/SessionsTable/index.tsx
  • web/oss/src/components/EvalRunDetails/test.tsx
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useEntityTableState.ts
  • web/packages/agenta-annotation-ui/src/components/AnnotationSession/ScenarioListView.tsx
  • web/oss/src/components/EvalRunDetails/components/Page.tsx
  • web/packages/agenta-playground/src/executeWorkflowRevision.ts
  • web/oss/src/components/TestcasesTableNew/components/TestcasesTableShell.tsx
  • web/oss/src/components/pages/observability/components/ObservabilityTable/index.tsx
  • web/oss/src/components/EvaluationRunsTablePOC/components/EvaluationRunsTable/index.tsx
  • api/oss/src/services/organization_service.py
  • web/oss/src/services/evaluations/invocations/api.ts
  • web/packages/agenta-ui/src/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/oss/tests/playwright/acceptance/annotation-queue/tests.ts
  • api/ee/src/services/organization_service.py
  • api/oss/src/utils/env.py
  • api/oss/src/services/email_service.py
  • hosting/docker-compose/ee/env.ee.gh.example
  • sdks/python/agenta/sdk/middlewares/running/vault.py
  • web/oss/src/components/pages/settings/WorkspaceManage/cellRenderers.tsx
  • web/oss/src/components/InfiniteVirtualTable/hooks/useTableManager.tsx
  • web/packages/agenta-entity-ui/src/shared/EntityTable.tsx
  • web/oss/tests/playwright/acceptance/annotation-queue/index.ts
  • web/packages/agenta-playground/src/state/execution/executionRunner.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files. tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add e2e web tests for annotation queue

2 participants