feat(studio): API to expose test case row level data in experiments#198
feat(studio): API to expose test case row level data in experiments#198shanaiabuggy wants to merge 2 commits into
Conversation
Signed-off-by: shanaiabuggy <59746633+shanaiabuggy@users.noreply.github.com>
Documentation preview is readyPreview: https://nvidia-nemo.github.io/nemo-platform/pr-preview/pr-198/pr-198/ Built from This preview is deployed from this PR branch, updates when docs changes are pushed, and will be removed when the PR closes. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR adds a complete list-experiment-sessions endpoint across the intake API. It defines OpenAPI contracts, authorizes access, implements a ClickHouse-backed repository with aggregation and scoring, wires the endpoint handler with dependency injection, and validates end-to-end with integration tests covering pagination, filtering, and error cases. ChangesExperiment Sessions Listing
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
services/intake/tests/integration/spans/test_experiment_sessions.py (1)
86-133: ⚡ Quick winAdd integration coverage for
statusfiltering and the 503 path.Current tests miss two new endpoint behaviors:
statusquery filtering and deterministic 503 when telemetry storage is unavailable. Add both cases to lock the contract.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@services/intake/tests/integration/spans/test_experiment_sessions.py` around lines 86 - 133, Add two integration tests: one to cover status filtering and one to cover deterministic 503 when telemetry storage is unavailable. For status filtering, create sessions (using the same pattern in test_list_experiment_sessions_filter_by_test_case via client.post to ATIF_INGEST and EXPERIMENTS) with different "status" values, then call GET f"{EXPERIMENTS}/{experiment_name}/sessions" with params={"status":"<value>"} and assert pagination total_results and returned data match the filtered status. For the 503 path, add a test (similar setup to test_list_experiment_sessions_returns_404_for_unknown_experiment) that simulates telemetry storage being unavailable (mock or configure the telemetry/storage client used by the service to raise or be None) before calling the sessions endpoint and assert the response.status_code == 503 and appropriate error shape; reference the TestClient, EXPERIMENTS, ATIF_INGEST, and the existing test patterns for setup and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@openapi/ga/individual/platform.openapi.yaml`:
- Around line 15425-15429: The schema for the test_case_id property currently
disallows nulls but the description says it may be null; update the OpenAPI
schema for the test_case_id property (the YAML block defining test_case_id) to
include nullable: true so null values are accepted, keeping the existing title,
description, and type entries unchanged.
In `@openapi/ga/openapi.yaml`:
- Around line 15425-15429: The OpenAPI schema for the property test_case_id
declares it can be null in the description but omits the nullable marker; update
the test_case_id property in openapi.yaml to allow nulls by adding nullable:
true beneath the test_case_id definition (keeping title, description and type
intact) so clients will accept null values for test_case_id.
- Around line 15441-15443: Update the ExperimentSessionResponse schema in
openapi.yaml to mark fields that the backend can return as null as nullable:
true; specifically add nullable: true for ended_at, latency_ms, input,
input_tokens, output_tokens, cached_tokens, and cost_total_usd in the
ExperimentSessionResponse object so the OpenAPI spec accepts nulls consistent
with services/intake/src/nmp/intake/spans/experiment_session_repository.py.
In `@openapi/openapi.yaml`:
- Around line 15425-15430: The schema for the property test_case_id currently
lists only type: string but the description states it can be null; update the
test_case_id schema to allow nulls consistently with other fields by adding
nullable: true (or changing the type to a string/null union if your style
prefers) under the test_case_id entry in openapi/openapi.yaml so the spec and
description match.
In `@services/intake/src/nmp/intake/api/v2/experiments/endpoints.py`:
- Around line 432-439: Wrap the call to session_repository.list_sessions in a
try/except that catches ClickHouse-related backend exceptions and converts them
into an HTTP 503 response; specifically, around the block calling
session_repository.list_sessions capture exceptions from the ClickHouse client
(e.g., clickhouse_driver.errors.Error / connection-related exceptions) and raise
fastapi.HTTPException(status_code=503, detail="Telemetry store unavailable")
while logging the original exception for debugging. Ensure you import
fastapi.HTTPException (or use the existing FastAPI HTTPException) and only map
backend/unavailability errors to 503, re-raising other exceptions unchanged.
---
Nitpick comments:
In `@services/intake/tests/integration/spans/test_experiment_sessions.py`:
- Around line 86-133: Add two integration tests: one to cover status filtering
and one to cover deterministic 503 when telemetry storage is unavailable. For
status filtering, create sessions (using the same pattern in
test_list_experiment_sessions_filter_by_test_case via client.post to ATIF_INGEST
and EXPERIMENTS) with different "status" values, then call GET
f"{EXPERIMENTS}/{experiment_name}/sessions" with params={"status":"<value>"} and
assert pagination total_results and returned data match the filtered status. For
the 503 path, add a test (similar setup to
test_list_experiment_sessions_returns_404_for_unknown_experiment) that simulates
telemetry storage being unavailable (mock or configure the telemetry/storage
client used by the service to raise or be None) before calling the sessions
endpoint and assert the response.status_code == 503 and appropriate error shape;
reference the TestClient, EXPERIMENTS, ATIF_INGEST, and the existing test
patterns for setup and assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: c63875b3-6333-457d-a1e8-fca92fe0229f
⛔ Files ignored due to path filters (12)
sdk/python/nemo-platform/.nmpcontext/openapi.yamlis excluded by!sdk/**sdk/python/nemo-platform/.nmpcontext/stainless.yamlis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/resources/experiments/__init__.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/resources/experiments/api.mdis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/resources/experiments/experiments.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/resources/experiments/sessions.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/types/experiments/__init__.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/types/experiments/experiment_session_response.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/types/experiments/experiment_session_responses_page.pyis excluded by!sdk/**sdk/python/nemo-platform/src/nemo_platform/types/experiments/session_list_params.pyis excluded by!sdk/**sdk/python/nemo-platform/tests/api_resources/experiments/test_sessions.pyis excluded by!sdk/**sdk/stainless.yamlis excluded by!sdk/**
📒 Files selected for processing (8)
openapi/ga/individual/platform.openapi.yamlopenapi/ga/openapi.yamlopenapi/openapi.yamlservices/core/auth/src/nmp/core/auth/assets/static-authz.yamlservices/intake/src/nmp/intake/api/v2/experiments/endpoints.pyservices/intake/src/nmp/intake/api/v2/experiments/schemas.pyservices/intake/src/nmp/intake/spans/experiment_session_repository.pyservices/intake/tests/integration/spans/test_experiment_sessions.py
|
| # Sessions are the response payload (not enrichment), so we can't silently degrade like | ||
| # _hydrate_rollups does. Convert backend failures (ClickHouse connection drop, query | ||
| # timeout, etc.) to a deterministic 503 instead of letting them bubble as 500s. | ||
| logger.exception("Per-session read failed for workspace=%s experiment=%s", workspace, name) |
| # Sessions are the response payload (not enrichment), so we can't silently degrade like | ||
| # _hydrate_rollups does. Convert backend failures (ClickHouse connection drop, query | ||
| # timeout, etc.) to a deterministic 503 instead of letting them bubble as 500s. | ||
| logger.exception("Per-session read failed for workspace=%s experiment=%s", workspace, name) |
Summary by CodeRabbit
New Features
Tests