-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Description
Add a new GET /api/experimental/pipeline_runs/search_stats endpoint that returns aggregate pipeline run counts grouped by status. Supports the same filter_query format used by the existing GET /api/pipeline_runs/ list endpoint, so the stats cards on the homepage stay in sync with whatever filters the user has applied.
The goal is to return counts of pipeline runs broken down by their statuses, making it easy to surface how many runs are running, succeeded, failed, etc. for a given filter. The exact semantics of what constitutes a "run status" for counting purposes is left to the implementer's discretion.
Filtering
The endpoint accepts filter_query — the same JSON predicate format already used by the list endpoint. The "me" placeholder in filter predicates is resolved server-side from the authenticated session, consistent with how the existing list endpoint handles it. No additional current_user param is needed — it is the UI's responsibility to set the filter_query appropriately.
| Param | Type | Description |
|---|---|---|
filter_query |
string | null |
JSON filter query (same format as the list endpoint) |
Supported filter predicates
These are the filters currently used on the homepage:
| Filter | Backend key | Predicate type | Example |
|---|---|---|---|
| Created by | system/pipeline_run.created_by |
value_equals |
{"value_equals": {"key": "system/pipeline_run.created_by", "value": "me"}} |
| Pipeline name | system/pipeline_run.name |
value_contains |
{"value_contains": {"key": "system/pipeline_run.name", "value_substring": "training"}} |
| Date range | system/pipeline_run.date.created_at |
time_range |
{"time_range": {"key": "system/pipeline_run.date.created_at", "start_time": "...", "end_time": "..."}} |
| Annotations | Custom annotation keys | key_exists / value_contains |
{"key_exists": {"key": "my_annotation"}} |
Predicates are combined with "and" logic:
{
"and": [
{ "value_equals": { "key": "system/pipeline_run.created_by", "value": "me" } },
{ "time_range": { "key": "system/pipeline_run.date.created_at", "start_time": "2026-03-01T00:00:00Z", "end_time": "2026-03-31T23:59:59Z" } }
]
}Examples
- No filters → stats for all runs in the system
created_by: "me"→ stats for my runscreated_by: "me"+ date range → stats for my runs in the last 7 dayspipeline_namecontains "training" → stats for all "training" runs
Response shape
{
"total_runs": 1250,
"container_execution_status_stats": {
"RUNNING": 3,
"SUCCEEDED": 1100,
"FAILED": 87,
"CANCELLED": 50,
"PENDING": 10
},
"cached_at": "2026-03-19T12:00:00Z"
}Performance approach
Doing a full table scan on every request is not acceptable at scale. Some form of caching should be used — whether that is server-side in-memory caching, HTTP caching headers, or another mechanism is left to the implementer's discretion. The important thing is that repeated identical queries are not hitting the database on every request.
- No impact on existing list queries — this is a completely separate endpoint
- The existing
filter_query_sql.pyalready builds WHERE clauses from filter predicates — reuse those same clauses for the stats query
Relevant files
api_server_sql.py— add newstats()methodapi_router.py— register routefilter_query_sql.py— reuse existing WHERE clause builder for the stats queryfilter_query_models.py— existing filter predicate models (no changes needed)backend_types_sql.py— existingPipelineRunmodel, existing indexes support this query