Skip to content

feat: add GET /run/list endpoint (#39)#279

Open
saathviksheerla wants to merge 6 commits intoopenml:mainfrom
saathviksheerla:feat/run-list-endpoint
Open

feat: add GET /run/list endpoint (#39)#279
saathviksheerla wants to merge 6 commits intoopenml:mainfrom
saathviksheerla:feat/run-list-endpoint

Conversation

@saathviksheerla
Copy link
Copy Markdown
Contributor

Description

Adds the GET /run/list and POST /run/list endpoints for listing runs with optional filters.

Fixes: #39

Filters supported (all optional, combinable with AND logic):

  • run_id — list of run IDs
  • task_id — list of task IDs
  • flow_id — list of flow IDs
  • setup_id — list of setup IDs
  • uploader — list of user IDs
  • tag — single tag string (validated via SystemString64)

Pagination is supported via the nested pagination body parameter (consistent with other list endpoints).

Returns 404 with code 372 when no runs match the filters.

Deviation from PHP: The PHP API requires at least one filter (error 510). This endpoint follows the pattern of other list endpoints in this API and allows no-filter requests, returning all runs paginated.

Checklist

  • I have performed a self-review of my own pull request to ensure it contains all relevant information, and the proposed changes are minimal but sufficient to accomplish their task.
  • Tests pass locally
  • I have commented my code in hard-to-understand areas, and provided or updated docstrings as needed
  • I have added tests that cover the changes (only required if not already under coverage)

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 17, 2026

Codecov Report

❌ Patch coverage is 25.64103% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.37%. Comparing base (fde90bf) to head (dde9b4d).

Files with missing lines Patch % Lines
src/routers/openml/runs.py 26.31% 28 Missing ⚠️
src/main.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #279      +/-   ##
==========================================
- Coverage   54.15%   53.37%   -0.79%     
==========================================
  Files          37       37              
  Lines        1553     1587      +34     
  Branches      135      142       +7     
==========================================
+ Hits          841      847       +6     
- Misses        710      738      +28     
  Partials        2        2              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@saathviksheerla saathviksheerla marked this pull request as ready for review March 18, 2026 04:37
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 18, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a second registration of the OpenML runs router in create_api() (resulting in run_router being included twice). Introduces GET/POST /run/list implemented by list_runs, which accepts pagination, optional list filters (run_id, task_id, flow_id, setup_id, uploader) and an optional tag; it builds a dynamic SQL WHERE using a new _add_in_filter helper and a tag subquery, queries run joined with algorithm_setup, formats upload_time as "%Y-%m-%d %H:%M:%S", and raises NoResultsError when no rows match. Adds extensive tests for filtering, pagination, validation, and error cases.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the primary change: adding the GET /run/list endpoint. It is specific, directly related to the main changeset, and follows a standard PR convention.
Description check ✅ Passed The description thoroughly explains the changes, including endpoint details, supported filters, pagination, error handling, deviation from PHP API, and a completed checklist—all directly related to the changeset.
Linked Issues check ✅ Passed The PR directly addresses issue #39 by implementing the GET /run/list and POST /run/list endpoints with support for filtering and pagination as required.
Out of Scope Changes check ✅ Passed All changes in the PR are scoped to implementing the /run/list endpoints: router includes a new endpoint handler, tests cover the new functionality, and modifications to run_router registration support the feature.
Docstring Coverage ✅ Passed Docstring coverage is 96.77% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)

107-124: Consider adding ORDER BY for deterministic pagination.

The query lacks an ORDER BY clause, which means pagination results could be inconsistent across requests (same offset may return different rows). Adding ORDER BY r.rid would ensure stable, predictable pagination.

♻️ Proposed fix
         FROM run r
         JOIN algorithm_setup a ON r.setup = a.sid
         {where_clause}
+        ORDER BY r.rid
         LIMIT :limit OFFSET :offset
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 107 - 124, The SQL constructed in
the variable query in runs.py uses LIMIT/OFFSET without an ORDER BY, causing
non-deterministic pagination; update the text(...) SQL to include a
deterministic ORDER BY clause (e.g., ORDER BY r.rid) placed before the LIMIT
:limit OFFSET :offset so that queries built with where_clause and the query
variable return stable, repeatable pages for functions that rely on run_id
ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 107-124: The SQL constructed in the variable query in runs.py uses
LIMIT/OFFSET without an ORDER BY, causing non-deterministic pagination; update
the text(...) SQL to include a deterministic ORDER BY clause (e.g., ORDER BY
r.rid) placed before the LIMIT :limit OFFSET :offset so that queries built with
where_clause and the query variable return stable, repeatable pages for
functions that rely on run_id ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 215d38de-d419-460b-92fe-fa8de6ea3b0b

📥 Commits

Reviewing files that changed from the base of the PR and between 5357b01 and 42bc467.

📒 Files selected for processing (3)
  • src/main.py
  • src/routers/openml/runs.py
  • tests/routers/openml/runs_list_test.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/routers/openml/runs.py (1)

1-24: 🛠️ Refactor suggestion | 🟠 Major

Remove duplicate module docstring and consolidate imports.

The file has two module-level docstrings (lines 1 and 12) and several duplicate imports. Python only recognizes the first docstring; the second becomes a no-op string expression. The duplicates appear to be leftover from merging the new endpoint code.

Duplicated imports:

  • Annotated (lines 3, 14)
  • APIRouter, Depends (lines 5, 16)
  • AsyncConnection (lines 7, 17)
  • expdb_connection (lines 10, 21)
♻️ Proposed consolidation
 """Endpoints for OpenML Run resources."""

-from typing import Annotated, Any
+from typing import Annotated, Any

 from fastapi import APIRouter, Body, Depends
 from sqlalchemy import text
 from sqlalchemy.ext.asyncio import AsyncConnection

+import database.runs
+from core.errors import NoResultsError, RunNotFoundError, RunTraceNotFoundError
 from routers.dependencies import Pagination, expdb_connection
 from routers.types import SystemString64
-"""Endpoints for run-related data."""
-
-from typing import Annotated
-
-from fastapi import APIRouter, Depends
-from sqlalchemy.ext.asyncio import AsyncConnection
-
-import database.runs
-from core.errors import RunNotFoundError, RunTraceNotFoundError
-from routers.dependencies import expdb_connection
 from schemas.runs import RunTrace, TraceIteration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 1 - 24, Remove the duplicate
module-level docstring and consolidate the repeated imports into a single block:
keep one top docstring (either "Endpoints for OpenML Run resources." or
"Endpoints for run-related data.") and merge imports so that typing.Annotated
(and Any if used), fastapi.APIRouter and Depends,
sqlalchemy.ext.asyncio.AsyncConnection, routers.dependencies.expdb_connection,
database.runs, core.errors.RunNotFoundError/RunTraceNotFoundError, and
schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure
router = APIRouter(...) remains and remove the redundant second docstring and
duplicate import lines (e.g., the extra Annotated, APIRouter, Depends,
AsyncConnection, expdb_connection).
🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)

80-80: Inconsistent default on expdb parameter.

The = None default is unnecessary for a Depends() injected parameter and inconsistent with the existing get_run_trace endpoint at line 153, which omits the default.

♻️ Proposed fix for consistency
-    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` at line 80, Remove the unnecessary "= None"
default on the Depends-injected parameter so the signature uses expdb:
Annotated[AsyncConnection, Depends(expdb_connection)] without a default; update
the function that declares the expdb parameter (the one in
src/routers/openml/runs.py where expdb is annotated with AsyncConnection and
Depends(expdb_connection)) to match the existing get_run_trace endpoint style
and ensure the dependency injection is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/routers/openml/runs.py`:
- Around line 1-24: Remove the duplicate module-level docstring and consolidate
the repeated imports into a single block: keep one top docstring (either
"Endpoints for OpenML Run resources." or "Endpoints for run-related data.") and
merge imports so that typing.Annotated (and Any if used), fastapi.APIRouter and
Depends, sqlalchemy.ext.asyncio.AsyncConnection,
routers.dependencies.expdb_connection, database.runs,
core.errors.RunNotFoundError/RunTraceNotFoundError, and
schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure
router = APIRouter(...) remains and remove the redundant second docstring and
duplicate import lines (e.g., the extra Annotated, APIRouter, Depends,
AsyncConnection, expdb_connection).

---

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Line 80: Remove the unnecessary "= None" default on the Depends-injected
parameter so the signature uses expdb: Annotated[AsyncConnection,
Depends(expdb_connection)] without a default; update the function that declares
the expdb parameter (the one in src/routers/openml/runs.py where expdb is
annotated with AsyncConnection and Depends(expdb_connection)) to match the
existing get_run_trace endpoint style and ensure the dependency injection is
consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d762c642-0985-4912-86f0-f7a8fd70ea88

📥 Commits

Reviewing files that changed from the base of the PR and between 366f273 and 3db091f.

📒 Files selected for processing (2)
  • src/main.py
  • src/routers/openml/runs.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/main.py

@saathviksheerla saathviksheerla marked this pull request as draft March 26, 2026 06:03
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)

1-13: Remove duplicate module docstring.

Line 13 contains a redundant docstring that duplicates line 1. This appears to be a leftover from code merging.

🧹 Proposed cleanup
 from routers.types import SystemString64
 
-"""Endpoints for run-related data."""
-
-
 import database.runs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 1 - 13, Remove the redundant module
docstring duplicate by keeping a single top-level docstring and deleting the
second triple-quoted string (the one reading """Endpoints for run-related
data."""), leaving only the intended module description (e.g., """Endpoints for
OpenML Run resources.""") at the top of the module; this cleanup touches the
module-level docstring in the runs.py module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Line 75: The tag parameter in the list_runs endpoint is missing the Body(...)
annotation, causing inconsistency with other filters; update the list_runs
function signature to annotate tag the same way as
run_id/task_id/flow_id/setup_id/uploader (e.g., tag: Annotated[str | None,
SystemString64] = Body(None, description=...)) so FastAPI treats it as a request
body field in the hybrid GET/POST handling—apply this change to the list_runs
parameter list where tag is declared.

---

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 1-13: Remove the redundant module docstring duplicate by keeping a
single top-level docstring and deleting the second triple-quoted string (the one
reading """Endpoints for run-related data."""), leaving only the intended module
description (e.g., """Endpoints for OpenML Run resources.""") at the top of the
module; this cleanup touches the module-level docstring in the runs.py module.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dfb45d4a-e021-47bf-aa99-aca4a78783d7

📥 Commits

Reviewing files that changed from the base of the PR and between 3db091f and 15d0b13.

📒 Files selected for processing (1)
  • src/routers/openml/runs.py

@saathviksheerla saathviksheerla marked this pull request as ready for review March 26, 2026 06:34
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • In main.py the run_router is now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed.
  • The GET /run/list endpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `main.py` the `run_router` is now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed.
- The `GET /run/list` endpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.

## Individual Comments

### Comment 1
<location path="src/main.py" line_range="73" />
<code_context>
     app.include_router(task_router)
     app.include_router(flows_router)
     app.include_router(study_router)
+    app.include_router(run_router)
     app.include_router(setup_router)
     app.include_router(run_router)
</code_context>
<issue_to_address>
**issue (bug_risk):** The `run_router` is being included twice, which will cause route duplication conflicts.

It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one `app.include_router(run_router)` here.
</issue_to_address>

### Comment 2
<location path="src/routers/openml/runs.py" line_range="70-74" />
<code_context>
+        list[int] | None,
+        Body(description="Only include runs uploaded by these user id(s)."),
+    ] = None,
+    tag: Annotated[
+        str | None,
+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
</code_context>
<issue_to_address>
**suggestion:** The `tag` parameter description suggests multiple tags but the type only allows a single string.

The description uses `tag(s)` but the parameter is `str | None`, not a list. If multiple tags should be supported (like the other list filters), consider `list[str] | None` with an `IN`-style filter; otherwise, make the description singular to match the current type.

```suggestion
    tag: Annotated[
        str | None,
        SystemString64,
        Body(description="Only include runs with this tag."),
    ] = None,
```
</issue_to_address>

### Comment 3
<location path="src/routers/openml/runs.py" line_range="75" />
<code_context>
+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
+) -> list[dict[str, Any]]:
+    """List runs, optionally filtered by one or more criteria.
</code_context>
<issue_to_address>
**suggestion:** The `expdb` dependency defaulting to `None` is unnecessary and slightly misleading.

Given `Depends(expdb_connection)`, FastAPI will always provide an `AsyncConnection` here, so `expdb` can never be `None`. Removing the `= None` default (using just `expdb: Annotated[AsyncConnection, Depends(expdb_connection)]`) makes the type accurate and discourages unnecessary `None` checks downstream.

```suggestion
    expdb: Annotated[AsyncConnection, Depends(expdb_connection)],
```
</issue_to_address>

### Comment 4
<location path="tests/routers/openml/runs_list_test.py" line_range="113-117" />
<code_context>
+    }
+
+
+async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None:
+    """Filter by multiple run_ids returns exactly those runs."""
+    response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]})
+    assert response.status_code == HTTPStatus.OK
+    assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26}
+
+
</code_context>
<issue_to_address>
**question (testing):** There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.

Since list filters are only applied when the parameter is truthy, a payload like `{"run_id": []}` is currently treated the same as omitting `run_id`. Please add tests for empty-list inputs (e.g. `run_id: []`, `task_id: []`) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).
</issue_to_address>

### Comment 5
<location path="tests/routers/openml/runs_list_test.py" line_range="256-260" />
<code_context>
+    )
+
+
+async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
+    """Pagination limit=1 returns exactly 1 run."""
+    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
+    assert response.status_code == HTTPStatus.OK
+    assert len(response.json()) == 1
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.

Please add tests for invalid pagination inputs that the `Pagination` schema should reject (e.g., negative `limit`/`offset`, or `limit=0` if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.

Suggested implementation:

```python
async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
    """Pagination limit=1 returns exactly 1 run."""
    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
    assert response.status_code == HTTPStatus.OK
    assert len(response.json()) == 1


@pytest.mark.parametrize(
    "pagination",
    [
        {"limit": -1, "offset": 0},   # negative limit
        {"limit": 1, "offset": -5},   # negative offset
        {"limit": 0, "offset": 0},    # zero limit, if not allowed by schema
    ],
)
async def test_list_runs_pagination_invalid_inputs(
    py_api: httpx.AsyncClient,
    pagination: dict,
) -> None:
    """Invalid pagination values should be rejected by the API."""
    response = await py_api.post("/run/list", json={"pagination": pagination})
    assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY


async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:

```

If `pytest` and `HTTPStatus` are not already imported in this file, you should ensure the following imports exist near the top of the file:

- `import pytest`
- `from http import HTTPStatus`

No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns `422 Unprocessable Entity`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

app.include_router(task_router)
app.include_router(flows_router)
app.include_router(study_router)
app.include_router(run_router)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The run_router is being included twice, which will cause route duplication conflicts.

It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one app.include_router(run_router) here.

Comment on lines +70 to +74
tag: Annotated[
str | None,
SystemString64,
Body(description="Only include runs with this tag(s)."),
] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The tag parameter description suggests multiple tags but the type only allows a single string.

The description uses tag(s) but the parameter is str | None, not a list. If multiple tags should be supported (like the other list filters), consider list[str] | None with an IN-style filter; otherwise, make the description singular to match the current type.

Suggested change
tag: Annotated[
str | None,
SystemString64,
Body(description="Only include runs with this tag(s)."),
] = None,
tag: Annotated[
str | None,
SystemString64,
Body(description="Only include runs with this tag."),
] = None,

SystemString64,
Body(description="Only include runs with this tag(s)."),
] = None,
expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The expdb dependency defaulting to None is unnecessary and slightly misleading.

Given Depends(expdb_connection), FastAPI will always provide an AsyncConnection here, so expdb can never be None. Removing the = None default (using just expdb: Annotated[AsyncConnection, Depends(expdb_connection)]) makes the type accurate and discourages unnecessary None checks downstream.

Suggested change
expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
expdb: Annotated[AsyncConnection, Depends(expdb_connection)],

Comment on lines +113 to +117
async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None:
"""Filter by multiple run_ids returns exactly those runs."""
response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]})
assert response.status_code == HTTPStatus.OK
assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (testing): There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.

Since list filters are only applied when the parameter is truthy, a payload like {"run_id": []} is currently treated the same as omitting run_id. Please add tests for empty-list inputs (e.g. run_id: [], task_id: []) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).

Comment on lines +256 to +260
async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
"""Pagination limit=1 returns exactly 1 run."""
response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
assert response.status_code == HTTPStatus.OK
assert len(response.json()) == 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.

Please add tests for invalid pagination inputs that the Pagination schema should reject (e.g., negative limit/offset, or limit=0 if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.

Suggested implementation:

async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
    """Pagination limit=1 returns exactly 1 run."""
    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
    assert response.status_code == HTTPStatus.OK
    assert len(response.json()) == 1


@pytest.mark.parametrize(
    "pagination",
    [
        {"limit": -1, "offset": 0},   # negative limit
        {"limit": 1, "offset": -5},   # negative offset
        {"limit": 0, "offset": 0},    # zero limit, if not allowed by schema
    ],
)
async def test_list_runs_pagination_invalid_inputs(
    py_api: httpx.AsyncClient,
    pagination: dict,
) -> None:
    """Invalid pagination values should be rejected by the API."""
    response = await py_api.post("/run/list", json={"pagination": pagination})
    assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY


async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:

If pytest and HTTPStatus are not already imported in this file, you should ensure the following imports exist near the top of the file:

  • import pytest
  • from http import HTTPStatus

No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns 422 Unprocessable Entity.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 47-69: The current filter handling treats empty lists like None so
inputs such as run_id=[] broaden the search; update the checks in the run
filtering logic (where run_id, task_id, flow_id, setup_id, uploader are
evaluated) to explicitly distinguish None from an empty list: if a filter is
None skip it, if it is an empty list either return an empty result
(short-circuit) or raise a 400 validation error (choose the project's
convention), and apply this same pattern to task_id, flow_id, setup_id and
uploader so empty arrays do not fall through to the no-filter path.
- Line 46: Clamp Pagination fields before they are used in SQL: validate and
sanitize the Pagination instance (the pagination parameter and other usages at
the other spots where Pagination is accepted) by forcing offset = max(0, offset)
and limit = min(MAX_PAGE_SIZE, max(1, limit)) (pick a sensible MAX_PAGE_SIZE
constant), then pass the sanitized values into the SQL/LIMIT/OFFSET binding;
update the code paths that accept Annotated[Pagination, Body(...)] (the
parameter named pagination at the shown diff and the other occurrences around
lines 90 and 128) to perform this clamping immediately after receiving the body
and before any query construction/execution.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 131ee599-ef4e-4b6a-9f0a-f0741b4b6756

📥 Commits

Reviewing files that changed from the base of the PR and between 15d0b13 and ec8b17e.

📒 Files selected for processing (1)
  • src/routers/openml/runs.py

Comment on lines +47 to +69
run_id: Annotated[
list[int] | None,
Body(
description="The run(s) to include in the search. "
"If none are specified, all runs are included.",
),
] = None,
task_id: Annotated[
list[int] | None,
Body(description="Only include runs for these task id(s)."),
] = None,
flow_id: Annotated[
list[int] | None,
Body(description="Only include runs using these flow id(s)."),
] = None,
setup_id: Annotated[
list[int] | None,
Body(description="Only include runs with these setup id(s)."),
] = None,
uploader: Annotated[
list[int] | None,
Body(description="Only include runs uploaded by these user id(s)."),
] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Explicit empty filters currently broaden the search.

The truthiness checks on Lines 95-104 treat [] the same as None, so a request like { "run_id": [] } falls through to the no-filter path and can return unrelated runs. That contradicts the documented “all provided filters” behavior; handle is not None separately and reject or short-circuit empty lists.

🔧 One possible guard
-    if run_id:
+    if run_id is not None:
+        if not run_id:
+            raise NoResultsError("No runs match the search criteria.")
         _add_in_filter(filters, params, "r.rid", "run_id", run_id)

Apply the same pattern to task_id, flow_id, setup_id, and uploader.

Also applies to: 95-104

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 47 - 69, The current filter handling
treats empty lists like None so inputs such as run_id=[] broaden the search;
update the checks in the run filtering logic (where run_id, task_id, flow_id,
setup_id, uploader are evaluated) to explicitly distinguish None from an empty
list: if a filter is None skip it, if it is an empty list either return an empty
result (short-circuit) or raise a 400 validation error (choose the project's
convention), and apply this same pattern to task_id, flow_id, setup_id and
uploader so empty arrays do not fall through to the no-filter path.

@saathviksheerla saathviksheerla marked this pull request as draft March 26, 2026 07:19
@saathviksheerla saathviksheerla marked this pull request as ready for review March 26, 2026 07:56
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • Using Body for filters and pagination on the GET /run/list route makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic.
  • The hard-coded run IDs, setup IDs, and especially the exact upload_time value in runs_list_test.py make the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness.
  • The pagination clamp limit = max(1, min(pagination.limit, 1000)) embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Using `Body` for filters and pagination on the `GET /run/list` route makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic.
- The hard-coded run IDs, setup IDs, and especially the exact `upload_time` value in `runs_list_test.py` make the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness.
- The pagination clamp `limit = max(1, min(pagination.limit, 1000))` embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.

## Individual Comments

### Comment 1
<location path="tests/routers/openml/runs_list_test.py" line_range="68-74" />
<code_context>
+    assert response.json()["code"] == "372"
+
+
+async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
+    """GET and POST /run/list with no filters must return identical results."""
+    get_resp = await py_api.get("/run/list")
+    post_resp = await py_api.post("/run/list", json={})
+    assert get_resp.status_code == HTTPStatus.OK
+    assert post_resp.status_code == HTTPStatus.OK
+    assert get_resp.json() == post_resp.json()
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.

This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. `test_get_and_post_list_runs_with_filters_return_same_results`) that sends a JSON body including some filters (like `run_id` or `task_id` plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.

```suggestion
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
    """GET and POST /run/list with no filters must return identical results."""
    get_resp = await py_api.get("/run/list")
    post_resp = await py_api.post("/run/list", json={})
    assert get_resp.status_code == HTTPStatus.OK
    assert post_resp.status_code == HTTPStatus.OK
    assert get_resp.json() == post_resp.json()


async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None:
    """GET and POST /run/list with filters in the request body must return identical results."""
    payload = {
        "run_id": [24, 25],
        "limit": 10,
        "offset": 0,
    }

    get_resp = await py_api.get("/run/list", json=payload)
    post_resp = await py_api.post("/run/list", json=payload)

    assert get_resp.status_code == HTTPStatus.OK
    assert post_resp.status_code == HTTPStatus.OK
    assert get_resp.json() == post_resp.json()
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +68 to +74
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with no filters must return identical results."""
get_resp = await py_api.get("/run/list")
post_resp = await py_api.post("/run/list", json={})
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.

This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. test_get_and_post_list_runs_with_filters_return_same_results) that sends a JSON body including some filters (like run_id or task_id plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.

Suggested change
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with no filters must return identical results."""
get_resp = await py_api.get("/run/list")
post_resp = await py_api.post("/run/list", json={})
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with no filters must return identical results."""
get_resp = await py_api.get("/run/list")
post_resp = await py_api.post("/run/list", json={})
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()
async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with filters in the request body must return identical results."""
payload = {
"run_id": [24, 25],
"limit": 10,
"offset": 0,
}
get_resp = await py_api.get("/run/list", json=payload)
post_resp = await py_api.post("/run/list", json=payload)
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()

Copy link
Copy Markdown
Contributor

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off on reviewing this. Let's focus on #277, and then maybe see how/if we should extract common functionality for these list functions.

The wontfix label is just to mean that the PR will not be processed for now. We do still want the feature.

@PGijsbers PGijsbers added the wontfix This will not be worked on label Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wontfix This will not be worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GET /run/list/{filters}

2 participants