feat: add GET /run/list endpoint (#39) by saathviksheerla · Pull Request #279 · openml/server-api

saathviksheerla · 2026-03-17T16:37:33Z

Description

Adds the GET /run/list and POST /run/list endpoints for listing runs with optional filters.

Fixes: #39

Filters supported (all optional, combinable with AND logic):

run_id — list of run IDs
task_id — list of task IDs
flow_id — list of flow IDs
setup_id — list of setup IDs
uploader — list of user IDs
tag — single tag string (validated via SystemString64)

Pagination is supported via the nested pagination body parameter (consistent with other list endpoints).

Returns 404 with code 372 when no runs match the filters.

Deviation from PHP: The PHP API requires at least one filter (error 510). This endpoint follows the pattern of other list endpoints in this API and allows no-filter requests, returning all runs paginated.

Checklist

I have performed a self-review of my own pull request to ensure it contains all relevant information, and the proposed changes are minimal but sufficient to accomplish their task.
Tests pass locally
I have commented my code in hard-to-understand areas, and provided or updated docstrings as needed
I have added tests that cover the changes (only required if not already under coverage)

codecov · 2026-03-17T16:39:04Z

Codecov Report

❌ Patch coverage is 25.64103% with 29 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.37%. Comparing base (fde90bf) to head (dde9b4d).

Files with missing lines	Patch %	Lines
src/routers/openml/runs.py	26.31%	28 Missing ⚠️
src/main.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #279      +/-   ##
==========================================
- Coverage   54.15%   53.37%   -0.79%     
==========================================
  Files          37       37              
  Lines        1553     1587      +34     
  Branches      135      142       +7     
==========================================
+ Hits          841      847       +6     
- Misses        710      738      +28     
  Partials        2        2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai · 2026-03-18T04:38:11Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a second registration of the OpenML runs router in create_api() (resulting in run_router being included twice). Introduces GET/POST /run/list implemented by list_runs, which accepts pagination, optional list filters (run_id, task_id, flow_id, setup_id, uploader) and an optional tag; it builds a dynamic SQL WHERE using a new _add_in_filter helper and a tag subquery, queries run joined with algorithm_setup, formats upload_time as "%Y-%m-%d %H:%M:%S", and raises NoResultsError when no rows match. Adds extensive tests for filtering, pagination, validation, and error cases.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the primary change: adding the GET /run/list endpoint. It is specific, directly related to the main changeset, and follows a standard PR convention.
Description check	✅ Passed	The description thoroughly explains the changes, including endpoint details, supported filters, pagination, error handling, deviation from PHP API, and a completed checklist—all directly related to the changeset.
Linked Issues check	✅ Passed	The PR directly addresses issue `#39` by implementing the GET /run/list and POST /run/list endpoints with support for filtering and pagination as required.
Out of Scope Changes check	✅ Passed	All changes in the PR are scoped to implementing the /run/list endpoints: router includes a new endpoint handler, tests cover the new functionality, and modifications to run_router registration support the feature.
Docstring Coverage	✅ Passed	Docstring coverage is 96.77% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've left some high level feedback:

Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

🧹 Nitpick comments (1)

src/routers/openml/runs.py (1)
107-124: Consider adding ORDER BY for deterministic pagination.

The query lacks an ORDER BY clause, which means pagination results could be inconsistent across requests (same offset may return different rows). Adding ORDER BY r.rid would ensure stable, predictable pagination.
♻️ Proposed fix
         FROM run r
         JOIN algorithm_setup a ON r.setup = a.sid
         {where_clause}
+        ORDER BY r.rid
         LIMIT :limit OFFSET :offset
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 107 - 124, The SQL constructed in
the variable query in runs.py uses LIMIT/OFFSET without an ORDER BY, causing
non-deterministic pagination; update the text(...) SQL to include a
deterministic ORDER BY clause (e.g., ORDER BY r.rid) placed before the LIMIT
:limit OFFSET :offset so that queries built with where_clause and the query
variable return stable, repeatable pages for functions that rely on run_id
ordering.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 107-124: The SQL constructed in the variable query in runs.py uses
LIMIT/OFFSET without an ORDER BY, causing non-deterministic pagination; update
the text(...) SQL to include a deterministic ORDER BY clause (e.g., ORDER BY
r.rid) placed before the LIMIT :limit OFFSET :offset so that queries built with
where_clause and the query variable return stable, repeatable pages for
functions that rely on run_id ordering.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 215d38de-d419-460b-92fe-fa8de6ea3b0b

📥 Commits

Reviewing files that changed from the base of the PR and between 5357b01 and 42bc467.

📒 Files selected for processing (3)

src/main.py
src/routers/openml/runs.py
tests/routers/openml/runs_list_test.py

for more information, see https://pre-commit.ci

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/routers/openml/runs.py (1)

1-24: 🛠️ Refactor suggestion | 🟠 Major

Remove duplicate module docstring and consolidate imports.

The file has two module-level docstrings (lines 1 and 12) and several duplicate imports. Python only recognizes the first docstring; the second becomes a no-op string expression. The duplicates appear to be leftover from merging the new endpoint code.

Duplicated imports:

Annotated (lines 3, 14)
APIRouter, Depends (lines 5, 16)
AsyncConnection (lines 7, 17)
expdb_connection (lines 10, 21)

♻️ Proposed consolidation

 """Endpoints for OpenML Run resources."""

-from typing import Annotated, Any
+from typing import Annotated, Any

 from fastapi import APIRouter, Body, Depends
 from sqlalchemy import text
 from sqlalchemy.ext.asyncio import AsyncConnection

+import database.runs
+from core.errors import NoResultsError, RunNotFoundError, RunTraceNotFoundError
 from routers.dependencies import Pagination, expdb_connection
 from routers.types import SystemString64
-"""Endpoints for run-related data."""
-
-from typing import Annotated
-
-from fastapi import APIRouter, Depends
-from sqlalchemy.ext.asyncio import AsyncConnection
-
-import database.runs
-from core.errors import RunNotFoundError, RunTraceNotFoundError
-from routers.dependencies import expdb_connection
 from schemas.runs import RunTrace, TraceIteration

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 1 - 24, Remove the duplicate
module-level docstring and consolidate the repeated imports into a single block:
keep one top docstring (either "Endpoints for OpenML Run resources." or
"Endpoints for run-related data.") and merge imports so that typing.Annotated
(and Any if used), fastapi.APIRouter and Depends,
sqlalchemy.ext.asyncio.AsyncConnection, routers.dependencies.expdb_connection,
database.runs, core.errors.RunNotFoundError/RunTraceNotFoundError, and
schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure
router = APIRouter(...) remains and remove the redundant second docstring and
duplicate import lines (e.g., the extra Annotated, APIRouter, Depends,
AsyncConnection, expdb_connection).

🧹 Nitpick comments (1)

src/routers/openml/runs.py (1)

80-80: Inconsistent default on expdb parameter.

The = None default is unnecessary for a Depends() injected parameter and inconsistent with the existing get_run_trace endpoint at line 153, which omits the default.

♻️ Proposed fix for consistency

-    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)],

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` at line 80, Remove the unnecessary "= None"
default on the Depends-injected parameter so the signature uses expdb:
Annotated[AsyncConnection, Depends(expdb_connection)] without a default; update
the function that declares the expdb parameter (the one in
src/routers/openml/runs.py where expdb is annotated with AsyncConnection and
Depends(expdb_connection)) to match the existing get_run_trace endpoint style
and ensure the dependency injection is consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/routers/openml/runs.py`:
- Around line 1-24: Remove the duplicate module-level docstring and consolidate
the repeated imports into a single block: keep one top docstring (either
"Endpoints for OpenML Run resources." or "Endpoints for run-related data.") and
merge imports so that typing.Annotated (and Any if used), fastapi.APIRouter and
Depends, sqlalchemy.ext.asyncio.AsyncConnection,
routers.dependencies.expdb_connection, database.runs,
core.errors.RunNotFoundError/RunTraceNotFoundError, and
schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure
router = APIRouter(...) remains and remove the redundant second docstring and
duplicate import lines (e.g., the extra Annotated, APIRouter, Depends,
AsyncConnection, expdb_connection).

---

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Line 80: Remove the unnecessary "= None" default on the Depends-injected
parameter so the signature uses expdb: Annotated[AsyncConnection,
Depends(expdb_connection)] without a default; update the function that declares
the expdb parameter (the one in src/routers/openml/runs.py where expdb is
annotated with AsyncConnection and Depends(expdb_connection)) to match the
existing get_run_trace endpoint style and ensure the dependency injection is
consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d762c642-0985-4912-86f0-f7a8fd70ea88

📥 Commits

Reviewing files that changed from the base of the PR and between 366f273 and 3db091f.

📒 Files selected for processing (2)

src/main.py
src/routers/openml/runs.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/main.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/routers/openml/runs.py (1)

1-13: Remove duplicate module docstring.

Line 13 contains a redundant docstring that duplicates line 1. This appears to be a leftover from code merging.

🧹 Proposed cleanup

 from routers.types import SystemString64
 
-"""Endpoints for run-related data."""
-
-
 import database.runs

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/routers/openml/runs.py` around lines 1 - 13, Remove the redundant module
docstring duplicate by keeping a single top-level docstring and deleting the
second triple-quoted string (the one reading """Endpoints for run-related
data."""), leaving only the intended module description (e.g., """Endpoints for
OpenML Run resources.""") at the top of the module; this cleanup touches the
module-level docstring in the runs.py module.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Line 75: The tag parameter in the list_runs endpoint is missing the Body(...)
annotation, causing inconsistency with other filters; update the list_runs
function signature to annotate tag the same way as
run_id/task_id/flow_id/setup_id/uploader (e.g., tag: Annotated[str | None,
SystemString64] = Body(None, description=...)) so FastAPI treats it as a request
body field in the hybrid GET/POST handling—apply this change to the list_runs
parameter list where tag is declared.

---

Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 1-13: Remove the redundant module docstring duplicate by keeping a
single top-level docstring and deleting the second triple-quoted string (the one
reading """Endpoints for run-related data."""), leaving only the intended module
description (e.g., """Endpoints for OpenML Run resources.""") at the top of the
module; this cleanup touches the module-level docstring in the runs.py module.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dfb45d4a-e021-47bf-aa99-aca4a78783d7

📥 Commits

Reviewing files that changed from the base of the PR and between 3db091f and 15d0b13.

📒 Files selected for processing (1)

src/routers/openml/runs.py

src/routers/openml/runs.py

sourcery-ai

Hey - I've found 5 issues, and left some high level feedback:

In main.py the run_router is now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed.
The GET /run/list endpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `main.py` the `run_router` is now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed.
- The `GET /run/list` endpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.

## Individual Comments

### Comment 1
<location path="src/main.py" line_range="73" />
<code_context>
     app.include_router(task_router)
     app.include_router(flows_router)
     app.include_router(study_router)
+    app.include_router(run_router)
     app.include_router(setup_router)
     app.include_router(run_router)
</code_context>
<issue_to_address>
**issue (bug_risk):** The `run_router` is being included twice, which will cause route duplication conflicts.

It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one `app.include_router(run_router)` here.
</issue_to_address>

### Comment 2
<location path="src/routers/openml/runs.py" line_range="70-74" />
<code_context>
+        list[int] | None,
+        Body(description="Only include runs uploaded by these user id(s)."),
+    ] = None,
+    tag: Annotated[
+        str | None,
+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
</code_context>
<issue_to_address>
**suggestion:** The `tag` parameter description suggests multiple tags but the type only allows a single string.

The description uses `tag(s)` but the parameter is `str | None`, not a list. If multiple tags should be supported (like the other list filters), consider `list[str] | None` with an `IN`-style filter; otherwise, make the description singular to match the current type.

```suggestion
    tag: Annotated[
        str | None,
        SystemString64,
        Body(description="Only include runs with this tag."),
    ] = None,
```
</issue_to_address>

### Comment 3
<location path="src/routers/openml/runs.py" line_range="75" />
<code_context>
+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
+) -> list[dict[str, Any]]:
+    """List runs, optionally filtered by one or more criteria.
</code_context>
<issue_to_address>
**suggestion:** The `expdb` dependency defaulting to `None` is unnecessary and slightly misleading.

Given `Depends(expdb_connection)`, FastAPI will always provide an `AsyncConnection` here, so `expdb` can never be `None`. Removing the `= None` default (using just `expdb: Annotated[AsyncConnection, Depends(expdb_connection)]`) makes the type accurate and discourages unnecessary `None` checks downstream.

```suggestion
    expdb: Annotated[AsyncConnection, Depends(expdb_connection)],
```
</issue_to_address>

### Comment 4
<location path="tests/routers/openml/runs_list_test.py" line_range="113-117" />
<code_context>
+    }
+
+
+async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None:
+    """Filter by multiple run_ids returns exactly those runs."""
+    response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]})
+    assert response.status_code == HTTPStatus.OK
+    assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26}
+
+
</code_context>
<issue_to_address>
**question (testing):** There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.

Since list filters are only applied when the parameter is truthy, a payload like `{"run_id": []}` is currently treated the same as omitting `run_id`. Please add tests for empty-list inputs (e.g. `run_id: []`, `task_id: []`) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).
</issue_to_address>

### Comment 5
<location path="tests/routers/openml/runs_list_test.py" line_range="256-260" />
<code_context>
+    )
+
+
+async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
+    """Pagination limit=1 returns exactly 1 run."""
+    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
+    assert response.status_code == HTTPStatus.OK
+    assert len(response.json()) == 1
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.

Please add tests for invalid pagination inputs that the `Pagination` schema should reject (e.g., negative `limit`/`offset`, or `limit=0` if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.

Suggested implementation:

```python
async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
    """Pagination limit=1 returns exactly 1 run."""
    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
    assert response.status_code == HTTPStatus.OK
    assert len(response.json()) == 1


@pytest.mark.parametrize(
    "pagination",
    [
        {"limit": -1, "offset": 0},   # negative limit
        {"limit": 1, "offset": -5},   # negative offset
        {"limit": 0, "offset": 0},    # zero limit, if not allowed by schema
    ],
)
async def test_list_runs_pagination_invalid_inputs(
    py_api: httpx.AsyncClient,
    pagination: dict,
) -> None:
    """Invalid pagination values should be rejected by the API."""
    response = await py_api.post("/run/list", json={"pagination": pagination})
    assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY


async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:

```

If `pytest` and `HTTPStatus` are not already imported in this file, you should ensure the following imports exist near the top of the file:

- `import pytest`
- `from http import HTTPStatus`

No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns `422 Unprocessable Entity`.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-26T06:35:30Z

src/main.py

    app.include_router(task_router)
    app.include_router(flows_router)
    app.include_router(study_router)
+    app.include_router(run_router)


issue (bug_risk): The run_router is being included twice, which will cause route duplication conflicts.

It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one app.include_router(run_router) here.

sourcery-ai · 2026-03-26T06:35:30Z

src/routers/openml/runs.py

+    tag: Annotated[
+        str | None,
+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,


suggestion: The tag parameter description suggests multiple tags but the type only allows a single string.

The description uses tag(s) but the parameter is str | None, not a list. If multiple tags should be supported (like the other list filters), consider list[str] | None with an IN-style filter; otherwise, make the description singular to match the current type.

Suggested change

tag: Annotated[

str | None,

SystemString64,

Body(description="Only include runs with this tag(s)."),

] = None,

tag: Annotated[

str | None,

SystemString64,

Body(description="Only include runs with this tag."),

] = None,

sourcery-ai · 2026-03-26T06:35:30Z

src/routers/openml/runs.py

+        SystemString64,
+        Body(description="Only include runs with this tag(s)."),
+    ] = None,
+    expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,


suggestion: The expdb dependency defaulting to None is unnecessary and slightly misleading.

Given Depends(expdb_connection), FastAPI will always provide an AsyncConnection here, so expdb can never be None. Removing the = None default (using just expdb: Annotated[AsyncConnection, Depends(expdb_connection)]) makes the type accurate and discourages unnecessary None checks downstream.

Suggested change

expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,

expdb: Annotated[AsyncConnection, Depends(expdb_connection)],

sourcery-ai · 2026-03-26T06:35:30Z

tests/routers/openml/runs_list_test.py

+async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None:
+    """Filter by multiple run_ids returns exactly those runs."""
+    response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]})
+    assert response.status_code == HTTPStatus.OK
+    assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26}


question (testing): There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.

Since list filters are only applied when the parameter is truthy, a payload like {"run_id": []} is currently treated the same as omitting run_id. Please add tests for empty-list inputs (e.g. run_id: [], task_id: []) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).

sourcery-ai · 2026-03-26T06:35:30Z

tests/routers/openml/runs_list_test.py

+async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
+    """Pagination limit=1 returns exactly 1 run."""
+    response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
+    assert response.status_code == HTTPStatus.OK
+    assert len(response.json()) == 1


suggestion (testing): Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.

Please add tests for invalid pagination inputs that the Pagination schema should reject (e.g., negative limit/offset, or limit=0 if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.

Suggested implementation:

async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None: """Pagination limit=1 returns exactly 1 run.""" response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}}) assert response.status_code == HTTPStatus.OK assert len(response.json()) == 1 @pytest.mark.parametrize( "pagination", [ {"limit": -1, "offset": 0}, # negative limit {"limit": 1, "offset": -5}, # negative offset {"limit": 0, "offset": 0}, # zero limit, if not allowed by schema ], ) async def test_list_runs_pagination_invalid_inputs( py_api: httpx.AsyncClient, pagination: dict, ) -> None: """Invalid pagination values should be rejected by the API.""" response = await py_api.post("/run/list", json={"pagination": pagination}) assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:

If pytest and HTTPStatus are not already imported in this file, you should ensure the following imports exist near the top of the file:

import pytest

from http import HTTPStatus

No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns 422 Unprocessable Entity.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 47-69: The current filter handling treats empty lists like None so
inputs such as run_id=[] broaden the search; update the checks in the run
filtering logic (where run_id, task_id, flow_id, setup_id, uploader are
evaluated) to explicitly distinguish None from an empty list: if a filter is
None skip it, if it is an empty list either return an empty result
(short-circuit) or raise a 400 validation error (choose the project's
convention), and apply this same pattern to task_id, flow_id, setup_id and
uploader so empty arrays do not fall through to the no-filter path.
- Line 46: Clamp Pagination fields before they are used in SQL: validate and
sanitize the Pagination instance (the pagination parameter and other usages at
the other spots where Pagination is accepted) by forcing offset = max(0, offset)
and limit = min(MAX_PAGE_SIZE, max(1, limit)) (pick a sensible MAX_PAGE_SIZE
constant), then pass the sanitized values into the SQL/LIMIT/OFFSET binding;
update the code paths that accept Annotated[Pagination, Body(...)] (the
parameter named pagination at the shown diff and the other occurrences around
lines 90 and 128) to perform this clamping immediately after receiving the body
and before any query construction/execution.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 131ee599-ef4e-4b6a-9f0a-f0741b4b6756

📥 Commits

Reviewing files that changed from the base of the PR and between 15d0b13 and ec8b17e.

📒 Files selected for processing (1)

src/routers/openml/runs.py

src/routers/openml/runs.py

coderabbitai · 2026-03-26T06:41:22Z

src/routers/openml/runs.py

+    run_id: Annotated[
+        list[int] | None,
+        Body(
+            description="The run(s) to include in the search. "
+            "If none are specified, all runs are included.",
+        ),
+    ] = None,
+    task_id: Annotated[
+        list[int] | None,
+        Body(description="Only include runs for these task id(s)."),
+    ] = None,
+    flow_id: Annotated[
+        list[int] | None,
+        Body(description="Only include runs using these flow id(s)."),
+    ] = None,
+    setup_id: Annotated[
+        list[int] | None,
+        Body(description="Only include runs with these setup id(s)."),
+    ] = None,
+    uploader: Annotated[
+        list[int] | None,
+        Body(description="Only include runs uploaded by these user id(s)."),
+    ] = None,


⚠️ Potential issue | 🟠 Major

Explicit empty filters currently broaden the search.

The truthiness checks on Lines 95-104 treat [] the same as None, so a request like { "run_id": [] } falls through to the no-filter path and can return unrelated runs. That contradicts the documented “all provided filters” behavior; handle is not None separately and reject or short-circuit empty lists.

🔧 One possible guard

- if run_id: + if run_id is not None: + if not run_id: + raise NoResultsError("No runs match the search criteria.") _add_in_filter(filters, params, "r.rid", "run_id", run_id)

Apply the same pattern to task_id, flow_id, setup_id, and uploader.

Also applies to: 95-104

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` around lines 47 - 69, The current filter handling treats empty lists like None so inputs such as run_id=[] broaden the search; update the checks in the run filtering logic (where run_id, task_id, flow_id, setup_id, uploader are evaluated) to explicitly distinguish None from an empty list: if a filter is None skip it, if it is an empty list either return an empty result (short-circuit) or raise a 400 validation error (choose the project's convention), and apply this same pattern to task_id, flow_id, setup_id and uploader so empty arrays do not fall through to the no-filter path.

…merge

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

Using Body for filters and pagination on the GET /run/list route makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic.
The hard-coded run IDs, setup IDs, and especially the exact upload_time value in runs_list_test.py make the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness.
The pagination clamp limit = max(1, min(pagination.limit, 1000)) embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Using `Body` for filters and pagination on the `GET /run/list` route makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic.
- The hard-coded run IDs, setup IDs, and especially the exact `upload_time` value in `runs_list_test.py` make the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness.
- The pagination clamp `limit = max(1, min(pagination.limit, 1000))` embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.

## Individual Comments

### Comment 1
<location path="tests/routers/openml/runs_list_test.py" line_range="68-74" />
<code_context>
+    assert response.json()["code"] == "372"
+
+
+async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
+    """GET and POST /run/list with no filters must return identical results."""
+    get_resp = await py_api.get("/run/list")
+    post_resp = await py_api.post("/run/list", json={})
+    assert get_resp.status_code == HTTPStatus.OK
+    assert post_resp.status_code == HTTPStatus.OK
+    assert get_resp.json() == post_resp.json()
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.

This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. `test_get_and_post_list_runs_with_filters_return_same_results`) that sends a JSON body including some filters (like `run_id` or `task_id` plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.

```suggestion
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
    """GET and POST /run/list with no filters must return identical results."""
    get_resp = await py_api.get("/run/list")
    post_resp = await py_api.post("/run/list", json={})
    assert get_resp.status_code == HTTPStatus.OK
    assert post_resp.status_code == HTTPStatus.OK
    assert get_resp.json() == post_resp.json()


async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None:
    """GET and POST /run/list with filters in the request body must return identical results."""
    payload = {
        "run_id": [24, 25],
        "limit": 10,
        "offset": 0,
    }

    get_resp = await py_api.get("/run/list", json=payload)
    post_resp = await py_api.post("/run/list", json=payload)

    assert get_resp.status_code == HTTPStatus.OK
    assert post_resp.status_code == HTTPStatus.OK
    assert get_resp.json() == post_resp.json()
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-26T07:58:22Z

tests/routers/openml/runs_list_test.py

+async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
+    """GET and POST /run/list with no filters must return identical results."""
+    get_resp = await py_api.get("/run/list")
+    post_resp = await py_api.post("/run/list", json={})
+    assert get_resp.status_code == HTTPStatus.OK
+    assert post_resp.status_code == HTTPStatus.OK
+    assert get_resp.json() == post_resp.json()


suggestion (testing): Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.

This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. test_get_and_post_list_runs_with_filters_return_same_results) that sends a JSON body including some filters (like run_id or task_id plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.

Suggested change

async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:

"""GET and POST /run/list with no filters must return identical results."""

get_resp = await py_api.get("/run/list")

post_resp = await py_api.post("/run/list", json={})

assert get_resp.status_code == HTTPStatus.OK

assert post_resp.status_code == HTTPStatus.OK

assert get_resp.json() == post_resp.json()

async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:

"""GET and POST /run/list with no filters must return identical results."""

get_resp = await py_api.get("/run/list")

post_resp = await py_api.post("/run/list", json={})

assert get_resp.status_code == HTTPStatus.OK

assert post_resp.status_code == HTTPStatus.OK

assert get_resp.json() == post_resp.json()

async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None:

"""GET and POST /run/list with filters in the request body must return identical results."""

payload = {

"run_id": [24, 25],

"limit": 10,

"offset": 0,

}

get_resp = await py_api.get("/run/list", json=payload)

post_resp = await py_api.post("/run/list", json=payload)

assert get_resp.status_code == HTTPStatus.OK

assert post_resp.status_code == HTTPStatus.OK

assert get_resp.json() == post_resp.json()

PGijsbers

Holding off on reviewing this. Let's focus on #277, and then maybe see how/if we should extract common functionality for these list functions.

The wontfix label is just to mean that the PR will not be processed for now. We do still want the feature.

feat: add GET /run/list endpoint (openml#39)

42bc467

saathviksheerla marked this pull request as ready for review March 18, 2026 04:37

sourcery-ai bot reviewed Mar 18, 2026

View reviewed changes

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

saathviksheerla and others added 3 commits March 18, 2026 10:33

fix: add ORDER BY r.rid for deterministic pagination in run list

366f273

Merge branch 'main' into feat/run-list-endpoint

3db091f

[pre-commit.ci] auto fixes from pre-commit.com hooks

15d0b13

for more information, see https://pre-commit.ci

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

saathviksheerla marked this pull request as draft March 26, 2026 06:03

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

src/routers/openml/runs.py Outdated Show resolved Hide resolved

clean up imports and ensure consistent behavior

ec8b17e

saathviksheerla marked this pull request as ready for review March 26, 2026 06:34

sourcery-ai bot reviewed Mar 26, 2026

View reviewed changes

coderabbitai bot reviewed Mar 26, 2026

View reviewed changes

secure run listing with pagination limits and remove duplicates from …

dde9b4d

…merge

saathviksheerla marked this pull request as draft March 26, 2026 07:19

saathviksheerla marked this pull request as ready for review March 26, 2026 07:56

sourcery-ai bot reviewed Mar 26, 2026

View reviewed changes

PGijsbers reviewed Mar 26, 2026

View reviewed changes

PGijsbers added the wontfix This will not be worked on label Mar 26, 2026

	expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
	expdb: Annotated[AsyncConnection, Depends(expdb_connection)],

Uh oh!

Conversation

saathviksheerla commented Mar 17, 2026

Description

Checklist

Uh oh!

codecov bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

PGijsbers left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Mar 17, 2026 •

edited

Loading

coderabbitai bot commented Mar 18, 2026 •

edited

Loading

PGijsbers left a comment •

edited

Loading