feat: add GET /run/list endpoint (#39)#279
feat: add GET /run/list endpoint (#39)#279saathviksheerla wants to merge 6 commits intoopenml:mainfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #279 +/- ##
==========================================
- Coverage 54.15% 53.37% -0.79%
==========================================
Files 37 37
Lines 1553 1587 +34
Branches 135 142 +7
==========================================
+ Hits 841 847 +6
- Misses 710 738 +28
Partials 2 2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a second registration of the OpenML runs router in 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using a request body (Body) for the GET /run/list parameters (including pagination) can be problematic for standard HTTP clients and tooling; consider using query parameters for the GET variant and keeping the body-based interface only for POST /run/list, while still sharing the same handler logic.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)
107-124: Consider adding ORDER BY for deterministic pagination.The query lacks an ORDER BY clause, which means pagination results could be inconsistent across requests (same offset may return different rows). Adding
ORDER BY r.ridwould ensure stable, predictable pagination.♻️ Proposed fix
FROM run r JOIN algorithm_setup a ON r.setup = a.sid {where_clause} + ORDER BY r.rid LIMIT :limit OFFSET :offset🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` around lines 107 - 124, The SQL constructed in the variable query in runs.py uses LIMIT/OFFSET without an ORDER BY, causing non-deterministic pagination; update the text(...) SQL to include a deterministic ORDER BY clause (e.g., ORDER BY r.rid) placed before the LIMIT :limit OFFSET :offset so that queries built with where_clause and the query variable return stable, repeatable pages for functions that rely on run_id ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 107-124: The SQL constructed in the variable query in runs.py uses
LIMIT/OFFSET without an ORDER BY, causing non-deterministic pagination; update
the text(...) SQL to include a deterministic ORDER BY clause (e.g., ORDER BY
r.rid) placed before the LIMIT :limit OFFSET :offset so that queries built with
where_clause and the query variable return stable, repeatable pages for
functions that rely on run_id ordering.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 215d38de-d419-460b-92fe-fa8de6ea3b0b
📒 Files selected for processing (3)
src/main.pysrc/routers/openml/runs.pytests/routers/openml/runs_list_test.py
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/routers/openml/runs.py (1)
1-24: 🛠️ Refactor suggestion | 🟠 MajorRemove duplicate module docstring and consolidate imports.
The file has two module-level docstrings (lines 1 and 12) and several duplicate imports. Python only recognizes the first docstring; the second becomes a no-op string expression. The duplicates appear to be leftover from merging the new endpoint code.
Duplicated imports:
Annotated(lines 3, 14)APIRouter,Depends(lines 5, 16)AsyncConnection(lines 7, 17)expdb_connection(lines 10, 21)♻️ Proposed consolidation
"""Endpoints for OpenML Run resources.""" -from typing import Annotated, Any +from typing import Annotated, Any from fastapi import APIRouter, Body, Depends from sqlalchemy import text from sqlalchemy.ext.asyncio import AsyncConnection +import database.runs +from core.errors import NoResultsError, RunNotFoundError, RunTraceNotFoundError from routers.dependencies import Pagination, expdb_connection from routers.types import SystemString64 -"""Endpoints for run-related data.""" - -from typing import Annotated - -from fastapi import APIRouter, Depends -from sqlalchemy.ext.asyncio import AsyncConnection - -import database.runs -from core.errors import RunNotFoundError, RunTraceNotFoundError -from routers.dependencies import expdb_connection from schemas.runs import RunTrace, TraceIteration🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` around lines 1 - 24, Remove the duplicate module-level docstring and consolidate the repeated imports into a single block: keep one top docstring (either "Endpoints for OpenML Run resources." or "Endpoints for run-related data.") and merge imports so that typing.Annotated (and Any if used), fastapi.APIRouter and Depends, sqlalchemy.ext.asyncio.AsyncConnection, routers.dependencies.expdb_connection, database.runs, core.errors.RunNotFoundError/RunTraceNotFoundError, and schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure router = APIRouter(...) remains and remove the redundant second docstring and duplicate import lines (e.g., the extra Annotated, APIRouter, Depends, AsyncConnection, expdb_connection).
🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)
80-80: Inconsistent default onexpdbparameter.The
= Nonedefault is unnecessary for aDepends()injected parameter and inconsistent with the existingget_run_traceendpoint at line 153, which omits the default.♻️ Proposed fix for consistency
- expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None, + expdb: Annotated[AsyncConnection, Depends(expdb_connection)],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` at line 80, Remove the unnecessary "= None" default on the Depends-injected parameter so the signature uses expdb: Annotated[AsyncConnection, Depends(expdb_connection)] without a default; update the function that declares the expdb parameter (the one in src/routers/openml/runs.py where expdb is annotated with AsyncConnection and Depends(expdb_connection)) to match the existing get_run_trace endpoint style and ensure the dependency injection is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/routers/openml/runs.py`:
- Around line 1-24: Remove the duplicate module-level docstring and consolidate
the repeated imports into a single block: keep one top docstring (either
"Endpoints for OpenML Run resources." or "Endpoints for run-related data.") and
merge imports so that typing.Annotated (and Any if used), fastapi.APIRouter and
Depends, sqlalchemy.ext.asyncio.AsyncConnection,
routers.dependencies.expdb_connection, database.runs,
core.errors.RunNotFoundError/RunTraceNotFoundError, and
schemas.runs.RunTrace/TraceIteration are each imported exactly once; ensure
router = APIRouter(...) remains and remove the redundant second docstring and
duplicate import lines (e.g., the extra Annotated, APIRouter, Depends,
AsyncConnection, expdb_connection).
---
Nitpick comments:
In `@src/routers/openml/runs.py`:
- Line 80: Remove the unnecessary "= None" default on the Depends-injected
parameter so the signature uses expdb: Annotated[AsyncConnection,
Depends(expdb_connection)] without a default; update the function that declares
the expdb parameter (the one in src/routers/openml/runs.py where expdb is
annotated with AsyncConnection and Depends(expdb_connection)) to match the
existing get_run_trace endpoint style and ensure the dependency injection is
consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d762c642-0985-4912-86f0-f7a8fd70ea88
📒 Files selected for processing (2)
src/main.pysrc/routers/openml/runs.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/main.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/routers/openml/runs.py (1)
1-13: Remove duplicate module docstring.Line 13 contains a redundant docstring that duplicates line 1. This appears to be a leftover from code merging.
🧹 Proposed cleanup
from routers.types import SystemString64 -"""Endpoints for run-related data.""" - - import database.runs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/routers/openml/runs.py` around lines 1 - 13, Remove the redundant module docstring duplicate by keeping a single top-level docstring and deleting the second triple-quoted string (the one reading """Endpoints for run-related data."""), leaving only the intended module description (e.g., """Endpoints for OpenML Run resources.""") at the top of the module; this cleanup touches the module-level docstring in the runs.py module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/openml/runs.py`:
- Line 75: The tag parameter in the list_runs endpoint is missing the Body(...)
annotation, causing inconsistency with other filters; update the list_runs
function signature to annotate tag the same way as
run_id/task_id/flow_id/setup_id/uploader (e.g., tag: Annotated[str | None,
SystemString64] = Body(None, description=...)) so FastAPI treats it as a request
body field in the hybrid GET/POST handling—apply this change to the list_runs
parameter list where tag is declared.
---
Nitpick comments:
In `@src/routers/openml/runs.py`:
- Around line 1-13: Remove the redundant module docstring duplicate by keeping a
single top-level docstring and deleting the second triple-quoted string (the one
reading """Endpoints for run-related data."""), leaving only the intended module
description (e.g., """Endpoints for OpenML Run resources.""") at the top of the
module; this cleanup touches the module-level docstring in the runs.py module.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: dfb45d4a-e021-47bf-aa99-aca4a78783d7
📒 Files selected for processing (1)
src/routers/openml/runs.py
There was a problem hiding this comment.
Hey - I've found 5 issues, and left some high level feedback:
- In
main.pytherun_routeris now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed. - The
GET /run/listendpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `main.py` the `run_router` is now included twice (once newly added and once existing); the earlier include is redundant and could cause confusion, so it should be removed.
- The `GET /run/list` endpoint currently takes all filters via the request body, which is unusual for GET and may not be supported by some clients or proxies; consider moving filters to query parameters for GET and keeping body-based filters only for POST.
## Individual Comments
### Comment 1
<location path="src/main.py" line_range="73" />
<code_context>
app.include_router(task_router)
app.include_router(flows_router)
app.include_router(study_router)
+ app.include_router(run_router)
app.include_router(setup_router)
app.include_router(run_router)
</code_context>
<issue_to_address>
**issue (bug_risk):** The `run_router` is being included twice, which will cause route duplication conflicts.
It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one `app.include_router(run_router)` here.
</issue_to_address>
### Comment 2
<location path="src/routers/openml/runs.py" line_range="70-74" />
<code_context>
+ list[int] | None,
+ Body(description="Only include runs uploaded by these user id(s)."),
+ ] = None,
+ tag: Annotated[
+ str | None,
+ SystemString64,
+ Body(description="Only include runs with this tag(s)."),
+ ] = None,
+ expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
</code_context>
<issue_to_address>
**suggestion:** The `tag` parameter description suggests multiple tags but the type only allows a single string.
The description uses `tag(s)` but the parameter is `str | None`, not a list. If multiple tags should be supported (like the other list filters), consider `list[str] | None` with an `IN`-style filter; otherwise, make the description singular to match the current type.
```suggestion
tag: Annotated[
str | None,
SystemString64,
Body(description="Only include runs with this tag."),
] = None,
```
</issue_to_address>
### Comment 3
<location path="src/routers/openml/runs.py" line_range="75" />
<code_context>
+ SystemString64,
+ Body(description="Only include runs with this tag(s)."),
+ ] = None,
+ expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None,
+) -> list[dict[str, Any]]:
+ """List runs, optionally filtered by one or more criteria.
</code_context>
<issue_to_address>
**suggestion:** The `expdb` dependency defaulting to `None` is unnecessary and slightly misleading.
Given `Depends(expdb_connection)`, FastAPI will always provide an `AsyncConnection` here, so `expdb` can never be `None`. Removing the `= None` default (using just `expdb: Annotated[AsyncConnection, Depends(expdb_connection)]`) makes the type accurate and discourages unnecessary `None` checks downstream.
```suggestion
expdb: Annotated[AsyncConnection, Depends(expdb_connection)],
```
</issue_to_address>
### Comment 4
<location path="tests/routers/openml/runs_list_test.py" line_range="113-117" />
<code_context>
+ }
+
+
+async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None:
+ """Filter by multiple run_ids returns exactly those runs."""
+ response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]})
+ assert response.status_code == HTTPStatus.OK
+ assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26}
+
+
</code_context>
<issue_to_address>
**question (testing):** There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.
Since list filters are only applied when the parameter is truthy, a payload like `{"run_id": []}` is currently treated the same as omitting `run_id`. Please add tests for empty-list inputs (e.g. `run_id: []`, `task_id: []`) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).
</issue_to_address>
### Comment 5
<location path="tests/routers/openml/runs_list_test.py" line_range="256-260" />
<code_context>
+ )
+
+
+async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
+ """Pagination limit=1 returns exactly 1 run."""
+ response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
+ assert response.status_code == HTTPStatus.OK
+ assert len(response.json()) == 1
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.
Please add tests for invalid pagination inputs that the `Pagination` schema should reject (e.g., negative `limit`/`offset`, or `limit=0` if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.
Suggested implementation:
```python
async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
"""Pagination limit=1 returns exactly 1 run."""
response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
assert response.status_code == HTTPStatus.OK
assert len(response.json()) == 1
@pytest.mark.parametrize(
"pagination",
[
{"limit": -1, "offset": 0}, # negative limit
{"limit": 1, "offset": -5}, # negative offset
{"limit": 0, "offset": 0}, # zero limit, if not allowed by schema
],
)
async def test_list_runs_pagination_invalid_inputs(
py_api: httpx.AsyncClient,
pagination: dict,
) -> None:
"""Invalid pagination values should be rejected by the API."""
response = await py_api.post("/run/list", json={"pagination": pagination})
assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY
async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:
```
If `pytest` and `HTTPStatus` are not already imported in this file, you should ensure the following imports exist near the top of the file:
- `import pytest`
- `from http import HTTPStatus`
No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns `422 Unprocessable Entity`.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| app.include_router(task_router) | ||
| app.include_router(flows_router) | ||
| app.include_router(study_router) | ||
| app.include_router(run_router) |
There was a problem hiding this comment.
issue (bug_risk): The run_router is being included twice, which will cause route duplication conflicts.
It’s now included once in the new line you added and again in the existing line below, which will cause FastAPI to raise duplicate route errors for identical paths/methods. Please keep only one app.include_router(run_router) here.
| tag: Annotated[ | ||
| str | None, | ||
| SystemString64, | ||
| Body(description="Only include runs with this tag(s)."), | ||
| ] = None, |
There was a problem hiding this comment.
suggestion: The tag parameter description suggests multiple tags but the type only allows a single string.
The description uses tag(s) but the parameter is str | None, not a list. If multiple tags should be supported (like the other list filters), consider list[str] | None with an IN-style filter; otherwise, make the description singular to match the current type.
| tag: Annotated[ | |
| str | None, | |
| SystemString64, | |
| Body(description="Only include runs with this tag(s)."), | |
| ] = None, | |
| tag: Annotated[ | |
| str | None, | |
| SystemString64, | |
| Body(description="Only include runs with this tag."), | |
| ] = None, |
src/routers/openml/runs.py
Outdated
| SystemString64, | ||
| Body(description="Only include runs with this tag(s)."), | ||
| ] = None, | ||
| expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None, |
There was a problem hiding this comment.
suggestion: The expdb dependency defaulting to None is unnecessary and slightly misleading.
Given Depends(expdb_connection), FastAPI will always provide an AsyncConnection here, so expdb can never be None. Removing the = None default (using just expdb: Annotated[AsyncConnection, Depends(expdb_connection)]) makes the type accurate and discourages unnecessary None checks downstream.
| expdb: Annotated[AsyncConnection, Depends(expdb_connection)] = None, | |
| expdb: Annotated[AsyncConnection, Depends(expdb_connection)], |
| async def test_list_runs_filter_multiple_run_ids(py_api: httpx.AsyncClient) -> None: | ||
| """Filter by multiple run_ids returns exactly those runs.""" | ||
| response = await py_api.post("/run/list", json={"run_id": [RUN_ID_24, RUN_ID_26]}) | ||
| assert response.status_code == HTTPStatus.OK | ||
| assert {r["run_id"] for r in response.json()} == {RUN_ID_24, RUN_ID_26} |
There was a problem hiding this comment.
question (testing): There are no tests for list filters being provided as empty lists, which currently behave like "no filter" due to truthiness checks.
Since list filters are only applied when the parameter is truthy, a payload like {"run_id": []} is currently treated the same as omitting run_id. Please add tests for empty-list inputs (e.g. run_id: [], task_id: []) to lock in the current behavior and catch regressions if this logic changes (for example, to interpret empty lists as “match nothing”).
| async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None: | ||
| """Pagination limit=1 returns exactly 1 run.""" | ||
| response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}}) | ||
| assert response.status_code == HTTPStatus.OK | ||
| assert len(response.json()) == 1 |
There was a problem hiding this comment.
suggestion (testing): Pagination validation edge cases (e.g. negative limit/offset or zero limit) are not covered.
Please add tests for invalid pagination inputs that the Pagination schema should reject (e.g., negative limit/offset, or limit=0 if that's not allowed), and assert that the API returns 422 (or the correct error status) for those cases.
Suggested implementation:
async def test_list_runs_pagination_limit(py_api: httpx.AsyncClient) -> None:
"""Pagination limit=1 returns exactly 1 run."""
response = await py_api.post("/run/list", json={"pagination": {"limit": 1, "offset": 0}})
assert response.status_code == HTTPStatus.OK
assert len(response.json()) == 1
@pytest.mark.parametrize(
"pagination",
[
{"limit": -1, "offset": 0}, # negative limit
{"limit": 1, "offset": -5}, # negative offset
{"limit": 0, "offset": 0}, # zero limit, if not allowed by schema
],
)
async def test_list_runs_pagination_invalid_inputs(
py_api: httpx.AsyncClient,
pagination: dict,
) -> None:
"""Invalid pagination values should be rejected by the API."""
response = await py_api.post("/run/list", json={"pagination": pagination})
assert response.status_code == HTTPStatus.UNPROCESSABLE_ENTITY
async def test_list_runs_combined_run_id_and_mismatched_task_id(py_api: httpx.AsyncClient) -> None:If pytest and HTTPStatus are not already imported in this file, you should ensure the following imports exist near the top of the file:
import pytestfrom http import HTTPStatus
No other behavioral changes are required; the new test parametrizes invalid pagination payloads and asserts the API returns 422 Unprocessable Entity.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/routers/openml/runs.py`:
- Around line 47-69: The current filter handling treats empty lists like None so
inputs such as run_id=[] broaden the search; update the checks in the run
filtering logic (where run_id, task_id, flow_id, setup_id, uploader are
evaluated) to explicitly distinguish None from an empty list: if a filter is
None skip it, if it is an empty list either return an empty result
(short-circuit) or raise a 400 validation error (choose the project's
convention), and apply this same pattern to task_id, flow_id, setup_id and
uploader so empty arrays do not fall through to the no-filter path.
- Line 46: Clamp Pagination fields before they are used in SQL: validate and
sanitize the Pagination instance (the pagination parameter and other usages at
the other spots where Pagination is accepted) by forcing offset = max(0, offset)
and limit = min(MAX_PAGE_SIZE, max(1, limit)) (pick a sensible MAX_PAGE_SIZE
constant), then pass the sanitized values into the SQL/LIMIT/OFFSET binding;
update the code paths that accept Annotated[Pagination, Body(...)] (the
parameter named pagination at the shown diff and the other occurrences around
lines 90 and 128) to perform this clamping immediately after receiving the body
and before any query construction/execution.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 131ee599-ef4e-4b6a-9f0a-f0741b4b6756
📒 Files selected for processing (1)
src/routers/openml/runs.py
| run_id: Annotated[ | ||
| list[int] | None, | ||
| Body( | ||
| description="The run(s) to include in the search. " | ||
| "If none are specified, all runs are included.", | ||
| ), | ||
| ] = None, | ||
| task_id: Annotated[ | ||
| list[int] | None, | ||
| Body(description="Only include runs for these task id(s)."), | ||
| ] = None, | ||
| flow_id: Annotated[ | ||
| list[int] | None, | ||
| Body(description="Only include runs using these flow id(s)."), | ||
| ] = None, | ||
| setup_id: Annotated[ | ||
| list[int] | None, | ||
| Body(description="Only include runs with these setup id(s)."), | ||
| ] = None, | ||
| uploader: Annotated[ | ||
| list[int] | None, | ||
| Body(description="Only include runs uploaded by these user id(s)."), | ||
| ] = None, |
There was a problem hiding this comment.
Explicit empty filters currently broaden the search.
The truthiness checks on Lines 95-104 treat [] the same as None, so a request like { "run_id": [] } falls through to the no-filter path and can return unrelated runs. That contradicts the documented “all provided filters” behavior; handle is not None separately and reject or short-circuit empty lists.
🔧 One possible guard
- if run_id:
+ if run_id is not None:
+ if not run_id:
+ raise NoResultsError("No runs match the search criteria.")
_add_in_filter(filters, params, "r.rid", "run_id", run_id)Apply the same pattern to task_id, flow_id, setup_id, and uploader.
Also applies to: 95-104
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/routers/openml/runs.py` around lines 47 - 69, The current filter handling
treats empty lists like None so inputs such as run_id=[] broaden the search;
update the checks in the run filtering logic (where run_id, task_id, flow_id,
setup_id, uploader are evaluated) to explicitly distinguish None from an empty
list: if a filter is None skip it, if it is an empty list either return an empty
result (short-circuit) or raise a 400 validation error (choose the project's
convention), and apply this same pattern to task_id, flow_id, setup_id and
uploader so empty arrays do not fall through to the no-filter path.
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- Using
Bodyfor filters and pagination on theGET /run/listroute makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic. - The hard-coded run IDs, setup IDs, and especially the exact
upload_timevalue inruns_list_test.pymake the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness. - The pagination clamp
limit = max(1, min(pagination.limit, 1000))embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `Body` for filters and pagination on the `GET /run/list` route makes it hard to use with common HTTP clients and tooling; consider moving these parameters to query params for GET (keeping the body shape for POST) to make the API more conventional and ergonomic.
- The hard-coded run IDs, setup IDs, and especially the exact `upload_time` value in `runs_list_test.py` make the tests tightly coupled to the current DB snapshot; consider setting up fixture data or loosening assertions around stable but non-essential fields like timestamps to avoid brittleness.
- The pagination clamp `limit = max(1, min(pagination.limit, 1000))` embeds a magic constant and logic locally; consider extracting a shared max-page-size constant (or helper) so pagination behavior stays consistent and centrally adjustable across list endpoints.
## Individual Comments
### Comment 1
<location path="tests/routers/openml/runs_list_test.py" line_range="68-74" />
<code_context>
+ assert response.json()["code"] == "372"
+
+
+async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
+ """GET and POST /run/list with no filters must return identical results."""
+ get_resp = await py_api.get("/run/list")
+ post_resp = await py_api.post("/run/list", json={})
+ assert get_resp.status_code == HTTPStatus.OK
+ assert post_resp.status_code == HTTPStatus.OK
+ assert get_resp.json() == post_resp.json()
+
+
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.
This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. `test_get_and_post_list_runs_with_filters_return_same_results`) that sends a JSON body including some filters (like `run_id` or `task_id` plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.
```suggestion
async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with no filters must return identical results."""
get_resp = await py_api.get("/run/list")
post_resp = await py_api.post("/run/list", json={})
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()
async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None:
"""GET and POST /run/list with filters in the request body must return identical results."""
payload = {
"run_id": [24, 25],
"limit": 10,
"offset": 0,
}
get_resp = await py_api.get("/run/list", json=payload)
post_resp = await py_api.post("/run/list", json=payload)
assert get_resp.status_code == HTTPStatus.OK
assert post_resp.status_code == HTTPStatus.OK
assert get_resp.json() == post_resp.json()
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None: | ||
| """GET and POST /run/list with no filters must return identical results.""" | ||
| get_resp = await py_api.get("/run/list") | ||
| post_resp = await py_api.post("/run/list", json={}) | ||
| assert get_resp.status_code == HTTPStatus.OK | ||
| assert post_resp.status_code == HTTPStatus.OK | ||
| assert get_resp.json() == post_resp.json() |
There was a problem hiding this comment.
suggestion (testing): Add a test that GET /run/list with filters in the request body behaves the same as POST /run/list.
This currently only checks GET vs POST equivalence with no filters. Since the endpoint defines filters and pagination in the request body, please add a test (e.g. test_get_and_post_list_runs_with_filters_return_same_results) that sends a JSON body including some filters (like run_id or task_id plus pagination) in both GET and POST requests and asserts identical responses. This helps catch regressions where GET might ignore or mis-handle body filters.
| async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None: | |
| """GET and POST /run/list with no filters must return identical results.""" | |
| get_resp = await py_api.get("/run/list") | |
| post_resp = await py_api.post("/run/list", json={}) | |
| assert get_resp.status_code == HTTPStatus.OK | |
| assert post_resp.status_code == HTTPStatus.OK | |
| assert get_resp.json() == post_resp.json() | |
| async def test_get_and_post_list_runs_return_same_results(py_api: httpx.AsyncClient) -> None: | |
| """GET and POST /run/list with no filters must return identical results.""" | |
| get_resp = await py_api.get("/run/list") | |
| post_resp = await py_api.post("/run/list", json={}) | |
| assert get_resp.status_code == HTTPStatus.OK | |
| assert post_resp.status_code == HTTPStatus.OK | |
| assert get_resp.json() == post_resp.json() | |
| async def test_get_and_post_list_runs_with_filters_return_same_results(py_api: httpx.AsyncClient) -> None: | |
| """GET and POST /run/list with filters in the request body must return identical results.""" | |
| payload = { | |
| "run_id": [24, 25], | |
| "limit": 10, | |
| "offset": 0, | |
| } | |
| get_resp = await py_api.get("/run/list", json=payload) | |
| post_resp = await py_api.post("/run/list", json=payload) | |
| assert get_resp.status_code == HTTPStatus.OK | |
| assert post_resp.status_code == HTTPStatus.OK | |
| assert get_resp.json() == post_resp.json() |
There was a problem hiding this comment.
Holding off on reviewing this. Let's focus on #277, and then maybe see how/if we should extract common functionality for these list functions.
The wontfix label is just to mean that the PR will not be processed for now. We do still want the feature.
Description
Adds the
GET /run/listandPOST /run/listendpoints for listing runs with optional filters.Fixes: #39
Filters supported (all optional, combinable with AND logic):
run_id— list of run IDstask_id— list of task IDsflow_id— list of flow IDssetup_id— list of setup IDsuploader— list of user IDstag— single tag string (validated viaSystemString64)Pagination is supported via the nested
paginationbody parameter (consistent with other list endpoints).Returns 404 with code 372 when no runs match the filters.
Deviation from PHP: The PHP API requires at least one filter (error 510). This endpoint follows the pattern of other list endpoints in this API and allows no-filter requests, returning all runs paginated.
Checklist