Feat: Implement GET /setup/{id} endpoint#280
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an async DB helper Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Hey - I've found 3 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="src/database/setups.py" line_range="38-42" />
<code_context>
+ t_input.dataType AS data_type,
+ t_input.defaultValue AS default_value,
+ t_setting.value AS value
+ FROM input_setting t_setting
+ JOIN input t_input ON t_setting.input_id = t_input.id
+ JOIN implementation t_impl ON t_input.implementation_id = t_impl.id
</code_context>
<issue_to_address>
**suggestion:** Consider adding an explicit ORDER BY clause to make parameter ordering deterministic.
Without an `ORDER BY`, the database may return parameters in any order, which can break callers that assume stable ordering (e.g., for diffs, cache keys, or UI). Please add an explicit sort, e.g. `ORDER BY t_impl.id, t_input.id`, to make the result deterministic.
```suggestion
FROM input_setting t_setting
JOIN input t_input ON t_setting.input_id = t_input.id
JOIN implementation t_impl ON t_input.implementation_id = t_impl.id
WHERE t_setting.setup = :setup_id
ORDER BY t_impl.id, t_input.id
""",
```
</issue_to_address>
### Comment 2
<location path="src/schemas/setups.py" line_range="17" />
<code_context>
+ name: str
+ data_type: str | None = None
+ default_value: str | None = None
+ value: str
+
+ model_config = ConfigDict(from_attributes=True)
</code_context>
<issue_to_address>
**issue:** Check whether `value` should be nullable to match the database and avoid validation issues.
Since `t_setting.value` is selected directly and may be NULL at the DB level, `value: str` will cause Pydantic validation failures for such rows. If NULL is valid, use `value: str | None = None` or normalize NULL to a non-null value (e.g., empty string) before model validation to avoid unexpected 500/422 responses.
</issue_to_address>
### Comment 3
<location path="src/routers/openml/setups.py" line_range="36" />
<code_context>
+
+ setup_parameters = await database.setups.get_parameters(setup_id, expdb_db)
+
+ params_model = SetupParameters(
+ setup_id=str(setup_id),
+ flow_id=str(setup.implementation_id),
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new endpoint by always returning a list for `parameter` and letting Pydantic handle object conversion instead of manually building dicts.
You can simplify the new endpoint a bit without changing behavior.
1. **Always return a list for `parameter`**
Avoid the `None` vs `[]` duality and the extra conditional. This keeps the schema simpler for clients and reduces branching:
```python
params_model = SetupParameters(
setup_id=str(setup_id),
flow_id=str(setup.implementation_id),
parameter=[dict(param) for param in setup_parameters],
)
```
If `setup_parameters` is empty, `parameter` will just be `[]`.
2. **Avoid the dict conversion if possible**
Since you already have Pydantic models and likely `from_attributes=True` on `SetupParameter`, you can skip the `dict` round-trip and let Pydantic do the adaptation. For example, if `get_parameters` returns SQLAlchemy rows, you can validate them directly:
```python
from schemas.setups import SetupParameter
params_model = SetupParameters(
setup_id=str(setup_id),
flow_id=str(setup.implementation_id),
parameter=[
SetupParameter.model_validate(param, from_attributes=True)
for param in setup_parameters
],
)
```
If this doesn’t work with the current return type of `get_parameters`, consider updating `get_parameters` to return objects with attribute access (`.name`, `.value`, etc.) instead of dict-like mappings, so Pydantic can map them directly.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #280 +/- ##
=======================================
Coverage ? 54.20%
=======================================
Files ? 38
Lines ? 1581
Branches ? 137
=======================================
Hits ? 857
Misses ? 722
Partials ? 2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/database/setups.py (1)
28-42: Stabilize parameter ordering in the SQL result.At Line 41, the query filters but does not sort. Without
ORDER BY, parameter list order is not guaranteed and can become flaky for response comparisons.Suggested diff
FROM input_setting t_setting JOIN input t_input ON t_setting.input_id = t_input.id JOIN implementation t_impl ON t_input.implementation_id = t_impl.id WHERE t_setting.setup = :setup_id + ORDER BY t_input.id """,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/database/setups.py` around lines 28 - 42, The SQL query selecting inputs/settings (the SELECT joining input_setting, input, implementation and filtering by :setup_id) lacks a deterministic ORDER BY causing flaky parameter ordering; update the query to append an ORDER BY clause (for example ORDER BY t_input.implementation_id, t_input.name or ORDER BY t_impl.name, t_input.name) so results are consistently sorted by flow and parameter name (refer to the SELECT block, tables input_setting/input/implementation, and the :setup_id filter).tests/routers/openml/setups_test.py (1)
141-146: Strengthen the success-path assertions to lock the response contract.Right now the test only checks
setup_idand key presence. Adding a few schema-level assertions would catch more regressions.Suggested diff
async def test_get_setup_success(py_api: httpx.AsyncClient) -> None: response = await py_api.get("/setup/1") assert response.status_code == HTTPStatus.OK data = response.json()["setup_parameters"] assert data["setup_id"] == "1" - assert "parameter" in data + assert "flow_id" in data + assert isinstance(data.get("parameter"), list) + assert data["parameter"], "Expected setup 1 to contain at least one parameter" + first = data["parameter"][0] + assert {"id", "flow_id", "flow_name", "full_name", "parameter_name", "name", "value"} <= set(first)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routers/openml/setups_test.py` around lines 141 - 146, The test_get_setup_success should lock the response contract: assert that the top-level response is a dict and contains only the expected keys for the setup payload (verify set(data.keys()) == {'setup_id', 'parameter'}), assert type(data['setup_id']) is str and equals "1", assert that data['parameter'] is the expected container type (dict or list) and is non-empty, and if it's a list assert each item is a dict with at least 'name' and 'value' keys (if it's a dict assert it contains 'name' and 'value'); update the test_get_setup_success assertions accordingly to enforce these schema-level checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/database/setups.py`:
- Around line 28-42: The SQL query selecting inputs/settings (the SELECT joining
input_setting, input, implementation and filtering by :setup_id) lacks a
deterministic ORDER BY causing flaky parameter ordering; update the query to
append an ORDER BY clause (for example ORDER BY t_input.implementation_id,
t_input.name or ORDER BY t_impl.name, t_input.name) so results are consistently
sorted by flow and parameter name (refer to the SELECT block, tables
input_setting/input/implementation, and the :setup_id filter).
In `@tests/routers/openml/setups_test.py`:
- Around line 141-146: The test_get_setup_success should lock the response
contract: assert that the top-level response is a dict and contains only the
expected keys for the setup payload (verify set(data.keys()) == {'setup_id',
'parameter'}), assert type(data['setup_id']) is str and equals "1", assert that
data['parameter'] is the expected container type (dict or list) and is
non-empty, and if it's a list assert each item is a dict with at least 'name'
and 'value' keys (if it's a dict assert it contains 'name' and 'value'); update
the test_get_setup_success assertions accordingly to enforce these schema-level
checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 33a4cdd7-cfcd-4b24-9a38-8284b29f797c
📒 Files selected for processing (5)
src/database/setups.pysrc/routers/openml/setups.pysrc/schemas/setups.pytests/routers/openml/migration/setups_migration_test.pytests/routers/openml/setups_test.py
|
@PGijsbers I intentionally built explicit BaseModel schemas under The PR is ready for review. Thank you |
|
@PGijsbers Gentle ping on the PR, Thanks! |
PGijsbers
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Looks pretty good already, we can merge this after some minor changes :)
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/routers/openml/migration/setups_migration_test.py (1)
303-305:⚠️ Potential issue | 🟡 MinorValidate error payload in the precondition early-return path too.
Line 303-305 currently checks only status before returning. Add code/detail assertions there as well to preserve parity guarantees for missing setups in this parameterized run.
Proposed fix
if original.status_code == HTTPStatus.PRECONDITION_FAILED: assert new.status_code == HTTPStatus.NOT_FOUND + assert original.json()["error"]["code"] == new.json()["code"] + assert new.json()["detail"] == f"Setup {setup_id} not found." return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routers/openml/migration/setups_migration_test.py` around lines 303 - 305, The early-return branch that handles original.status_code == HTTPStatus.PRECONDITION_FAILED only asserts status codes; add assertions to validate the error payloads as well by parsing both responses (use original.json() and new.json()) and asserting the relevant error fields (e.g., error code/message/details or the exact payload) match or satisfy the same expectations used elsewhere in the test so parity is preserved for the PRECONDITION_FAILED -> NOT_FOUND path; update the block guarded by original and new and HTTPStatus.PRECONDITION_FAILED/HTTPStatus.NOT_FOUND accordingly.
🧹 Nitpick comments (1)
tests/routers/openml/setups_test.py (1)
141-146: Strengthen success-path assertions to lock the response contract.Line 145-146 currently checks only
setup_idand key presence. This can miss regressions in field types/shape (for example, int-vs-string changes in nested items).Proposed test hardening
async def test_get_setup_success(py_api: httpx.AsyncClient) -> None: response = await py_api.get("/setup/1") assert response.status_code == HTTPStatus.OK data = response.json()["setup_parameters"] assert data["setup_id"] == 1 - assert "parameter" in data + assert isinstance(data["flow_id"], int) + assert "parameter" in data + assert isinstance(data["parameter"], list) + assert len(data["parameter"]) > 0 + first_param = data["parameter"][0] + assert isinstance(first_param["id"], int) + assert isinstance(first_param["flow_id"], int)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/routers/openml/setups_test.py` around lines 141 - 146, The test_get_setup_success success-path assertions are too loose; after obtaining data = response.json()["setup_parameters"] update the test to lock the response contract by asserting exact schema and types: assert that data is a dict, assert isinstance(data["setup_id"], int) and equals 1, assert that "parameter" exists and has the expected type (e.g., list or dict) and structure (check expected keys/length and types of nested items), and assert any other required fields/values returned by the endpoint so the test fails on type/shape regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/routers/openml/migration/setups_migration_test.py`:
- Around line 317-327: The test should treat an empty "parameter" list the same
as a missing field before doing deep equality: in the block that inspects
setup_params (the "if 'parameter' in setup_params:" branch), if
setup_params["parameter"] is an empty list remove the "parameter" key (del
setup_params["parameter"]); otherwise continue converting p["id"]/p["flow_id"]
to int and deleting empty-list fields (p["data_type"], p["default_value"],
p["value"]) as currently done so comparisons won't fail when PHP returns [] but
Python omits the field.
---
Duplicate comments:
In `@tests/routers/openml/migration/setups_migration_test.py`:
- Around line 303-305: The early-return branch that handles original.status_code
== HTTPStatus.PRECONDITION_FAILED only asserts status codes; add assertions to
validate the error payloads as well by parsing both responses (use
original.json() and new.json()) and asserting the relevant error fields (e.g.,
error code/message/details or the exact payload) match or satisfy the same
expectations used elsewhere in the test so parity is preserved for the
PRECONDITION_FAILED -> NOT_FOUND path; update the block guarded by original and
new and HTTPStatus.PRECONDITION_FAILED/HTTPStatus.NOT_FOUND accordingly.
---
Nitpick comments:
In `@tests/routers/openml/setups_test.py`:
- Around line 141-146: The test_get_setup_success success-path assertions are
too loose; after obtaining data = response.json()["setup_parameters"] update the
test to lock the response contract by asserting exact schema and types: assert
that data is a dict, assert isinstance(data["setup_id"], int) and equals 1,
assert that "parameter" exists and has the expected type (e.g., list or dict)
and structure (check expected keys/length and types of nested items), and assert
any other required fields/values returned by the endpoint so the test fails on
type/shape regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a263b39f-cad0-4830-be47-6c9058e8c0f3
📒 Files selected for processing (6)
docs/migration.mdsrc/database/setups.pysrc/routers/openml/setups.pysrc/schemas/setups.pytests/routers/openml/migration/setups_migration_test.pytests/routers/openml/setups_test.py
✅ Files skipped from review due to trivial changes (1)
- docs/migration.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/database/setups.py
- src/routers/openml/setups.py
- src/schemas/setups.py
|
@PGijsbers You were right! Expanding the test to range(1, 125) caught edge cases where legacy PHP encoded missing parameter fields as empty arrays []. Using |
|
Thanks! Minor remark to improve code reusability for (likely) common operations. Otherwise changes are good. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/conversions.py`:
- Around line 49-55: The dict comprehension in nested_remove_nones repeatedly
calls nested_remove_nones(val, ...) multiple times and thus reprocesses the same
value and misses cases where recursion turns a value into [] when
remove_empty_list=True; fix by normalizing each value exactly once: iterate over
obj.items() (or use a comprehension that assigns a temp like normalized =
nested_remove_nones(val,...)), use that normalized result for the None check,
empty-list check, and as the dict value, and ensure remove_empty_list behavior
checks the normalized value (not the original) before including the key; locate
and update the comprehension in nested_remove_nones to compute and reuse the
normalized result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9b91d048-1fc0-40f0-a2f7-91ea1433ec7a
📒 Files selected for processing (2)
src/core/conversions.pytests/routers/openml/migration/setups_migration_test.py
|
@PGijsbers During testing, I found |
|
I made a small change in the way the function was changed to accommodate removal of empty lists, but we're good now. Thanks! |
FIxes: #61
Description
(SetupResponseand SetupParameter)undersrc/schemas/setups.pystrictly representing the returned topology.Checklist
Always:
Required for code changes:
If applicable:
/docs)Extra context:
Screenshot:
