Test type constraints separately by PGijsbers · Pull Request #327 · openml/server-api

PGijsbers · 2026-05-08T08:05:39Z

Instead of testing type constraints through API calls, we directly invoke the pydantic validation to test them.
This much faster and provides a clearer signal on failure.

coderabbitai · 2026-05-08T08:05:54Z

Warning

Rate limit exceeded

@PGijsbers has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 33 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c1096104-2afa-443c-9b42-0530dee36e3f

📥 Commits

Reviewing files that changed from the base of the PR and between 53029d4 and 6cd637d.

📒 Files selected for processing (1)

tests/types_test.py

Walkthrough

This PR renames the tag type from SystemString64 to TagString and refactors CasualString/CasualString128, updates all tagging-related router parameters and TagInfo responses to use TagString, replaces many plain int/string IDs with Identifier in database helper signatures, adds TypeAdapter tests for Identifier/TagString/CasualString, and removes one endpoint-level invalid-tag API test.

Possibly related PRs

openml/server-api#246: Related changes to setup tagging endpoints that originally used SystemString64.
openml/server-api#323: Modifies dataset tagging helpers and untag endpoints that reference the same tag types.
openml/server-api#280: Changes to setups DB helpers related to get_parameters which were also updated here.

Suggested labels

Datasets

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Test type constraints separately' clearly and directly matches the main objective of the PR: refactoring tests to validate type constraints through pydantic validation rather than API calls.
Description check	✅ Passed	The description explains the key change: testing type constraints via direct pydantic validation instead of API calls, highlighting the benefits of being faster and providing clearer failure signals.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch tests-types

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

In tests/types_test.py, test_identifier_rejects_negative currently validates the string "0" (same as test_identifier_rejects_zero); change this to a true negative value (e.g. "-1") or consolidate the zero/negative cases to avoid duplicated/incorrect coverage.
The parametrization of _invalid_punctuation_tag uses a set, which leads to non-deterministic test order and IDs; consider converting it to a sorted list before passing it to pytest.mark.parametrize for stable, reproducible parametrization.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `tests/types_test.py`, `test_identifier_rejects_negative` currently validates the string "0" (same as `test_identifier_rejects_zero`); change this to a true negative value (e.g. "-1") or consolidate the zero/negative cases to avoid duplicated/incorrect coverage.
- The parametrization of `_invalid_punctuation_tag` uses a `set`, which leads to non-deterministic test order and IDs; consider converting it to a sorted list before passing it to `pytest.mark.parametrize` for stable, reproducible parametrization.

## Individual Comments

### Comment 1
<location path="tests/types_test.py" line_range="23-25" />
<code_context>
+        _identifier.validate_strings("1.2")
+
+
+def test_identifier_rejects_negative() -> None:
+    with pytest.raises(ValidationError):
+        _identifier.validate_strings("0")
+
+
</code_context>
<issue_to_address>
**issue (testing):** The `test_identifier_rejects_negative` test currently checks `"0"`, making it duplicate `test_identifier_rejects_zero` and not actually testing negatives.

To match the test’s intent and avoid duplicating `test_identifier_rejects_zero`, update the value to an actual negative (e.g. `"-1"`) while keeping zero covered by the existing dedicated test.
</issue_to_address>

### Comment 2
<location path="tests/types_test.py" line_range="47-48" />
<code_context>
+    assert _tag_string.validate_strings(tag) == tag
+
+
+@pytest.mark.parametrize("tag", ["", "c" * 65, *_invalid_punctuation_tag])
+def test_tag_string_rejects_invalid(tag: str) -> None:
+    with pytest.raises(ValidationError):
+        _tag_string.validate_strings(tag)
</code_context>
<issue_to_address>
**suggestion (testing):** Add explicit invalid tag cases for whitespace (e.g. leading/trailing spaces) to mirror the previously tested edge cases.

The old API tests covered tags with whitespace (e.g. " a", "a b", "a\t"), but the new parametrization doesn’t. Since `string.punctuation` excludes whitespace, please add a few explicit whitespace-containing tags to maintain the previous validation coverage.

```suggestion
@pytest.mark.parametrize(
    "tag",
    [
        "",
        "c" * 65,
        *_invalid_punctuation_tag,
        " a",
        "a ",
        "a b",
        "a\t",
    ],
)
def test_tag_string_rejects_invalid(tag: str) -> None:
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/types_test.py`:
- Around line 23-31: The test test_identifier_rejects_negative is incorrectly
reusing the zero input "0" instead of a negative value; update the assertion in
test_identifier_rejects_negative to call _identifier.validate_strings with a
negative string (e.g., "-1") so it raises ValidationError and provides
negative-boundary coverage for _identifier.validate_strings.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: dbca1d9d-ab09-48a8-9cdc-f8bc5eb577cf

📥 Commits

Reviewing files that changed from the base of the PR and between 92c6f3b and 6bbf1f2.

📒 Files selected for processing (6)

src/routers/openml/datasets.py
src/routers/openml/setups.py
src/routers/openml/tasks.py
src/routers/types.py
tests/routers/openml/dataset_tag_test.py
tests/types_test.py

💤 Files with no reviewable changes (1)

tests/routers/openml/dataset_tag_test.py

codecov · 2026-05-08T08:14:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.48%. Comparing base (92c6f3b) to head (6cd637d).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #327      +/-   ##
==========================================
+ Coverage   94.40%   94.48%   +0.07%     
==========================================
  Files          67       68       +1     
  Lines        3112     3157      +45     
  Branches      229      229              
==========================================
+ Hits         2938     2983      +45     
  Misses        111      111              
  Partials       63       63

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

♻️ Duplicate comments (1)

tests/types_test.py (1)
23-30: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

test_identifier_rejects_negative never tests a negative value and duplicates test_identifier_rejects_zero.

Both tests call _identifier.validate_strings("0"). test_identifier_rejects_negative should use a negative input (e.g. "-1") to provide the coverage its name promises.
🐛 Proposed fix
 def test_identifier_rejects_negative() -> None:
     with pytest.raises(ValidationError):
-        _identifier.validate_strings("0")
+        _identifier.validate_strings("-1")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/types_test.py` around lines 23 - 30, The test named
test_identifier_rejects_negative is incorrectly passing the string "0" and
duplicating test_identifier_rejects_zero; update
test_identifier_rejects_negative to call _identifier.validate_strings with a
negative value such as "-1" so it actually asserts ValidationError for negative
inputs and provides distinct coverage from test_identifier_rejects_zero.

🧹 Nitpick comments (4)

src/database/studies.py (1)

153-159: 💤 Low value

attach_tasks still uses int for study_id/task_ids while attach_task (line 118) was updated to Identifier.

✨ Proposed fix

 async def attach_tasks(
     *,
-    study_id: int,
-    task_ids: list[int],
+    study_id: Identifier,
+    task_ids: list[Identifier],
     user: User,
     connection: AsyncConnection,
 ) -> None:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/database/studies.py` around lines 153 - 159, The attach_tasks function
signature and internal types should use the same Identifier type as attach_task:
change study_id: int to study_id: Identifier and task_ids: list[int] to
task_ids: list[Identifier], update any internal variables/annotations that
assume ints accordingly, and ensure Identifier is imported/available in this
module (mirror how attach_task declares and uses Identifier).

tests/types_test.py (1)

62-64: 💤 Low value

Minor: parametrize parameter "string" shadows the string stdlib import; assertion style is inconsistent with the TagString tests.

Two small nits:

Using "string" as the parametrize label (and function parameter name) shadows the import string at line 1 — use "value" or "s" to avoid confusion.
assert _casual_string.validate_strings(string) checks truthiness, while test_tag_string_accepts_valid (line 44) checks equality with == tag. Prefer == string for symmetry.

✨ Proposed fix

-@pytest.mark.parametrize("string", ["a", "a" * 1000, "_-.(),"])
-def test_casual_string_accepts_valid(string: str) -> None:
-    assert _casual_string.validate_strings(string)
+@pytest.mark.parametrize("value", ["a", "a" * 1000, "_-.(),"])
+def test_casual_string_accepts_valid(value: str) -> None:
+    assert _casual_string.validate_strings(value) == value


-@pytest.mark.parametrize("string", ["", *_invalid_punctuation_casual_string])
-def test_casual_string_rejects_invalid(string: str) -> None:
+@pytest.mark.parametrize("value", ["", *_invalid_punctuation_casual_string])
+def test_casual_string_rejects_invalid(value: str) -> None:
     with pytest.raises(ValidationError):
-        _casual_string.validate_strings(string)
+        _casual_string.validate_strings(value)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/types_test.py` around lines 62 - 64, The test
test_casual_string_accepts_valid uses the param name "string" which shadows the
stdlib import and asserts truthiness; rename the parametrize label and function
parameter to something like "value" (or "s") and update the assertion to compare
equality (assert _casual_string.validate_strings(value) == value) to match the
style of test_tag_string_accepts_valid and avoid the shadowing; locate uses of
_casual_string.validate_strings and the test function name to apply the change.

src/database/datasets.py (1)

98-119: ⚡ Quick win

tag is not updated while the symmetric delete_tag (line 71) was — parameters use int/str instead of Identifier/TagString.

✨ Proposed fix

-async def tag(id_: int, tag_: str, *, user_id: int, connection: AsyncConnection) -> None:
+async def tag(id_: Identifier, tag_: TagString, *, user_id: Identifier, connection: AsyncConnection) -> None:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/database/datasets.py` around lines 98 - 119, The tag function is passing
raw id_/tag_ types (int/str) but should accept/unwrap the domain types
Identifier and TagString like delete_tag does; update the tag(...)
implementation to accept Identifier and TagString (or convert them) and pass
their primitive values into the SQL parameters (dataset_id, tag, user_id) so the
INSERT uses the same unwrapped values/keys as delete_tag; locate the tag
function and ensure you convert Identifier -> id value and TagString -> string
before building the parameters and executing the query, preserving the same
IntegrityError handling.

src/database/tasks.py (1)

55-69: ⚡ Quick win

get_input_for_task_type still uses int while the sibling get_task_type was updated to Identifier.

Both functions accept a task-type ID (ttid), so the annotation should be consistent.

✨ Proposed fix

-async def get_input_for_task_type(task_type_id: int, expdb: AsyncConnection) -> Sequence[Row]:
+async def get_input_for_task_type(task_type_id: Identifier, expdb: AsyncConnection) -> Sequence[Row]:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/database/tasks.py` around lines 55 - 69, The get_input_for_task_type
function currently types the task_type_id parameter as int while the related
get_task_type uses Identifier; update get_input_for_task_type to accept
Identifier instead of int, adjust any type annotations/imports (e.g., ensure
Identifier is imported where get_input_for_task_type is defined) and keep the
SQL parameter key ("ttid") and parameters={"ttid": task_type_id} unchanged so
runtime behavior is identical; ensure the return typing (Sequence[Row]) remains
the same.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tests/types_test.py`:
- Around line 23-30: The test named test_identifier_rejects_negative is
incorrectly passing the string "0" and duplicating test_identifier_rejects_zero;
update test_identifier_rejects_negative to call _identifier.validate_strings
with a negative value such as "-1" so it actually asserts ValidationError for
negative inputs and provides distinct coverage from
test_identifier_rejects_zero.

---

Nitpick comments:
In `@src/database/datasets.py`:
- Around line 98-119: The tag function is passing raw id_/tag_ types (int/str)
but should accept/unwrap the domain types Identifier and TagString like
delete_tag does; update the tag(...) implementation to accept Identifier and
TagString (or convert them) and pass their primitive values into the SQL
parameters (dataset_id, tag, user_id) so the INSERT uses the same unwrapped
values/keys as delete_tag; locate the tag function and ensure you convert
Identifier -> id value and TagString -> string before building the parameters
and executing the query, preserving the same IntegrityError handling.

In `@src/database/studies.py`:
- Around line 153-159: The attach_tasks function signature and internal types
should use the same Identifier type as attach_task: change study_id: int to
study_id: Identifier and task_ids: list[int] to task_ids: list[Identifier],
update any internal variables/annotations that assume ints accordingly, and
ensure Identifier is imported/available in this module (mirror how attach_task
declares and uses Identifier).

In `@src/database/tasks.py`:
- Around line 55-69: The get_input_for_task_type function currently types the
task_type_id parameter as int while the related get_task_type uses Identifier;
update get_input_for_task_type to accept Identifier instead of int, adjust any
type annotations/imports (e.g., ensure Identifier is imported where
get_input_for_task_type is defined) and keep the SQL parameter key ("ttid") and
parameters={"ttid": task_type_id} unchanged so runtime behavior is identical;
ensure the return typing (Sequence[Row]) remains the same.

In `@tests/types_test.py`:
- Around line 62-64: The test test_casual_string_accepts_valid uses the param
name "string" which shadows the stdlib import and asserts truthiness; rename the
parametrize label and function parameter to something like "value" (or "s") and
update the assertion to compare equality (assert
_casual_string.validate_strings(value) == value) to match the style of
test_tag_string_accepts_valid and avoid the shadowing; locate uses of
_casual_string.validate_strings and the test function name to apply the change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 1eecdf68-3c97-4ef9-b03b-4bfcfad36c0a

📥 Commits

Reviewing files that changed from the base of the PR and between 6bbf1f2 and 53029d4.

📒 Files selected for processing (10)

src/database/datasets.py
src/database/flows.py
src/database/qualities.py
src/database/runs.py
src/database/setups.py
src/database/studies.py
src/database/tasks.py
src/database/users.py
src/routers/types.py
tests/types_test.py

✅ Files skipped from review due to trivial changes (2)

src/database/runs.py
src/database/qualities.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/routers/types.py

PGijsbers added 2 commits May 8, 2026 10:01

Rename SystemString64 to TagString and test it separately

e0f3576

Test is now covered by dedicated test for TagString

6bbf1f2

PGijsbers added the maintenance improvements or changes to existing systems label May 8, 2026

sourcery-ai Bot reviewed May 8, 2026

View reviewed changes

Comment thread tests/types_test.py Outdated

Comment thread tests/types_test.py Outdated

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

Comment thread tests/types_test.py

Constrain types

688d8e2

PGijsbers added 2 commits May 8, 2026 10:19

Narrow types

a7eb87b

write tests for CasualString

53029d4

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

PGijsbers added 2 commits May 8, 2026 11:43

Fixed ordering of parameters for pytest-xdist compatibility

7bf8fdc

fix typo and expand test coverage for whitespace in input

6cd637d

PGijsbers merged commit 1d44c05 into main May 8, 2026
8 checks passed

PGijsbers deleted the tests-types branch May 8, 2026 09:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test type constraints separately#327

Test type constraints separately#327
PGijsbers merged 7 commits into
mainfrom
tests-types

PGijsbers commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

PGijsbers commented May 8, 2026

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Possibly related PRs

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 8, 2026 •

edited

Loading

codecov Bot commented May 8, 2026 •

edited

Loading