Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,10 @@ dev = [
"hypothesis",
"deepdiff",
"pytest-xdist",
"testcontainers",
"pymysql",
"cryptography",

]
docs = [
"mkdocs-material",
Expand Down
63 changes: 56 additions & 7 deletions tests/conftest.py
Original file line number Diff line number Diff line change
@@ -1,24 +1,79 @@
import contextlib
import json
import os
import re
from collections.abc import Iterator
from pathlib import Path
from typing import Any, NamedTuple

import _pytest.mark
import httpx
import pytest
import sqlalchemy
from _pytest.config import Config
from _pytest.nodes import Item
from dotenv import load_dotenv
from fastapi.testclient import TestClient
from sqlalchemy import Connection, Engine, text
from testcontainers.mysql import LogMessageWaitStrategy, MySqlContainer

from database.setup import expdb_database, user_database
from database.setup import user_database
from main import create_api
from routers.dependencies import expdb_connection, userdb_connection

load_dotenv()

PHP_API_URL = "http://openml-php-rest-api:80/api/v1/json"


@pytest.fixture(scope="session", autouse=True)
def override_testcontainers_connect() -> None:
"""
Override MySqlContainer._connect once per test session.
Applied automatically everywhere.
"""

def _connect(self: MySqlContainer) -> None:
wait_strategy = LogMessageWaitStrategy(
re.compile(
r".*: ready for connections",
flags=re.DOTALL | re.MULTILINE,
)
)
wait_strategy.wait_until_ready(self)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🌐 Web query:

testcontainers-python MySqlContainer wait strategy customize public API

💡 Result:

In testcontainers-python, MySqlContainer doesn’t expose a Java-style “wait strategy” hook on the class itself; the public API for customizing readiness is via the waiting utilities (log-based waiting) and/or by subclassing and overriding the connection check.

1) Log-based waiting (public API)

Use testcontainers.core.waiting_utils.wait_for_logs(container, predicate, timeout=None, interval=1) after the container is started (or inside a custom container’s startup flow). The predicate can be a regex string or a callable. The function supports timeout and interval. [1]

from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_for_logs

with MySqlContainer("mysql:8.0") as mysql:
    wait_for_logs(mysql, r"ready for connections", timeout=60, interval=1)

MySqlContainer itself is the standard MySQL wrapper you’re already using. [2]

2) Connection-based waiting via @wait_container_is_ready (public API)

If you want “wait until I can connect” semantics (common for DBs), subclass MySqlContainer and decorate your _connect() with wait_container_is_ready(...) so failures are retried until success/timeout. [3]

import mysql.connector
from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_container_is_ready

class MyReadyMySql(MySqlContainer):
    `@wait_container_is_ready`(mysql.connector.Error)
    def _connect(self):
        conn = mysql.connector.connect(
            host=self.get_container_host_ip(),
            port=int(self.get_exposed_port(3306)),
            user=self.username,
            password=self.password,
            database=self.dbname,
        )
        conn.close()

with MyReadyMySql("mysql:8.0") as mysql:
    ...

3) Tuning retry/timeout behavior

Retry limits/sleep/timeout for readiness waiting are controlled via testcontainers_config (global configuration). [3]

from testcontainers.core.config import testcontainers_config
testcontainers_config.timeout = 180
testcontainers_config.max_tries = 180
testcontainers_config.sleep_time = 1

Sources: [1] [2] [3]


Use wait_for_logs() or subclass with @wait_container_is_ready() instead of monkey-patching _connect.

Line 36 overrides a private method, which breaks on testcontainers-python upgrades if internals change. testcontainers-python provides public APIs:

  • wait_for_logs(): Call after container startup with a regex predicate for log-based readiness
  • Subclass + @wait_container_is_ready() decorator: Override _connect in a subclass with the decorator to handle retry logic safely

Example using wait_for_logs:

from testcontainers.mysql import MySqlContainer
from testcontainers.core.waiting_utils import wait_for_logs

with MySqlContainer("mysql:8.0") as mysql:
    wait_for_logs(mysql, r"ready for connections", timeout=60, interval=1)
🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 29 - 44, The fixture
override_testcontainers_connect is monkey-patching the private
MySqlContainer._connect; remove that override and instead use the public waiting
utilities—either call wait_for_logs(container, r"ready for connections",
timeout=60, interval=1) after container startup (use the
testcontainers.core.waiting_utils.wait_for_logs function) or create a subclass
(e.g., MySqlContainerWithWait) and implement _connect decorated with
`@wait_container_is_ready` to handle retry logic; in short: stop overriding
_connect directly in override_testcontainers_connect and switch to wait_for_logs
or a decorated subclass of MySqlContainer for log-based readiness.

MySqlContainer._connect = _connect # noqa: SLF001
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated


@pytest.fixture(scope="session")
def mysql_container() -> MySqlContainer:
container = MySqlContainer(
os.environ.get(
"OPENML_DATABASES_OPENML_URL",
"openml/test-database:20240105",
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "OPENML_DATABASES_OPENML_URL" --type py -C 3

Repository: openml/server-api

Length of output: 476


🏁 Script executed:

# Search for MySqlContainer definition and imports
rg -n "MySqlContainer" --type py -C 5

Repository: openml/server-api

Length of output: 3255


🏁 Script executed:

# Check the testcontainers library documentation or type hints
# Also search for any .md files that might document environment variables
find . -type f -name "*.md" | head -20

Repository: openml/server-api

Length of output: 429


🏁 Script executed:

# Check documentation files for env var documentation
cat docs/contributing/tests.md

Repository: openml/server-api

Length of output: 4491


🏁 Script executed:

# Check docker and installation documentation
cat docker/mysql/README.md

Repository: openml/server-api

Length of output: 2404


🏁 Script executed:

# Check the complete context of the mysql_container fixture
cat -n tests/conftest.py | sed -n '45,65p'

Repository: openml/server-api

Length of output: 732


🏁 Script executed:

# Search for any documentation or comments about OPENML_DATABASES env vars
rg -n "OPENML_DATABASES" --type py --type md -C 2

Repository: openml/server-api

Length of output: 1828


Rename OPENML_DATABASES_OPENML_URL to clarify it's a Docker image name, not a database connection URL.

The env var name OPENML_DATABASES_OPENML_URL misleadingly suggests a database connection string (e.g., mysql://host:3306/db), but it's actually passed as the Docker image name to MySqlContainer. This could cause confusion if someone sets it to an actual database URL, which would fail silently or with a confusing error when the container tries to start. The naming pattern is also inconsistent—adjacent env vars are clearly named USERNAME and PASSWORD. Consider renaming to OPENML_TEST_DATABASE_IMAGE to match its actual purpose.

🧰 Tools
🪛 GitHub Actions: Tests

[error] 48-48: FileNotFoundError during attempting to connect to Docker daemon. Docker API not reachable.

🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 48 - 52, Rename the misleading environment
variable OPENML_DATABASES_OPENML_URL to a name that reflects it holds a Docker
image (suggested OPENML_TEST_DATABASE_IMAGE) wherever it's used in
tests/conftest.py and related test configuration; update the call that passes
its value into MySqlContainer to read
os.environ.get("OPENML_TEST_DATABASE_IMAGE", "openml/test-database:20240105")
instead of the old name, and search/replace other references (tests, CI configs,
and docs) to the old variable so callers are updated and the naming is
consistent with OPENML_DATABASES_OPENML_USERNAME /
OPENML_DATABASES_OPENML_PASSWORD.

username=os.environ.get("OPENML_DATABASES_OPENML_USERNAME", ""),
password=os.environ.get("OPENML_DATABASES_OPENML_PASSWORD", ""),
dbname="openml_expdb",
)

container.start()
try:
yield container
finally:
container.stop()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

testcontainers-python MySqlContainer.start() Docker unavailable exception type

💡 Result:

In testcontainers-python, MySqlContainer.start() ultimately initializes the Docker SDK client (docker / “docker-py”). If Docker isn’t reachable (daemon not running, wrong DOCKER_HOST, etc.), the failure you’ll see is typically:

  • docker.errors.DockerException (often with a message like “Error while fetching server API version …”) [1][2]

So, the exception type to catch for “Docker unavailable” is generally docker.errors.DockerException. [1][2]


🏁 Script executed:

# Check if there are existing patterns for Docker error handling in the codebase
rg -i "docker|container" tests/conftest.py -A 3 -B 3

Repository: openml/server-api

Length of output: 1397


🏁 Script executed:

# Search for any existing pytest.skip usage patterns related to Docker or external services
rg "pytest.skip" --type py -B 2 -A 2

Repository: openml/server-api

Length of output: 2038


🏁 Script executed:

# Check imports in conftest.py to understand what's currently imported
head -30 tests/conftest.py

Repository: openml/server-api

Length of output: 840


🏁 Script executed:

# Check if there are any other exception handling patterns for testcontainers in the codebase
rg "testcontainers|MySqlContainer" --type py

Repository: openml/server-api

Length of output: 622


🏁 Script executed:

# Verify the fixture is used as a dependency in tests to understand impact scope
rg "mysql_container" tests/ --type py

Repository: openml/server-api

Length of output: 265


Skip tests gracefully when Docker isn't available (CI currently fails).

The pipeline reports a Docker connection error; container.start() will hard-fail the suite when Docker isn't reachable. Wrap the start call in a try-except to catch docker.errors.DockerException and skip tests in non-Docker environments.

🛠️ Proposed fix
+from docker.errors import DockerException
+
 `@pytest.fixture`(scope="session")
 def mysql_container() -> MySqlContainer:
     container = MySqlContainer(
         os.environ.get(
             "OPENML_DATABASES_OPENML_URL",
             "openml/test-database:20240105",
         ),
         username=os.environ.get("OPENML_DATABASES_OPENML_USERNAME", ""),
         password=os.environ.get("OPENML_DATABASES_OPENML_PASSWORD", ""),
         dbname="openml_expdb",
     )
 
-    container.start()
+    started = False
+    try:
+        container.start()
+        started = True
+    except DockerException as exc:
+        pytest.skip(f"Docker unavailable: {exc}")
     try:
         yield container
     finally:
-        container.stop()
+        if started:
+            container.stop()
🧰 Tools
🪛 GitHub Actions: Tests

[error] 50-50: Failed to initialize MySqlContainer due to docker connection error

🤖 Prompt for AI Agents
In `@tests/conftest.py` around lines 48 - 64, The mysql_container fixture
currently calls container.start() unguarded and will hard-fail when Docker isn't
reachable; wrap the container.start() call in a try/except that catches
docker.errors.DockerException and calls pytest.skip(...) to skip the session
tests when Docker is unavailable (ensure you import pytest and docker.errors if
not present), then proceed to yield container and keep the existing finally
container.stop() behavior so cleanup still runs if start succeeded.



@pytest.fixture
def expdb_test(mysql_container: MySqlContainer) -> Connection:
url = mysql_container.get_connection_url()
url = url.replace("mysql://", "mysql+pymysql://")

engine = sqlalchemy.create_engine(url)
with engine.begin() as connection:
yield connection
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (testing): The new expdb_test fixture no longer wraps connections in automatic_rollback, so test data will persist across tests and may break isolation.

With engine.begin() the transaction is committed on context exit, so writes from one test will persist into the next and can cause order-dependent, flaky tests when sharing the same database. To preserve isolation, either wrap the dynamic engine with automatic_rollback, explicitly manage a transaction and roll it back per test, or add a fixture that truncates/cleans the DB between tests while still using Testcontainers.


Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

@contextlib.contextmanager
def automatic_rollback(engine: Engine) -> Iterator[Connection]:
with engine.connect() as connection:
Expand All @@ -28,12 +83,6 @@ def automatic_rollback(engine: Engine) -> Iterator[Connection]:
transaction.rollback()


@pytest.fixture
def expdb_test() -> Connection:
with automatic_rollback(expdb_database()) as connection:
yield connection


@pytest.fixture
def user_test() -> Connection:
with automatic_rollback(user_database()) as connection:
Expand Down
Loading
Loading