-
Notifications
You must be signed in to change notification settings - Fork 1
Production-readiness refactor: drop MinIO, make Crate IDs more robust, add async validation and offline cache #186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from all commits
29beec3
a3aa2e7
837c74d
70ed8e7
ef93bdd
db4482c
fa1ef14
56913c1
ee62364
f900ae2
721dcb8
882fe57
4526f56
3289a9a
0b340ce
7a0d28a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| name: Lint | ||
|
|
||
| on: | ||
| pull_request: | ||
| branches: [ develop ] | ||
|
|
||
| jobs: | ||
| ruff: | ||
| runs-on: ubuntu-latest | ||
|
|
||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: '3.11' | ||
|
|
||
| - name: Install ruff | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install ruff | ||
| - name: Lint | ||
| run: ruff check . | ||
|
|
||
| - name: Format check | ||
| run: ruff format --check . | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -20,16 +20,15 @@ jobs: | |
| - name: Install dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install pytest requests minio docker | ||
| pip install pytest requests boto3 | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we pin our pytest, requests, boto3 versions here? The most recent tests for this PR ran with pytest-9.1.0, requests-2.33.1, and boto3-1.43.29 |
||
|
|
||
| - name: Build Docker Compose Containers | ||
| - name: Run integration tests (brings up the compose stack) | ||
| run: | | ||
| cp example.env .env | ||
| docker compose -f docker-compose-develop.yml build | ||
| pytest -s -v tests/test_integration.py | ||
|
|
||
| - name: Spin Up Docker Compose and Run Tests | ||
| run: pytest -s -v tests/test_integration.py | ||
|
|
||
| - name: Ensure that Docker Compose is Shutdown | ||
| - name: Ensure Docker Compose is shut down | ||
| if: always() | ||
| run: docker compose down | ||
| run: > | ||
| docker compose -f docker-compose-develop.yml -p cratey_integration | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we need the project name string here? ( |
||
| --profile objectstore down -v || true | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,17 +1,46 @@ | ||
| FROM python:3.11-slim | ||
|
|
||
| # Install required system packages, including git | ||
| RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/* | ||
| # git is needed by some dependencies; wget is only used when baking a profile. | ||
| RUN apt-get update && apt-get install -y git wget && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| WORKDIR /app | ||
|
|
||
| COPY requirements.txt . | ||
| RUN pip install --upgrade pip | ||
| RUN pip install --no-cache-dir -r requirements.txt | ||
|
|
||
| COPY cratey.py LICENSE /app/ | ||
| COPY wsgi.py LICENSE /app/ | ||
| COPY app /app/app | ||
|
|
||
| # Optionally fetch an extra RO-Crate profile into a normal directory. It is | ||
| # *added* to the bundled profiles at runtime via EXTRA_PROFILES_PATH. | ||
| # A plain build leaves PROFILES_ARCHIVE_URL empty | ||
| # and skips this; the "with profiles" image build passes it as --build-arg. | ||
| ARG PROFILES_ARCHIVE_URL="" | ||
| ARG FIVE_SAFES_PROFILE_VERSION="" | ||
| # Set EXTRA_PROFILES_PATH only for the profiles build (passed as a build arg). | ||
| ARG EXTRA_PROFILES_PATH="" | ||
| ENV EXTRA_PROFILES_PATH=${EXTRA_PROFILES_PATH} | ||
| ENV CACHE_PATH=/app/.rocrate-cache | ||
| RUN if [ -n "$PROFILES_ARCHIVE_URL" ]; then \ | ||
| mkdir -p /app/extra-profiles && \ | ||
| wget -O /tmp/profiles.tar.gz "$PROFILES_ARCHIVE_URL" && \ | ||
| tar -xzf /tmp/profiles.tar.gz \ | ||
| -C /app/extra-profiles \ | ||
| --strip-components=3 \ | ||
| "rocrate-validator-${FIVE_SAFES_PROFILE_VERSION}/rocrate_validator/profiles/five-safes-crate" && \ | ||
| rm /tmp/profiles.tar.gz ; \ | ||
| fi | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can, and should, organise this RUN command better: |
||
|
|
||
| # Pre-populate the HTTP cache so opt-in offline validation | ||
| # (VALIDATION_OFFLINE=true) works without network at runtime. | ||
| RUN if [ -n "$EXTRA_PROFILES_PATH" ]; then \ | ||
| rocrate-validator cache warm --all-profiles \ | ||
| --extra-profiles-path "$EXTRA_PROFILES_PATH" --cache-path "$CACHE_PATH" ; \ | ||
| else \ | ||
| rocrate-validator cache warm --all-profiles --cache-path "$CACHE_PATH" ; \ | ||
| fi | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above, let's do this better: |
||
|
|
||
| RUN useradd -ms /bin/bash flaskuser | ||
| RUN chown -R flaskuser:flaskuser /app | ||
|
|
||
|
|
@@ -21,4 +50,5 @@ EXPOSE 5000 | |
|
|
||
| CMD ["flask", "run", "--host=0.0.0.0"] | ||
|
|
||
| LABEL org.opencontainers.image.source="https://github.com/eScienceLab/Cratey-Validator" | ||
| LABEL org.opencontainers.image.source="https://github.com/eScienceLab/RO-Crate-Validation-Service" | ||
| LABEL org.ro-crate-validation-service.five-safes-profile-version="${FIVE_SAFES_PROFILE_VERSION}" | ||
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within the RSE team we're moving to specifying the exact hash of the action we want to use, rather than only the version number (c.f.: https://github.com/UoMResearchIT/Actions/blob/20365fa75ab643043ec188ef0b67fd5e537d326c/reuse/action.yml#L67). Should we do the same here?