Skip to content

fixing docker build and Vertex AI tests#7833

Open
dmiltr3 wants to merge 8 commits intotensorflow:masterfrom
dmiltr3:dev_new_image
Open

fixing docker build and Vertex AI tests#7833
dmiltr3 wants to merge 8 commits intotensorflow:masterfrom
dmiltr3:dev_new_image

Conversation

@dmiltr3
Copy link
Copy Markdown

@dmiltr3 dmiltr3 commented Mar 30, 2026

  1. The Docker build has been updated to use gcr.io/tfx-oss-public/tfx_base:py310-20260326 as its foundation.
  2. Vulnerabilities have been reduced from 180 to 68. Pyarrow and Keras associated CVEs were not addressed; the pyarrow upgrade requires broader component support for a 10.0.1 to 14.0.1 change, which is likely infeasible now, and a Keras 2 to 3 upgrade is a significant undertaking and might be a breaking change for the consumers.
  3. Strip the image from unused python and OS components like Jupiter server and notebook, etc
  4. The Dockerfile has been modernized to adopt a multi-stage build approach, with wheel building steps arranged to prevent unnecessary rebuilds.
  5. A lightweight Apache beam version discovery mechanism has been implemented to eliminate the time spent on building extra container for beam version querying purposes
  6. The build for CPP wheels now incorporates a --no-rebuild and --clean-cache option, accessible via build_docker_image.sh or the USE_CPP_WHEELS_FROM_TEMP=true Docker argument. This feature aids in debugging Docker builds by skipping redundant rebuilds when modifying sources, given the resource-intensive nature of the CPP build. Consequently, image building is now significantly faster when troubleshooting Python dependencies.
  7. The tfx.patch has been removed from the process of stripping pinned versions in TXT files (remains for depencies.py), as this functionality has been transferred to build_docker_image.sh. The reasoning behind this change is that as the requirements.txt file evolves a lot during the Dockerfile tuning, the patch becomes ineffective due to hunks falling out of context.

Testing - Vertex AI

python -m pytest -sv \
  tfx/orchestration/kubeflow/v2/e2e_tests/bigquery_integration_test.py \
  tfx/orchestration/kubeflow/v2/e2e_tests/csv_example_gen_integration_test.py \
  tfx/orchestration/kubeflow/v2/e2e_tests/exit_handler_e2e_test.py \
  tfx/orchestration/kubeflow/v2/e2e_tests/artifact_value_placeholder_integration_test.py \

These Docker build variations have been put to the test.:
- TFX_DEPENDENCY_SELECTOR=DEFAULT
- TFX_DEPENDENCY_SELECTOR=NIGHTLY
- TFX_DEPENDENCY_SELECTOR=UNCONSTRAINED

@dmiltr3 dmiltr3 marked this pull request as draft March 30, 2026 02:39
@dmiltr3 dmiltr3 marked this pull request as ready for review April 1, 2026 22:33
@dmiltr3 dmiltr3 marked this pull request as draft April 1, 2026 22:38
@dmiltr3 dmiltr3 marked this pull request as ready for review April 1, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant