fix: remove predictable GCS bucket names to prevent bucket squatting#6399
Open
KevinZhao wants to merge 2 commits intogoogleapis:mainfrom
Open
fix: remove predictable GCS bucket names to prevent bucket squatting#6399KevinZhao wants to merge 2 commits intogoogleapis:mainfrom
KevinZhao wants to merge 2 commits intogoogleapis:mainfrom
Conversation
The fix for CVE-2026-2473 in v1.133.0 patched metadata/_models.py but missed two other locations in utils/gcs_utils.py that construct GCS bucket names from predictable inputs (project ID + region): - stage_local_data_in_gcs(): "{project}-vertex-staging-{location}" - generate_gcs_directory_for_pipeline_artifacts(): "{project}-vertex-pipelines-{location}" An attacker who knows a victim's project ID and region can pre-register these bucket names, causing the SDK to silently upload model artifacts, training data, and pipeline outputs to attacker-controlled storage. Apply the same fix pattern: require explicit bucket configuration via aiplatform.init(staging_bucket=...) instead of auto-generating predictable names.
- Update test_generate_gcs_directory_for_pipeline_artifacts to test both success (with staging_bucket set) and failure (RuntimeError) - Update test_create_gcs_bucket_for_pipeline_artifacts to pass explicit output_artifacts_gcs_dir instead of relying on auto-generation - Add validate_gcs_path() call in generate_gcs_directory_for_pipeline_artifacts - Add Raises section to docstrings for new RuntimeError conditions - Mark deprecated parameters in generate_gcs_directory_for_pipeline_artifacts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The fix for CVE-2026-2473 (GHSA-wh2j-26j7-9728) in v1.133.0 patched
metadata/_models.pybut missed two other locations inutils/gcs_utils.pythat construct GCS bucket names from predictable inputs (project ID + region):stage_local_data_in_gcs()line 206:"{project}-vertex-staging-{location}"generate_gcs_directory_for_pipeline_artifacts()line 254:"{project}-vertex-pipelines-{location}"An attacker who knows a victim's project ID and region can pre-register these globally-unique bucket names in their own GCP project and configure public write access. When the victim's SDK auto-generates the same predictable name,
Bucket.exists()returns True for the attacker's bucket, and the SDK silently uploads model artifacts, training data, and pipeline outputs to attacker-controlled storage.Changes
Apply the same fix pattern as CVE-2026-2473:
stage_local_data_in_gcs(): Require explicitstaging_gcs_diroraiplatform.init(staging_bucket=...). RaiseRuntimeErrorif neither is provided, instead of auto-generating a predictable bucket name.generate_gcs_directory_for_pipeline_artifacts(): Usestaging_bucketfrom global config. RaiseRuntimeErrorif not set. Addvalidate_gcs_path()to ensure propergs://prefix.Raisessections and deprecated parameter notes.Affected entry points
Model.upload()whenstaging_bucketis not provided (callsstage_local_data_in_gcs)PipelineJob()whenpipeline_rootis not provided (callsgenerate_gcs_directory_for_pipeline_artifacts)Test plan
test_generate_gcs_directory_for_pipeline_artifacts— tests both success path (with staging_bucket) and RuntimeError pathtest_create_gcs_bucket_for_pipeline_artifacts_if_it_does_not_exist— passes explicitoutput_artifacts_gcs_dirvertexai/callers (distillation, language models)