Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ substitutions:
options:
logging: CLOUD_LOGGING_ONLY
steps:
- name: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/gigl-base-images/gigl-builder:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
- name: us-central1-docker.pkg.dev/external-snap-ci-github-gigl/gigl-base-images/gigl-builder:6db83bdc98b2da65ac80243ce08cd8b37b3ee85f.110.1
entrypoint: /bin/bash
# Route sbt through Google's Maven Central mirror to avoid 429 rate limits from repo1.maven.org.
# Intentionally set here (CI env) rather than in scala/.sbtopts or scala_spark35/.sbtopts to avoid
Expand Down
84 changes: 84 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,90 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).

## [Unreleased]

## [0.3.0] - Jun 1, 2026

### Deprecations

- Deprecate `ShardStrategy` and default distributed sharding to `CONTIGUOUS` by @kmontemayor2-sc in
https://github.com/Snapchat/GiGL/pull/582 and https://github.com/Snapchat/GiGL/pull/545
- Migrate `RESOURCE_CONFIG_PATH` to `GIGL_RESOURCE_CONFIG_URI` by @kmontemayor2-sc in
https://github.com/Snapchat/GiGL/pull/648

### Changed

- Replace mypy with [ty](https://github.com/astral-sh/ty) for static type checking and migrate formatting to Ruff by
@svij-sc in https://github.com/Snapchat/GiGL/pull/585 and https://github.com/Snapchat/GiGL/pull/583
- Consolidate distributed loader and sampler abstractions, including `BaseDistLoader`, `BaseDistNeighborSampler`,
sampler factory helpers, two-phase loader initialization, and shared sampling options by @mkolodner-sc and
@kmontemayor2-sc in https://github.com/Snapchat/GiGL/pull/495, https://github.com/Snapchat/GiGL/pull/532,
https://github.com/Snapchat/GiGL/pull/536, https://github.com/Snapchat/GiGL/pull/570,
https://github.com/Snapchat/GiGL/pull/576, https://github.com/Snapchat/GiGL/pull/579, and
https://github.com/Snapchat/GiGL/pull/561
- Merge `testing/` into `tests/`, migrate tests to GiGL test case utilities, and improve graph store integration test
coverage by @kmontemayor2-sc and @svij-sc in https://github.com/Snapchat/GiGL/pull/494,
https://github.com/Snapchat/GiGL/pull/479, https://github.com/Snapchat/GiGL/pull/480,
https://github.com/Snapchat/GiGL/pull/515, and https://github.com/Snapchat/GiGL/pull/547

### Added

- Enable GraphStore mode across storage and compute, including GiGL-owned `DistServer`, `DistABLPLoader` GraphStore
mode, multiple GraphStore loaders, and homogeneous and heterogeneous examples by @kmontemayor2-sc in
https://github.com/Snapchat/GiGL/pull/476, https://github.com/Snapchat/GiGL/pull/485,
https://github.com/Snapchat/GiGL/pull/493, https://github.com/Snapchat/GiGL/pull/514, and
https://github.com/Snapchat/GiGL/pull/526
- Add distributed and C++-based PPR sampling, including PPR sequence generation and new GiGL wheel builds with the
`gigl-core` C++/CUDA extension package, by @mkolodner-sc and @yliu2-sc in https://github.com/Snapchat/GiGL/pull/538,
https://github.com/Snapchat/GiGL/pull/560, https://github.com/Snapchat/GiGL/pull/558, and
https://github.com/Snapchat/GiGL/pull/556
- Add shared multi-channel graph store sampling backend, remote channels with pinned-memory bulk transfer, and two-phase
sampling APIs by @kmontemayor2-sc in https://github.com/Snapchat/GiGL/pull/577,
https://github.com/Snapchat/GiGL/pull/565, and https://github.com/Snapchat/GiGL/pull/578
- Add weighted sampling, positional encoding transforms, Graph Transformer encoder, degree tensor computation for
`DistDataset`, and max-label-per-anchor support in the data splitter by @mkolodner-sc and @yliu2-sc in
https://github.com/Snapchat/GiGL/pull/635, https://github.com/Snapchat/GiGL/pull/509,
https://github.com/Snapchat/GiGL/pull/537, https://github.com/Snapchat/GiGL/pull/517, and
https://github.com/Snapchat/GiGL/pull/589
- Add `CustomResourceConfig` shell-command launchers, custom launcher subprocess dispatch, and GiGL env var propagation
for custom and Vertex AI launchers by @kmontemayor2-sc in https://github.com/Snapchat/GiGL/pull/625,
https://github.com/Snapchat/GiGL/pull/626, https://github.com/Snapchat/GiGL/pull/642, and
https://github.com/Snapchat/GiGL/pull/653
- Add Vertex AI boot disk, reservation, and Data Preprocessor timeout controls by @zfan3-sc, @kmontemayor2-sc, and
@mkolodner-sc in https://github.com/Snapchat/GiGL/pull/521, https://github.com/Snapchat/GiGL/pull/590, and
https://github.com/Snapchat/GiGL/pull/524
- Add GBML config wrapper maps for node and edge type metadata, unified metadata extraction, BigQuery latest-table
utility, and SNC example code by @svij-sc, @mkolodner-sc, and @kmontemayor2-sc in
https://github.com/Snapchat/GiGL/pull/643, https://github.com/Snapchat/GiGL/pull/544,
https://github.com/Snapchat/GiGL/pull/516, and https://github.com/Snapchat/GiGL/pull/641

### Fixed

- Fix PPR sampler output edges and memory behavior by @mkolodner-sc in https://github.com/Snapchat/GiGL/pull/562,
https://github.com/Snapchat/GiGL/pull/566, and https://github.com/Snapchat/GiGL/pull/645
- Fix dataloading of multiple labels, missing anchor-node labels, dataset factory parallel tensor loading, test dataset
edge direction, and `PreprocessedMetadataPbWrapper` `LocalUri` kwargs by @mkolodner-sc, @kmontemayor2-sc, and @svij-sc
in https://github.com/Snapchat/GiGL/pull/612, https://github.com/Snapchat/GiGL/pull/571,
https://github.com/Snapchat/GiGL/pull/606, https://github.com/Snapchat/GiGL/pull/552, and
https://github.com/Snapchat/GiGL/pull/639
- Fix launcher image selection for graph store storage pools, default `should_use_glt_backend`, C++ installation, SBT
dependency resolution, and types-protobuf v7 `ParseDict` compatibility by @kmontemayor2-sc, @mkolodner-sc, and
@svij-sc in https://github.com/Snapchat/GiGL/pull/615, https://github.com/Snapchat/GiGL/pull/609,
https://github.com/Snapchat/GiGL/pull/619, https://github.com/Snapchat/GiGL/pull/631, and
https://github.com/Snapchat/GiGL/pull/638
- Make metric exporters fall back to NoOp when initialization fails and fix seeding utility behavior by @mkolodner-sc
and @svij-sc in https://github.com/Snapchat/GiGL/pull/550 and https://github.com/Snapchat/GiGL/pull/644

### Misc

- Add and refine agent/development guidance, including `CLAUDE.md`, `AGENTS.md`, `/all_test`, `/codex-review`, and
`/watch-action` workflow support by @kmontemayor2-sc in https://github.com/Snapchat/GiGL/pull/498,
https://github.com/Snapchat/GiGL/pull/504, https://github.com/Snapchat/GiGL/pull/543,
https://github.com/Snapchat/GiGL/pull/508, https://github.com/Snapchat/GiGL/pull/528, and
https://github.com/Snapchat/GiGL/pull/563
- Update graph store and in-memory SGS documentation, supervision-edge direction docs, Vertex AI Agent Platform links,
and presubmit whitespace checks by @mkolodner-sc and @kmontemayor2-sc in https://github.com/Snapchat/GiGL/pull/553,
https://github.com/Snapchat/GiGL/pull/593, https://github.com/Snapchat/GiGL/pull/595,
https://github.com/Snapchat/GiGL/pull/608, and https://github.com/Snapchat/GiGL/pull/607

## [0.2.0] - Jan 30, 2025

### Added
Expand Down
2 changes: 1 addition & 1 deletion gigl-core/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name = "gigl-core"
description = "GiGL C++/CUDA kernels (pybind11 extensions)"
readme = "README.md"
version = "0.2.0"
Comment thread
mkolodner-sc marked this conversation as resolved.
version = "0.3.0"
requires-python = "==3.11.*"
# Torch is resolved from the ambient environment. gigl-core wheels are ABI-bound
# to the torch variant they were built against (cpu or cu128). The parent `gigl`
Expand Down
2 changes: 1 addition & 1 deletion gigl/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.0"
__version__ = "0.3.0"
16 changes: 8 additions & 8 deletions gigl/dep_vars.env
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Note this file only supports static key value pairs so it can be loaded by make, bash, python, and sbt without any additional parsing.
DOCKER_LATEST_BASE_CUDA_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cuda-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
DOCKER_LATEST_BASE_CPU_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cpu-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
DOCKER_LATEST_BASE_DATAFLOW_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dataflow-base:7d3182eeb6446ce3e35910babba990c8e003879d.109.1
DOCKER_LATEST_BASE_CUDA_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cuda-base:6db83bdc98b2da65ac80243ce08cd8b37b3ee85f.110.1
DOCKER_LATEST_BASE_CPU_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-cpu-base:6db83bdc98b2da65ac80243ce08cd8b37b3ee85f.110.1
DOCKER_LATEST_BASE_DATAFLOW_IMAGE_NAME_WITH_TAG=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dataflow-base:6db83bdc98b2da65ac80243ce08cd8b37b3ee85f.110.1

DEFAULT_GIGL_RELEASE_SRC_IMAGE_CUDA=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cuda:0.2.0
DEFAULT_GIGL_RELEASE_SRC_IMAGE_CPU=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cpu:0.2.0
DEFAULT_GIGL_RELEASE_SRC_IMAGE_DATAFLOW_CPU=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cpu-dataflow:0.2.0
DEFAULT_GIGL_RELEASE_DEV_WORKBENCH_IMAGE=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dev-workbench:0.2.0
DEFAULT_GIGL_RELEASE_KFP_PIPELINE_PATH=gs://public-gigl/releases/pipelines/gigl-pipeline-0.2.0.yaml
DEFAULT_GIGL_RELEASE_SRC_IMAGE_CUDA=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cuda:0.3.0
DEFAULT_GIGL_RELEASE_SRC_IMAGE_CPU=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cpu:0.3.0
DEFAULT_GIGL_RELEASE_SRC_IMAGE_DATAFLOW_CPU=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/src-cpu-dataflow:0.3.0
DEFAULT_GIGL_RELEASE_DEV_WORKBENCH_IMAGE=us-central1-docker.pkg.dev/external-snap-ci-github-gigl/public-gigl/gigl-dev-workbench:0.3.0
DEFAULT_GIGL_RELEASE_KFP_PIPELINE_PATH=gs://public-gigl/releases/pipelines/gigl-pipeline-0.3.0.yaml

SPARK_31_TFRECORD_JAR_GCS_PATH=gs://public-gigl/tools/scala/spark_packages/spark-custom-tfrecord_2.12-0.5.0.jar
SPARK_35_TFRECORD_JAR_GCS_PATH=gs://public-gigl/tools/scala/spark_packages/spark_3.5.0-custom-tfrecord_2.12-0.6.1.jar
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name = "gigl"
description = "GIgantic Graph Learning Library"
readme = "README.md"
version = "0.2.0"
version = "0.3.0"
classifiers = [
"Programming Language :: Python",
"Programming Language :: Python :: 3",
Expand All @@ -15,7 +15,7 @@ dependencies = [
"chardet",
# gigl-core hosts all C++ / CUDA / pybind11 extensions. Separate wheel per torch
# variant (cpu/cu128). Version must match gigl exactly.
"gigl-core==0.2.0",
"gigl-core==0.3.0",
"google-cloud-aiplatform",
"google-cloud-dataproc",
"google-cloud-logging",
Expand Down
8 changes: 6 additions & 2 deletions scripts/bump_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,9 @@ def update_pyproject(version: str) -> None:
path = f"{GIGL_ROOT_DIR}/pyproject.toml"
with open(path, "r") as f:
content = f.read()
content = re.sub(r'(version\s*)=\s*"[\d\.]+"', f'\\1= "{version}"', content)
content = re.sub(
Comment thread
mkolodner-sc marked this conversation as resolved.
r'(?m)^(version[ \t]*)=[ \t]*"[^"]+"', f'\\1= "{version}"', content
)
# Keep the gigl-core pin in sync with the new version.
content = re.sub(r'"gigl-core==[\d\.a-zA-Z]+"', f'"gigl-core=={version}"', content)
with open(path, "w") as f:
Expand All @@ -110,7 +112,9 @@ def update_gigl_core_pyproject(version: str) -> None:
path = f"{GIGL_ROOT_DIR}/gigl-core/pyproject.toml"
with open(path, "r") as f:
content = f.read()
content = re.sub(r'(version\s*)=\s*"[\d\.]+"', f'\\1= "{version}"', content)
content = re.sub(
r'(?m)^(version[ \t]*)=[ \t]*"[^"]+"', f'\\1= "{version}"', content
)
with open(path, "w") as f:
f.write(content)

Expand Down
37 changes: 37 additions & 0 deletions tests/integration/distributed/utils/networking_test.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
import uuid
from textwrap import dedent

from google.cloud.aiplatform_v1.types import env_var
from parameterized import param, parameterized

from gigl.common.constants import DEFAULT_GIGL_RELEASE_SRC_IMAGE_CPU
from gigl.common.services.vertex_ai import VertexAiJobConfig, VertexAIService
from gigl.common.utils.proto_utils import ProtoUtils
from gigl.env.constants import GIGL_RESOURCE_CONFIG_URI_ENV_KEY
from gigl.env.pipelines_config import get_resource_config
from gigl.src.common.utils.file_loader import FileLoader
from tests.test_assets.test_case import TestCase


Expand All @@ -24,8 +28,31 @@ def setUp(self):
service_account=self._service_account,
staging_bucket=self._staging_bucket,
)

# get_graph_store_info() (run on the launched workers) calls
# get_resource_config() to build the readiness URI, so the workers need a
# resource config they can read. The test runner's resource config URI may
# be a local path that does not exist on the worker image, so we upload the
# in-memory resource config to the regional bucket (which the workers can
# read from GCS) and pass that URI via GIGL_RESOURCE_CONFIG_URI.
self._file_loader = FileLoader()
self._remote_resource_config_uri = (
self._resource_config.temp_assets_regional_bucket_path
/ "gigl"
/ "integration_tests"
/ "networking"
/ f"resource_config_{uuid.uuid4()}.yaml"
)
ProtoUtils().write_proto_to_yaml(
proto=self._resource_config.resource_config,
uri=self._remote_resource_config_uri,
)
super().setUp()

def tearDown(self):
self._file_loader.delete_files([self._remote_resource_config_uri])
super().tearDown()

@parameterized.expand(
[
param(
Expand Down Expand Up @@ -63,12 +90,22 @@ def test_get_graph_store_info(self, _, storage_nodes, compute_nodes):
"""
),
]
# launch_graph_store_job propagates the compute pool's environment_variables
# to both the compute and storage container specs, so the uploaded resource
# config URI is visible to every worker.
resource_config_env_vars = [
env_var.EnvVar(
name=GIGL_RESOURCE_CONFIG_URI_ENV_KEY,
value=self._remote_resource_config_uri.uri,
)
]
compute_cluster_config = VertexAiJobConfig(
job_name=job_name,
container_uri=DEFAULT_GIGL_RELEASE_SRC_IMAGE_CPU,
replica_count=compute_nodes,
command=command,
machine_type="n2-standard-8",
environment_variables=resource_config_env_vars,
)
storage_cluster_config = VertexAiJobConfig(
job_name=job_name,
Expand Down
4 changes: 2 additions & 2 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.