Skip to content

feat(cloud-storage): add pluggable cloud storage support for distributed HugeGraph persistence#3061

Draft
vaijosh wants to merge 4 commits into
apache:masterfrom
vaijosh:RockDBCloud
Draft

feat(cloud-storage): add pluggable cloud storage support for distributed HugeGraph persistence#3061
vaijosh wants to merge 4 commits into
apache:masterfrom
vaijosh:RockDBCloud

Conversation

@vaijosh

@vaijosh vaijosh commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Why

This PR introduces a unified cloud storage capability for HugeGraph’s distributed stack, so store data can be synchronized to cloud object storage for stronger durability and recovery options.

The focus is not a provider-specific implementation detail, but a cloud-capable architecture with a default S3-compatible provider and extension points for additional providers.

What this PR delivers

1) Cloud storage support in distributed runtime

  • Enables cloud sync for store-side RocksDB in distributed deployments (backend=hstore flow).
  • Supports both:
    • cloud-first mode: commit waits for cloud sync (stronger durability)
    • async mode: local/Raft commit first, periodic cloud reconciliation

2) Pluggable provider architecture

  • Introduces provider/client abstractions for cloud backends.
  • Uses ServiceLoader discovery so new cloud providers can be added via plugin JARs.
  • Keeps built-in s3 provider as default (S3-compatible API model), while allowing future providers without core rewrites.

3) Cloud-neutral configuration model

  • Standardizes configuration naming around cloud-neutral keys (cloud_*).
  • Aligns server-side propagation and store-side consumption of cloud settings.
  • Cleans up older S3-specific naming to keep the config surface consistent and provider-agnostic.

4) Operational docs and examples

  • Updates architecture and usage docs to describe cloud storage behavior consistently.
  • Adds/updates sample plugin guidance and SPI wiring for provider developers.
  • Aligns docker/dev scripts and templates with cloud storage configuration.

User Impact

  • Resilience against Ephemeral Infrastructure: By treating cloud object storage as the decoupled, durable source of truth, HugeGraph can seamlessly adapt to cloud-native environments (e.g., Kubernetes). If an instance or pod is unexpectedly terminated, rescheduled, or suffers from local disk/EBS detachment, the system prevents catastrophic data loss. New instances can instantly rehydrate or recover their state directly from the cloud checkpoint.
  • Flexible Durability SLAs: Operators running distributed deployments (backend=hstore) can choose their optimal trade-off between throughput and durability:
    • Cloud-First Mode (sync): Guarantees zero data loss by ensuring local Raft commits inline with cloud storage flushes before returning success to the caller.
    • Asynchronous Mode (async): Minimizes latency by performing background reconciliation to the cloud storage bucket within a bounded time horizon.

Validation

  • Maven compile/build verification on affected modules.
  • Manual Validation and smoke test validation check RocksDB-cloud.md for more details.

Reviewer focus areas

  • Cloud config key consistency across server/store/docs
  • Provider abstraction and ServiceLoader integration
  • Cloud-first vs async durability semantics
  • Recovery/rehydration behavior and operational clarity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant