Skip to content

Feature/hotspot re allocation 2#317

Open
bootjp wants to merge 4 commits intomainfrom
feature/hotspot-re-allocation-2
Open

Feature/hotspot re allocation 2#317
bootjp wants to merge 4 commits intomainfrom
feature/hotspot-re-allocation-2

Conversation

@bootjp
Copy link
Owner

@bootjp bootjp commented Feb 17, 2026

No description provided.

- introduce RouteDescriptor, CatalogSnapshot, and RouteState
- add reserved catalog key helpers and route/version codecs
- implement CatalogStore Snapshot/Save with optimistic version checks
- add catalog tests for round-trip, validation, mismatch, and overflow cases
- extend Engine with catalogVersion and Version() accessor
- add ApplySnapshot(CatalogSnapshot) for atomic route replacement
- enrich Route with RouteID and RouteState metadata
- validate snapshot routes for duplicate IDs and invalid ordering/overlap
- preserve existing lookup behavior after snapshot application
- add engine tests for snapshot apply, stale version rejection, and routing correctness
Copilot AI review requested due to automatic review settings February 17, 2026 16:43
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @bootjp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request lays the groundwork for Elastickv's hotspot shard splitting feature by introducing a durable and versioned route catalog. It establishes the core data structures and persistence mechanisms for managing routing information and integrates this new catalog with the existing distribution engine. This foundational work enables consistent and reliable propagation of routing changes across the system, which is essential for future dynamic scaling capabilities.

Highlights

  • Durable Route Catalog Implementation: Introduced distribution/catalog.go which defines the CatalogStore for managing a durable, versioned route catalog. This includes RouteDescriptor and CatalogSnapshot structures, along with encoding/decoding logic and methods for saving and retrieving catalog state from the underlying MVCC store.
  • Engine Refactor for Versioned Snapshots: Modified distribution/engine.go to incorporate RouteID and State into the Route struct and added a catalogVersion to the Engine. New methods, Version() and ApplySnapshot(), enable the engine to atomically apply consistent, versioned route catalog snapshots, replacing previous append-only updates.
  • Comprehensive Testing for Catalog and Engine: Added distribution/catalog_test.go with extensive unit tests for the new catalog functionality, covering encoding/decoding, key management, and save/snapshot operations, including error handling. distribution/engine_test.go was updated with tests for the new ApplySnapshot method, ensuring correct behavior for version updates and route lookups.
  • Hotspot Shard Split Design Documentation: New documentation files, docs/hotspot_shard_split_design.md and docs/hotspot_split_milestone1_pr_plan.md, were added. These documents detail the overall design for automatic hotspot shard splitting and outline a phased PR plan for Milestone 1, providing crucial context for the introduced code changes.
Changelog
  • distribution/catalog.go
    • Added new file for route catalog model and persistence, including RouteDescriptor, CatalogSnapshot, and CatalogStore.
  • distribution/catalog_test.go
    • Added new file containing unit tests for CatalogVersionCodec, RouteDescriptorCodec, CatalogRouteKey helpers, and CatalogStore save/snapshot operations.
  • distribution/engine.go
    • Updated Route struct to include RouteID and State fields.
    • Added catalogVersion field to Engine struct.
    • Introduced Version() and ApplySnapshot() methods for managing route catalog snapshots.
    • Added new error types related to engine snapshot validation.
  • distribution/engine_test.go
    • Added test cases for ApplySnapshot to verify route replacement, version rejection, and lookup behavior.
  • docs/hotspot_shard_split_design.md
    • Added new design document outlining the architecture and plan for hotspot shard splitting.
  • docs/hotspot_split_milestone1_pr_plan.md
    • Added new document detailing the PR plan for Milestone 1 of the hotspot split feature.
Activity
  • bootjp created this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

if strconv.IntSize == 64 && v > (^uint64(0)>>1) {
return 0, errors.WithStack(ErrCatalogInvalidRouteRecord)
}
return int(v), nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [golangci] reported by reviewdog 🐶
G115: integer overflow conversion uint64 -> int (gosec)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new catalog.go file and its corresponding test file catalog_test.go, which implements a durable route catalog for Elastickv. It also modifies engine.go and engine_test.go to integrate with this new catalog, allowing the Engine to apply snapshots of routes and manage route states. The changes are well-structured and align with the design document hotspot_shard_split_design.md and the PR plan hotspot_split_milestone1_pr_plan.md. The new CatalogStore provides persistence for route catalog state, including versioning and optimistic concurrency control. The Engine now supports applying these catalog snapshots, which is a crucial step towards implementing hotspot splitting. The changes are comprehensive and include necessary error handling and validation for route descriptors and catalog operations.

Comment on lines +245 to +246
left := Route{Start: r.Start, End: mid, GroupID: r.GroupID, State: RouteStateActive}
right := Route{Start: mid, End: r.End, GroupID: r.GroupID, State: RouteStateActive}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The RouteID and ParentRouteID fields are not being set for the newly split routes. While these are currently 0 for RouteStateActive routes, it's important to explicitly set them or consider if they should be inherited or generated. If RouteID is meant to be a durable identifier assigned by the catalog, then these newly created routes should probably not have a RouteID of 0 if they are derived from an existing catalog route. This could lead to inconsistencies when these routes are later persisted or processed by the catalog.

Suggested change
left := Route{Start: r.Start, End: mid, GroupID: r.GroupID, State: RouteStateActive}
right := Route{Start: mid, End: r.End, GroupID: r.GroupID, State: RouteStateActive}
left := Route{RouteID: 0, Start: r.Start, End: mid, GroupID: r.GroupID, State: RouteStateActive} // Consider how RouteID should be handled for split routes
right := Route{RouteID: 0, Start: mid, End: r.End, GroupID: r.GroupID, State: RouteStateActive} // Consider how RouteID should be handled for split routes


var buf bytes.Buffer
buf.WriteByte(catalogRouteCodecVersion)
_ = binary.Write(&buf, binary.BigEndian, route.RouteID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The binary.Write function returns an error, but it's being ignored with _ =. While binary.Write on a bytes.Buffer typically doesn't return an error unless the buffer grows beyond available memory (which is rare), it's good practice to handle potential errors, especially in serialization logic where data integrity is critical. Consider checking the error return value.

Suggested change
_ = binary.Write(&buf, binary.BigEndian, route.RouteID)
if err := binary.Write(&buf, binary.BigEndian, route.RouteID); err != nil {
return nil, errors.WithStack(err)
}

var buf bytes.Buffer
buf.WriteByte(catalogRouteCodecVersion)
_ = binary.Write(&buf, binary.BigEndian, route.RouteID)
_ = binary.Write(&buf, binary.BigEndian, route.GroupID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous comment, the error returned by binary.Write is ignored. It's best to handle this error to ensure robust serialization.

if err := binary.Write(&buf, binary.BigEndian, route.GroupID); err != nil {
		return nil, errors.WithStack(err)
	}

_ = binary.Write(&buf, binary.BigEndian, route.RouteID)
_ = binary.Write(&buf, binary.BigEndian, route.GroupID)
buf.WriteByte(byte(route.State))
_ = binary.Write(&buf, binary.BigEndian, route.ParentRouteID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error returned by binary.Write is ignored. Please handle this error for better error propagation and robustness.

if err := binary.Write(&buf, binary.BigEndian, route.ParentRouteID); err != nil {
		return nil, errors.WithStack(err)
	}

_ = binary.Write(&buf, binary.BigEndian, route.GroupID)
buf.WriteByte(byte(route.State))
_ = binary.Write(&buf, binary.BigEndian, route.ParentRouteID)
_ = binary.Write(&buf, binary.BigEndian, uint64(len(route.Start)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error returned by binary.Write is ignored. It's good practice to handle potential errors, even if they are unlikely with bytes.Buffer.

if err := binary.Write(&buf, binary.BigEndian, uint64(len(route.Start))); err != nil {
		return nil, errors.WithStack(err)
	}

}

buf.WriteByte(1)
_ = binary.Write(&buf, binary.BigEndian, uint64(len(route.End)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error returned by binary.Write is ignored. Please handle this error for consistency and robustness.

if err := binary.Write(&buf, binary.BigEndian, uint64(len(route.End))); err != nil {
		return nil, errors.WithStack(err)
	}


Elastickv already has shard boundaries, but it does not yet have the control-plane needed for safe automatic hotspot splitting.

Current implementation status (as of February 17, 2026):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The date "February 17, 2026" is in the future. While this is a design document, it's generally better to use current or past dates for "Current implementation status" to reflect actual progress. If this is a forward-looking statement, it should be rephrased to indicate a target date for this status.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request implements Milestone 1 of the hotspot shard split feature for Elastickv, delivering the control plane infrastructure for durable route catalog management. The implementation adds versioned, persistent route storage using the existing MVCC store, enabling route table durability across restarts and preparing the foundation for future manual and automatic shard splitting capabilities.

Changes:

  • Introduced durable route catalog with versioned snapshots and optimistic concurrency control
  • Extended distribution Engine to support atomic route snapshot replacement with version tracking
  • Added comprehensive test coverage for catalog persistence and engine snapshot operations
  • Documented complete design and implementation plan for all 5 milestones of the hotspot split feature

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
docs/hotspot_split_milestone1_pr_plan.md Detailed 5-PR breakdown plan for Milestone 1 implementation with validation commands and merge order
docs/hotspot_shard_split_design.md Comprehensive design document covering all 4 milestones of hotspot detection, split execution, and data migration
distribution/catalog.go Core catalog persistence layer with RouteDescriptor encoding, CatalogStore for MVCC integration, and optimistic version-based updates
distribution/catalog_test.go Extensive test suite covering codec round-trips, version conflicts, duplicate detection, and catalog save/snapshot operations
distribution/engine.go Enhanced Engine with catalog version tracking, ApplySnapshot method for atomic route replacement, and route ordering validation
distribution/engine_test.go Test coverage for snapshot application, version rejection, and route lookup behavior after catalog updates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant