fix: harden leader verification and optimize scans; add MVCC/route co…#315
fix: harden leader verification and optimize scans; add MVCC/route co…#315
Conversation
Summary of ChangesHello @bootjp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the robustness and performance of the distributed key-value store. It centralizes and strengthens the leader verification logic to ensure operations are always directed to the active leader, preventing inconsistencies. Significant optimizations have been applied to key scanning mechanisms, particularly for Redis-like pattern matching and MVCC store iterations, to improve query efficiency. Additionally, list deletion now occurs in batches, and comprehensive test coverage has been added across critical components to validate these improvements and new functionalities. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several valuable improvements, including hardening Raft leader verification by centralizing the logic, optimizing scan performance in the LSM store, and significantly increasing test coverage for MVCC, routing, and HLC components. The race condition fix in mvccStore is a critical improvement.
However, I've identified a critical issue in the deleteList implementation in adapter/redis.go where the introduction of batched deletes breaks the atomicity of the DEL command for lists, violating MVCC principles. This could lead to data corruption. Please see my detailed comment on this.
adapter/redis.go
Outdated
| for { | ||
| readTS := r.readTS() | ||
| kvs, scanErr := r.store.ScanAt(ctx, start, end, listDeleteBatchSize, readTS) | ||
| if scanErr != nil { | ||
| return errors.WithStack(scanErr) | ||
| } | ||
| if len(kvs) == 0 { | ||
| break | ||
| } | ||
|
|
||
| ops := make([]*kv.Elem[kv.OP], 0, len(kvs)+1) | ||
| for _, kvp := range kvs { | ||
| ops = append(ops, &kv.Elem[kv.OP]{Op: kv.Del, Key: kvp.Key}) | ||
| } | ||
| // delete meta last | ||
| ops = append(ops, &kv.Elem[kv.OP]{Op: kv.Del, Key: listMetaKey(key)}) | ||
| ops := make([]*kv.Elem[kv.OP], 0, len(kvs)) | ||
| for _, kvp := range kvs { | ||
| ops = append(ops, &kv.Elem[kv.OP]{Op: kv.Del, Key: kvp.Key}) | ||
| } | ||
|
|
||
| // ensure meta bounds consistent even if scan missed (in case of empty list) | ||
| _ = meta | ||
| group := &kv.OperationGroup[kv.OP]{IsTxn: true, Elems: ops} | ||
| if _, dispatchErr := r.coordinator.Dispatch(ctx, group); dispatchErr != nil { | ||
| return errors.WithStack(dispatchErr) | ||
| } | ||
| if len(kvs) < listDeleteBatchSize { | ||
| break | ||
| } | ||
| } | ||
|
|
||
| group := &kv.OperationGroup[kv.OP]{IsTxn: true, Elems: ops} | ||
| group := &kv.OperationGroup[kv.OP]{ | ||
| IsTxn: true, | ||
| Elems: []*kv.Elem[kv.OP]{ | ||
| {Op: kv.Del, Key: listMetaKey(key)}, | ||
| }, | ||
| } | ||
| _, err = r.coordinator.Dispatch(ctx, group) | ||
| return errors.WithStack(err) |
There was a problem hiding this comment.
This implementation of deleteList breaks the atomicity of the DEL command for lists, which is a critical regression. The previous implementation correctly performed the deletion in a single transaction.
The new batched approach introduces two major issues:
- Loss of Atomicity: By splitting the deletion into multiple transactions, the operation is no longer atomic. A failure during the process will leave the list in a corrupt, partially-deleted state.
- Inconsistent Snapshot Reads:
readTSis fetched inside the loop, meaning each batch is scanned at a different point-in-time. This can lead to inconsistencies.
To preserve correctness, this operation must be atomic. I recommend reverting to a single-transaction approach. If transaction size is a concern, it should be addressed at a lower level (e.g., with atomic range deletes in the storage engine) rather than compromising command atomicity.
Here is a suggestion that restores the atomic behavior:
readTS := r.readTS()
kvs, err := r.store.ScanAt(ctx, start, end, math.MaxInt, readTS)
if err != nil {
return errors.WithStack(err)
}
ops := make([]*kv.Elem[kv.OP], 0, len(kvs)+1)
for _, kvp := range kvs {
ops = append(ops, &kv.Elem[kv.OP]{Op: kv.Del, Key: kvp.Key})
}
// delete meta last
ops = append(ops, &kv.Elem[kv.OP]{Op: kv.Del, Key: listMetaKey(key)})
group := &kv.OperationGroup[kv.OP]{IsTxn: true, Elems: ops}
_, err = r.coordinator.Dispatch(ctx, group)
return errors.WithStack(err)References
- In an MVCC system, all operations within a single transaction must share the same start timestamp to ensure atomicity and a consistent snapshot view.
There was a problem hiding this comment.
Pull request overview
This PR strengthens raft leader verification across coordinators/stores, optimizes scan-based operations (notably Pebble/LSM scans and Redis KEYS-pattern scanning), and adds targeted test coverage for MVCC snapshots/concurrency and leader-routing behavior.
Changes:
- Centralize and harden raft leader verification via a shared
verifyRaftLeaderhelper, with new tests for nil/missing-group cases. - Refactor Pebble (LSM) scan implementation and improve Redis
KEYSpattern scanning + list-key discovery. - Add MVCC snapshot/restore, mutation error-path, concurrency, HLC, and leader-routed-store tests.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| store/mvcc_store_snapshot_test.go | Adds MVCC snapshot/restore + ApplyMutations error-path tests. |
| store/mvcc_store_concurrency_test.go | Adds a concurrent Scan vs Put regression test. |
| store/mvcc_store.go | Makes readTS() race-safe via RLock. |
| store/lsm_store.go | Refactors Pebble scan logic to reduce seeks and simplify iteration. |
| kv/sharded_coordinator_leader_test.go | Adds ShardedCoordinator leader verification tests (leader nil/missing group). |
| kv/sharded_coordinator.go | Switches leader verification to the shared helper. |
| kv/shard_store.go | Uses shared helper for leader verification checks. |
| kv/raft_leader.go | Introduces verifyRaftLeader helper (nil-safe, stack-wrapped errors). |
| kv/leader_routed_store_test.go | Adds tests for local-vs-proxied reads and missing leader address behavior. |
| kv/leader_proxy.go | Uses shared helper before serving leader-only txn ops. |
| kv/hlc_test.go | Adds monotonicity/observe-overflow behavior tests for HLC. |
| kv/coordinator_leader_test.go | Adds Coordinator leader verification tests (including nil raft). |
| kv/coordinator.go | Switches leader verification to the shared helper. |
| adapter/redis_keys_pattern_test.go | Adds tests for pattern bounds, matching, and list key discovery. |
| adapter/redis.go | Improves KEYS pattern scanning bounds + list namespace scans; batches list deletion scans/deletes. |
| adapter/distribution_server_test.go | Adds DistributionServer route/timestamp monotonicity tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
store/lsm_store.go
Outdated
| if out == nil && version <= ts { | ||
| kv, err := s.processFoundValue(iter, userKey, ts) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| out = kv | ||
| } |
There was a problem hiding this comment.
ScanAt for Pebble store can return incorrect values when a tombstone/expired version is the latest visible version at the read timestamp. In scanCurrentUserKey, out stays nil when processFoundValue returns nil (tombstone/expired), so the loop keeps scanning older versions and may incorrectly return an earlier live value that should be masked by the tombstone/expiry. Track that the first version <= ts has been evaluated (even if it yields nil) and stop considering older versions for that userKey (advance to the next userKey).
adapter/redis.go
Outdated
| group := &kv.OperationGroup[kv.OP]{IsTxn: true, Elems: ops} | ||
| if _, dispatchErr := r.coordinator.Dispatch(ctx, group); dispatchErr != nil { | ||
| return errors.WithStack(dispatchErr) | ||
| } |
There was a problem hiding this comment.
Batching list item deletions into multiple Dispatch calls makes list deletion non-atomic: if a later batch fails after earlier batches succeeded, the list meta remains (deleted only after the loop) and the list can be left partially deleted/inconsistent (meta length no longer matches items). Consider deleting/invalidating the meta first (so the list becomes logically absent) and then best-effort cleaning up items, or otherwise ensuring the delete is atomic/retriable.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces several valuable improvements, including hardening leader verification by centralizing the logic, optimizing scan performance in the LSM store, and significantly increasing test coverage for MVCC, routing, and Redis adapter functionality. The changes are well-implemented and enhance the system's correctness and robustness. I've identified one high-severity issue in the Redis adapter related to ensuring consistent snapshots for the KEYS command, for which I've provided a detailed comment and suggestion.
| start, end := patternScanBounds(pattern) | ||
| keyset := map[string][]byte{} | ||
|
|
||
| readTS := r.readTS() | ||
| keys, err := r.store.ScanAt(context.Background(), start, nil, math.MaxInt, readTS) | ||
| if err != nil { | ||
| return nil, errors.WithStack(err) | ||
| mergeScannedKeys := func(scanStart, scanEnd []byte) error { | ||
| readTS := r.readTS() | ||
| keys, err := r.store.ScanAt(context.Background(), scanStart, scanEnd, math.MaxInt, readTS) |
There was a problem hiding this comment.
The KEYS command should operate on a consistent snapshot of the data. In the current implementation of localKeysPattern, r.readTS() is called inside the mergeScannedKeys closure. Since mergeScannedKeys is called multiple times for a single KEYS command, each scan could potentially use a different timestamp if writes are occurring concurrently. This can lead to inconsistent results.
To ensure atomicity and a consistent view, the read timestamp should be fetched once at the beginning of localKeysPattern and reused for all subsequent scans within that function.
| start, end := patternScanBounds(pattern) | |
| keyset := map[string][]byte{} | |
| readTS := r.readTS() | |
| keys, err := r.store.ScanAt(context.Background(), start, nil, math.MaxInt, readTS) | |
| if err != nil { | |
| return nil, errors.WithStack(err) | |
| mergeScannedKeys := func(scanStart, scanEnd []byte) error { | |
| readTS := r.readTS() | |
| keys, err := r.store.ScanAt(context.Background(), scanStart, scanEnd, math.MaxInt, readTS) | |
| start, end := patternScanBounds(pattern) | |
| keyset := map[string][]byte{} | |
| readTS := r.readTS() | |
| mergeScannedKeys := func(scanStart, scanEnd []byte) error { | |
| keys, err := r.store.ScanAt(context.Background(), scanStart, scanEnd, math.MaxInt, readTS) |
References
- In an MVCC system, all operations within a single transaction must share the same start timestamp to ensure atomicity and a consistent snapshot view. While
KEYSis not a transaction, it's a single command that should present a consistent view of the database, so the principle of using a single snapshot timestamp applies.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/gemini review |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a set of significant improvements to the system's correctness, robustness, and performance. Key changes include hardening leader verification logic by centralizing it into a reusable helper, which is now used consistently across coordinators and proxies. The Redis KEYS command implementation has been substantially improved to correctly handle various wildcard patterns and to include list keys in results, along with performance optimizations for prefix-based scans. Critical correctness fixes have been made to the MVCC storage layers, including resolving a race condition in the in-memory mvccStore and fixing scan logic in the lsm_store to correctly handle tombstones. The PR also adds a wealth of test coverage for these changes, including concurrency tests, which greatly increases confidence in the stability of the system. Overall, this is an excellent set of changes that materially improves the quality of the codebase. Additionally, a specific comment regarding potential high memory usage in deleteList for large lists has been noted, suggesting a more scalable range deletion approach or a clarifying comment.
| kvs, err := r.store.ScanAt(ctx, start, end, math.MaxInt, startTS) | ||
| if err != nil { | ||
| return errors.WithStack(err) | ||
| } |
There was a problem hiding this comment.
This implementation of deleteList scans all items of a list into memory before creating a transaction to delete them. For very large lists, this could lead to high memory consumption and potentially OOM errors.
A more scalable approach would be to use a range deletion at the storage layer if the underlying store supports it. For example, PebbleDB has a DeleteRange operation. This would avoid loading all keys into memory.
If implementing a range delete is not immediately feasible, it would be good to add a comment here noting the potential for high memory usage with large lists.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 18 out of 18 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces a number of significant improvements across the codebase. The hardening of leader verification by centralizing the logic and adding nil checks is a great step towards a more robust system. The scan optimizations in both the Redis adapter and the LSM store are impressive, not only improving performance but also fixing potential correctness issues with MVCC visibility. The addition of comprehensive test coverage, including concurrency and snapshot tests, greatly increases confidence in the stability of the system. Finally, the fix for the data race on lastCommitTS in mvccStore is a critical improvement for concurrency safety. Overall, this is an excellent set of changes.
| func (s *mvccStore) readTS() uint64 { | ||
| s.mtx.RLock() | ||
| defer s.mtx.RUnlock() | ||
| return s.lastCommitTS | ||
| } |
There was a problem hiding this comment.
This is a critical fix to prevent a data race. On 32-bit architectures, reading a 64-bit integer (lastCommitTS) is not an atomic operation and can result in a torn read if another goroutine is writing to it concurrently. Adding the read lock ensures that the read is atomic and prevents this race condition.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This is a substantial pull request that brings significant improvements to leader verification, scan performance and correctness, and MVCC behavior. The introduction of the verifyRaftLeader helper hardens the system against panics from nil Raft instances. The optimizations and correctness fixes for Redis KEYS pattern matching and list deletion are excellent. The refactoring of the LSM store's scan logic to correctly handle tombstones is a critical correctness fix. I also appreciate the addition of comprehensive tests, which cover complex scenarios like cross-shard transactions and concurrency, greatly increasing confidence in these changes. I found one potential issue with timestamp handling during transaction rollbacks, which I've commented on.
kv/shard_store.go
Outdated
| return pb.Phase_NONE, 0, errors.Wrapf(ErrTxnLocked, "key: %s (timestamp overflow)", string(key)) | ||
| } | ||
| return pb.Phase_ABORT, abortTS, nil | ||
| return pb.Phase_ABORT, cleanupTSWithNow(lock.StartTS, cleanupNow), nil |
There was a problem hiding this comment.
The previous implementation had a defensive check for timestamp overflow when calculating the abort timestamp. This check seems to have been removed. If lock.StartTS is math.MaxUint64, cleanupTSWithNow could return a timestamp smaller than lock.StartTS, which would violate the FSM invariant that the resolution timestamp must be greater than the start timestamp. It's safer to reintroduce the check to prevent this potential issue.
abortTS := cleanupTSWithNow(lock.StartTS, cleanupNow)
if abortTS <= lock.StartTS {
return pb.Phase_NONE, 0, errors.Wrapf(ErrTxnLocked, "key: %s (timestamp overflow)", string(key))
}
return pb.Phase_ABORT, abortTS, nilThere was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func cleanupTSWithNow(startTS, now uint64) uint64 { | ||
| next := startTS + 1 | ||
| if now > next { | ||
| return now | ||
| } | ||
| return next | ||
| } |
There was a problem hiding this comment.
cleanupTSWithNow can overflow when startTS == ^uint64(0) (next wraps to 0) and then returns now, which may be <= startTS. That can violate the FSM invariant abortTS > startTS (see fsm.go abort/commit checks) and cause lock-resolution aborts during scans to fail unexpectedly. Consider using the existing overflow-safe abortTSFrom(startTS, 0) (or add an explicit startTS==max guard) and return an error/0 when no strictly-greater timestamp is representable, similar to the previous defensive check logic.
store/lsm_store.go
Outdated
| func (s *pebbleStore) scanCurrentUserKey(iter *pebble.Iterator, userKey []byte, ts uint64) (*KVPair, error) { | ||
| var out *KVPair | ||
| evaluatedVisible := false | ||
|
|
||
| for iter.Valid() { | ||
| curKey, version, ok := decodeIterKey(iter) | ||
| if !ok { | ||
| if !iter.Next() { | ||
| return out, nil | ||
| } | ||
| continue | ||
| } | ||
| if !bytes.Equal(curKey, userKey) { | ||
| return out, nil | ||
| } | ||
|
|
||
| // Only the newest version visible at ts should determine scan visibility. | ||
| // If that version is tombstoned/expired, older versions must stay masked. | ||
| if !evaluatedVisible && version <= ts { | ||
| kv, err := s.processFoundValue(iter, userKey, ts) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| out = kv | ||
| evaluatedVisible = true | ||
| } | ||
|
|
||
| if !iter.Next() { | ||
| return out, nil | ||
| } |
There was a problem hiding this comment.
scanCurrentUserKey continues iterating through all remaining versions for the current userKey even after it has already evaluated the newest visible version at ts. For keys with many historical versions this turns ScanAt into O(versions-per-key) work and can be a significant regression compared to fast-forwarding to the next user key once visibility has been decided. After setting evaluatedVisible=true, consider seeking directly to the next user key (or reintroducing a helper like the previous skipToNextUserKey) so scans don't have to walk older versions unnecessarily.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces several significant improvements to correctness, performance, and robustness. Key changes include hardening leader verification by centralizing the logic, fixing a critical data race in the in-memory MVCC store, and ensuring atomic list deletions in the Redis adapter by correctly setting transaction timestamps. The scan performance of the LSM store is also notably optimized, along with a correctness fix for handling tombstones. The addition of extensive test coverage, including for concurrency and distributed transactions, is a valuable contribution to the stability of the codebase.
| _ = meta | ||
|
|
||
| group := &kv.OperationGroup[kv.OP]{IsTxn: true, Elems: ops} | ||
| group := &kv.OperationGroup[kv.OP]{IsTxn: true, StartTS: startTS, Elems: ops} |
There was a problem hiding this comment.
This is a critical correctness fix. By adding StartTS to the OperationGroup, you ensure that all delete operations for the list items and metadata occur within a single atomic transaction at a consistent snapshot timestamp. This adheres to MVCC principles and prevents partial deletions or inconsistencies.
References
- In an MVCC system, all operations within a single transaction must share the same start timestamp to ensure atomicity and a consistent snapshot view.
| s.mtx.RLock() | ||
| defer s.mtx.RUnlock() | ||
| return s.lastCommitTS |
There was a problem hiding this comment.
This is a critical fix to prevent a data race. The lastCommitTS field is written under a write lock in alignCommitTS, so reading it must be protected by a read lock. Without this lock, concurrent reads and writes could lead to reading a partially written value or other undefined behavior. The new test TestMVCCStore_ScanConcurrentWithWrites is great for catching this type of issue.
| return pb.Phase_COMMIT, state.commitTS, nil | ||
| case txnStatusRolledBack: | ||
| abortTS := abortTSFrom(lock.StartTS, state.commitTS) | ||
| abortTS := cleanupTSWithNow(lock.StartTS, cleanupNow) |
There was a problem hiding this comment.
Using cleanupTSWithNow to generate the abort timestamp is a good improvement for robustness. It uses the current wall time (cleanupNow) to ensure the abort timestamp is recent, which helps prevent potential timestamp conflicts when cleaning up old or expired transaction locks. The previous approach of using startTS + 1 was correct but could be brittle if new transactions were started with timestamps very close to the aborted one.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // User-key bounded scans like "foo*" do not naturally include internal list | ||
| // keys ("!lst|..."), so scan list namespaces separately with mapped bounds. | ||
| if start != nil || end != nil { | ||
| metaStart, metaEnd := listPatternScanBounds(store.ListMetaPrefix, pattern) | ||
| if err := mergeScannedKeys(metaStart, metaEnd); err != nil { |
There was a problem hiding this comment.
localKeysPattern now does extra scans to include internal list namespaces for wildcard patterns like foo*, but exact-match patterns (no *) are routed to localKeysExact, which only checks the raw user key and will never return existing list keys (stored under !lst|...). This makes KEYS mylist inconsistent with EXISTS mylist and with wildcard results. Consider updating the exact-match path to detect list keys (e.g., by checking ListMetaKey(pattern)), or routing exact patterns through the same list-aware collection logic.
store/lsm_store.go
Outdated
| func decodeIterKey(iter *pebble.Iterator) ([]byte, uint64, bool) { | ||
| key, version := decodeKey(iter.Key()) | ||
| if key == nil { | ||
| return nil, 0, false | ||
| } |
There was a problem hiding this comment.
pebbleStore persists lastCommitTS under the special key _meta_last_commit_ts, which is not MVCC-encoded and whose value is only 8 bytes. The scan iterator path decodes every Pebble key via decodeKey and may pass this meta key into processFoundValue, which will fail decodeValue (expects at least 9 bytes) or expose an internal key during full-range scans (e.g., ScanAt(nil, nil, ...)). Consider moving this metadata out of the user keyspace (e.g., separate prefix + explicit skip in scans, or a dedicated Pebble property), or ensure meta keys use the same key/value encoding and are filtered from user-facing scans.
|
/gemini review |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements to leader verification, scan performance, and test coverage. The leader verification logic has been centralized into a verifyRaftLeader helper, improving consistency. Scan operations are now more efficient and correct, especially for Redis KEYS patterns and list types. The added tests, particularly for MVCC, routing, and distributed transactions, greatly enhance the robustness of the system. A critical concurrency bug in mvccStore has also been fixed by correctly synchronizing access to lastCommitTS. Overall, these are excellent changes that harden the system and improve its performance. I've left a few minor suggestions to reduce code duplication.
| if st, ok := t.listStates[string(cmd.Args[1])]; ok { | ||
| if st.metaExists { | ||
| st.purge = true | ||
| st.purgeMeta = st.meta | ||
| } | ||
| st.deleted = true | ||
| st.appends = nil | ||
| return redisResult{typ: resultInt, integer: 1}, nil | ||
| } | ||
| isList, err := t.server.isListKeyAt(context.Background(), cmd.Args[1], t.startTS) | ||
| if err != nil { | ||
| return redisResult{}, err | ||
| } else if isList { | ||
| } | ||
| if isList { | ||
| st, err := t.loadListState(cmd.Args[1]) | ||
| if err != nil { | ||
| return redisResult{}, err | ||
| } | ||
| if st.metaExists { | ||
| st.purge = true | ||
| st.purgeMeta = st.meta | ||
| } | ||
| st.deleted = true | ||
| st.appends = nil | ||
| return redisResult{typ: resultInt, integer: 1}, nil |
| func (c *ShardedCoordinator) VerifyLeader() error { | ||
| g, ok := c.groups[c.defaultGroup] | ||
| if !ok || g.Raft == nil { | ||
| if !ok { | ||
| return errors.WithStack(ErrLeaderNotFound) | ||
| } | ||
| return errors.WithStack(g.Raft.VerifyLeader().Error()) | ||
| return verifyRaftLeader(g.Raft) | ||
| } |
| func (c *ShardedCoordinator) VerifyLeaderForKey(key []byte) error { | ||
| g, ok := c.groupForKey(key) | ||
| if !ok || g.Raft == nil { | ||
| if !ok { | ||
| return errors.WithStack(ErrLeaderNotFound) | ||
| } | ||
| return errors.WithStack(g.Raft.VerifyLeader().Error()) | ||
| return verifyRaftLeader(g.Raft) | ||
| } |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces significant improvements across leader verification, scan performance, and transactional correctness, particularly for Redis list operations. The introduction of verifyRaftLeader centralizes and hardens leadership checks, which is a great enhancement for system stability. The scan optimizations in the Redis adapter and LSM store are well-thought-out. The fixes for transaction handling, including the race condition in mvccStore and the atomicity of list deletions, are crucial for correctness. The extensive test coverage added in this PR, including concurrency and edge case tests, is commendable and greatly increases confidence in the changes. I've found one potential race condition in adapter/redis.go and have provided a suggestion to address it. Overall, this is a high-quality contribution.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…verage