IGNITE-28105 Ensure new leader clock is not lower than last applied cmd#7731
IGNITE-28105 Ensure new leader clock is not lower than last applied cmd#7731EgorKuts wants to merge 3 commits intoapache:mainfrom
Conversation
modules/raft/src/integrationTest/java/org/apache/ignite/raft/server/ItNewLeaderClockTest.java
Outdated
Show resolved
Hide resolved
| this.currTerm | ||
| ); | ||
| if (request.timestamp() != null) { | ||
| clock.update(request.timestamp()); |
There was a problem hiding this comment.
Propagating HLC in each AppendEntry request creates additinal contention on HLC.
For this reason it was removed earlier.
Could we instead do it in RequestVoteRequest ?
There was a problem hiding this comment.
No. If the leader has the most advanced clock, it is possible for the new leader to have a stale clock value. This scenario is shown in the reproducer (2 out of 3 nodes have a lower HLC).
If we need to address the contention, I propose creating a separate clock per partition. This clock would be updated on each appendEntry and merged with the main clock when the node becomes a leader.
There was a problem hiding this comment.
Safe timestamp is already passed with each group command.
Can we just propagate it's value to a node's clock to ensure monotonic invariant, like this:
if (safeTimestamp != null) {
clock.update(safeTimestamp);
try {
safeTimeTracker.update(safeTimestamp, commandIndex, commandTerm, command);
} catch (TrackerClosedException ignored) {
// Ignored.
}
}
Co-authored-by: Aleksei Scherbakov <alexey.scherbakoff@gmail.com>
It's possible that new elected leader would have a lower clock's value than the last applied command(see ItNewLeaderClockTest).
Such behavior breaks state machine invariants and causes all nodes in the replication group to fail.