Upgrade microVersionId to timestamp-ordered format and add isReplica#2628
Upgrade microVersionId to timestamp-ordered format and add isReplica#2628maeldonn wants to merge 2 commits into
Conversation
Hello maeldonn,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
|
LGTM |
caf9a51 to
868cae8
Compare
Review by Claude Code |
| * | ||
| * @return - the generated microVersionId | ||
| */ | ||
| export function generate(replicationGroupId: string): string { |
There was a problem hiding this comment.
discussed briefly in person with Maël already but, i had no idea id generation was this complicated
I think here its mostly inspired from the VersionID file to have something similar, but yeah this whole thing looks kinda smelly.
A good ID generation should be independent from basically anything,
- we shouldn't have to call wait,
- we shouldn't have to rely on a previous stored lastTimetamp
- we shouldn't have to do weird conditionnal + 1 increment on current timestamp
I mean at the end of the day, why don't we use uuidv7 ? It's globally unique, time ordered and sortable, and particularly recommended for distributed system
There was a problem hiding this comment.
Maybe we want uuids to be as small as possible, because I remember in other context we talked about how important it is to have metadata that doesn't occupy too much storage (especially important when users have many multiple small objects, we end up having metadata that is larger than the object itself).
Claude also tells me for mongodb calls ordering, it might be easier to have reversed ordered Version Id but not sure its relevant here (and not sure its true anyways)
There was a problem hiding this comment.
A few clarifications on the "smells":
- wait is called once per process lifetime, not per id
- lastTimestamp + 1 is the per-ms counter giving strict monotonicity per generator
- with uuid v7, two writers generating in the same ms are unordered
So we are keeping the version id format. We can plan a refactoring as follow-up if you want.
Replace the random 8-byte hex microVersionId with a 20-character timestamp-ordered identifier matching the versionId scheme, and add encode/decode/compare helpers in a new MicroVersionID module. Add an optional isReplica flag to ReplicationInfo to distinguish replica writes from user writes, enabling cascaded CRR. Issue: ARSN-578
868cae8 to
2eae0c5
Compare
|
LGTM |
| * sequence are stored in reversed form (MAX - value), followed by the | ||
| * replication group id normalized to {@link LENGTH_RG} characters | ||
| * (space-padded if shorter, truncated if longer). String comparison | ||
| * orders microVersionIds from newest to oldest, matching the |
There was a problem hiding this comment.
discussed briefly in person with Maël already but, i had no idea id generation was this complicated
I think here its mostly inspired from the VersionID file to have something similar, but yeah this whole looks rather smelly.
A good ID generation should be independent from basically anything,
- we shouldn't have to call wait,
- we should have to rely on a previous stored lastTimetamp
- we should have to do weird conditionnal + 1 increment on current timestamp
I mean at the end of the day, why don't we use uuidv7 ? It's globally unique, time ordered and sortable
There was a problem hiding this comment.
Re thinking some more 🌚
We want two things from this microVersionID :
- Avoid infinite loop cascading : uuidv7 is good for this : when we receive the request, we compare the uuidv7 received with whats currently stored, if its the same, it means we have a loop. uuidv7 does not suffer from collision (2 ids generated in the same millisecond have 1/2^74 collision chance)
- Deal with stale object : If we receive an object that is older than the currently stored object, we don't want to store it. uuidv7 can be compared and ordered with a millisecond precision (which is largely enough, imo even ~1 second would be enough)
Also, uuidv7 storage space is 128 bits, which I think is the same order of magnitude with this 27 chars microVersionId
There was a problem hiding this comment.
Uuid v7 doesn't actually solve the same-ms case across sites, so it's not a real win over what we have. Snowflake-style (ms timestamp + per-generator counter + node bits) gives us:
- Strict monotonicity per generator (the
wait/lastTimestamp/+1logic is the counter, not smell) - Deterministic cross-site tiebreak via the node bits
- Same equality check for loop detection
We need to keep the versionId format. But i'm sure we can do a little refactor to make the file clearer.
Waiting for approvalThe following approvals are needed before I can proceed with the merge:
|
| this._data.microVersionId = crypto.randomBytes(8).toString('hex'); | ||
| updateMicroVersionId(replicationGroupId: string) { | ||
| this._data.microVersionId = MicroVersionIdUtils.generate(replicationGroupId); | ||
| } |
There was a problem hiding this comment.
updateMicroVersionId is documented with @return itself but doesn't actually return this, unlike every other setter/updater in the class (e.g. setReplicationIsReplica, setReplicationIsNFS). Since this method is being rewritten anyway, this is a good opportunity to fix the mismatch.
| } | |
| updateMicroVersionId(replicationGroupId: string) { | |
| this._data.microVersionId = MicroVersionIdUtils.generate(replicationGroupId); | |
| return this; | |
| } |
— Claude Code
Review by Claude Code |
fbaf3fe to
3160d6b
Compare
| */ | ||
| updateMicroVersionId() { | ||
| this._data.microVersionId = crypto.randomBytes(8).toString('hex'); | ||
| updateMicroVersionId(replicationGroupId: string) { |
There was a problem hiding this comment.
Breaking API change: updateMicroVersionId() now requires a replicationGroupId argument. All downstream callers (CloudServer, Backbeat, etc.) that invoke this method without arguments will fail at compile time. Ensure coordinated updates across dependent repos.
— Claude Code
|
LGTM — one breaking change to flag: |
Issue: ARSN-578
3160d6b to
5f4839a
Compare
| this._data.microVersionId = crypto.randomBytes(8).toString('hex'); | ||
| updateMicroVersionId(replicationGroupId: string) { | ||
| this._data.microVersionId = MicroVersionIdUtils.generate(replicationGroupId); | ||
| } |
There was a problem hiding this comment.
@return itself in the JSDoc, but the method never returns this. Every other setter in this class (setReplicationIsNFS, setReplicationIsReplica, etc.) returns this for chaining. Since this PR is already editing the method and its JSDoc, either add the missing return this; or drop the @return tag.
| } | |
| updateMicroVersionId(replicationGroupId: string) { | |
| this._data.microVersionId = MicroVersionIdUtils.generate(replicationGroupId); | |
| return this; | |
| } |
— Claude Code
Review by Claude Code |
Replace the random 8-byte hex microVersionId with a 20-character timestamp-ordered identifier matching the versionId scheme, and add encode/decode/compare helpers in a new MicroVersionID module. Add an optional isReplica flag to ReplicationInfo to distinguish replica writes from user writes, enabling cascaded CRR.
Issue: ARSN-578