fix(pb): optimize actor key allocation#4610
Conversation
|
PR Review: fix(pb): optimize actor key allocation NOTE: This PR is already merged. Feedback is provided for future reference. SUMMARY This PR eliminates two separate workflow activities (GenerateReservationId and ResolveTargetReplicas) by folding their work into a single LookupKeyOptimistic activity. The optimization is sound -- the target replicas are needed to scope the optimistic read anyway, so fetching them there (and generating the reservation ID in the same pass) removes two unnecessary round trips. The supporting refactor moving resolve_quorum_members to a shared util is clean. POSITIVES Utility extraction (utils.rs): resolve_active_quorum_members is the right place for this logic. Both epoxy_propose and epoxy_kv_get_optimistic now share the same validated quorum resolution path, and the rename from resolve_quorum_members to resolve_active_quorum_members is more descriptive. Test migration: All 6 unit tests were correctly moved and renamed alongside the function. No coverage was lost. Documentation comment on target_replicas: The doc comment clearly explains the caller-side scope stability invariant, which is important for a subtle correctness property. Enum output design: LookupKeyOptimisticOutput with serde rename_all snake_case is idiomatic and the serialized form is stable. CONCERNS In-flight workflow compatibility: GenerateReservationId and ResolveTargetReplicas are removed entirely without a version guard. Workflows that already completed LookupKeyOptimistic (returning Option) but have not yet replayed through GenerateReservationId or ResolveTargetReplicas will fail on replay after deployment. The old ResolveTargetReplicas was added behind .v(2), so there is prior art for using version gates here. Worth confirming this codepath had no in-flight workflows at deploy time, or that the workflow engine handles missing activity types gracefully. Minor - observability loss: The removed ResolveTargetReplicas activity included timing instrumentation (tracing::debug! with op_duration_ms). The inlined replica fetch in lookup_key_optimistic has no equivalent timing. If this op is slow in production it will be invisible in traces. Minor - into_iter().next() vs .first().copied(): The forwarding branch now uses target_replicas.into_iter().next() where the old code used target_replicas.first().copied(). Both are correct here (the Vec is consumed before the early return), but .first().copied() reads more clearly as "get the first element" without the ownership implication. Not a bug, just a style note. OVERALL Clean optimization. The two activity round trips removed represent real distributed latency savings for every key reservation. The in-flight compatibility question is the only thing worth double-checking on a future similar change. |
c8e4e6b to
f31092e
Compare
94f4263 to
4064862
Compare
f31092e to
0351bf4
Compare
39d337d to
9198ffc
Compare
0351bf4 to
ccfb97a
Compare
9198ffc to
bd6ef15
Compare
ccfb97a to
4573f6b
Compare
bd6ef15 to
e2a69f1
Compare
4573f6b to
2efef31
Compare
engine/packages/epoxy/src/utils.rs
Outdated
| .collect() | ||
| } | ||
|
|
||
| /// Returns the quorum members to use for this proposal. This supports scoped proposals because |
c0cc63d to
d4d4a75
Compare
e2a69f1 to
40e5746
Compare
40e5746 to
be5f01c
Compare
d4d4a75 to
079bc44
Compare
Merge activity
|
079bc44 to
bebc8ac
Compare

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: