fix: emit UpdateEvent::BroadcastEmitted telemetry events#3643
Conversation
The BroadcastEmitted event type was fully defined with serialization support but never emitted. Add NetEventLog::update_broadcast_emitted() constructor and emit it in both simulation (broadcast_state_to_peers) and production (handle_broadcast_state_change) broadcast paths. Fixes freenet#3622 Signed-off-by: majiayu000 <1835304752@qq.com>
Rule Review: fix: PR missing new
|
sanity
left a comment
There was a problem hiding this comment.
Comprehensive PR Review: #3643
Summary
- PR Title: fix: emit UpdateEvent::BroadcastEmitted telemetry events
- Type: fix (telemetry wiring)
- CI Status: ✅ 2/2 passing
- Linked Issues: #3622
Code-First Analysis
Independent Understanding: Adds a new NetEventLog::update_broadcast_emitted() constructor and emits UpdateEvent::BroadcastEmitted telemetry events in two broadcast code paths (production and simulation). The event type was fully defined but never emitted.
Stated Intent: Matches implementation. The PR correctly wires up the pre-existing but never-fired event.
Alignment: Mostly aligned, but the production path has accuracy issues that partially undermine the telemetry value.
Testing Assessment
Coverage Level: Insufficient for an untrusted contributor
| Test Type | Status | Notes |
|---|---|---|
| Unit | No unit test for update_broadcast_emitted() constructor |
|
| Integration | Simulation tests exercise the path but don't assert event emission | |
| Simulation | N/A | Existing StateVerifier tests validate event processing but not emission |
Missing Tests: No test verifies that BroadcastEmitted events are actually produced. The testing_impl.rs summary currently ignores this event variant (no-op match arm at line 2880). A future refactor could silently break emission with no test catching it.
Skeptical Findings
Risk Level: Medium (misleading telemetry, not behavioral)
| Concern | Severity | Location | Details |
|---|---|---|---|
| Fabricated Transaction ID | High | p2p_protoc.rs:~4263 (production path) |
Transaction::new() creates an orphaned ID uncorrelatable with the actual update operation. Downstream telemetry consumers joining on transaction ID will get no matches. The simulation path correctly reuses update_tx. |
Misleading broadcasted_to |
Medium | p2p_protoc.rs:~4269 (production path) |
Uses target_result.targets.len() (enqueued count), not actual send success count. Will always report 100% delivery success even when sends fail, misleading broadcast failure debugging. |
target_result.targets.clone() in hot path |
Low | p2p_protoc.rs:~4267 (production path) |
Clones up to 512 PeerKeyLocation elements (~50KB) purely for telemetry on every broadcast. Consider recording only the count or a sample. |
.clone() on new_state |
Low | p2p_protoc.rs:~4270 (production path) |
Not needed — new_state has no further uses in non-simulation builds. Could be a move. (Note: WrappedState is Arc-based, so clone is just a refcount bump — cheap.) |
Verified safe:
- No duplicate emission with deprecated
UpdateMsg::Broadcastingpath (different trigger conditions) Transaction::new()uses ULID — no collision risk with real transactions in state machine mapsTelemetryReporter::send_eventusestry_send(non-blocking);EventRegisteruses bounded.send().awaitbut this is a pre-existing patternWrappedState::clone()isArcrefcount bump, not deep copy
Big Picture Assessment
Goal Alignment: Partial — addresses #3622 but production telemetry quality is poor
Anti-Patterns Detected: Fabricated transaction ID
Removed Code Concerns: None (purely additive: +61/-0)
Scope Assessment: Focused, no creep
Recommendations
Must Fix (Blocking)
-
Document the fabricated transaction ID limitation. The current comment says "Best-effort: broadcasted_to = target count since actual delivery is async" but doesn't mention the uncorrelatable transaction ID. Add a clear comment that the production path's transaction ID is synthetic and cannot be joined with downstream
BroadcastReceived/BroadcastAppliedevents. This is critical so telemetry dashboard developers don't waste time debugging why joins fail. -
Clarify
broadcasted_tosemantics. Either:- Rename/document that it means "enqueued" not "sent" in the production path, OR
- Consider not emitting the event in the production path at all until
BroadcastQueuecan report actual delivery results
Should Fix (Important)
-
Add a basic test that verifies
BroadcastEmittedevents are actually produced in the simulation path. The existingtesting_impl.rsignores this variant. Without a test, a future refactor can silently break emission. -
Avoid cloning
target_result.targetsin the production path — record only the count (the individual peer list can be reconstructed fromBroadcastReceivedevents on the receiving end).
Consider (Suggestions)
-
Move
new_stateinstead of cloning in the production path (it's the last use in non-simulation builds). Minor since it'sArc-based. -
Long-term: Thread the original update operation's transaction through
NodeEvent::BroadcastStateChangeso the production path can emit a correlatable transaction ID.
Verdict
Not ready to merge — needs changes. The core implementation is sound and follows codebase patterns, but the production path emits telemetry that is actively misleading (fake TX ID + inaccurate delivery count). At minimum, these limitations must be clearly documented so downstream consumers know what they're getting. Ideally, the production path should either emit accurate data or defer emission until it can.
[AI-assisted - Claude]
- Document production path telemetry limitations: synthetic TX ID (uncorrelatable with downstream events), broadcasted_to = enqueued count (not delivery confirmation) - Stop cloning target_result.targets in production path, pass empty vec since peer list is reconstructable from BroadcastReceived events - Move new_state instead of cloning in production path (last use) - Track UpdateEvent::BroadcastEmitted in testing_impl.rs summary stats (was previously a no-op match arm) Signed-off-by: majiayu000 <1835304752@qq.com>
Add assertion in test_subscription_broadcast_propagation that verifies UpdateEvent::BroadcastEmitted telemetry events are actually produced during simulation. Without this, a refactor could silently break emission with no test catching it. Add EventKind::is_update_broadcast_emitted() helper since UpdateEvent is pub(crate) and not accessible from integration tests. Signed-off-by: majiayu000 <1835304752@qq.com>
|
Addressed in the last two commits: Must-fix items (both done in 0b5a55c):
Should-fix items: Consider items: |
sanity
left a comment
There was a problem hiding this comment.
All previous review feedback has been addressed:
- Synthetic TX ID — now clearly documented in block comment
broadcasted_tosemantics — documented as enqueued count, not delivery confirmation- Avoid cloning targets — production path passes empty vec, simulation path clones (acceptable for telemetry)
- Move instead of clone —
new_statemoved in production path - Test added —
test_subscription_broadcast_propagationassertsBroadcastEmitted > 0 - Summary stats —
broadcasts_emittedcounter tracked intesting_impl.rs
4-perspective review (code-first, testing, skeptical, big-picture) found no blocking issues. Purely additive telemetry wiring (+102/-3) that closes the observability gap identified in #3622.
[AI-assisted - Claude]
|
Adding [AI-assisted - Claude] |
|
Adding The Claude Rule Review warning asks for a dedicated new
A standalone test function would be strictly additive at this point. Overriding to unblock merge. [AI-assisted - Claude] |
Problem
The
UpdateEvent::BroadcastEmittedtelemetry event type is fully defined with JSON serialization, state verifier support, and WebSocket streaming, but is never actually emitted during broadcast operations. The old pathway viaUpdateMsg::Broadcastingis deprecated.Solution
Added
NetEventLog::update_broadcast_emitted()constructor intracing.rs, following the same pattern asupdate_broadcast_receivedandupdate_broadcast_applied. Populatesupstream/senderfrom own location, computesstate_hashviastate_hash_short().Emit in simulation path: In
broadcast_state_to_peers(), after the per-target send loop and before the existingBroadcastDeliverySummaryemission. Uses the actualsend_successcount forbroadcasted_to.Emit in production path: In
handle_broadcast_state_change(), inside the#[cfg(not(feature = "simulation_tests"))]block after the enqueue loop. Usestargets.len()as best-effortbroadcasted_tosince actual delivery is async viaBroadcastQueue.Testing
cargo check— compiles cleanlycargo fmt --check— no formatting issuescargo clippy --locked -- -D warnings— no warningscargo test -p freenet --lib— all 2008 tests passStateVerifiertests already validateBroadcastEmittedevent processingFixes
Closes #3622