analysis/reactive: redesign incremental fixpoint with delete-then-rederive strategy#8276
Open
cristianoc wants to merge 5 commits intomasterfrom
Open
analysis/reactive: redesign incremental fixpoint with delete-then-rederive strategy#8276cristianoc wants to merge 5 commits intomasterfrom
cristianoc wants to merge 5 commits intomasterfrom
Conversation
Move the BFS reachability computation, incremental/full recompute, and state management out of Reactive.ml into a dedicated private module with a clean interface. Signed-Off-By: Cristiano Calcagno <cristianoc@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Cristiano Calcagno <cristianoc@users.noreply.github.com>
Signed-off-by: Cristiano Calcagno <cristianoc@users.noreply.github.com>
Signed-off-by: Cristiano Calcagno <cristianoc@users.noreply.github.com>
Signed-off-by: Cristiano Calcagno <cristianoc@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Redesign the incremental fixpoint algorithm for reactive analysis. The transitive-closure logic is extracted from Reactive.ml into a dedicated ReactiveFixpoint module, rebuilt around a delete-then-rederive strategy with predecessor tracking, and validated through comprehensive invariant assertions and a replay-based evaluation over 56 real commits (showing 83% edge-traversal reduction vs full recomputation).
Incremental Fixpoint for Reactive Analysis
Introduction
When a source file changes in a codebase, analysis results that depend on it may become stale. Recomputing the entire analysis from scratch is correct but wasteful: typically only a small region of the dependency graph is affected. The incremental fixpoint algorithm described here maintains the set of reachable nodes in a dynamic directed graph, updating it efficiently as roots and edges change. It is the core propagation engine behind the reactive analysis server.
This report covers the problem definition, the algorithm and its key design ideas, correctness instrumentation, and empirical results from a replay-based evaluation on a real codebase.
Problem Definition
The algorithm maintains transitive reachability over a directed graph that evolves over time.
State. The graph is defined by:
The fundamental invariant is:
Updates. The graph changes in discrete waves. Each wave supplies:
(k, unit option)entries.Some ()adds a root;Noneremoves one.(k, k list option)entries.Some succssets the successors of k;Noneremoves all edges from k.Output. Each wave returns a list of delta entries describing how C changed:
(k, Some ())— node k became reachable (was not in C before, is now).(k, None)— node k became unreachable (was in C before, is no longer).The output represents the net effect of the wave: if a node is tentatively removed and then recovered within the same wave, no output is emitted for it.
Data Structures
The algorithm maintains four mutable structures:
currentrootsedge_mappred_mapThe predecessor map (
pred_map) is the key auxiliary structure. It enables efficient support checking: given a tentatively deleted node, we can quickly determine whether any live predecessor still points to it, without scanning the entire graph. The cost of maintainingpred_mapis proportional to edge updates: each edge addition or removal requires a constant-time insert or delete in the predecessor set of the target node. This is amortized across normal edge-update processing and adds no asymptotic overhead.Algorithm
Design Intuition
When edges or roots are removed, some previously reachable nodes may lose all paths from the roots. The challenge is determining which nodes are still reachable without recomputing from scratch.
The algorithm uses a delete-then-rederive strategy:
Delete pessimistically — starting from invalidation points (removed roots, targets of removed edges), propagate tentative deletions forward through the old edges. This marks every node that might have lost reachability.
Rederive optimistically — scan the tentatively deleted nodes and recover those that still have support in the new state: either they are roots, or at least one live predecessor still points to them. Recovered nodes propagate support to their successors.
Expand forward — from newly added roots and nodes with new outgoing edges, discover newly reachable nodes via forward BFS.
This two-pass approach (pessimistic deletion followed by optimistic recovery) is both simple and correct: it avoids the complexity of trying to determine "true" deletions upfront, while ensuring the final state is exactly the least fixed point.
Phases in Detail
The
applyfunction executes one wave in the following phases:Phase 1 — Analyze changes. Before modifying any state, examine each edge update against the current edge map. For each updated source node, compute which successor edges were removed (
removed_targets) and whether any new edges were added (has_new_edge). This pre-analysis drives the deletion and expansion phases.Phase 2 — Tentative deletion. Seed a deletion queue with:
Propagate forward through the old edges: for each node popped from the queue, enqueue its old successors that are still in C. This produces
deleted_nodes, a set of nodes that may have lost reachability.The deletion uses old edges deliberately: we need to follow the paths that used to provide reachability, since those are the paths that may have broken.
Next, apply root and edge updates to the state (updating
roots,edge_map, andpred_map), then remove all deleted nodes fromcurrent.Phase 3 — Re-derivation. Scan the deleted nodes. A deleted node is supported if:
currentand k is in E(p).Supported nodes are added back to
currentand their successors are checked recursively. This phase only adds nodes back; it never removes them. When it converges, every deleted node that had an alternative reachability path has been recovered.Phase 4 — Expansion. Starting from newly added roots and nodes whose edge updates introduced new successors, perform forward BFS to discover all newly reachable nodes. Each newly reached node is added to
currentand an add-delta is emitted — unless the node was tentatively deleted earlier in this same wave (in which case it is a no-op recovery, not a net change).Phase 5 — Emit removals. For each node in
deleted_nodesthat is not in the finalcurrent, emit a remove-delta. These are the nodes that were genuinely lost.Worked Example
Consider a graph with root A and edges A->B, A->X, B->D, X->D:
The reachable set is C = {A, B, X, D}.
Wave: remove the edge A->B.
Analyze: Edge A->B is removed.
removed_targets = [B], the target of the dropped edge.Tentative deletion: Seed the delete queue with B (target of removed edge, currently in C). Propagate forward through old edges: B->D, so D is also tentatively deleted.
deleted_nodes = {B, D}. Remove B and D fromcurrent. Nowcurrent = {A, X}.Re-derivation: Check each deleted node for support.
current.After re-derivation:
current = {A, X, D}.Expansion: No new roots or new edges added. Nothing to expand.
Emit removals:
deleted_nodes = {B, D}. D is back incurrent, so no removal for D. B is not incurrent. Emit(B, None).Output:
[(B, None)]— only B was lost. D survived because it had an alternative path through X.Complexity
Space is O(|C| + |E|) for the reachable set, edge map, and predecessor map.
Correctness Instrumentation
Debug invariants are implemented in the
Invariantssubmodule and enabled by settingRESCRIPT_REACTIVE_FIXPOINT_ASSERT=1. They validate the algorithm's internal consistency after each phase:removed_targetsandhas_new_edgematch the actual diff between old and new successor lists.currentequals the pre-wave reachable set minusdeleted_nodes.currentafter Phase 3.deleted_nodes \ current(nodes deleted but not recovered).currentequalsReach(R, E)computed by a fresh BFS, and the full output (both adds and removes) matches the set difference between pre-wave and post-wavecurrent.These checks add significant overhead and are intended for testing and validation, not production use.
Empirical Evaluation
Setup
The algorithm was evaluated using a replay-based benchmark. The workload replays 56 sequential commits from a real project (Hyperindex), with invariant assertions and metrics collection enabled throughout. For each commit, the script runs the incremental (server-backed) analysis and a cold baseline that recomputes the full analysis from scratch (with
RESCRIPT_REANALYZE_NO_SERVER=1), comparing both for timing and failure behavior.Average change size per commit: 1.3 files, 5.9 insertions, 47.2 deletions (range: 1–5 changed files per commit).
Results
Fixpoint internal metrics (aggregated over 56 waves):
Interpretation
Re-derivation is central, not exceptional. 61% of waves trigger the delete-then-rederive path. The worked example above (where D survives deletion because of an alternative path through X) is the common case in real dependency graphs, not a corner case.
Edge-work savings are substantial. The incremental algorithm traverses only 16.7% of the edges that a full recomputation would visit — an 83% reduction. This confirms that localized source changes produce localized graph updates.
Node-work savings are modest. Node-side bookkeeping (queue operations, hash table lookups) runs at 95.5% of the full baseline. This is likely because the node-work counter counts queue pops and hash table membership checks, which are O(1) operations that dominate BFS cost regardless of how many edges are skipped. In other words, the algorithm successfully avoids traversing most edges, but still touches most nodes during the deletion and re-derivation sweeps. Reducing this overhead would require more targeted invalidation — for example, bounding the deletion frontier using dominator information or topological depth — rather than optimizing individual operations.
Output is much smaller than input. 56,762 input entries produce only 5,241 output deltas — a 10.8x compression ratio. Most of the internal work cancels out, consistent with the high re-derivation rate.
Comparison to Alternatives
Several strategies exist for maintaining transitive reachability under updates:
Full recomputation: Rerun BFS/DFS from all roots after every change. Correct and simple, but wasteful when changes are localized. The empirical results show the incremental algorithm traverses only 16.7% of the edges that full recomputation would visit.
Reference counting: Track the number of paths reaching each node and remove it when the count drops to zero. This is efficient for pure deletions but fails in the presence of cycles (counts never reach zero) and does not naturally handle the case where a node loses one path but retains another through a different predecessor — precisely the re-derivation scenario that occurs in 61% of waves here.
DFS-based marking: Mark nodes as "dirty" and re-verify reachability via DFS from roots. This avoids false deletions but may revisit large portions of the graph when many nodes are marked dirty, offering less control over the re-derivation scope than the delete-then-rederive approach.
The delete-then-rederive strategy used here combines the simplicity of pessimistic deletion with efficient recovery via the predecessor map, avoiding the cycle problems of reference counting and the unbounded re-traversal of DFS marking.
Limitations
Conclusion
The incremental fixpoint algorithm maintains exact transitive closure through a delete-then-rederive strategy that is both simple to reason about and effective in practice. Assertion-enabled replay over 56 real commits confirms correctness, while metrics show an 83% reduction in edge traversals compared to full recomputation. The data also reveals that re-derivation is a routine operation (occurring in 61% of waves) rather than a rare edge case, validating the algorithm's two-pass design.
The primary optimization opportunity lies in reducing node-side bookkeeping overhead, which currently runs near the full-recomputation baseline despite the large edge-work savings.