Skip to content

fix(generation): stabilize prompt hashes across re-runs#62

Open
dmikushin wants to merge 1 commit intorepowise-dev:mainfrom
dmikushin:fix/stabilize-prompt-hashes
Open

fix(generation): stabilize prompt hashes across re-runs#62
dmikushin wants to merge 1 commit intorepowise-dev:mainfrom
dmikushin:fix/stabilize-prompt-hashes

Conversation

@dmikushin
Copy link
Copy Markdown

Problem

Every repowise init re-generates all wiki pages from scratch, even when the codebase hasn't changed. The root cause is non-deterministic source_hash values: the SHA-256 is computed over the rendered Jinja2 prompt, and two context variables were unstable across runs.

Source of non-determinism 1: graph edge ordering

Files are parsed in parallel via ProcessPoolExecutor + as_completed, so the order in which nodes and edges are inserted into the NetworkX graph is non-deterministic. graph.predecessors() / graph.successors() return nodes in insertion order, so dependents and dependencies lists in FilePageContext shuffled between runs → different rendered prompt → different source_hash.

Fix: sort predecessors/successors before building FilePageContext in ContextAssembler.assemble_file_page.

Source of non-determinism 2: Louvain community IDs

nx.community.louvain_communities already receives seed=42, but the adjacency traversal order inside Louvain still depends on node insertion order (same root cause). Additionally, the community list returned by louvain_communities has no guaranteed order, so enumerate() assigned different integer IDs to the same community across runs.

Fix: before calling louvain_communities, rebuild a sorted copy of the undirected graph (g_stable) with nodes and edges added in alphabetical order. After the call, sort the returned community list by each community's lexicographically smallest member before enumerate().

Impact

These two fixes make source_hash stable across re-runs for unchanged files, enabling the DB content cache (_db_content_cache keyed by source_hash) to skip redundant LLM calls and save API costs.

Testing

Added scripts/diagnose_hash_mismatch.py — a diagnostic script that:

  • Calls betweenness_centrality() and community_detection() twice and reports any differing values
  • For each cached file_page in wiki.db, renders the prompt fresh and compares SHA-256 with the stored source_hash
  • Reports whether a mismatch is caused by dep_summaries or another factor, with a unified diff

Run from the target repo directory:

python3 scripts/diagnose_hash_mismatch.py /path/to/repo --max-pages 20

Checklist

  • context_assembler.py: sort predecessors/successors
  • graph.py: sorted graph copy + sorted community list before enumerate
  • scripts/diagnose_hash_mismatch.py: diagnostic tool

Graph edge ordering and community IDs were non-deterministic because
files are parsed in parallel (ProcessPoolExecutor + as_completed),
causing NetworkX node insertion order to vary between runs.

Changes:
- context_assembler: sort predecessors/successors before including them
  in FilePageContext so dependents/dependencies lists are identical
  across runs regardless of graph construction order
- graph: rebuild a sorted copy of the undirected graph before passing it
  to louvain_communities so adjacency traversal order is reproducible;
  also sort the returned community list by each community's smallest
  member before assigning integer IDs via enumerate()

Adds scripts/diagnose_hash_mismatch.py to verify the fix and identify
any remaining sources of hash instability (dep_summaries, betweenness
sampling, etc.).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dmikushin dmikushin force-pushed the fix/stabilize-prompt-hashes branch from da425be to 6fdf4ee Compare April 10, 2026 12:32
@RaghavChamadiya
Copy link
Copy Markdown
Collaborator

Nice analysis and clean fix. The sorted predecessors/successors and stabilized Louvain ordering both make sense, and the PR description is really well written.

A few things before I merge:

  1. Betweenness centrality: for large repos (above the threshold), betweenness_centrality uses k=500 random samples without a seed. If betweenness values feed into the rendered prompt, that's a third source of non-determinism you haven't addressed. Can you check whether that's the case? If so, adding seed=42 there too would complete the fix.

  2. Unit tests: the core changes (sorted edges, sorted communities) don't have tests that verify stability across different insertion orders. Something like building the same graph in two different orders and asserting identical community IDs would really lock this in and catch regressions in CI.

  3. Diagnostic script: scripts/diagnose_hash_mismatch.py imports private internals like _run_ingestion which will break on refactors. Fine to keep it as an unsupported debugging tool in scripts/, but a proper unit test would be more valuable long term.

Happy to merge once (1) and (2) are addressed.

Copy link
Copy Markdown
Collaborator

@swati510 swati510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good fix, the non-determinism was real. A couple of things worth looking at:

  1. In graph.py, you rebuild the graph from sorted nodes/edges before Louvain. That works for stability but g_stable = nx.Graph() drops any node/edge attributes that were on the original graph. If downstream code reads attrs off these nodes after get_communities returns, it'll silently lose them. Safer to use nx.relabel_nodes on a copy, or sort in-place via a canonical representation. If no attrs matter here, a comment noting that would help future readers.

  2. scripts/diagnose_hash_mismatch.py has hardcoded paths to ~/forge/free-code and ~/forge/repowise in the docstring usage example. Either parameterize via argparse (take --repo-root) or drop the script once the fix is verified. 226 lines is a lot of one-off debug tooling to keep in-tree.

# via ProcessPoolExecutor + as_completed → non-deterministic insertion
# order in the main graph).
g_und = g.to_undirected()
g_stable = nx.Graph()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirm this is safe: nx.Graph() drops node/edge attributes present on g_und. If any downstream consumer reads attrs off these nodes after get_communities, this change silently loses them. If no attrs matter, worth a one-line comment saying so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants