Improve context extraction and test discovery performance#1846
Improve context extraction and test discovery performance#1846aseembits93 wants to merge 2 commits intomainfrom
Conversation
|
Claude finished @aseembits93's task in 7m 55s —— View job PR Review Summary
Prek Checks✅ Code Review1. Naming convention violation —
|
| File | Coverage |
|---|---|
codeflash/discovery/discover_unit_tests.py |
75% |
codeflash/languages/python/context/code_context_extractor.py |
71% |
New code paths with missing coverage:
discover_unit_tests.py:186–188— theOperationalError/ read-only DB fallback inflush()is not tested- The
pruned_source_by_keyandsource_by_filecache hit paths inprocess_file_context(lines ~292–297, ~279–284) appear to be exercised implicitly via the Jedi project reuse test, but there's no dedicated test for the source/pruned-source caching layers
Last updated: 2026-03-16
|
@claude review it |
|
Claude finished @aseembits93's task in 3m 19s —— View job PR Review: Improve context extraction and test discovery performance
Overall AssessmentThe PR achieves its stated goals with good test coverage. A few correctness concerns and design questions are worth addressing before merge.
|
| Severity | Issue |
|---|---|
| 🔴 | Recursive functions may now appear as their own helpers in get_function_sources_from_jedi |
| 🟡 | @cache on _get_jedi_project is a process-lifetime singleton with no eviction |
| 🟡 | writes_enabled = False on first flush failure kills all subsequent cache writes |
| 🟡 | `jedi_project: object |
| 🟢 | All other changes are correct and well-structured |
|
that |
|
@KRRT7 it's an experimental PR which I opened via the codex desktop app, i'm suspicious too. surprised though that it passes our CI (except the windows test) |
|
@claude fix the failing windows unit tests |
|
Claude finished @aseembits93's task in 5m 23s —— View job Fixing failing Windows unit tests
Root CauseThe new The Fix (commit
|
…dering On Windows, Path.resolve() expands 8.3 short names for directories but not always for files. After project_root_path.resolve() converts RUNNER~1 to runneradmin, file_path.resolve() may still return the 8.3 form, causing relative_to() to raise ValueError and silently returning empty target_code. Wrap the cosmetic reordering block in try/except to match the existing pattern in process_file_context, preventing the crash without any correctness loss. Co-authored-by: Aseem Saxena <aseembits93@users.noreply.github.com>
|
Claude finished @KRRT7's task in 2m 26s —— View job PR #1846 vs PR #1660: Deep Comparison
TL;DRThese two PRs attack the same performance bottleneck (redundant work in Scope
|
| Aspect | PR #1846 | PR #1660 |
|---|---|---|
| Mechanism | Memoization cache (runtime deduplication) | Structural refactor (architectural deduplication) |
| Correctness risk | Cache key might miss a relevant dimension (e.g., helper_functions not in key) |
Function signature changes propagate widely |
| Code complexity | Adds 3 dicts + cache-check boilerplate at every call site | Removes code overall (-1232 lines net in code_context_extractor.py) |
| Cross-call safety | ContextProcessingCache is scoped per get_code_optimization_context call (safe) |
cst.Module is passed directly (no cross-call leakage) |
| Jedi Project caching | @cache on _get_jedi_project (process-lifetime singleton) |
No equivalent (not part of PR #1660's context extraction refactor) |
discover_unit_tests.py — Different Purposes, No Overlap
| PR #1846 | PR #1660 | |
|---|---|---|
| What changed | Batch SQLite writes (flush()), O(1) name lookup dict, eliminate one script.get_names() call |
Added existing_unit_test_count() function (22 lines, 0 deleted) |
| Where | Inside process_test_files / TestsCache |
New standalone function |
| Purpose | Reduce I/O overhead during test discovery | Count tests per function for ranking boost |
| Conflict | None — completely different parts of the file | None |
These changes do not conflict. PR #1660's existing_unit_test_count() sits alongside the TestsCache class without touching it. PR #1846's batching of TestsCache.insert_test() is independent.
Merge Order Implications
If #1846 merges first, then #1660:
- PR feat: improve function ranking with reference graph and test-based boosting #1660's
extract_all_contexts_from_filesrefactor will structurally eliminate most of whatContextProcessingCachewas protecting against. The cache will still work correctly (it's purely additive), but much of it becomes dead weight — the cache will never be hit because each file is now processed only once per pass anyway. - PR feat: improve function ranking with reference graph and test-based boosting #1660 will need to rebase and decide whether to keep
ContextProcessingCachefor the remaining token-limit fallback paths (wherebuild_testgen_contextis still called multiple times), or remove it. - The
_get_jedi_projectcache from Improve context extraction and test discovery performance #1846 is the one piece that PR feat: improve function ranking with reference graph and test-based boosting #1660 doesn't replicate — that part would still provide value.
If #1660 merges first, then #1846:
- PR Improve context extraction and test discovery performance #1846 will have a large merge conflict in
code_context_extractor.py(both PRs heavily touch the same file with fundamentally different structures). - The
ContextProcessingCacheand its threading throughprocess_file_context/extract_code_markdown_context_from_fileswould need to be retrofitted onto PR feat: improve function ranking with reference graph and test-based boosting #1660's newextract_all_contexts_from_filesarchitecture. - Most of PR Improve context extraction and test discovery performance #1846's context caching work would become moot, though the SQLite batching in
discover_unit_tests.pyand the Jedi project cache would still apply cleanly.
Summary Verdict
PR #1846 is a lightweight, low-risk, purely additive performance improvement that works on the current architecture. It provides real wins (fewer disk reads, fewer Jedi Project allocations, fewer SQLite round-trips) with minimal churn.
PR #1660 is a deep architectural investment — it refactors the context extraction system to be structurally correct (single-pass, no redundant parsing), alongside a large ranking overhaul. Once merged, it supersedes most of PR #1846's caching in code_context_extractor.py.
The two PRs are not redundant today because they target different bottlenecks, but they will become architecturally incompatible in code_context_extractor.py if both are in-flight and merge independently. The discover_unit_tests.py changes in both PRs are independent and can both land without conflict.
| Branch
|
Claude finished @KRRT7's task in 2m 57s —— View job Claude finished @KRRT7's task —— Updated Review: Post-#1660 Merge
Status: Needs RebaseWith #1660 now in
What Became Obsolete After #1660
These functions no longer exist in main in the same form. Main's
The plumbing needs to be redone against the new architecture, but the concept still applies (see below). What Remains Valuable and Should Be Ported1.
|
| Change | Status |
|---|---|
flush() / batched executemany in TestsCache |
✅ Clean |
writes_enabled + OperationalError fallback |
✅ Clean |
functions_to_optimize_by_name O(1) lookup |
✅ Clean |
all_defs derived from all_names (saves 1 Jedi call) |
✅ Clean |
4. New tests — partially salvageable
test_unit_test_discovery.pyadditions: all applicable, no conflictstest_code_context_extractor.pyadditions: need to be rewritten against the newextract_all_contexts_from_files/re_extract_from_cacheAPI
Recommended Path Forward
- Immediately: Cherry-pick the Windows
try/except ValueErrorfix tomainas a standalone patch (it's a real bug currently live in production) - For this PR: Rebase onto current
main, then:- Drop
ContextProcessingCache(superseded byextract_all_contexts_from_files) - Port
get_jedi_projectcache to main's 4 call sites - Keep all
discover_unit_tests.pychanges as-is - Rewrite the
test_code_context_extractor.pyadditions against the new API
- Drop
Previous review comments above are now superseded by this analysis.
| Branch
|
@aseembits93 sorry about this, I had been working on a similar thing for the call + reference graph so I will close this |
…ve_to bug Port valuable improvements from #1846 that remain applicable after #1660: - Cache jedi.Project instances via @cache to avoid recreating across 5 call sites - Fix unguarded relative_to() in get_code_optimization_context (Windows 8.3 paths) - Pre-group references by parent function in get_function_sources_from_jedi for O(1) lookup - Batch TestsCache writes with flush() + executemany instead of per-row commit - Gracefully disable cache writes on sqlite3.OperationalError - Build functions_to_optimize_by_name dict for O(1) fallback lookup in process_test_files - Derive all_defs from all_names via is_definition() to save a redundant Jedi call
…ve_to bug Port valuable improvements from #1846 that remain applicable after #1660: - Cache jedi.Project instances via @cache to avoid recreating across 5 call sites - Fix unguarded relative_to() in get_code_optimization_context (Windows 8.3 paths) - Pre-group references by parent function in get_function_sources_from_jedi for O(1) lookup - Batch TestsCache writes with flush() + executemany instead of per-row commit - Gracefully disable cache writes on sqlite3.OperationalError - Build functions_to_optimize_by_name dict for O(1) fallback lookup in process_test_files - Derive all_defs from all_names via is_definition() to save a redundant Jedi call
Summary
Testing