Skip to content

refactor: extract PythonFunctionOptimizer subclass#1699

Merged
KRRT7 merged 25 commits intomainfrom
extract-python-optimizer
Mar 2, 2026
Merged

refactor: extract PythonFunctionOptimizer subclass#1699
KRRT7 merged 25 commits intomainfrom
extract-python-optimizer

Conversation

@KRRT7
Copy link
Copy Markdown
Collaborator

@KRRT7 KRRT7 commented Mar 2, 2026

Summary

  • Extracts Python-specific logic from FunctionOptimizer into a PythonFunctionOptimizer subclass in languages/python/function_optimizer.py
  • Base class gets no-op/default hooks; subclass overrides 6 methods with the current Python-specific implementations
  • Optimizer.create_function_optimizer dispatches to the subclass when is_python()

I want to implement some new optimization strategies for Python, and this separation makes it easier to add Python-specific behavior without further bloating the base class.

Test plan

  • uv run ruff check + uv run ruff format --check on all 3 files
  • Import verification: from codeflash.languages.python.function_optimizer import PythonFunctionOptimizer
  • Optimizer and JS optimization flow tests pass

KRRT7 added 3 commits March 1, 2026 20:38
…/optimizer.py

Move AST parsing, code normalization, and imported module validation out of
the language-agnostic Optimizer class into dedicated functions.
Consolidate Python AST resolution through the shared helper instead of
importing get_first_top_level_function_or_method_ast directly.
…izer

Move 6 Python-specific methods into PythonFunctionOptimizer in
languages/python/function_optimizer.py. Base class gets no-op defaults;
Optimizer.create_function_optimizer dispatches to the subclass when
is_python().
KRRT7 and others added 2 commits March 1, 2026 21:30
…e ast to TYPE_CHECKING

- Base class keeps the language-routing replacement logic (used by both
  Python and JS); Python subclass adds unused-helper revert on top via super()
- Tests that exercise Python-specific replace+revert use PythonFunctionOptimizer
- Move `ast` to TYPE_CHECKING in optimizer.py (fixes prek)
@codeflash-ai
Copy link
Copy Markdown
Contributor

codeflash-ai bot commented Mar 2, 2026

⚡️ Codeflash found optimizations for this PR

📄 3,869% (38.69x) speedup for PythonFunctionOptimizer._resolve_function_ast in codeflash/languages/python/function_optimizer.py

⏱️ Runtime : 83.9 milliseconds 2.11 milliseconds (best of 90 runs)

A dependent PR with the suggested changes has been created. Please review:

If you approve, it will be merged into this PR (branch extract-python-optimizer).

Static Badge

KRRT7 and others added 4 commits March 1, 2026 21:49
Move the Python-specific test instrumentation branch (~80 lines using
inject_profiling_into_existing_test) to the subclass. Base class keeps
the JS/TS branch using language_support.instrument_existing_test as the
default.
Move async decorator instrumentation (4 sites) and codeflash capture
instrumentation (2 sites) into overridable hooks. Base class no-ops;
PythonFunctionOptimizer provides the implementations.
PythonSupport.instrument_existing_test() already delegates to
inject_profiling_into_existing_test(), so the base class handles
Python correctly through the language_support abstraction.
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 2, 2026

PR Review Summary

Prek Checks

  • 1 issue found and fixed: Unused loop variable tests_in_file_list in function_optimizer.py:1616 — changed .items() iteration to iterate directly over dict keys.
  • Fix committed and pushed: 192cf0f8style: fix unused loop variable in function_optimizer.py
  • Prek now passes cleanly.

Mypy

  • 3 errors in codeflash/languages/python/function_optimizer.py:
    • Language import from codeflash.languages.base (pre-existing attr-defined issue — same error exists in the base function_optimizer.py)
    • dict missing type parameters on line 112 (follows base class pattern)
    • Return type mismatch on line 115 — caused by parse_line_profile_results declaring -> dict but actually returning tuple[dict, None] (pre-existing upstream issue)
  • ~200+ errors in function_optimizer.py base class — all pre-existing, not introduced by this PR

Code Review

1 bug found (inline comment posted):

  • codeflash/languages/python/function_optimizer.py:180-185 — Behavioral change in line_profiler_step: The return {"timings": {}, "unit": 0, "str_out": ""} on line 185 is inside the outer if isinstance(line_profile_results, TestResults) but outside the inner if not line_profile_results.test_results. This means all TestResults (even populated ones) return the empty dict, whereas the original _line_profiler_step_python only did this for empty results. Code suggestion provided.

No other critical issues, security vulnerabilities, breaking API changes, or documentation staleness found. Architecture docs were properly updated.

Test Coverage

File Stmts Miss Cover Status
languages/python/function_optimizer.py 82 44 46% ⚠️ NEW — below 75% threshold
languages/python/optimizer.py 30 21 30% ⚠️ NEW — below 75% threshold
optimization/function_optimizer.py 1104 894 19% Pre-existing low coverage
optimization/optimizer.py 429 341 21% Pre-existing low coverage
lsp/beta.py Not exercised by tests

Overall project coverage: 79% (2483 passed, 57 skipped, 8 failed in unrelated test_tracer.py)

Notes:

  • New files languages/python/function_optimizer.py (46%) and languages/python/optimizer.py (30%) are below the 75% coverage threshold for new code
  • The extracted code is largely integration-level logic (test execution, line profiling, AST resolution) that requires full optimization pipeline tests to exercise
  • Base class coverage (19-21%) is pre-existing and not affected by this refactor

Optimization PRs


Last updated: 2026-03-02T08:00:00Z

@codeflash-ai codeflash-ai deleted a comment from claude bot Mar 2, 2026
@codeflash-ai codeflash-ai deleted a comment from claude bot Mar 2, 2026
@codeflash-ai codeflash-ai deleted a comment from claude bot Mar 2, 2026
KRRT7 added 9 commits March 1, 2026 22:51
…izer

Replace 2 is_python() guard sites around coverage_critic() calls with a
should_check_coverage() hook, defaulting to False in the base class and
True in PythonFunctionOptimizer.
…timizer

The factory method was resolving the function AST from original_module_ast
before constructing PythonFunctionOptimizer, duplicating the _resolve_function_ast
hook. Now the hook handles resolution and the factory checks the result post-construction.
KRRT7 and others added 2 commits March 1, 2026 23:36
Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- Add clarifying comment on shared replace_function_definitions_in_module import
- Remove misleading alias in test_unused_helper_revert.py, use PythonFunctionOptimizer directly
- Align base line_profiler_step return type to dict[str, Any]
- Fix latent bug: handle non-empty TestResults in line_profiler_step
@KRRT7
Copy link
Copy Markdown
Collaborator Author

KRRT7 commented Mar 2, 2026

@claude ensure codeflash/blob/main/.claude/rules/architecture.md Is up to date

Add languages/python/function_optimizer.py and languages/python/optimizer.py
to the directory tree and Key Entry Points table.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KRRT7
Copy link
Copy Markdown
Collaborator Author

KRRT7 commented Mar 2, 2026

@claude why is topological-sort-worktree-optimization failing?

@KRRT7
Copy link
Copy Markdown
Collaborator Author

KRRT7 commented Mar 2, 2026

@claude fix #1699 (comment)

github-actions bot and others added 3 commits March 2, 2026 06:27
The refactoring changed the log message from 'Discovered' to 'Instrumented',
breaking the E2E test regex in end_to_end_test_utilities.py:225.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
Log "Discovered N existing unit test files" after counting tests, and
"Instrumented N existing unit test files" after injecting profiling.
Python E2E harness matches "Discovered", JS harness matches "Instrumented".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KRRT7
Copy link
Copy Markdown
Collaborator Author

KRRT7 commented Mar 2, 2026

@claude double check that the style: fix unused loop variable in function_optimizer.py isn't a regression between our intention in this branch and what's on main

…y dict for all TestResults

The return statement was outside the inner if-not block, causing all TestResults
(even populated ones) to return an empty dict.

Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>
@KRRT7
Copy link
Copy Markdown
Collaborator Author

KRRT7 commented Mar 2, 2026

@claude work on the failures in js-esm-async-optimization

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 2, 2026

Claude finished @KRRT7's task in 13m 15s —— View job


Investigating js-esm-async-optimization CI failures

  • Get CI status and find the failing workflow run
  • Download and analyze job logs
  • Identify root cause of failures
  • Verify whether changes in this PR affect JS path

Analysis

The js-esm-async-optimization job was cancelled (not failed) — likely due to cancel-in-progress: true in the workflow's concurrency group when a newer push triggered a new run.

However, before cancellation, the test was indeed failing. Here's the detailed breakdown:

What happened during the run:

Candidate Behavior Tests Speedup
#1 Matched ✅ 0.999X (no improvement)
#2 Matched ✅ 1.001X (no improvement)
#3 Matched ✅ 1.000X (no improvement)
#4 Did NOT match ❌ N/A (used Promise.all → changed error semantics)

The AI generated 4 optimization candidates for processItemsSequential. Candidates 1-3 passed behavioral tests but showed zero speedup (~1.0X). Candidate 4 used Promise.all parallelization which is faster but changes error-handling behavior (sequential error propagation vs parallel), so it was correctly rejected. Result: "No optimizations found."

This is NOT caused by this PR. I verified that zero JS-related files were changed:

  • code_to_optimize/js/code_to_optimize_js_esm/ — no diff
  • tests/scripts/end_to_end_test_js_esm_async.py — no diff
  • tests/scripts/end_to_end_test_utilities_js.py — no diff
  • packages/codeflash/runtime/capture.js — no diff
  • codeflash/languages/javascript/ — no diff

I also audited the base class refactoring to confirm all JS code paths are preserved:

  • compare_candidate_results → same SQLite comparison logic for JS
  • should_skip_sqlite_cleanup → same BEHAVIOR or iteration==0 logic for non-Python
  • instrument_async_for_mode → correctly no-op for JS (was is_python() guarded before)
  • collect_async_metrics → correctly returns (None, None) for JS (was is_python() guarded before)
  • parse_line_profile_test_results → returns TestResults([]) for JS (same as before)

Root cause: processItemsSequential is inherently hard to optimize — the only meaningful optimization (parallelization via Promise.all) changes error semantics. The AI-generated candidates that preserve behavior show no speedup. This is a non-deterministic E2E test that depends on the AI finding a correct-and-faster optimization.

Recommendation: Re-run the workflow. This test is non-deterministic by nature (depends on AI-generated optimizations). The last run was cancelled before all 3 retries completed.

--- |

@KRRT7 KRRT7 merged commit 8a75fe1 into main Mar 2, 2026
24 of 28 checks passed
@KRRT7 KRRT7 deleted the extract-python-optimizer branch March 2, 2026 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant