Skip to content

Latest commit

 

History

History
291 lines (206 loc) · 9.92 KB

File metadata and controls

291 lines (206 loc) · 9.92 KB
title Trace & Optimize E2E Workflows
description End-to-end optimization of entire workflows with execution tracing
icon route
sidebarTitle Optimize E2E Workflows
keywords
tracing
workflow optimization
replay tests
end-to-end
script optimization
context manager
javascript
typescript
jest
vitest
java
jfr
maven
gradle

Codeflash can optimize an entire script or test suite end-to-end by tracing its execution and generating Replay Tests. Tracing follows the execution of your code, profiles it and captures inputs to all functions it called, allowing them to be replayed during optimization. Codeflash uses these Replay Tests to optimize the most important functions called in the workflow, delivering the best performance.

Function Optimization

To optimize a script, `python myscript.py`, simply replace `python` with `codeflash optimize`:
codeflash optimize myscript.py

You can also optimize code called by pytest tests:

codeflash optimize -m pytest tests/
To trace and optimize your Jest or Vitest tests:
# Jest
codeflash optimize --jest

# Vitest
codeflash optimize --vitest

# Or trace a specific script
codeflash optimize --language javascript script.js
To trace and optimize a running Java program, replace your `java` command with `codeflash optimize java`:
cd /path/to/your/java/project

# Class with classpath (recommended — works with any compiled project)
codeflash optimize java -cp target/classes com.example.Main

# Executable JAR (requires maven-jar-plugin or equivalent with Main-Class manifest)
codeflash optimize java -jar target/my-app.jar

For long-running programs (servers, benchmarks), use --timeout to limit each tracing stage:

codeflash optimize --timeout 30 java -cp target/classes com.example.Main

The codeflash optimize command creates high-quality optimizations, making it ideal when you need to optimize a workflow or script. The initial tracing process can be slow, so try to limit your script's runtime to under 1 minute for best results.

The generated replay tests and the trace file are for the immediate optimization use, don't add them to git.

Codeflash optimize 1 min demo

<iframe width="640" height="400" src="https://www.youtube.com/embed/_nwliGzRIug" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen ></iframe>

What is the codeflash optimize command?

codeflash optimize tries to do everything that an expert engineer would do while optimizing a workflow. It profiles your code, traces the execution of your workflow and generates a set of test cases that are derived from how your code is actually run. Codeflash Tracer works by recording the inputs of your functions as they are called in your codebase, and generating regression tests with those inputs. We call these generated test cases "Replay Tests" because they replay the inputs that were recorded during the tracing phase. These replay tests are representative of the real-world usage of your functions.

Using Replay Tests, Codeflash can verify that the optimized functions produce the same output as the original function and also measure the performance gains of the optimized function on the real-world inputs. This way you can be sure that the optimized function causes no changes of behavior for the traced workflow and also, that it is faster than the original function. To get more confidence on the correctness of the code, we also generate several LLM generated test cases and discover any existing unit cases you may have.

Using codeflash optimize

Codeflash script optimizer can be used in three ways:

  1. As an integrated command

    If you run a Python script as follows

    python path/to/your/file.py --your_options

    You can start tracing and optimizing your code with the following command

    codeflash optimize path/to/your/file.py --your_options

    The above command should suffice in most situations. To customize the trace file location you can specify it like codeflash optimize --output trace_file_path.trace. Otherwise, it defaults to codeflash.trace in the current working directory.

  2. Trace and optimize as two separate steps

    If you want more control over the tracing and optimization process. You can trace first and then optimize with the replay tests later. Each replay test is associated with a trace file.

    To create just the trace file first, run

    codeflash optimize --output trace_file.trace --trace-only path/to/your/file.py --your_options

    This will create a replay test file. To optimize with the replay test, run the

    codeflash --replay-test /path/to/test_replay_test_0.py

    More Options:

    • --timeout: The maximum time in seconds to trace the entire workflow. Default is indefinite. This is useful while tracing really long workflows.
  3. As a Context Manager

    To trace only specific sections of your code, you can use the Codeflash Tracer as a context manager. You can wrap the code you want to trace in a with statement as follows:

    from codeflash.tracer import Tracer
    
    with Tracer(output="codeflash.trace"):
        model.predict() # Your code here

    This is much faster than tracing the whole script. It can also help if tracing the whole script fails.

    After this finishes, you can optimize using the generated replay tests.

    codeflash --replay-test /path/to/test_replay_test_0.py

    More Options for the Tracer Context Manager:

    • disable: If set to True, the tracer will not trace the code. Default is False.
    • max_function_count: The maximum number of times to trace a single function. More calls to a function will not be traced. Default is 100.
    • timeout: The maximum time in seconds to trace the entire workflow. Default is indefinite. This is useful while tracing really long workflows, to not wait indefinitely.
    • output: The file to save the trace to. Default is codeflash.trace.
    • config_file_path: The path to the pyproject.toml file which stores the Codeflash config. This is auto-discovered by default. You can also disable the tracer in the code by setting the disable=True option in the Tracer constructor.

The JavaScript tracer uses Babel instrumentation to capture function calls during your test suite execution.

  1. Trace your test suite

    # Jest projects
    codeflash optimize --jest
    
    # Vitest projects
    codeflash optimize --vitest
    
    # Trace a specific script
    codeflash optimize --language javascript src/main.js
  2. Trace specific functions only

    codeflash optimize --jest --only-functions processData,transformInput
  3. Trace and optimize as two separate steps

    # Step 1: Create trace file
    codeflash optimize --trace-only --jest --output trace_file.sqlite
    
    # Step 2: Optimize with replay tests
    codeflash --replay-test /path/to/test_replay_test_0.test.js

    More Options:

    • --timeout: Maximum tracing time in seconds.
    • --max-function-count: Maximum traces per function (default: 256).
    • --only-functions: Comma-separated list of function names to trace.

The Java tracer uses a two-stage approach: JFR (Java Flight Recorder) for accurate profiling, then a bytecode instrumentation agent for argument capture.

  1. Trace and optimize a Java program

    Replace your java command with codeflash optimize java:

    # Class with classpath (recommended — works with any compiled project)
    codeflash optimize java -cp target/classes com.example.Main
    
    # Executable JAR (requires maven-jar-plugin or equivalent with Main-Class manifest)
    codeflash optimize java -jar target/my-app.jar

    The -cp approach works with any project after mvn compile or gradle build. The -jar approach requires your project to produce an executable JAR with a Main-Class entry in the manifest — this is not the default Maven behavior.

    Codeflash will run your program twice (once for profiling, once for argument capture), generate JUnit replay tests, then optimize the most impactful functions.

  2. With Maven / Gradle test suites

    # Maven
    codeflash optimize mvn test
    
    # Gradle
    codeflash optimize ./gradlew :module:cleanTest :module:test
  3. Multi-module projects

    For multi-module builds, point Codeflash at the correct source and test roots:

    codeflash \
      --module-root src/main/java \
      --tests-root src/test/java \
      optimize ./gradlew :my-module:cleanTest :my-module:test
  4. Long-running programs

    For servers, benchmarks, or programs that don't terminate on their own, use --timeout to limit each tracing stage:

    codeflash optimize --timeout 30 java -cp target/classes com.example.Main

    Each stage runs for at most 30 seconds, then the program is terminated and captured data is processed.

  5. Trace only (no optimization)

    codeflash optimize --trace-only java -cp target/classes com.example.Main

    This generates replay tests in src/test/java/codeflash/replay/ without running the optimizer.

    Options:

    Option Description Default
    --timeout Maximum time (seconds) for each tracing stage Indefinite
    --max-function-count Maximum captures per method 100
    --trace-only Generate replay tests without running the optimizer Off
    --no-pr Skip PR creation and keep changes local Off