Use a more capable "strong" model for the early planning phase of a session, then switch back to your default model once execution starts. This helps ensure high-quality initial analysis and planning without paying strong-model costs for the whole session.
The Planning-Phase Strong Model Overrides feature allows you to automatically route the initial turns of a session to a more powerful model for better planning and analysis, then seamlessly switch back to your default model for execution. This provides the best of both worlds: high-quality planning from stronger models and cost-effective execution from your standard model.
- Automatic Model Switching: Routes early requests to a configured strong model, then switches back based on turn count or file writes
- Parameter Overrides: Configure specific parameters (temperature, top_p, reasoning effort, thinking budget) for the strong model
- File-Write Detection: Automatically switches back when the model performs file-writing operations
- Turn-Based Switching: Configurable maximum number of planning turns before switching back
- Zero Overhead: Uses existing Tool Call Reactor for file-write detection, no duplicate logic
- Flexible Configuration: Configure via CLI, environment variables, or YAML config files
- Better Initial Planning: Early prompts often set the trajectory of an entire session. Stronger reasoning models can plan tasks more effectively, leading to better overall outcomes
- Cost and Speed Control: After planning, the system returns to your normal/default model to control costs and improve turnaround time
- Minimal Configuration: No arbiter or complex logic; switching is automatic based on simple counters (turns or file writes)
- Optimized Resource Usage: Pay for strong model capabilities only when they matter most - during the planning phase
Configuration follows standard precedence: CLI > Environment Variables > YAML
Add to your config.yaml:
session:
planning_phase:
enabled: true
strong_model: "openai:gpt-4o"
max_turns: 10
max_file_writes: 1
overrides:
temperature: 0.2
top_p: 0.9
reasoning_effort: "high"
thinking_budget: 8000# Enable/disable the feature
PLANNING_PHASE_ENABLED=true
# Specify the strong model (format: backend:model)
PLANNING_PHASE_STRONG_MODEL=openai:gpt-4o
# Set switching thresholds
PLANNING_PHASE_MAX_TURNS=10
PLANNING_PHASE_MAX_FILE_WRITES=1
# Override model parameters for the strong model
PLANNING_PHASE_TEMPERATURE=0.2
PLANNING_PHASE_TOP_P=0.9
PLANNING_PHASE_REASONING_EFFORT=high
PLANNING_PHASE_THINKING_BUDGET=8000--enable-planning-phase
--planning-phase-strong-model BACKEND:MODEL
--planning-phase-max-turns N
--planning-phase-max-file-writes N
--planning-phase-temperature FLOAT
--planning-phase-top-p FLOAT
--planning-phase-reasoning-effort EFFORT
--planning-phase-thinking-budget TOKENSEnable planning phase with GPT-4o for the first 8 turns:
python -m src.core.cli \
--default-backend openai \
--enable-planning-phase \
--planning-phase-strong-model openai:gpt-4o \
--planning-phase-max-turns 8Use a strong model with specific parameters optimized for planning:
python -m src.core.cli \
--default-backend openai \
--enable-planning-phase \
--planning-phase-strong-model openai:gpt-4o \
--planning-phase-max-turns 8 \
--planning-phase-max-file-writes 1 \
--planning-phase-temperature 0.2 \
--planning-phase-top-p 0.9 \
--planning-phase-reasoning-effort high \
--planning-phase-thinking-budget 8000Set up planning phase via environment variables:
export PLANNING_PHASE_ENABLED=true
export PLANNING_PHASE_STRONG_MODEL=openai:gpt-4o
export PLANNING_PHASE_MAX_TURNS=10
export PLANNING_PHASE_TEMPERATURE=0.2
python -m src.core.cli --default-backend openaiConfigure to switch back immediately after the first file-writing operation:
python -m src.core.cli \
--default-backend openai \
--enable-planning-phase \
--planning-phase-strong-model openai:gpt-4o \
--planning-phase-max-turns 20 \
--planning-phase-max-file-writes 1Use a strong model to analyze the codebase and plan the refactoring strategy, then switch to a faster model for implementing the changes:
session:
planning_phase:
enabled: true
strong_model: "openai:gpt-4o"
max_turns: 5
max_file_writes: 1
overrides:
temperature: 0.1 # Low temperature for careful analysis
reasoning_effort: "high"Let a strong model design the architecture and plan the implementation, then use a standard model for coding:
python -m src.core.cli \
--enable-planning-phase \
--planning-phase-strong-model anthropic:claude-3-opus \
--planning-phase-max-turns 10 \
--planning-phase-temperature 0.3Use a strong model to analyze logs and plan the debugging approach, then switch to a faster model for fixes:
session:
planning_phase:
enabled: true
strong_model: "openai:gpt-4o"
max_turns: 8
max_file_writes: 0 # Don't switch until turn limit
overrides:
temperature: 0.2
thinking_budget: 10000Balance quality and cost by using premium models only for planning:
# Use GPT-4o for planning, GPT-3.5-turbo for execution
python -m src.core.cli \
--default-backend openai \
--default-model gpt-3.5-turbo \
--enable-planning-phase \
--planning-phase-strong-model openai:gpt-4o \
--planning-phase-max-turns 5stateDiagram-v2
[*] --> PlanningPhase: Session Start
state PlanningPhase {
direction LR
[*] --> CheckConditions
CheckConditions --> UseStrongModel: Limits Not Reached
UseStrongModel --> IncrementCounters: Request Complete
IncrementCounters --> CheckConditions
}
CheckConditions --> ExecutionPhase: Max Turns Reached
CheckConditions --> ExecutionPhase: File Write Detected
state ExecutionPhase {
direction LR
[*] --> UseDefaultModel
UseDefaultModel --> [*]
}
- If enabled, the proxy routes early requests to the configured strong model
- The strong model is used unless the current model is already the strong model
- Configured parameter overrides (temperature, top_p, etc.) are applied to the strong model
The proxy automatically switches back to the default model when either condition is met:
- Maximum turns reached: The number of turns in the planning phase reaches
max_turns - File write detected: The model performs a file-writing tool call (e.g.,
write,edit,apply_diff,patch)
- Requests use whatever the normal routing would select (typically your default model)
- No more parameter overrides are applied
- The session continues with standard behavior
- File-write detection is handled by the existing Tool Call Reactor
- Supported file-writing tools:
write,edit,apply_diff,patch, and similar operations - No duplicate detection logic - reuses existing infrastructure
- enabled: Enable or disable the planning phase feature (default:
false) - strong_model: The model to use during planning phase (format:
backend:model) - max_turns: Maximum number of turns to use the strong model (default:
10) - max_file_writes: Maximum file writes before switching back (default:
1)
These parameters are applied only to the strong model during the planning phase:
- temperature: Controls randomness (0.0 to 2.0, lower is more deterministic)
- top_p: Nucleus sampling parameter (0.0 to 1.0)
- reasoning_effort: Reasoning effort level (e.g., "low", "medium", "high")
- thinking_budget: Token budget for thinking/reasoning (integer)
Problem: The strong model is not being used even though planning phase is enabled.
Solutions:
- Verify
enabledis set totrue - Check that
strong_modelis configured correctly (format:backend:model) - Ensure the strong model is different from your default model
- Check logs for planning phase activation messages
Problem: The proxy continues using the strong model after planning should end.
Solutions:
- Verify
max_turnsis set to a reasonable value - Check if file-write operations are being detected (review Tool Call Reactor logs)
- Ensure
max_file_writesis not set to 0 (which disables file-write switching) - Review session state to confirm turn counters are incrementing
Problem: The strong model is not using the configured parameter overrides.
Solutions:
- Verify overrides are in the correct section of the config
- Check configuration precedence (CLI > Env > YAML)
- Review logs for parameter override messages
- Ensure the backend supports the parameters you're overriding
Problem: Planning phase conflicts with other model override features.
Solutions:
- Check if Edit Precision Tuning is also enabled and potentially conflicting
- Review the order of middleware in the request pipeline
- Consider disabling one feature if they interfere with each other
- Check logs for multiple override attempts
- Edit Precision Tuning - Fine-tune model parameters based on edit patterns
- Hybrid Backend - Use different backends for different phases
- Model Name Rewrites - Dynamically rewrite model names using regex rules
- URI Model Parameters - Override model parameters via URI
- Session Management - Intelligent session state management