This guide documents all available CLI parameters, environment variables, and configuration options for the LLM Interactive Proxy.
Configuration is resolved in the following order (highest to lowest priority):
- CLI Arguments - Command-line flags override everything
- Environment Variables - Environment variables override config files
- YAML Configuration File - Config file provides defaults
- Built-in Defaults - Hardcoded defaults if nothing else is specified
| CLI Argument | Environment Variable | Description |
|---|---|---|
--help, -h |
N/A | Show help message and exit. |
--config FILE |
CONFIG_FILE |
Path to persistent configuration file (YAML). |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--default-backend BACKEND |
LLM_BACKEND |
Default backend to use (e.g., openai, anthropic, gemini). |
--static-route BACKEND:MODEL |
STATIC_ROUTE |
Force all requests to use this backend:model combination. |
--disable-gemini-oauth-fallback |
DISABLE_GEMINI_OAUTH_FALLBACK=1 |
Disable automatic Gemini OAuth fallback to gemini-2.5-flash. |
--disable-gemini-oauth-reasoning-prompt-injection |
DISABLE_GEMINI_OAUTH_REASONING_PROMPT_INJECTION=1 |
Disable automatic reasoning effort prompt injection for Gemini OAuth backends (enabled by default). |
--disable-hybrid-backend |
DISABLE_HYBRID_BACKEND=1 |
Disable the hybrid backend (enabled by default). |
--hybrid-backend-repeat-messages |
HYBRID_BACKEND_REPEAT_MESSAGES=1 |
Repeat reasoning output as an artificial message in the session. |
--reasoning-injection-probability FLOAT (or --reasoning_injection_probability) |
REASONING_INJECTION_PROBABILITY |
Probability of using the reasoning model in the hybrid backend (0.0 to 1.0). |
--hybrid-reasoning-model-timeout SECONDS |
HYBRID_REASONING_MODEL_TIMEOUT |
Timeout for the reasoning model call in hybrid scenarios (default: 60). |
--hybrid-reasoning-force-initial-turns N |
HYBRID_REASONING_FORCE_INITIAL_TURNS |
Number of turns at start of session to force reasoning model usage (default: 4). |
--model-alias PATTERN=REPLACEMENT |
MODEL_ALIASES (JSON string) |
Add a model name rewrite rule (regex pattern and replacement). Can be used multiple times. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--host HOST |
APP_HOST |
Bind host (default: 127.0.0.1). |
--port PORT |
APP_PORT |
Bind port (default: 8000). |
--anthropic-port PORT |
ANTHROPIC_PORT |
Port for Anthropic-compatible endpoints (default: disabled/derived). |
--timeout SECONDS |
PROXY_TIMEOUT |
Global request timeout in seconds (default: 120). |
--command-prefix PREFIX |
COMMAND_PREFIX |
Command prefix for in-chat commands (default: !/). |
--force-context-window TOKENS |
FORCE_CONTEXT_WINDOW |
Override context window size for all models. |
--thinking-budget TOKENS |
THINKING_BUDGET |
Set max reasoning tokens for all requests. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--disable-model-registry-download |
N/A | Disable downloading updates from the external model registry. |
--model-registry-url URL |
N/A | Override the model registry URL (default: https://models.dev/api.json). |
--model-registry-update-interval SECONDS |
N/A | Update interval for registry downloads (default: 86400). |
--disable-model-limit-enforcement |
N/A | Disable automated model limit and modality enforcement based on registry metadata. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--disable-auth |
DISABLE_AUTH=1 |
Disable client authentication (forces localhost binding). |
--disable-sso-captcha |
SSO_CAPTCHA_ENABLED=false |
Disable SSO Captcha verification (overrides config). |
--disable-redact-api-keys-in-prompts |
REDACT_API_KEYS_IN_PROMPTS=false |
Disable redaction of API keys in prompts. |
--openrouter-api-key KEY |
OPENROUTER_API_KEY |
OpenRouter API Key. |
--openrouter-api-base-url URL |
OPENROUTER_API_BASE_URL |
OpenRouter API Base URL. |
--gemini-api-key KEY |
GEMINI_API_KEY |
Google Gemini API Key. |
--gemini-api-base-url URL |
GEMINI_API_BASE_URL |
Google Gemini API Base URL. |
--zai-api-key KEY |
ZAI_API_KEY |
ZAI API Key. |
--zenmux-api-base-url URL |
ZENMUX_API_BASE_URL |
ZenMux API Base URL. |
| N/A | ANTHROPIC_API_KEY |
Anthropic API Key. |
| N/A | ANTHROPIC_API_BASE_URL |
Anthropic API Base URL. |
| N/A | AUTH_TOKEN |
Shared secret token for client authentication. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-brute-force-protection |
BRUTE_FORCE_PROTECTION_ENABLED=true |
Enable API key brute-force protection. |
--disable-brute-force-protection |
BRUTE_FORCE_PROTECTION_ENABLED=false |
Disable API key brute-force protection. |
--auth-max-failed-attempts N |
BRUTE_FORCE_MAX_FAILED_ATTEMPTS |
Max failed attempts before blocking (default: 5). |
--auth-brute-force-ttl SECONDS |
BRUTE_FORCE_TTL_SECONDS |
Time window for tracking failed attempts (default: 900). |
--auth-brute-force-initial-block SECONDS |
BRUTE_FORCE_INITIAL_BLOCK_SECONDS |
Initial block duration (default: 30). |
--auth-brute-force-multiplier FLOAT |
BRUTE_FORCE_BLOCK_MULTIPLIER |
Multiplier for subsequent blocks (default: 2.0). |
--auth-brute-force-max-block SECONDS |
BRUTE_FORCE_MAX_BLOCK_SECONDS |
Max block duration (default: 3600). |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--trusted-ip IP |
N/A | IP address to trust for bypassing authorization. |
--allow-admin |
N/A | Allow running with administrative privileges. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--hybrid-execution-model-timeout SECONDS |
HYBRID_EXECUTION_MODEL_TIMEOUT |
Timeout for execution model in hybrid scenarios. |
| N/A | HYBRID_REASONING_LATENCY_THRESHOLD |
Latency threshold for adaptive reasoning backoff. |
| N/A | HYBRID_REASONING_BACKOFF_TURNS |
Turns to skip reasoning after latency threshold exceeded. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
| N/A | OPENROUTER_TIMEOUT |
Timeout for OpenRouter requests. |
| N/A | GEMINI_TIMEOUT |
Timeout for Gemini requests. |
| N/A | ANTHROPIC_TIMEOUT |
Timeout for Anthropic requests. |
| N/A | ZAI_TIMEOUT |
Timeout for ZAI requests. |
| N/A | ZENMUX_TIMEOUT |
Timeout for ZenMux requests. |
| N/A | OPENAI_TIMEOUT |
Timeout for OpenAI requests. |
| N/A | MINIMAX_TIMEOUT |
Timeout for Minimax requests. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--log-level LEVEL |
LOG_LEVEL |
Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL). |
--log-stream {stdout,stderr} |
N/A | Write console logs to stdout or stderr (default: stderr). |
--log FILE |
LOG_FILE |
Path to log file. |
--log-colors |
LOG_COLORS=true |
Enable colored log output. |
--no-log-colors |
LOG_COLORS=false |
Disable colored log output. |
--capture-file FILE |
CAPTURE_FILE |
Write raw LLM requests/replies to this file (JSON). |
--capture-max-bytes N |
CAPTURE_MAX_BYTES |
Max size of capture file before rotation. |
--capture-truncate-bytes N |
CAPTURE_TRUNCATE_BYTES |
Truncate captures to N bytes per entry. |
--capture-max-files N |
CAPTURE_MAX_FILES |
Max number of capture files to retain. |
--capture-rotate-interval SECONDS |
CAPTURE_ROTATE_INTERVAL_SECONDS |
Time-based rotation period. |
--capture-total-max-bytes N |
CAPTURE_TOTAL_MAX_BYTES |
Total disk cap across capture files. |
--cbor-capture-dir DIR |
N/A | Directory for CBOR byte-precise capture files. |
--cbor-capture-session ID |
N/A | Fixed session ID for CBOR capture. |
| N/A | REQUEST_LOGGING |
Enable detailed request logging (boolean). |
| N/A | RESPONSE_LOGGING |
Enable detailed response logging (boolean). |
| N/A | CAPTURE_BUFFER_SIZE |
Buffer size for wire capture writes (bytes). |
| N/A | CAPTURE_FLUSH_INTERVAL |
Flush interval for wire capture (seconds). |
| N/A | CAPTURE_MAX_ENTRIES_PER_FLUSH |
Max entries buffered before forced flush. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--disable-interactive-mode |
DEFAULT_INTERACTIVE_MODE=false |
Disable interactive mode by default. |
--force-set-project |
FORCE_SET_PROJECT=true |
Require project name to be set before sending prompts. |
--project-dir-resolution-model BACKEND:MODEL |
PROJECT_DIR_RESOLUTION_MODEL |
Model used to detect absolute project directory. |
--project-dir-resolution-mode MODE |
PROJECT_DIR_RESOLUTION_MODE |
Strategy: 'deterministic', 'llm', or 'hybrid'. |
--disable-interactive-commands |
N/A | Disable all in-chat command processing. |
--disable-accounting |
DISABLE_ACCOUNTING=true |
Disable LLM usage tracking. |
--no-accounting |
DISABLE_ACCOUNTING=true |
Disable LLM usage tracking (alias for --disable-accounting). |
--strict-command-detection |
STRICT_COMMAND_DETECTION |
Require commands to be at the start of messages. |
--enable-sandboxing |
ENABLE_SANDBOXING=true |
Restrict file operations to the project directory. |
--daemon |
N/A | Run server as a daemon (background process). |
--enable-end-of-session |
N/A | Enable end-of-session detection and event emission. |
--disable-end-of-session |
N/A | Disable end-of-session detection and event emission. |
--end-of-session-emit-events |
N/A | Enable event emission (default when EoS is enabled). |
--end-of-session-detect-only |
N/A | Enable detect-only mode (no events emitted). |
--end-of-session-dispatch-timeout SECONDS |
N/A | Maximum time to wait for event dispatch (default: 5.0, 0 for fire-and-forget). |
--enable-b2bua-session-handling |
SESSION_B2BUA_ENABLED=true |
Enable B2BUA A-leg/B-leg session identity separation. |
--disable-b2bua-session-handling |
SESSION_B2BUA_ENABLED=false |
Disable B2BUA mode and keep legacy session behavior. |
--b2bua-continuity-max-age-seconds SECONDS |
SESSION_B2BUA_CONTINUITY_MAX_AGE_SECONDS |
Maximum age for (auth_scope_id, client_session_id) continuity mappings. |
--b2bua-continuity-sliding-expiration |
SESSION_B2BUA_CONTINUITY_SLIDING_EXPIRATION=true |
Extend continuity mapping expiry on activity. |
--b2bua-continuity-fixed-expiration |
SESSION_B2BUA_CONTINUITY_SLIDING_EXPIRATION=false |
Use fixed continuity mapping expiry without sliding updates. |
--enable-b2bua-persistent-mapping-store |
SESSION_B2BUA_PERSISTENT_MAPPING_STORE_ENABLED=true |
Persist continuity mapping and B-leg sequence state across restarts. |
--disable-b2bua-persistent-mapping-store |
SESSION_B2BUA_PERSISTENT_MAPPING_STORE_ENABLED=false |
Use in-memory continuity mapping store only. |
--enable-b2bua-session-echo |
SESSION_B2BUA_ECHO_ENABLED=true |
Emit A-leg session echo response header (diagnostic only). |
--disable-b2bua-session-echo |
SESSION_B2BUA_ECHO_ENABLED=false |
Disable A-leg session echo response header emission. |
--b2bua-session-echo-header-name HEADER |
SESSION_B2BUA_ECHO_HEADER_NAME |
Configure response header name used for A-leg echo. |
--enable-unsafe-legacy-session-inference |
SESSION_B2BUA_ENABLE_UNSAFE_HEURISTIC_SESSION_INFERENCE=true |
Allow unsafe fallback continuity inference when client_session_id is absent. |
--disable-unsafe-legacy-session-inference |
SESSION_B2BUA_ENABLE_UNSAFE_HEURISTIC_SESSION_INFERENCE=false |
Disable unsafe legacy continuity inference (recommended default). |
--b2bua-deployment-mode {single-process,multi-worker} |
SESSION_B2BUA_DEPLOYMENT_MODE |
Set deployment assumptions used for B2BUA startup validation and store selection. |
| N/A | SESSION_CLEANUP_ENABLED |
Enable session cleanup (boolean). |
| N/A | SESSION_CLEANUP_INTERVAL |
Cleanup interval in seconds. |
| N/A | SESSION_MAX_AGE |
Max session age in seconds. |
| N/A | SANDBOXING_STRICT_MODE |
Enable strict mode for sandboxing. |
| N/A | SANDBOXING_ALLOW_PARENT_ACCESS |
Allow access to parent directories in sandbox. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
| N/A | TOOL_CALL_REPAIR_ENABLED |
Enable tool call repair. |
| N/A | TOOL_CALL_REPAIR_BUFFER_CAP_BYTES |
Buffer cap for tool call repair. |
| N/A | JSON_REPAIR_ENABLED |
Enable JSON repair. |
| N/A | JSON_REPAIR_BUFFER_CAP_BYTES |
Buffer cap for JSON repair. |
| N/A | JSON_REPAIR_SCHEMA |
JSON schema for repair. |
| N/A | FORCE_REPROCESS_TOOL_CALLS |
Force reprocessing of tool calls. |
| N/A | LOG_SKIPPED_TOOL_CALLS |
Log skipped tool calls. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
| N/A | STREAMING_SAMPLER_ENABLED |
Enable streaming sampler. |
| N/A | STREAMING_SAMPLER_RATE |
Sampling rate (0.0 to 1.0). |
| N/A | STREAMING_SAMPLER_MAX_SAMPLES |
Max samples to retain. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-context-compaction |
ENABLE_CONTEXT_COMPACTION=true |
Enable history compaction to reduce stale tool outputs. (Default: Disabled) |
--compaction-min-tokens N |
COMPACTION_MIN_TOKENS |
Minimum tokens required to trigger compaction. (Default: 100,000) |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--memory-available |
MEMORY_AVAILABLE=true |
Enable the Memory feature globally. |
--memory-default-enabled |
MEMORY_DEFAULT_ENABLED=true |
Enable Memory by default for new sessions. |
--memory-summary-model BACKEND:MODEL |
MEMORY_SUMMARY_MODEL |
Model to use for generating session summaries. |
--memory-context-model BACKEND:MODEL |
MEMORY_CONTEXT_MODEL |
Model to use for retrieving context. |
--memory-summary-prompt FILE |
MEMORY_SUMMARY_PROMPT |
Path to custom summary prompt file. |
--memory-context-prompt FILE |
MEMORY_CONTEXT_PROMPT |
Path to custom context prompt file. |
--memory-database-path FILE |
MEMORY_DATABASE_PATH |
Path to SQLite database for memory storage. |
--memory-session-timeout MINUTES |
MEMORY_SESSION_TIMEOUT_MINUTES |
Timeout in minutes for session inactivity. |
--memory-summarization-delay SECONDS |
MEMORY_SUMMARIZATION_DELAY_SECONDS |
Delay before summarizing completed sessions. |
--memory-max-sessions-to-consider N |
MEMORY_MAX_SESSIONS_TO_CONSIDER |
Max recent sessions to consider for context. |
--memory-retention-days DAYS |
MEMORY_RETENTION_DAYS |
Days to retain memory data. |
--memory-max-context-tokens N |
MEMORY_MAX_CONTEXT_TOKENS |
Max tokens for injected context. |
--memory-max-summary-tokens N |
MEMORY_MAX_SUMMARY_TOKENS |
Max tokens for summary prompt context. |
--memory-max-transcript-chars N |
MEMORY_MAX_TRANSCRIPT_CHARS |
Max transcript length before chunking. |
--memory-summary-completion-tokens N |
MEMORY_SUMMARY_COMPLETION_TOKENS |
Max completion tokens for summary generation. |
--memory-context-relevance-threshold FLOAT |
MEMORY_CONTEXT_RELEVANCE_THRESHOLD |
Minimum relevance score for context retrieval. |
--memory-max-buffer-size-bytes N |
MEMORY_MAX_BUFFER_SIZE_BYTES |
Max capture buffer size per session. |
--memory-analysis-queue-maxsize N |
MEMORY_ANALYSIS_QUEUE_MAXSIZE |
Max size of analysis queue. |
--memory-analysis-timeout SECONDS |
MEMORY_ANALYSIS_TIMEOUT_SECONDS |
Timeout per summary generation. |
--memory-max-concurrent-analyses N |
MEMORY_MAX_CONCURRENT_ANALYSES |
Max concurrent analyses. |
--memory-context-template TEMPLATE |
MEMORY_CONTEXT_TEMPLATE |
Template for injected context (use {context}). |
--memory-single-user-mode |
MEMORY_SINGLE_USER_MODE=true |
Enable single-user mode (ignores user IDs). |
--memory-fixed-user-id ID |
MEMORY_FIXED_USER_ID |
Fixed user ID to use in single-user mode. |
--memory-persist-transcript |
MEMORY_PERSIST_TRANSCRIPT=true |
Persist transcripts for summaries. |
--memory-redaction-pattern PATTERN |
MEMORY_REDACTION_PATTERNS |
Add a regex pattern for redaction. Can be used multiple times. |
--memory-disable-user ID |
MEMORY_DISABLED_USERS |
Disable memory for specific user ID. Can be used multiple times. |
--memory-disable-client ID |
MEMORY_DISABLED_CLIENTS |
Disable memory for specific client ID. Can be used multiple times. |
--memory-summary-prompt-version VERSION |
MEMORY_SUMMARY_PROMPT_VERSION |
Summary prompt version identifier. |
--memory-summary-schema-version VERSION |
MEMORY_SUMMARY_SCHEMA_VERSION |
Summary schema version identifier. |
--memory-require-project-discovery |
MEMORY_REQUIRE_PROJECT_DISCOVERY=true |
Require project discovery before context injection. |
--memory-allow-missing-project |
MEMORY_REQUIRE_PROJECT_DISCOVERY=false |
Allow context injection without project discovery. |
--memory-project-discovery-mode MODE |
MEMORY_PROJECT_DISCOVERY_MODE |
Project discovery mode. |
See Also: ProxyMem: Cross-Session Memory for detailed documentation on the memory feature.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-planning-phase |
PLANNING_PHASE_ENABLED=true |
Enable planning phase model routing. |
--planning-phase-strong-model BACKEND:MODEL |
PLANNING_PHASE_STRONG_MODEL |
Strong model for planning phase. |
--planning-phase-max-turns N |
PLANNING_PHASE_MAX_TURNS |
Max turns before switching from strong model. |
--planning-phase-max-file-writes N |
PLANNING_PHASE_MAX_FILE_WRITES |
Max file writes before switching. |
--planning-phase-temperature FLOAT |
PLANNING_PHASE_TEMPERATURE |
Temperature override for planning. |
--planning-phase-top-p FLOAT |
PLANNING_PHASE_TOP_P |
Top-p override for planning. |
--planning-phase-reasoning-effort EFFORT |
PLANNING_PHASE_REASONING_EFFORT |
Reasoning effort override for planning. |
--planning-phase-thinking-budget TOKENS |
PLANNING_PHASE_THINKING_BUDGET |
Thinking budget override for planning. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-edit-precision |
EDIT_PRECISION_ENABLED=true |
Enable automated edit-precision tuning. |
--disable-edit-precision |
EDIT_PRECISION_ENABLED=false |
Disable automated edit-precision tuning. |
--edit-precision-temperature FLOAT |
EDIT_PRECISION_TEMPERATURE |
Target temperature (default: 0.1). |
--edit-precision-min-top-p FLOAT |
EDIT_PRECISION_MIN_TOP_P |
Minimum top_p (default: 0.3). |
--edit-precision-override-top-p |
EDIT_PRECISION_OVERRIDE_TOP_P |
Enable top_p override. |
--edit-precision-target-top-k N |
EDIT_PRECISION_TARGET_TOP_K |
Target top_k value. |
--edit-precision-override-top-k |
EDIT_PRECISION_OVERRIDE_TOP_K |
Enable top_k override. |
--edit-precision-exclude-agents REGEX |
EDIT_PRECISION_EXCLUDE_AGENTS_REGEX |
Exclude agents matching regex. |
Real-time connection activity tracking for debugging and monitoring. Disabled by default for performance.
| CLI Argument | Environment Variable | Config File | Description |
|---|---|---|---|
--enable-activity-tracking |
ENABLE_ACTIVITY_TRACKING=1 |
enable_activity_tracking: true |
Enable connection activity tracking (RX/TX counters per session). |
| CLI Argument | Environment Variable | Description |
|---|
| CLI Argument | Environment Variable | Description |
|---|---|---|
--quality-verifier-model BACKEND:MODEL |
QUALITY_VERIFIER_MODEL |
Enable Quality Verifier with model. |
--quality-verifier-frequency N |
QUALITY_VERIFIER_FREQUENCY |
Run verification every N user turns (default: 1). |
--quality-verifier-max-history N |
QUALITY_VERIFIER_MAX_HISTORY |
Truncate history for Quality Verifier to last N messages (optional). |
--quality-verifier-max-consecutive-failures N |
QUALITY_VERIFIER_MAX_CONSECUTIVE_FAILURES |
Trip the Angel circuit breaker after N consecutive failures (default: 5). |
--quality-verifier-cooldown-seconds N |
QUALITY_VERIFIER_COOLDOWN_SECONDS |
Cooldown period before Quality Verifier can retry (default: 300s). |
--quality-verifier-ttft-timeout-seconds SECONDS |
QUALITY_VERIFIER_TTFT_TIMEOUT_SECONDS |
Timeout for first token from Quality Verifier model before skipping (default: 30s). |
--quality-verifier-tool-followup-weight WEIGHT |
QUALITY_VERIFIER_TOOL_FOLLOWUP_WEIGHT |
Eligible-turn increment per tool-result follow-up, 0.0–1.0 (default: 0.2). |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--allowed-tools PATTERNS |
N/A | Comma-separated regex for allowed tools. |
--blocked-tools PATTERNS |
N/A | Comma-separated regex for blocked tools. |
--default-policy POLICY |
N/A | Default policy: 'allow' or 'deny'. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--disable-routing-with-backend-ids |
DISABLE_ROUTING_WITH_BACKEND_IDS=true |
Disable routing using explicit backend instance IDs (e.g. openai.1:gpt-4). |
--disable-routing-with-backend_names |
DISABLE_ROUTING_WITH_BACKEND_NAMES=true |
Disable routing using backend names (e.g. openai:gpt-4). Implies disabling IDs. |
--disable-routing-with-only-model-names |
DISABLE_ROUTING_WITH_ONLY_MODEL_NAMES=true |
Disable routing using only model names (e.g. gpt-4). |
Routes auxiliary requests (title/summary generation) to alternative backends to reduce rate limiting pressure on the primary backend.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-auxiliary-routing |
AUXILIARY_ROUTING_ENABLED=true |
Enable routing of auxiliary requests (title/summary generation) to an alternative backend. |
--auxiliary-routing-model MODEL |
AUXILIARY_ROUTING_MODEL |
Model to use for auxiliary requests. Can be specified as model or fully qualified backend:model (e.g. openrouter:gemini-1.5-flash). |
--auxiliary-routing-max-messages N |
AUXILIARY_ROUTING_MAX_MESSAGES |
Maximum message count for a request to be considered auxiliary (default: 3). |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-pytest-compression |
PYTEST_COMPRESSION_ENABLED=true |
Enable pytest output compression. |
--disable-pytest-compression |
PYTEST_COMPRESSION_ENABLED=false |
Disable pytest output compression. |
--enable-pytest-full-suite-steering |
PYTEST_FULL_SUITE_STEERING_ENABLED=true |
Enable steering for full pytest suite. |
--disable-pytest-full-suite-steering |
PYTEST_FULL_SUITE_STEERING_ENABLED=false |
Disable steering for full pytest suite. |
--enable-pytest-context-saving |
N/A | Enable context saving rewrites. |
--test-execution-reminder-enabled |
TEST_EXECUTION_REMINDER_ENABLED=true |
Enable test execution reminder. |
--no-test-execution-reminder-enabled |
TEST_EXECUTION_REMINDER_ENABLED=false |
Disable test execution reminder. |
| N/A | PYTEST_COMPRESSION_MIN_LINES |
Min lines for compression. |
| N/A | PYTEST_FULL_SUITE_STEERING_MESSAGE |
Custom steering message. |
| N/A | TEST_EXECUTION_REMINDER_MESSAGE |
Custom reminder message. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
| N/A | EMPTY_RESPONSE_HANDLING_ENABLED |
Enable empty response handling. |
| N/A | EMPTY_RESPONSE_MAX_RETRIES |
Max retries for empty responses. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
| N/A | REWRITING_ENABLED |
Enable content rewriting. |
| N/A | REWRITING_CONFIG_PATH |
Path to rewriting configuration. |
See Random Model Replacement Feature Guide for detailed documentation.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-replacement |
REPLACEMENT_ENABLED=true |
Enable random model replacement. |
--disable-replacement |
REPLACEMENT_ENABLED=false |
Disable random model replacement. |
--replacement-probability FLOAT |
REPLACEMENT_PROBABILITY |
Probability of replacement (0.0 to 1.0). |
--random-model-replacement-from-to FROM=TO |
REPLACEMENT_RULES |
Conditional replacement rule. Can be specified multiple times. Format: <from-model-name>=<to-model-name>. <from-model-name> can be * (wildcard), model-name (partial match), or backend:model (exact match). <to-model-name> must be backend:model. |
--replacement-backend-model BACKEND:MODEL |
REPLACEMENT_BACKEND_MODEL |
Deprecated: Use --random-model-replacement-from-to instead. Backend and model to use for replacement (converted to wildcard rule). |
--replacement-turn-count N |
REPLACEMENT_TURN_COUNT |
Number of turns to stay on replacement. |
--allow-oauth-auto-replacement |
ALLOW_OAUTH_AUTO_REPLACEMENT=true |
Allow random model replacement for multi-account oauth-auto rotating backends (disabled by default for safety). |
Configure automatic retry and failover behavior for backend errors. See Failure Handling for detailed documentation.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--disable-failure-handling |
DISABLE_FAILURE_HANDLING=1 |
Disable automatic failure handling (retry/failover). |
--max-silent-wait SECONDS |
FAILURE_HANDLING_MAX_SILENT_WAIT |
Max seconds to wait before failover (default: 30). |
--total-timeout-budget SECONDS |
FAILURE_HANDLING_TOTAL_TIMEOUT_BUDGET |
Total timeout budget across failover attempts (default: 90). |
--keepalive-interval SECONDS |
FAILURE_HANDLING_KEEPALIVE_INTERVAL |
SSE keepalive interval during waits (default: 8). |
--max-failover-hops N |
FAILURE_HANDLING_MAX_FAILOVER_HOPS |
Max backend instances to try (default: 5). |
--min-retry-wait SECONDS |
FAILURE_HANDLING_MIN_RETRY_WAIT |
Minimum retry wait time (default: 1). |
Control whether rate-limit and cooldown state is shared or isolated per client. See Resilience Scoping for detailed documentation.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--resilience-personal-backends BACKEND[,BACKEND...] |
RESILIENCE_PERSONAL_BACKEND_TYPES |
Force personal scoping for listed backend types (comma-separated or repeat the flag). |
--resilience-shared-backends BACKEND[,BACKEND...] |
RESILIENCE_SHARED_BACKEND_TYPES |
Force shared scoping for listed backend types (comma-separated or repeat the flag). |
Prevent duplicate requests from exhausting rate limits. See Request Deduplication for detailed documentation.
| CLI Argument | Environment Variable | Description |
|---|---|---|
--request-dedup-window SECONDS |
LLM_REQUEST_DEDUP_WINDOW |
Time window for duplicate detection (default: 3.0, set to 0 to disable). |
--disable-request-dedup |
N/A | Disable request deduplication entirely. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--fix-think-tags |
FIX_THINK_TAGS_ENABLED=true |
Enable correction of <think> tags. |
--disable-binary-file-edit-steering |
N/A | Disable binary file edit steering (overrides config). |
--disable-dangerous-git-commands-protection |
DANGEROUS_COMMAND_PREVENTION_ENABLED=false |
Disable dangerous command protection. |
| N/A | DANGEROUS_COMMAND_STEERING_MESSAGE |
Custom message for dangerous commands. |
| N/A | FIX_THINK_TAGS_STREAMING_BUFFER_SIZE |
Buffer size for think tag fix. |
| N/A | GCP_PROJECT_ID |
Google Cloud Project ID (GOOGLE_CLOUD_PROJECT). |
| N/A | GEMINI_CREDENTIALS_PATH |
Path to Gemini credentials JSON. |
| N/A | DISABLE_HEALTH_CHECKS |
Disable health check endpoints. |
| N/A | API_KEYS |
Comma-separated list of allowed API keys. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--enable-sso |
SSO_ENABLED=true |
Enable SSO authentication. |
--sso-config PATH |
SSO_CONFIG_FILE |
Path to SSO configuration file. |
--sso-provider NAME |
SSO_PROVIDER |
Provider name (google, microsoft, github, linkedin, aws). |
--sso-auth-mode MODE |
SSO_AUTH_MODE |
Authorization mode (single_user, enterprise). |
--disable-sso-captcha |
SSO_CAPTCHA_ENABLED=false |
Disable SSO captcha protection. |
| CLI Argument | Environment Variable | Description |
|---|---|---|
--identity-user-agent VALUE |
APP_USER_AGENT |
Override User-Agent header. |
--identity-url URL |
APP_URL |
Override HTTP-Referer header. |
--identity-title TITLE |
APP_TITLE |
Override X-Title header. |
| N/A | APP_USER_AGENT_MODE |
Mode for User-Agent override. |
| N/A | APP_URL_MODE |
Mode for URL override. |
| N/A | APP_TITLE_MODE |
Mode for Title override. |
Restricted for internal development.
| CLI Argument | Description |
|---|---|
--enable-cline-backend-debugging-override |
Enable Cline backend debugging. |
--enable-antigravity-backend-debugging-override |
Enable Antigravity backend debugging. |
--enable-gemini-oauth-free-backend-debugging-override |
Enable Gemini OAuth Free debugging. |
--enable-gemini-oauth-plan-backend-debugging-override |
Enable Gemini OAuth Plan debugging. |
--enable-qwen-oauth-backend-debugging-override |
Enable Qwen OAuth debugging. |
--enable-droid-path-fix |
Enable automatic path fixing for Droid agent with Antigravity OAuth backend. |