CLI Parameters and Configuration Reference

This guide documents all available CLI parameters, environment variables, and configuration options for the LLM Interactive Proxy.

Configuration Precedence

Configuration is resolved in the following order (highest to lowest priority):

CLI Arguments - Command-line flags override everything
Environment Variables - Environment variables override config files
YAML Configuration File - Config file provides defaults
Built-in Defaults - Hardcoded defaults if nothing else is specified

General

CLI Argument	Environment Variable	Description
`--help`, `-h`	N/A	Show help message and exit.
`--config FILE`	`CONFIG_FILE`	Path to persistent configuration file (YAML).

Backend Selection

CLI Argument	Environment Variable	Description
`--default-backend BACKEND`	`LLM_BACKEND`	Default backend to use (e.g., `openai`, `anthropic`, `gemini`).
`--static-route BACKEND:MODEL`	`STATIC_ROUTE`	Force all requests to use this backend:model combination.
`--disable-gemini-oauth-fallback`	`DISABLE_GEMINI_OAUTH_FALLBACK=1`	Disable automatic Gemini OAuth fallback to `gemini-2.5-flash`.
`--disable-gemini-oauth-reasoning-prompt-injection`	`DISABLE_GEMINI_OAUTH_REASONING_PROMPT_INJECTION=1`	Disable automatic reasoning effort prompt injection for Gemini OAuth backends (enabled by default).
`--disable-hybrid-backend`	`DISABLE_HYBRID_BACKEND=1`	Disable the hybrid backend (enabled by default).
`--hybrid-backend-repeat-messages`	`HYBRID_BACKEND_REPEAT_MESSAGES=1`	Repeat reasoning output as an artificial message in the session.
`--reasoning-injection-probability FLOAT` (or `--reasoning_injection_probability`)	`REASONING_INJECTION_PROBABILITY`	Probability of using the reasoning model in the hybrid backend (0.0 to 1.0).
`--hybrid-reasoning-model-timeout SECONDS`	`HYBRID_REASONING_MODEL_TIMEOUT`	Timeout for the reasoning model call in hybrid scenarios (default: 60).
`--hybrid-reasoning-force-initial-turns N`	`HYBRID_REASONING_FORCE_INITIAL_TURNS`	Number of turns at start of session to force reasoning model usage (default: 4).
`--model-alias PATTERN=REPLACEMENT`	`MODEL_ALIASES` (JSON string)	Add a model name rewrite rule (regex pattern and replacement). Can be used multiple times.

Server Configuration

CLI Argument	Environment Variable	Description
`--host HOST`	`APP_HOST`	Bind host (default: `127.0.0.1`).
`--port PORT`	`APP_PORT`	Bind port (default: `8000`).
`--anthropic-port PORT`	`ANTHROPIC_PORT`	Port for Anthropic-compatible endpoints (default: disabled/derived).
`--timeout SECONDS`	`PROXY_TIMEOUT`	Global request timeout in seconds (default: 120).
`--command-prefix PREFIX`	`COMMAND_PREFIX`	Command prefix for in-chat commands (default: `!/`).
`--force-context-window TOKENS`	`FORCE_CONTEXT_WINDOW`	Override context window size for all models.
`--thinking-budget TOKENS`	`THINKING_BUDGET`	Set max reasoning tokens for all requests.

Model Registry & Limits

CLI Argument	Environment Variable	Description
`--disable-model-registry-download`	N/A	Disable downloading updates from the external model registry.
`--model-registry-url URL`	N/A	Override the model registry URL (default: `https://models.dev/api.json`).
`--model-registry-update-interval SECONDS`	N/A	Update interval for registry downloads (default: 86400).
`--disable-model-limit-enforcement`	N/A	Disable automated model limit and modality enforcement based on registry metadata.

Authentication & Security

API Keys & Tokens

CLI Argument	Environment Variable	Description
`--disable-auth`	`DISABLE_AUTH=1`	Disable client authentication (forces localhost binding).
`--disable-sso-captcha`	`SSO_CAPTCHA_ENABLED=false`	Disable SSO Captcha verification (overrides config).
`--disable-redact-api-keys-in-prompts`	`REDACT_API_KEYS_IN_PROMPTS=false`	Disable redaction of API keys in prompts.
`--openrouter-api-key KEY`	`OPENROUTER_API_KEY`	OpenRouter API Key.
`--openrouter-api-base-url URL`	`OPENROUTER_API_BASE_URL`	OpenRouter API Base URL.
`--gemini-api-key KEY`	`GEMINI_API_KEY`	Google Gemini API Key.
`--gemini-api-base-url URL`	`GEMINI_API_BASE_URL`	Google Gemini API Base URL.
`--zai-api-key KEY`	`ZAI_API_KEY`	ZAI API Key.
`--zenmux-api-base-url URL`	`ZENMUX_API_BASE_URL`	ZenMux API Base URL.
N/A	`ANTHROPIC_API_KEY`	Anthropic API Key.
N/A	`ANTHROPIC_API_BASE_URL`	Anthropic API Base URL.
N/A	`AUTH_TOKEN`	Shared secret token for client authentication.

Brute Force Protection

CLI Argument	Environment Variable	Description
`--enable-brute-force-protection`	`BRUTE_FORCE_PROTECTION_ENABLED=true`	Enable API key brute-force protection.
`--disable-brute-force-protection`	`BRUTE_FORCE_PROTECTION_ENABLED=false`	Disable API key brute-force protection.
`--auth-max-failed-attempts N`	`BRUTE_FORCE_MAX_FAILED_ATTEMPTS`	Max failed attempts before blocking (default: 5).
`--auth-brute-force-ttl SECONDS`	`BRUTE_FORCE_TTL_SECONDS`	Time window for tracking failed attempts (default: 900).
`--auth-brute-force-initial-block SECONDS`	`BRUTE_FORCE_INITIAL_BLOCK_SECONDS`	Initial block duration (default: 30).
`--auth-brute-force-multiplier FLOAT`	`BRUTE_FORCE_BLOCK_MULTIPLIER`	Multiplier for subsequent blocks (default: 2.0).
`--auth-brute-force-max-block SECONDS`	`BRUTE_FORCE_MAX_BLOCK_SECONDS`	Max block duration (default: 3600).

Access Control

CLI Argument	Environment Variable	Description
`--trusted-ip IP`	N/A	IP address to trust for bypassing authorization.
`--allow-admin`	N/A	Allow running with administrative privileges.

Advanced Backend Settings

CLI Argument	Environment Variable	Description
`--hybrid-execution-model-timeout SECONDS`	`HYBRID_EXECUTION_MODEL_TIMEOUT`	Timeout for execution model in hybrid scenarios.
N/A	`HYBRID_REASONING_LATENCY_THRESHOLD`	Latency threshold for adaptive reasoning backoff.
N/A	`HYBRID_REASONING_BACKOFF_TURNS`	Turns to skip reasoning after latency threshold exceeded.

Backend Timeouts

CLI Argument	Environment Variable	Description
N/A	`OPENROUTER_TIMEOUT`	Timeout for OpenRouter requests.
N/A	`GEMINI_TIMEOUT`	Timeout for Gemini requests.
N/A	`ANTHROPIC_TIMEOUT`	Timeout for Anthropic requests.
N/A	`ZAI_TIMEOUT`	Timeout for ZAI requests.
N/A	`ZENMUX_TIMEOUT`	Timeout for ZenMux requests.
N/A	`OPENAI_TIMEOUT`	Timeout for OpenAI requests.
N/A	`MINIMAX_TIMEOUT`	Timeout for Minimax requests.

Logging & Capture

CLI Argument	Environment Variable	Description
`--log-level LEVEL`	`LOG_LEVEL`	Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL).
`--log-stream {stdout,stderr}`	N/A	Write console logs to stdout or stderr (default: stderr).
`--log FILE`	`LOG_FILE`	Path to log file.
`--log-colors`	`LOG_COLORS=true`	Enable colored log output.
`--no-log-colors`	`LOG_COLORS=false`	Disable colored log output.
`--capture-file FILE`	`CAPTURE_FILE`	Write raw LLM requests/replies to this file (JSON).
`--capture-max-bytes N`	`CAPTURE_MAX_BYTES`	Max size of capture file before rotation.
`--capture-truncate-bytes N`	`CAPTURE_TRUNCATE_BYTES`	Truncate captures to N bytes per entry.
`--capture-max-files N`	`CAPTURE_MAX_FILES`	Max number of capture files to retain.
`--capture-rotate-interval SECONDS`	`CAPTURE_ROTATE_INTERVAL_SECONDS`	Time-based rotation period.
`--capture-total-max-bytes N`	`CAPTURE_TOTAL_MAX_BYTES`	Total disk cap across capture files.
`--cbor-capture-dir DIR`	N/A	Directory for CBOR byte-precise capture files.
`--cbor-capture-session ID`	N/A	Fixed session ID for CBOR capture.
N/A	`REQUEST_LOGGING`	Enable detailed request logging (boolean).
N/A	`RESPONSE_LOGGING`	Enable detailed response logging (boolean).
N/A	`CAPTURE_BUFFER_SIZE`	Buffer size for wire capture writes (bytes).
N/A	`CAPTURE_FLUSH_INTERVAL`	Flush interval for wire capture (seconds).
N/A	`CAPTURE_MAX_ENTRIES_PER_FLUSH`	Max entries buffered before forced flush.

Session Management

CLI Argument	Environment Variable	Description
`--disable-interactive-mode`	`DEFAULT_INTERACTIVE_MODE=false`	Disable interactive mode by default.
`--force-set-project`	`FORCE_SET_PROJECT=true`	Require project name to be set before sending prompts.
`--project-dir-resolution-model BACKEND:MODEL`	`PROJECT_DIR_RESOLUTION_MODEL`	Model used to detect absolute project directory.
`--project-dir-resolution-mode MODE`	`PROJECT_DIR_RESOLUTION_MODE`	Strategy: 'deterministic', 'llm', or 'hybrid'.
`--disable-interactive-commands`	N/A	Disable all in-chat command processing.
`--disable-accounting`	`DISABLE_ACCOUNTING=true`	Disable LLM usage tracking.
`--no-accounting`	`DISABLE_ACCOUNTING=true`	Disable LLM usage tracking (alias for --disable-accounting).
`--strict-command-detection`	`STRICT_COMMAND_DETECTION`	Require commands to be at the start of messages.
`--enable-sandboxing`	`ENABLE_SANDBOXING=true`	Restrict file operations to the project directory.
`--daemon`	N/A	Run server as a daemon (background process).
`--enable-end-of-session`	N/A	Enable end-of-session detection and event emission.
`--disable-end-of-session`	N/A	Disable end-of-session detection and event emission.
`--end-of-session-emit-events`	N/A	Enable event emission (default when EoS is enabled).
`--end-of-session-detect-only`	N/A	Enable detect-only mode (no events emitted).
`--end-of-session-dispatch-timeout SECONDS`	N/A	Maximum time to wait for event dispatch (default: 5.0, 0 for fire-and-forget).
`--enable-b2bua-session-handling`	`SESSION_B2BUA_ENABLED=true`	Enable B2BUA A-leg/B-leg session identity separation.
`--disable-b2bua-session-handling`	`SESSION_B2BUA_ENABLED=false`	Disable B2BUA mode and keep legacy session behavior.
`--b2bua-continuity-max-age-seconds SECONDS`	`SESSION_B2BUA_CONTINUITY_MAX_AGE_SECONDS`	Maximum age for (`auth_scope_id`, `client_session_id`) continuity mappings.
`--b2bua-continuity-sliding-expiration`	`SESSION_B2BUA_CONTINUITY_SLIDING_EXPIRATION=true`	Extend continuity mapping expiry on activity.
`--b2bua-continuity-fixed-expiration`	`SESSION_B2BUA_CONTINUITY_SLIDING_EXPIRATION=false`	Use fixed continuity mapping expiry without sliding updates.
`--enable-b2bua-persistent-mapping-store`	`SESSION_B2BUA_PERSISTENT_MAPPING_STORE_ENABLED=true`	Persist continuity mapping and B-leg sequence state across restarts.
`--disable-b2bua-persistent-mapping-store`	`SESSION_B2BUA_PERSISTENT_MAPPING_STORE_ENABLED=false`	Use in-memory continuity mapping store only.
`--enable-b2bua-session-echo`	`SESSION_B2BUA_ECHO_ENABLED=true`	Emit A-leg session echo response header (diagnostic only).
`--disable-b2bua-session-echo`	`SESSION_B2BUA_ECHO_ENABLED=false`	Disable A-leg session echo response header emission.
`--b2bua-session-echo-header-name HEADER`	`SESSION_B2BUA_ECHO_HEADER_NAME`	Configure response header name used for A-leg echo.
`--enable-unsafe-legacy-session-inference`	`SESSION_B2BUA_ENABLE_UNSAFE_HEURISTIC_SESSION_INFERENCE=true`	Allow unsafe fallback continuity inference when `client_session_id` is absent.
`--disable-unsafe-legacy-session-inference`	`SESSION_B2BUA_ENABLE_UNSAFE_HEURISTIC_SESSION_INFERENCE=false`	Disable unsafe legacy continuity inference (recommended default).
`--b2bua-deployment-mode {single-process,multi-worker}`	`SESSION_B2BUA_DEPLOYMENT_MODE`	Set deployment assumptions used for B2BUA startup validation and store selection.
N/A	`SESSION_CLEANUP_ENABLED`	Enable session cleanup (boolean).
N/A	`SESSION_CLEANUP_INTERVAL`	Cleanup interval in seconds.
N/A	`SESSION_MAX_AGE`	Max session age in seconds.
N/A	`SANDBOXING_STRICT_MODE`	Enable strict mode for sandboxing.
N/A	`SANDBOXING_ALLOW_PARENT_ACCESS`	Allow access to parent directories in sandbox.

Tool Call & JSON Repair

CLI Argument	Environment Variable	Description
N/A	`TOOL_CALL_REPAIR_ENABLED`	Enable tool call repair.
N/A	`TOOL_CALL_REPAIR_BUFFER_CAP_BYTES`	Buffer cap for tool call repair.
N/A	`JSON_REPAIR_ENABLED`	Enable JSON repair.
N/A	`JSON_REPAIR_BUFFER_CAP_BYTES`	Buffer cap for JSON repair.
N/A	`JSON_REPAIR_SCHEMA`	JSON schema for repair.
N/A	`FORCE_REPROCESS_TOOL_CALLS`	Force reprocessing of tool calls.
N/A	`LOG_SKIPPED_TOOL_CALLS`	Log skipped tool calls.

Streaming Sampler

CLI Argument	Environment Variable	Description
N/A	`STREAMING_SAMPLER_ENABLED`	Enable streaming sampler.
N/A	`STREAMING_SAMPLER_RATE`	Sampling rate (0.0 to 1.0).
N/A	`STREAMING_SAMPLER_MAX_SAMPLES`	Max samples to retain.

History Compaction

CLI Argument	Environment Variable	Description
`--enable-context-compaction`	`ENABLE_CONTEXT_COMPACTION=true`	Enable history compaction to reduce stale tool outputs. (Default: Disabled)
`--compaction-min-tokens N`	`COMPACTION_MIN_TOKENS`	Minimum tokens required to trigger compaction. (Default: 100,000)

Features

Memory (ProxyMem)

CLI Argument	Environment Variable	Description
`--memory-available`	`MEMORY_AVAILABLE=true`	Enable the Memory feature globally.
`--memory-default-enabled`	`MEMORY_DEFAULT_ENABLED=true`	Enable Memory by default for new sessions.
`--memory-summary-model BACKEND:MODEL`	`MEMORY_SUMMARY_MODEL`	Model to use for generating session summaries.
`--memory-context-model BACKEND:MODEL`	`MEMORY_CONTEXT_MODEL`	Model to use for retrieving context.
`--memory-summary-prompt FILE`	`MEMORY_SUMMARY_PROMPT`	Path to custom summary prompt file.
`--memory-context-prompt FILE`	`MEMORY_CONTEXT_PROMPT`	Path to custom context prompt file.
`--memory-database-path FILE`	`MEMORY_DATABASE_PATH`	Path to SQLite database for memory storage.
`--memory-session-timeout MINUTES`	`MEMORY_SESSION_TIMEOUT_MINUTES`	Timeout in minutes for session inactivity.
`--memory-summarization-delay SECONDS`	`MEMORY_SUMMARIZATION_DELAY_SECONDS`	Delay before summarizing completed sessions.
`--memory-max-sessions-to-consider N`	`MEMORY_MAX_SESSIONS_TO_CONSIDER`	Max recent sessions to consider for context.
`--memory-retention-days DAYS`	`MEMORY_RETENTION_DAYS`	Days to retain memory data.
`--memory-max-context-tokens N`	`MEMORY_MAX_CONTEXT_TOKENS`	Max tokens for injected context.
`--memory-max-summary-tokens N`	`MEMORY_MAX_SUMMARY_TOKENS`	Max tokens for summary prompt context.
`--memory-max-transcript-chars N`	`MEMORY_MAX_TRANSCRIPT_CHARS`	Max transcript length before chunking.
`--memory-summary-completion-tokens N`	`MEMORY_SUMMARY_COMPLETION_TOKENS`	Max completion tokens for summary generation.
`--memory-context-relevance-threshold FLOAT`	`MEMORY_CONTEXT_RELEVANCE_THRESHOLD`	Minimum relevance score for context retrieval.
`--memory-max-buffer-size-bytes N`	`MEMORY_MAX_BUFFER_SIZE_BYTES`	Max capture buffer size per session.
`--memory-analysis-queue-maxsize N`	`MEMORY_ANALYSIS_QUEUE_MAXSIZE`	Max size of analysis queue.
`--memory-analysis-timeout SECONDS`	`MEMORY_ANALYSIS_TIMEOUT_SECONDS`	Timeout per summary generation.
`--memory-max-concurrent-analyses N`	`MEMORY_MAX_CONCURRENT_ANALYSES`	Max concurrent analyses.
`--memory-context-template TEMPLATE`	`MEMORY_CONTEXT_TEMPLATE`	Template for injected context (use `{context}`).
`--memory-single-user-mode`	`MEMORY_SINGLE_USER_MODE=true`	Enable single-user mode (ignores user IDs).
`--memory-fixed-user-id ID`	`MEMORY_FIXED_USER_ID`	Fixed user ID to use in single-user mode.
`--memory-persist-transcript`	`MEMORY_PERSIST_TRANSCRIPT=true`	Persist transcripts for summaries.
`--memory-redaction-pattern PATTERN`	`MEMORY_REDACTION_PATTERNS`	Add a regex pattern for redaction. Can be used multiple times.
`--memory-disable-user ID`	`MEMORY_DISABLED_USERS`	Disable memory for specific user ID. Can be used multiple times.
`--memory-disable-client ID`	`MEMORY_DISABLED_CLIENTS`	Disable memory for specific client ID. Can be used multiple times.
`--memory-summary-prompt-version VERSION`	`MEMORY_SUMMARY_PROMPT_VERSION`	Summary prompt version identifier.
`--memory-summary-schema-version VERSION`	`MEMORY_SUMMARY_SCHEMA_VERSION`	Summary schema version identifier.
`--memory-require-project-discovery`	`MEMORY_REQUIRE_PROJECT_DISCOVERY=true`	Require project discovery before context injection.
`--memory-allow-missing-project`	`MEMORY_REQUIRE_PROJECT_DISCOVERY=false`	Allow context injection without project discovery.
`--memory-project-discovery-mode MODE`	`MEMORY_PROJECT_DISCOVERY_MODE`	Project discovery mode.

See Also: ProxyMem: Cross-Session Memory for detailed documentation on the memory feature.

Planning Phase

CLI Argument	Environment Variable	Description
`--enable-planning-phase`	`PLANNING_PHASE_ENABLED=true`	Enable planning phase model routing.
`--planning-phase-strong-model BACKEND:MODEL`	`PLANNING_PHASE_STRONG_MODEL`	Strong model for planning phase.
`--planning-phase-max-turns N`	`PLANNING_PHASE_MAX_TURNS`	Max turns before switching from strong model.
`--planning-phase-max-file-writes N`	`PLANNING_PHASE_MAX_FILE_WRITES`	Max file writes before switching.
`--planning-phase-temperature FLOAT`	`PLANNING_PHASE_TEMPERATURE`	Temperature override for planning.
`--planning-phase-top-p FLOAT`	`PLANNING_PHASE_TOP_P`	Top-p override for planning.
`--planning-phase-reasoning-effort EFFORT`	`PLANNING_PHASE_REASONING_EFFORT`	Reasoning effort override for planning.
`--planning-phase-thinking-budget TOKENS`	`PLANNING_PHASE_THINKING_BUDGET`	Thinking budget override for planning.

Edit Precision Tuning

CLI Argument	Environment Variable	Description
`--enable-edit-precision`	`EDIT_PRECISION_ENABLED=true`	Enable automated edit-precision tuning.
`--disable-edit-precision`	`EDIT_PRECISION_ENABLED=false`	Disable automated edit-precision tuning.
`--edit-precision-temperature FLOAT`	`EDIT_PRECISION_TEMPERATURE`	Target temperature (default: 0.1).
`--edit-precision-min-top-p FLOAT`	`EDIT_PRECISION_MIN_TOP_P`	Minimum top_p (default: 0.3).
`--edit-precision-override-top-p`	`EDIT_PRECISION_OVERRIDE_TOP_P`	Enable top_p override.
`--edit-precision-target-top-k N`	`EDIT_PRECISION_TARGET_TOP_K`	Target top_k value.
`--edit-precision-override-top-k`	`EDIT_PRECISION_OVERRIDE_TOP_K`	Enable top_k override.
`--edit-precision-exclude-agents REGEX`	`EDIT_PRECISION_EXCLUDE_AGENTS_REGEX`	Exclude agents matching regex.

Activity Tracking

Real-time connection activity tracking for debugging and monitoring. Disabled by default for performance.

CLI Argument	Environment Variable	Config File	Description
`--enable-activity-tracking`	`ENABLE_ACTIVITY_TRACKING=1`	`enable_activity_tracking: true`	Enable connection activity tracking (RX/TX counters per session).

CLI Argument	Environment Variable	Description

Quality Verifier

CLI Argument	Environment Variable	Description
`--quality-verifier-model BACKEND:MODEL`	`QUALITY_VERIFIER_MODEL`	Enable Quality Verifier with model.
`--quality-verifier-frequency N`	`QUALITY_VERIFIER_FREQUENCY`	Run verification every N user turns (default: 1).
`--quality-verifier-max-history N`	`QUALITY_VERIFIER_MAX_HISTORY`	Truncate history for Quality Verifier to last N messages (optional).
`--quality-verifier-max-consecutive-failures N`	`QUALITY_VERIFIER_MAX_CONSECUTIVE_FAILURES`	Trip the Angel circuit breaker after N consecutive failures (default: 5).
`--quality-verifier-cooldown-seconds N`	`QUALITY_VERIFIER_COOLDOWN_SECONDS`	Cooldown period before Quality Verifier can retry (default: 300s).
`--quality-verifier-ttft-timeout-seconds SECONDS`	`QUALITY_VERIFIER_TTFT_TIMEOUT_SECONDS`	Timeout for first token from Quality Verifier model before skipping (default: 30s).
`--quality-verifier-tool-followup-weight WEIGHT`	`QUALITY_VERIFIER_TOOL_FOLLOWUP_WEIGHT`	Eligible-turn increment per tool-result follow-up, 0.0–1.0 (default: 0.2).

Tool Access Control

CLI Argument	Environment Variable	Description
`--allowed-tools PATTERNS`	N/A	Comma-separated regex for allowed tools.
`--blocked-tools PATTERNS`	N/A	Comma-separated regex for blocked tools.
`--default-policy POLICY`	N/A	Default policy: 'allow' or 'deny'.

Routing Control

CLI Argument	Environment Variable	Description
`--disable-routing-with-backend-ids`	`DISABLE_ROUTING_WITH_BACKEND_IDS=true`	Disable routing using explicit backend instance IDs (e.g. `openai.1:gpt-4`).
`--disable-routing-with-backend_names`	`DISABLE_ROUTING_WITH_BACKEND_NAMES=true`	Disable routing using backend names (e.g. `openai:gpt-4`). Implies disabling IDs.
`--disable-routing-with-only-model-names`	`DISABLE_ROUTING_WITH_ONLY_MODEL_NAMES=true`	Disable routing using only model names (e.g. `gpt-4`).

Auxiliary Request Routing

Routes auxiliary requests (title/summary generation) to alternative backends to reduce rate limiting pressure on the primary backend.

CLI Argument	Environment Variable	Description
`--enable-auxiliary-routing`	`AUXILIARY_ROUTING_ENABLED=true`	Enable routing of auxiliary requests (title/summary generation) to an alternative backend.
`--auxiliary-routing-model MODEL`	`AUXILIARY_ROUTING_MODEL`	Model to use for auxiliary requests. Can be specified as `model` or fully qualified `backend:model` (e.g. `openrouter:gemini-1.5-flash`).
`--auxiliary-routing-max-messages N`	`AUXILIARY_ROUTING_MAX_MESSAGES`	Maximum message count for a request to be considered auxiliary (default: 3).

Pytest Integration

CLI Argument	Environment Variable	Description
`--enable-pytest-compression`	`PYTEST_COMPRESSION_ENABLED=true`	Enable pytest output compression.
`--disable-pytest-compression`	`PYTEST_COMPRESSION_ENABLED=false`	Disable pytest output compression.
`--enable-pytest-full-suite-steering`	`PYTEST_FULL_SUITE_STEERING_ENABLED=true`	Enable steering for full pytest suite.
`--disable-pytest-full-suite-steering`	`PYTEST_FULL_SUITE_STEERING_ENABLED=false`	Disable steering for full pytest suite.
`--enable-pytest-context-saving`	N/A	Enable context saving rewrites.
`--test-execution-reminder-enabled`	`TEST_EXECUTION_REMINDER_ENABLED=true`	Enable test execution reminder.
`--no-test-execution-reminder-enabled`	`TEST_EXECUTION_REMINDER_ENABLED=false`	Disable test execution reminder.
N/A	`PYTEST_COMPRESSION_MIN_LINES`	Min lines for compression.
N/A	`PYTEST_FULL_SUITE_STEERING_MESSAGE`	Custom steering message.
N/A	`TEST_EXECUTION_REMINDER_MESSAGE`	Custom reminder message.

Empty Response Handling

CLI Argument	Environment Variable	Description
N/A	`EMPTY_RESPONSE_HANDLING_ENABLED`	Enable empty response handling.
N/A	`EMPTY_RESPONSE_MAX_RETRIES`	Max retries for empty responses.

Rewriting

CLI Argument	Environment Variable	Description
N/A	`REWRITING_ENABLED`	Enable content rewriting.
N/A	`REWRITING_CONFIG_PATH`	Path to rewriting configuration.

Random Model Replacement

See Random Model Replacement Feature Guide for detailed documentation.

CLI Argument	Environment Variable	Description
`--enable-replacement`	`REPLACEMENT_ENABLED=true`	Enable random model replacement.
`--disable-replacement`	`REPLACEMENT_ENABLED=false`	Disable random model replacement.
`--replacement-probability FLOAT`	`REPLACEMENT_PROBABILITY`	Probability of replacement (0.0 to 1.0).
`--random-model-replacement-from-to FROM=TO`	`REPLACEMENT_RULES`	Conditional replacement rule. Can be specified multiple times. Format: `<from-model-name>=<to-model-name>`. `<from-model-name>` can be `*` (wildcard), `model-name` (partial match), or `backend:model` (exact match). `<to-model-name>` must be `backend:model`.
`--replacement-backend-model BACKEND:MODEL`	`REPLACEMENT_BACKEND_MODEL`	Deprecated: Use `--random-model-replacement-from-to` instead. Backend and model to use for replacement (converted to wildcard rule).
`--replacement-turn-count N`	`REPLACEMENT_TURN_COUNT`	Number of turns to stay on replacement.
`--allow-oauth-auto-replacement`	`ALLOW_OAUTH_AUTO_REPLACEMENT=true`	Allow random model replacement for multi-account `oauth-auto` rotating backends (disabled by default for safety).

Failure Handling

Configure automatic retry and failover behavior for backend errors. See Failure Handling for detailed documentation.

CLI Argument	Environment Variable	Description
`--disable-failure-handling`	`DISABLE_FAILURE_HANDLING=1`	Disable automatic failure handling (retry/failover).
`--max-silent-wait SECONDS`	`FAILURE_HANDLING_MAX_SILENT_WAIT`	Max seconds to wait before failover (default: 30).
`--total-timeout-budget SECONDS`	`FAILURE_HANDLING_TOTAL_TIMEOUT_BUDGET`	Total timeout budget across failover attempts (default: 90).
`--keepalive-interval SECONDS`	`FAILURE_HANDLING_KEEPALIVE_INTERVAL`	SSE keepalive interval during waits (default: 8).
`--max-failover-hops N`	`FAILURE_HANDLING_MAX_FAILOVER_HOPS`	Max backend instances to try (default: 5).
`--min-retry-wait SECONDS`	`FAILURE_HANDLING_MIN_RETRY_WAIT`	Minimum retry wait time (default: 1).

Resilience Scoping

Control whether rate-limit and cooldown state is shared or isolated per client. See Resilience Scoping for detailed documentation.

CLI Argument	Environment Variable	Description
`--resilience-personal-backends BACKEND[,BACKEND...]`	`RESILIENCE_PERSONAL_BACKEND_TYPES`	Force personal scoping for listed backend types (comma-separated or repeat the flag).
`--resilience-shared-backends BACKEND[,BACKEND...]`	`RESILIENCE_SHARED_BACKEND_TYPES`	Force shared scoping for listed backend types (comma-separated or repeat the flag).

Request Deduplication

Prevent duplicate requests from exhausting rate limits. See Request Deduplication for detailed documentation.

CLI Argument	Environment Variable	Description
`--request-dedup-window SECONDS`	`LLM_REQUEST_DEDUP_WINDOW`	Time window for duplicate detection (default: 3.0, set to 0 to disable).
`--disable-request-dedup`	N/A	Disable request deduplication entirely.

Other Features

CLI Argument	Environment Variable	Description
`--fix-think-tags`	`FIX_THINK_TAGS_ENABLED=true`	Enable correction of `<think>` tags.
`--disable-binary-file-edit-steering`	N/A	Disable binary file edit steering (overrides config).
`--disable-dangerous-git-commands-protection`	`DANGEROUS_COMMAND_PREVENTION_ENABLED=false`	Disable dangerous command protection.
N/A	`DANGEROUS_COMMAND_STEERING_MESSAGE`	Custom message for dangerous commands.
N/A	`FIX_THINK_TAGS_STREAMING_BUFFER_SIZE`	Buffer size for think tag fix.
N/A	`GCP_PROJECT_ID`	Google Cloud Project ID (`GOOGLE_CLOUD_PROJECT`).
N/A	`GEMINI_CREDENTIALS_PATH`	Path to Gemini credentials JSON.
N/A	`DISABLE_HEALTH_CHECKS`	Disable health check endpoints.
N/A	`API_KEYS`	Comma-separated list of allowed API keys.

Single Sign-On (SSO)

CLI Argument	Environment Variable	Description
`--enable-sso`	`SSO_ENABLED=true`	Enable SSO authentication.
`--sso-config PATH`	`SSO_CONFIG_FILE`	Path to SSO configuration file.
`--sso-provider NAME`	`SSO_PROVIDER`	Provider name (google, microsoft, github, linkedin, aws).
`--sso-auth-mode MODE`	`SSO_AUTH_MODE`	Authorization mode (single_user, enterprise).
`--disable-sso-captcha`	`SSO_CAPTCHA_ENABLED=false`	Disable SSO captcha protection.

Client Identity Override

CLI Argument	Environment Variable	Description
`--identity-user-agent VALUE`	`APP_USER_AGENT`	Override User-Agent header.
`--identity-url URL`	`APP_URL`	Override HTTP-Referer header.
`--identity-title TITLE`	`APP_TITLE`	Override X-Title header.
N/A	`APP_USER_AGENT_MODE`	Mode for User-Agent override.
N/A	`APP_URL_MODE`	Mode for URL override.
N/A	`APP_TITLE_MODE`	Mode for Title override.

Backend Debugging Overrides

Restricted for internal development.

CLI Argument	Description
`--enable-cline-backend-debugging-override`	Enable Cline backend debugging.
`--enable-antigravity-backend-debugging-override`	Enable Antigravity backend debugging.
`--enable-gemini-oauth-free-backend-debugging-override`	Enable Gemini OAuth Free debugging.
`--enable-gemini-oauth-plan-backend-debugging-override`	Enable Gemini OAuth Plan debugging.
`--enable-qwen-oauth-backend-debugging-override`	Enable Qwen OAuth debugging.
`--enable-droid-path-fix`	Enable automatic path fixing for Droid agent with Antigravity OAuth backend.

FilesExpand file tree

cli-parameters.md

Latest commit

History

cli-parameters.md

File metadata and controls

CLI Parameters and Configuration Reference

Configuration Precedence

General

Backend Selection

Server Configuration

Model Registry & Limits

Authentication & Security

API Keys & Tokens

Brute Force Protection

Access Control

Advanced Backend Settings

Backend Timeouts

Logging & Capture

Session Management

Tool Call & JSON Repair

Streaming Sampler

History Compaction

Features

Memory (ProxyMem)

Planning Phase

Edit Precision Tuning

Activity Tracking

Quality Verifier

Tool Access Control

Routing Control

Auxiliary Request Routing

Auxiliary Request Routing

Pytest Integration

Empty Response Handling

Rewriting

Random Model Replacement

Failure Handling

Resilience Scoping

Request Deduplication

Other Features

Single Sign-On (SSO)

Client Identity Override

Backend Debugging Overrides