This guide covers common issues and their solutions when using the LLM Interactive Proxy.
Error: 401 Unauthorized response from proxy
Cause: Missing or invalid Authorization header when proxy authentication is enabled
Solutions:
-
Check if authentication is enabled:
# Look for auth configuration in your config file grep -A 5 "auth:" config.yaml
-
Provide valid API key:
curl -X POST http://localhost:8000/v1/chat/completions \ -H "Authorization: Bearer YOUR_PROXY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "openai:gpt-4", "messages": [...]}'
-
Disable authentication for testing:
# config.yaml auth: enabled: false
Error: 403 Forbidden response from proxy
Cause: API key is recognized but lacks required permissions
Solutions:
- Verify API key permissions in your configuration
- Check IP-based restrictions if configured
- Review brute-force protection - you may be temporarily blocked
Error: 400 Bad Request response
Cause: Malformed request payload
Solutions:
-
Verify request format matches the API you're using:
- OpenAI:
/v1/chat/completionsexpects OpenAI format - Anthropic:
/anthropic/v1/messagesexpects Anthropic format - Gemini:
/v1beta/modelsexpects Gemini format
- OpenAI:
-
Check required fields:
{ "model": "openai:gpt-4", // Required "messages": [ // Required {"role": "user", "content": "Hello"} ] } -
Validate JSON syntax:
# Use jq to validate JSON echo '{"model": "test"}' | jq .
Error: 422 Unprocessable Entity response
Cause: Request validation failed
Solutions:
-
Check error details in the response body:
{ "error": { "message": "Validation error", "details": { "field": "temperature", "issue": "must be between 0 and 2" } } } -
Verify parameter values:
temperature: 0.0 to 2.0top_p: 0.0 to 1.0max_tokens: positive integer
-
Check model name format:
Valid: openai:gpt-4 Valid: anthropic:claude-3-opus Invalid: gpt-4 (missing backend prefix)
Error: 400 Bad Request with input_limit_exceeded error code
Cause: Request exceeds model's context window limits
Solutions:
-
Check error details for token counts:
{ "error": { "code": "input_limit_exceeded", "message": "Request exceeds context window", "details": { "measured_tokens": 150000, "limit_tokens": 128000, "model": "openai:gpt-4" } } } -
Reduce input size:
- Shorten messages
- Remove unnecessary context
- Split into multiple requests
-
Use a model with larger context window:
openai:gpt-4-turbo-128k anthropic:claude-3-opus (200k context) -
Enable context window enforcement:
session: context_window_enforcement_enabled: true
Error: 503 Service Unavailable response
Cause: Upstream LLM provider is unreachable
Solutions:
-
Check backend connectivity:
# Test OpenAI curl https://api.openai.com/v1/models \ -H "Authorization: Bearer $OPENAI_API_KEY" # Test Anthropic curl https://api.anthropic.com/v1/messages \ -H "x-api-key: $ANTHROPIC_API_KEY"
-
Verify API keys are set:
echo $OPENAI_API_KEY echo $ANTHROPIC_API_KEY
-
Try another backend:
# Switch from OpenAI to Anthropic curl -X POST http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "anthropic:claude-3-opus", "messages": [...]}'
-
Enable failover (if configured):
backends: openai: failover_to: anthropic
Error: 404 Not Found or "Model not found" error
Cause: Model name doesn't exist for the selected backend
Solutions:
-
Verify model name for your backend:
# OpenAI models openai:gpt-4 openai:gpt-4-turbo openai:gpt-3.5-turbo # Anthropic models anthropic:claude-3-opus-20240229 anthropic:claude-3-sonnet-20240229 anthropic:claude-3-haiku-20240307
-
Check backend documentation for available models
-
Use model name rewrites to map to available models:
model_rewrites: - pattern: "gpt-4" replacement: "openai:gpt-4-turbo"
Error: 429 Too Many Requests response
Cause: Rate limit exceeded (proxy or backend)
Solutions:
-
Check if it's proxy brute-force protection:
Response headers: Retry-After: 30Wait for the specified time before retrying
-
Check if it's backend rate limiting:
- Review your API plan limits
- Upgrade your API plan
- Use multiple backend instances (e.g.
openai.1,openai.2) for load balancing
-
Enable API Key Rotation (multi-instance load balancing):
# Configure multiple backend instances in environment or config files # Environment: # OPENAI_API_KEY_1=sk-... # OPENAI_API_KEY_2=sk-...
-
Adjust brute-force protection (if proxy-side):
auth: brute_force_protection: max_failed_attempts: 10 ttl_seconds: 900
For tricky issues, enable wire capture to see exact requests and responses:
python -m src.core.cli \
--capture-file logs/debug.log \
--default-backend openaiThen analyze the capture:
# View all requests
jq 'select(.direction=="outbound_request")' logs/debug.log
# View all errors
jq 'select(.direction=="inbound_response" and .payload.error)' logs/debug.logTest different backends and models without restarting:
# In your LLM client, send these commands:
!/backend(anthropic)
!/model(claude-3-opus)
!/temperature(0.5)Verify all required environment variables are set:
# List all LLM-related environment variables
env | grep -E "(OPENAI|ANTHROPIC|GEMINI|API_KEY)"
# Check specific backend
echo "OpenAI: $OPENAI_API_KEY"
echo "Anthropic: $ANTHROPIC_API_KEY"Check proxy logs for detailed error information:
# View recent logs
tail -f logs/proxy.log
# Search for errors
grep ERROR logs/proxy.log
# Search for specific session
grep "session-123" logs/proxy.logIsolate issues by testing with curl:
# Basic test
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer test-key" \
-d '{
"model": "openai:gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello"}]
}' | jq .
# Test streaming
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer test-key" \
-d '{
"model": "openai:gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'Problem: Configuration changes don't take effect
Solutions:
-
Verify configuration file path:
python -m src.core.cli --config config.yaml
-
Check YAML syntax:
# Validate YAML python -c "import yaml; yaml.safe_load(open('config.yaml'))"
-
Check configuration precedence:
- CLI arguments override environment variables
- Environment variables override config file
- Config file is the lowest priority
-
Restart the proxy after configuration changes
Problem: Environment variables are not being recognized
Solutions:
-
Export variables in the same shell:
export OPENAI_API_KEY="sk-..." python -m src.core.cli
-
Use .env file:
# Create .env file echo "OPENAI_API_KEY=sk-..." > .env # Load with python-dotenv python -m src.core.cli
-
Check variable names match expected format:
# Correct OPENAI_API_KEY=sk-... # Incorrect OPENAI_KEY=sk-...
Problem: Requests take too long to complete
Solutions:
-
Check backend latency:
# Enable wire capture and analyze timing jq -r 'select(.direction=="inbound_response") | "\(.timestamp_iso) \(.model)"' logs/wire_capture.log
-
Reduce request size:
- Shorter prompts
- Fewer messages in history
- Lower max_tokens
-
Use faster models:
Fast: openai:gpt-3.5-turbo Fast: anthropic:claude-3-haiku Slow: openai:gpt-4 Slow: anthropic:claude-3-opus -
Check network connectivity:
ping api.openai.com traceroute api.openai.com
Problem: Proxy consumes too much memory
Solutions:
-
Reduce buffer sizes:
logging: capture_buffer_size: 32768 # 32KB instead of 64KB
-
Disable wire capture when not needed
-
Limit session history:
session: max_history_turns: 50
-
Restart proxy periodically for long-running instances
Solutions:
-
Verify assessment is enabled:
-
Check turn threshold:
turn_threshold: 30 # Lower for more frequent checks
-
Verify assessment backend is configured:
backend: openai model: gpt-4o-mini
-
Check logs for assessment activity:
grep "LLM Assessment" logs/proxy.log
Problem: Tool calls are not being blocked by access control
Solutions:
-
Verify tool access control is enabled:
session: tool_call_reactor: enabled: true
-
Check policy patterns:
access_policies: - name: block_dangerous model_pattern: ".*" # Matches all models blocked_patterns: - "delete_.*" - "rm_.*"
-
Review policy priority:
- Higher priority policies override lower ones
- Allowed patterns override blocked patterns
-
Check logs for policy evaluation:
grep "Tool Access Control" logs/proxy.log
Problem: Dangerous git commands are not being blocked
Solutions:
-
Verify protection is enabled:
# Should NOT have this flag python -m src.core.cli # (without --disable-dangerous-git-commands-protection)
-
Check environment variable:
echo $DANGEROUS_COMMAND_PREVENTION_ENABLED # Should be "true" or unset
-
Review configuration:
session: dangerous_command_prevention_enabled: true
-
Check logs for blocked commands:
grep "Dangerous command blocked" logs/proxy.log
When reporting issues, include:
-
Proxy version:
python -m src.core.cli --version
-
Configuration (sanitized):
# Remove API keys before sharing cat config.yaml | grep -v "api_key"
-
Error messages from logs:
grep ERROR logs/proxy.log | tail -20 -
Wire capture (if applicable):
# Sanitize and share relevant entries jq 'del(.payload.messages[].content)' logs/wire_capture.log
-
Steps to reproduce the issue
- GitHub Issues: Open an issue
- Documentation: Check other guides in this documentation
- Wire Capture: Use wire capture to diagnose complex issues
- Wire Capture - Capture and analyze HTTP traffic
- CBOR Capture - Binary capture for regression testing
- Security - Authentication and security features
- Configuration - Configuration guide