Troubleshooting

This guide covers common issues and their solutions when using the LLM Interactive Proxy.

Common Errors

Authentication Errors

401 Unauthorized

Error: 401 Unauthorized response from proxy

Cause: Missing or invalid Authorization header when proxy authentication is enabled

Solutions:

Check if authentication is enabled:

# Look for auth configuration in your config file
grep -A 5 "auth:" config.yaml

Provide valid API key:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer YOUR_PROXY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai:gpt-4", "messages": [...]}'

Disable authentication for testing:
```
# config.yaml
auth:
  enabled: false
```

403 Forbidden

Error: 403 Forbidden response from proxy

Cause: API key is recognized but lacks required permissions

Solutions:

Verify API key permissions in your configuration
Check IP-based restrictions if configured
Review brute-force protection - you may be temporarily blocked

Request Errors

400 Bad Request

Error: 400 Bad Request response

Cause: Malformed request payload

Solutions:

Verify request format matches the API you're using:
- OpenAI: /v1/chat/completions expects OpenAI format
- Anthropic: /anthropic/v1/messages expects Anthropic format
- Gemini: /v1beta/models expects Gemini format

Check required fields:

{
  "model": "openai:gpt-4",  // Required
  "messages": [              // Required
    {"role": "user", "content": "Hello"}
  ]
}

Validate JSON syntax:

# Use jq to validate JSON
echo '{"model": "test"}' | jq .

422 Unprocessable Entity

Error: 422 Unprocessable Entity response

Cause: Request validation failed

Solutions:

Check error details in the response body:

{
  "error": {
    "message": "Validation error",
    "details": {
      "field": "temperature",
      "issue": "must be between 0 and 2"
    }
  }
}

Verify parameter values:
- temperature: 0.0 to 2.0
- top_p: 0.0 to 1.0
- max_tokens: positive integer

Check model name format:

Valid: openai:gpt-4
Valid: anthropic:claude-3-opus
Invalid: gpt-4 (missing backend prefix)

400 Input Limit Exceeded

Error: 400 Bad Request with input_limit_exceeded error code

Cause: Request exceeds model's context window limits

Solutions:

Check error details for token counts:

{
  "error": {
    "code": "input_limit_exceeded",
    "message": "Request exceeds context window",
    "details": {
      "measured_tokens": 150000,
      "limit_tokens": 128000,
      "model": "openai:gpt-4"
    }
  }
}

Reduce input size:
- Shorten messages
- Remove unnecessary context
- Split into multiple requests

Use a model with larger context window:

openai:gpt-4-turbo-128k
anthropic:claude-3-opus (200k context)

Enable context window enforcement:

session:
  context_window_enforcement_enabled: true

Backend Errors

503 Service Unavailable

Error: 503 Service Unavailable response

Cause: Upstream LLM provider is unreachable

Solutions:

Check backend connectivity:

# Test OpenAI
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

# Test Anthropic
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY"

Verify API keys are set:

echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY

Try another backend:

# Switch from OpenAI to Anthropic
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "anthropic:claude-3-opus", "messages": [...]}'

Enable failover (if configured):

backends:
  openai:
    failover_to: anthropic

Model Not Found

Error: 404 Not Found or "Model not found" error

Cause: Model name doesn't exist for the selected backend

Solutions:

Verify model name for your backend:

# OpenAI models
openai:gpt-4
openai:gpt-4-turbo
openai:gpt-3.5-turbo

# Anthropic models
anthropic:claude-3-opus-20240229
anthropic:claude-3-sonnet-20240229
anthropic:claude-3-haiku-20240307

Check backend documentation for available models

Use model name rewrites to map to available models:

model_rewrites:
  - pattern: "gpt-4"
    replacement: "openai:gpt-4-turbo"

Rate Limiting

429 Too Many Requests

Error: 429 Too Many Requests response

Cause: Rate limit exceeded (proxy or backend)

Solutions:

Check if it's proxy brute-force protection:
```
Response headers:
Retry-After: 30
```
Wait for the specified time before retrying
Check if it's backend rate limiting:
- Review your API plan limits
- Upgrade your API plan
- Use multiple backend instances (e.g. openai.1, openai.2) for load balancing

Enable API Key Rotation (multi-instance load balancing):

# Configure multiple backend instances in environment or config files
# Environment:
# OPENAI_API_KEY_1=sk-...
# OPENAI_API_KEY_2=sk-...

Adjust brute-force protection (if proxy-side):

auth:
  brute_force_protection:
    max_failed_attempts: 10
    ttl_seconds: 900

Debugging Tips

Enable Wire Capture

For tricky issues, enable wire capture to see exact requests and responses:

python -m src.core.cli \
  --capture-file logs/debug.log \
  --default-backend openai

Then analyze the capture:

# View all requests
jq 'select(.direction=="outbound_request")' logs/debug.log

# View all errors
jq 'select(.direction=="inbound_response" and .payload.error)' logs/debug.log

Use Interactive Commands

Test different backends and models without restarting:

# In your LLM client, send these commands:
!/backend(anthropic)
!/model(claude-3-opus)
!/temperature(0.5)

Check Environment Variables

Verify all required environment variables are set:

# List all LLM-related environment variables
env | grep -E "(OPENAI|ANTHROPIC|GEMINI|API_KEY)"

# Check specific backend
echo "OpenAI: $OPENAI_API_KEY"
echo "Anthropic: $ANTHROPIC_API_KEY"

Review Logs

Check proxy logs for detailed error information:

# View recent logs
tail -f logs/proxy.log

# Search for errors
grep ERROR logs/proxy.log

# Search for specific session
grep "session-123" logs/proxy.log

Test with curl

Isolate issues by testing with curl:

# Basic test
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test-key" \
  -d '{
    "model": "openai:gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Hello"}]
  }' | jq .

# Test streaming
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer test-key" \
  -d '{
    "model": "openai:gpt-3.5-turbo",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Configuration Issues

Configuration Not Loading

Problem: Configuration changes don't take effect

Solutions:

Verify configuration file path:

python -m src.core.cli --config config.yaml

Check YAML syntax:

# Validate YAML
python -c "import yaml; yaml.safe_load(open('config.yaml'))"

Check configuration precedence:
- CLI arguments override environment variables
- Environment variables override config file
- Config file is the lowest priority
Restart the proxy after configuration changes

Environment Variables Not Working

Problem: Environment variables are not being recognized

Solutions:

Export variables in the same shell:

export OPENAI_API_KEY="sk-..."
python -m src.core.cli

Use .env file:

# Create .env file
echo "OPENAI_API_KEY=sk-..." > .env

# Load with python-dotenv
python -m src.core.cli

Check variable names match expected format:

# Correct
OPENAI_API_KEY=sk-...

# Incorrect
OPENAI_KEY=sk-...

Performance Issues

Slow Response Times

Problem: Requests take too long to complete

Solutions:

Check backend latency:

# Enable wire capture and analyze timing
jq -r 'select(.direction=="inbound_response") | 
  "\(.timestamp_iso) \(.model)"' logs/wire_capture.log

Reduce request size:
- Shorter prompts
- Fewer messages in history
- Lower max_tokens

Use faster models:

Fast: openai:gpt-3.5-turbo
Fast: anthropic:claude-3-haiku
Slow: openai:gpt-4
Slow: anthropic:claude-3-opus

Check network connectivity:

ping api.openai.com
traceroute api.openai.com

High Memory Usage

Problem: Proxy consumes too much memory

Solutions:

Reduce buffer sizes:

logging:
  capture_buffer_size: 32768  # 32KB instead of 64KB

Disable wire capture when not needed
Limit session history:
```
session:
  max_history_turns: 50
```
Restart proxy periodically for long-running instances

Feature-Specific Issues

Solutions:

Verify assessment is enabled:

Check turn threshold:

  turn_threshold: 30  # Lower for more frequent checks

Verify assessment backend is configured:
```
  backend: openai
  model: gpt-4o-mini
```
Check logs for assessment activity:
```
grep "LLM Assessment" logs/proxy.log
```

Tool Access Control Not Blocking

Problem: Tool calls are not being blocked by access control

Solutions:

Verify tool access control is enabled:

session:
  tool_call_reactor:
    enabled: true

Check policy patterns:

access_policies:
  - name: block_dangerous
    model_pattern: ".*"  # Matches all models
    blocked_patterns:
      - "delete_.*"
      - "rm_.*"

Review policy priority:
- Higher priority policies override lower ones
- Allowed patterns override blocked patterns

Check logs for policy evaluation:

grep "Tool Access Control" logs/proxy.log

Dangerous Command Protection Not Working

Problem: Dangerous git commands are not being blocked

Solutions:

Verify protection is enabled:

# Should NOT have this flag
python -m src.core.cli  # (without --disable-dangerous-git-commands-protection)

Check environment variable:

echo $DANGEROUS_COMMAND_PREVENTION_ENABLED  # Should be "true" or unset

Review configuration:

session:
  dangerous_command_prevention_enabled: true

Check logs for blocked commands:

grep "Dangerous command blocked" logs/proxy.log

Getting Help

Collect Diagnostic Information

When reporting issues, include:

Proxy version:
```
python -m src.core.cli --version
```

Configuration (sanitized):

# Remove API keys before sharing
cat config.yaml | grep -v "api_key"

Error messages from logs:
```
grep ERROR logs/proxy.log | tail -20
```

Wire capture (if applicable):

# Sanitize and share relevant entries
jq 'del(.payload.messages[].content)' logs/wire_capture.log

Steps to reproduce the issue

Where to Get Help

GitHub Issues: Open an issue
Documentation: Check other guides in this documentation
Wire Capture: Use wire capture to diagnose complex issues

Related Features

Wire Capture - Capture and analyze HTTP traffic
CBOR Capture - Binary capture for regression testing
Security - Authentication and security features
Configuration - Configuration guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshooting

Common Errors

Authentication Errors

401 Unauthorized

403 Forbidden

Request Errors

400 Bad Request

422 Unprocessable Entity

400 Input Limit Exceeded

Backend Errors

503 Service Unavailable

Model Not Found

Rate Limiting

429 Too Many Requests

Debugging Tips

Enable Wire Capture

Use Interactive Commands

Check Environment Variables

Review Logs

Test with curl

Configuration Issues

Configuration Not Loading

Environment Variables Not Working

Performance Issues

Slow Response Times

High Memory Usage

Feature-Specific Issues

Tool Access Control Not Blocking

Dangerous Command Protection Not Working

Getting Help

Collect Diagnostic Information

Where to Get Help

Related Features

FilesExpand file tree

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting

Common Errors

Authentication Errors

401 Unauthorized

403 Forbidden

Request Errors

400 Bad Request

422 Unprocessable Entity

400 Input Limit Exceeded

Backend Errors

503 Service Unavailable

Model Not Found

Rate Limiting

429 Too Many Requests

Debugging Tips

Enable Wire Capture

Use Interactive Commands

Check Environment Variables

Review Logs

Test with curl

Configuration Issues

Configuration Not Loading

Environment Variables Not Working

Performance Issues

Slow Response Times

High Memory Usage

Feature-Specific Issues

Tool Access Control Not Blocking

Dangerous Command Protection Not Working

Getting Help

Collect Diagnostic Information

Where to Get Help

Related Features