This document explains the security features of the SSO authentication system and provides best practices for secure deployment.
The SSO authentication system implements multiple layers of security:
- Authentication Layer: SSO via trusted identity providers
- Authorization Layer: Confirmation codes or external API
- Token Layer: Secure token generation and storage
- Session Layer: Sandbox isolation and session management
- Rate Limiting Layer: Brute-force protection
- Bot Defense Layer: Optional Cloudflare Turnstile on the public login form
The system is designed to protect against:
- Unauthorized access: Unauthenticated users cannot access the proxy
- Token theft: Stolen tokens are hashed and cannot be reversed
- Brute-force attacks: Rate limiting and exponential backoff
- Session hijacking: Sandbox isolation prevents session continuation
- Timing attacks: Constant-time comparison for token verification
- Replay attacks: Time-limited confirmation codes and SSO sessions
- Automated login abuse: Invisible Turnstile challenge before SSO initiation
When sso.captcha.enabled is set, the /auth/login page requires a Cloudflare Turnstile response before redirecting to any identity provider. Turnstile allows invisible challenges and does not require pre-registering the URL of your login page, which keeps the endpoint unlisted in third-party dashboards while still receiving bot protection.
Agent tokens are hashed using Argon2id, the winner of the Password Hashing Competition and recommended by OWASP.
Algorithm: Argon2id (hybrid of Argon2i and Argon2d)
Parameters (2025 recommendations):
- Memory: 64 MB minimum (65536 KB)
- Iterations: 3 minimum
- Parallelism: 4 threads minimum
- Salt: 16 bytes, cryptographically random
- Output: 32 bytes
Why Argon2id?
- Memory-hard: Resistant to GPU/ASIC attacks
- Side-channel resistant: Argon2i component protects against timing attacks
- Brute-force resistant: Argon2d component maximizes resistance
- Configurable: Parameters can be tuned for security/performance balance
Example Hash:
$argon2id$v=19$m=65536,t=3,p=4$random_salt$hash_output
What is stored:
- Token hash (Argon2id output)
- User identity (email, user ID)
- Metadata (timestamps, provider, status)
What is NOT stored:
- Plaintext tokens
- SSO credentials
- Identity provider secrets
Database Record:
CREATE TABLE agent_tokens (
id TEXT PRIMARY KEY,
token_hash TEXT NOT NULL, -- Argon2id hash, not plaintext
user_id TEXT NOT NULL,
user_email TEXT NOT NULL,
provider TEXT NOT NULL,
is_authenticated INTEGER NOT NULL,
is_active INTEGER NOT NULL,
created_at TEXT NOT NULL,
last_authenticated_at TEXT,
auth_expires_at TEXT
);Token verification uses constant-time comparison to prevent timing attacks:
import hmac
def verify_token(provided_token: str, stored_hash: str) -> bool:
# Compute hash of provided token
provided_hash = hash_token(provided_token)
# Constant-time comparison (prevents timing attacks)
return hmac.compare_digest(provided_hash, stored_hash)Why constant-time?
- Prevents attackers from learning information about the hash through timing
- Standard string comparison (
==) leaks information via execution time hmac.compare_digestalways takes the same time regardless of input
File Permissions:
# Database file is created with restrictive permissions
chmod 600 /path/to/sso_auth.db
# Only owner can read/write
ls -l /path/to/sso_auth.db
-rw------- 1 user user 12345 Jan 15 10:30 sso_auth.dbEncryption at Rest (optional):
# Use encrypted filesystem or database encryption
# Example: LUKS, dm-crypt, or SQLCipherBackup Security:
- Encrypt database backups
- Store backups in secure locations
- Restrict access to backup files
- Regularly test backup restoration
Sandbox mode is a restricted state where unauthenticated users receive only a login banner instead of accessing the proxy.
Purpose:
- Prevent information leakage to unauthenticated users
- Ensure authentication state doesn't leak into conversations
- Force explicit authentication before access
1. Unauthenticated request arrives
|
v
2. Proxy detects no valid token
|
v
3. Proxy returns sandbox response (login banner)
|
v
4. User authenticates and receives token
|
v
5. User configures agent with token
|
v
6. New request with token is authenticated
|
v
7. Proxy routes to backend (normal operation)
The proxy detects and rejects requests containing sandbox content in conversation history:
def detect_sandbox_history(messages: list[dict]) -> bool:
"""Check if conversation history contains sandbox login banner."""
for message in messages:
content = message.get("content", "")
if "authentication required" in content.lower():
return True
if "http://localhost:8080/auth/login" in content:
return True
return FalseWhy?
- Prevents authentication state from leaking into unauthenticated sessions
- Ensures users start fresh after authentication
- Prevents session continuation attacks
- No state carryover: Sandbox sessions cannot continue after authentication
- History rejection: Requests with sandbox content are rejected
- Fresh start: Users must configure agent with token for new session
- No information leakage: Sandbox responses contain no sensitive information
Attempt Limits:
- Maximum 3 attempts per authorization session
- After 3 failures, must re-authenticate via SSO
Exponential Backoff:
Attempt 1: No delay
Attempt 2: 2 second delay
Attempt 3: 4 second delay
After 3 failures: Must re-authenticate (exponential backoff on SSO attempts)
Per-IP Rate Limiting:
- Track failed attempts by IP address
- Exponential backoff increases with each failure
- Prevents distributed brute-force attacks
Code Expiry:
- Confirmation codes expire after 10 minutes (configurable)
- Expired codes require re-authentication
- Prevents replay attacks
After exhausting confirmation code attempts:
1st SSO failure: 2 second wait
2nd SSO failure: 4 second wait
3rd SSO failure: 8 second wait
4th SSO failure: 16 second wait
...
Max wait: 300 seconds (5 minutes)
Implementation:
def calculate_backoff(attempts: int) -> int:
"""Calculate exponential backoff in seconds."""
base_delay = 2
max_delay = 300
delay = base_delay * (2 ** attempts)
return min(delay, max_delay)Timeout Protection:
- Default timeout: 5 seconds
- Prevents hanging on slow/unresponsive APIs
- Fails closed (denies access on timeout)
Retry Logic:
- No automatic retries (fail fast)
- User must re-authenticate if API fails
- Prevents amplification attacks
Default: 24 hours (configurable)
Rationale:
- Balance between security and convenience
- Shorter than typical password sessions
- Long enough to avoid frequent re-authentication
Configuration:
authorization:
session_lifetime_hours: 24 # Adjust based on security requirementsWhen a session expires:
- Detection: Proxy checks
auth_expires_attimestamp - Response: Returns sandbox with re-authentication URL
- User action: User re-authenticates via SSO
- Restoration: Session is restored with same token
- No reconfiguration: Agent continues with same token
Immediate revocation:
UPDATE agent_tokens
SET is_authenticated = 0, auth_expires_at = NULL
WHERE token_hash = ?;Soft delete (for audit):
UPDATE agent_tokens
SET is_active = 0
WHERE token_hash = ?;State Parameter:
- Cryptographically random state parameter
- Prevents CSRF attacks
- Validated on callback
PKCE (Proof Key for Code Exchange):
- Recommended for public clients
- Prevents authorization code interception
- Supported by most modern IdPs
Redirect URI Validation:
- Exact match required by IdPs
- No wildcards or partial matches
- Prevents open redirect attacks
Storage:
- Never commit to version control
- Use environment variables or secret managers
- Restrict access to secrets
Rotation:
- Rotate secrets periodically (e.g., every 90 days)
- Update configuration after rotation
- Revoke old secrets after transition period
Example (using environment variables):
providers:
google:
client_secret: "${GOOGLE_CLIENT_SECRET}"export GOOGLE_CLIENT_SECRET=GOCSPX-actual-secretPrinciple of least privilege:
- Request only necessary scopes
- Review scopes regularly
- Remove unused scopes
Example:
# GOOD - minimal scopes
scopes: ["openid", "email"]
# BAD - excessive scopes
scopes: ["openid", "email", "profile", "calendar", "drive", "contacts"]Always use HTTPS for production deployments:
# GOOD - HTTPS
server:
host: "0.0.0.0"
port: 443
tls:
cert: "/path/to/cert.pem"
key: "/path/to/key.pem"
# BAD - HTTP in production
server:
host: "0.0.0.0"
port: 80Why HTTPS?
- Encrypts token transmission
- Prevents man-in-the-middle attacks
- Required by most IdPs for production
- Industry best practice
Restrict access to proxy:
# Allow only specific IPs
iptables -A INPUT -p tcp --dport 8080 -s 192.168.1.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 8080 -j DROP
# Or use firewall-cmd
firewall-cmd --add-rich-rule='rule family="ipv4" source address="192.168.1.0/24" port port="8080" protocol="tcp" accept'VPN Access:
- Require VPN for remote access
- Use corporate VPN or WireGuard
- Restrict proxy to VPN subnet
Use reverse proxy for additional security:
# Nginx configuration
server {
listen 443 ssl;
server_name proxy.company.com;
ssl_certificate /path/to/cert.pem;
ssl_certificate_key /path/to/key.pem;
# Security headers
add_header Strict-Transport-Security "max-age=31536000" always;
add_header X-Frame-Options "DENY" always;
add_header X-Content-Type-Options "nosniff" always;
location / {
proxy_pass http://localhost:8080;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}What to log:
- Authentication attempts (success/failure)
- Authorization decisions (granted/denied)
- Token generation and revocation
- SSO session expiry and renewal
- Rate limit violations
- Configuration changes
Example log entries:
2025-01-15 10:30:45 INFO SSO authentication successful: user=alice@example.com provider=google
2025-01-15 10:30:46 INFO Authorization granted: user=alice@example.com mode=single_user
2025-01-15 10:30:47 INFO Token generated: user=alice@example.com token_id=abc123
2025-01-15 14:30:00 WARNING Rate limit exceeded: ip=192.168.1.100 attempts=5
2025-01-15 18:00:00 INFO Session expired: user=alice@example.com token_id=abc123
Protect log files:
# Restrictive permissions
chmod 600 /var/log/llm-proxy/auth.log
# Rotate logs regularly
logrotate -f /etc/logrotate.d/llm-proxyWhat NOT to log:
- Plaintext tokens
- Client secrets
- Confirmation codes (except in single-user mode)
- Full request/response bodies (may contain sensitive data)
Monitor for:
- Unusual authentication patterns
- High rate of failed attempts
- Authorization API failures
- Token generation spikes
- Session expiry anomalies
Alert on:
- Repeated authentication failures from same IP
- Authorization API downtime
- Database errors
- Configuration changes
- Suspicious access patterns
Data minimization:
- Store only necessary user data
- Don't store unnecessary IdP information
- Implement data retention policies
Right to erasure:
- Support token revocation
- Implement user data deletion
- Maintain audit trail of deletions
Data portability:
- Allow users to export their data
- Provide token usage history
- Support data format standards
Access controls:
- Implement role-based access control
- Audit access to sensitive data
- Restrict administrative access
Logging and monitoring:
- Comprehensive audit logs
- Real-time monitoring and alerting
- Log retention and archival
Incident response:
- Document security incidents
- Implement incident response procedures
- Regular security reviews
Encryption:
- Encrypt data at rest and in transit
- Use FIPS 140-2 compliant algorithms
- Implement key management
Access controls:
- Unique user identification
- Automatic logoff after inactivity
- Audit controls
- Use HTTPS: Always use HTTPS in production
- Restrict access: Use firewall rules and VPN
- Rotate secrets: Regularly rotate client secrets and tokens
- Monitor logs: Implement comprehensive logging and monitoring
- Update regularly: Keep proxy and dependencies updated
- Backup securely: Encrypt and secure database backups
- Minimize scopes: Request only necessary OAuth2 scopes
- Strong sessions: Use appropriate session lifetime (24-48 hours)
- Rate limiting: Enable rate limiting and exponential backoff
- Secure storage: Use environment variables or secret managers
- Validate input: Validate all configuration inputs
- Regular audits: Review access logs and authorization decisions
- Incident response: Have a plan for security incidents
- User training: Educate users on token security
- Access review: Regularly review who has access
- Penetration testing: Conduct regular security assessments
- HTTPS configured with valid certificate
- Firewall rules configured
- Client secrets stored securely (not in version control)
- Database file permissions set to 600
- Logging configured and tested
- Monitoring and alerting set up
- Backup strategy implemented
- Incident response plan documented
- Regular log reviews scheduled
- Secret rotation schedule established
- User access reviewed quarterly
- Security updates applied promptly
- Backup restoration tested
- Penetration testing scheduled
- Compliance requirements met
- Documentation kept up to date
If a token is compromised:
- Immediate action: Revoke the token
- Investigation: Review logs for unauthorized access
- Notification: Notify affected users
- Remediation: Generate new token for user
- Review: Analyze how compromise occurred
- Prevention: Implement additional controls
If authorization API is compromised:
- Immediate action: Disable enterprise mode or switch to single-user
- Investigation: Assess scope of breach
- Notification: Notify all users
- Remediation: Secure API and rotate credentials
- Review: Conduct security audit
- Prevention: Implement additional API security
If database is compromised:
- Immediate action: Revoke all tokens
- Investigation: Determine what data was accessed
- Notification: Notify all users and authorities (if required)
- Remediation: Secure database and restore from backup
- Review: Conduct comprehensive security review
- Prevention: Implement encryption at rest and additional controls
- Troubleshooting - Common issues and solutions
- Configuration Options - Complete configuration reference
- Agent Configuration - Configure AI agents with tokens