Label-Based Access Control for the LGTM Stack (Loki, Grafana, Tempo, Mimir/Prometheus)
This project originated from multena-proxy by Gepardec, which provided the foundational architecture for multi-tenancy proxy with label-based access control.
Original Repository: https://github.com/gepaplexx/multena-proxy
We extend our sincere gratitude to the Gepardec team for their pioneering work in bringing label-based access control to observability stacks. Their original implementation established excellent patterns that influenced our approach.
LGTM LBAC Proxy has evolved as an independent project with significant architectural changes and new capabilities that go beyond the original scope:
| Aspect | Original (multena-proxy) | This Project (lgtm-lbac-proxy) |
|---|---|---|
| Target Stack | OpenShift + Thanos/Loki | Complete LGTM stack (L+G+T+M) |
| Tracing Support | ❌ Not implemented | ✅ Full TraceQL enforcement |
| Label Model | Single-label per user | Multi-label policies with operators |
| Label Storage | MySQL + ConfigMap | File-based (ConfigMap) only |
| Configuration | Fixed auth claims | Configurable JWT claims for any OAuth provider |
| Performance | Standard reverse proxy | Optimized with connection pooling (1000-2000 req/s) |
| Architecture | Simple label mappings | Policy-based with AND/OR logic |
| Development | Upstream maintenance | Independent feature development |
New Capabilities (not in original):
- ✅ Grafana Tempo Integration: Complete TraceQL query enforcement with resource attribute injection
- ✅ Multi-Label Enforcement: Support multiple labels per user (namespace AND team AND environment)
- ✅ Flexible Operators: Exact match (
=), regex (=~), negation (!=,!~) - ✅ Policy Logic: AND/OR combinations for complex access rules
- ✅ Configurable JWT Claims: Support any OAuth provider (Keycloak, Azure AD, Auth0, Google, Okta)
- ✅ High-Performance Proxy: Dedicated transports with connection pooling and per-upstream tuning
- ✅ Production Helm Chart: Full Kubernetes deployment with HPA, security contexts, and ServiceMonitor
Architectural Changes (breaking compatibility):
- 🔄 Removed MySQL Label Store: File-based ConfigMap only (v0.9.0)
- 🔄 Removed Simple Label Format: Extended format with
_rules/_logicrequired (v0.12.0) - 🔄 New Label Store Interface:
GetLabelPolicy()instead ofGetLabels() - 🔄 Policy-Based Enforcement: Rules and operators instead of simple key-value labels
Use multena-proxy if you need:
- OpenShift-specific integrations
- Simple single-label enforcement (namespace only)
- MySQL label storage
- Minimal changes to existing deployments
Use lgtm-lbac-proxy if you need:
- Complete LGTM stack support (Loki + Grafana + Tempo + Mimir)
- Distributed tracing with Grafana Tempo
- Multi-label policies (namespace + team + environment)
- High-throughput production deployment (1000+ req/s)
- Flexible OAuth provider integration
- Advanced access control with regex and negation
- Configuration format changes (extended label format required)
- Different label storage mechanism (no MySQL support)
- New authentication configuration structure
Migration from multena-proxy requires configuration updates. See MIGRATION.md for detailed instructions.
LGTM LBAC Proxy is a multi-tenancy authorization proxy designed specifically for the LGTM observability stack. Built with Label-Based Access Control (LBAC) at its core, it ensures secure and granular access to your observability data.
The proxy intercepts queries to Prometheus/Thanos, Loki, and Tempo, validates JWT tokens, enforces tenant label restrictions, and forwards authorized queries to upstream servers.
| Feature | Description |
|---|---|
| Complete LGTM Stack | Full support for Metrics (Prometheus/Thanos/Mimir), Logs (Loki), and Traces (Tempo) |
| Multi-Label Policies | Enforce multiple labels per user with AND/OR logic (namespace AND team AND environment) |
| Flexible Operators | Support for exact (=), regex (=~), not-equal (!=), and negative regex (!~) matching |
| Configurable JWT Claims | Compatible with any OAuth provider - configure username, email, and groups claims |
| High-Performance Proxy | Handle 1000-2000 req/s with connection pooling and per-upstream optimization |
| Query Enforcement | Automatic injection of tenant filters into PromQL, LogQL, and TraceQL queries |
| TraceQL Support | Full distributed tracing support with Grafana Tempo integration |
| File-Based Label Store | Simple and portable ConfigMap/file-based label storage (no database required) |
| Admin Bypass | Optional admin group with #cluster-wide access for unrestricted queries |
| Production-Ready Helm Chart | Kubernetes deployment with HPA, security contexts, and ServiceMonitor |
| Secure Communication | JWT/JWKS authentication with mTLS support for upstream connections |
| OAuth Provider Support | Pre-configured examples for Keycloak, Azure AD, Auth0, Google, and Okta |
- ✅ Metrics: Prometheus, Thanos, Mimir (PromQL enforcement)
- ✅ Logging: Loki (LogQL enforcement)
- ✅ Traces: Tempo (TraceQL enforcement) 🆕
- ⏳ Profiles: Planned for future release
graph TD
A[Receive Request] -->|Extract JWT Token| B{Validate Token?}
B -->|Invalid| E[403 Forbidden]
B -->|Valid| D{Validate Labels?}
D -->|No Labels| E
D -->|Has #cluster-wide| F[Forward to Upstream]
D -->|Has Labels| G{Enforce Query?}
G -->|Unauthorized Label| E
G -->|Success| H[Inject Tenant Filter]
H --> F
F --> I[Stream Response]
Authorization Flow:
- Request Reception: Proxy receives a query request (PromQL, LogQL, or TraceQL)
- JWT Validation: Extracts and validates JWT token from Authorization header
- Label Retrieval: Fetches allowed tenant labels for user/groups from label store
- Admin Bypass Check: Users with
#cluster-widelabel skip enforcement - Query Parsing: Parses query using appropriate parser (PromQL/LogQL/TraceQL)
- Label Validation: Checks if existing tenant labels in query are authorized
- Label Injection: Injects tenant label filters if missing (e.g.,
{namespace="prod"}) - Proxy Forward: Forwards modified query to upstream (Prometheus/Loki/Tempo)
- Response Stream: Streams response back to client
# Download latest release
wget https://github.com/binhnguyenduc/lgtm-lbac-proxy/releases/latest/download/lgtm-lbac-proxy
# Make executable
chmod +x lgtm-lbac-proxy
# Run with config
./lgtm-lbac-proxydocker run -d \
-p 8080:8080 \
-p 8081:8081 \
-v $(pwd)/configs:/etc/config/config:ro \
ghcr.io/binhnguyenduc/lgtm-lbac-proxy:latest# Install from local chart
helm install lgtm-lbac-proxy ./helm/lgtm-lbac-proxy \
--namespace observability \
--create-namespace \
--set proxy.web.jwksCertUrl=https://your-oauth.com/certs \
--set proxy.thanos.url=https://thanos-query:9091 \
--set proxy.loki.url=https://loki-query:3100 \
--set proxy.tempo.url=https://tempo-query:3200
# Create labels ConfigMap
kubectl create configmap lgtm-lbac-proxy-labels \
--from-file=labels.yaml=./configs/labels.yaml \
--namespace observabilitySee Helm Chart Documentation for detailed configuration options.
Create config.yaml:
# Authentication configuration
auth:
jwks_cert_url: "https://your-oauth-provider.com/certs"
auth_header: "Authorization"
claims:
username: "preferred_username" # JWT claim for username
email: "email" # JWT claim for email
groups: "groups" # JWT claim for groups
admin:
bypass: true
group: "cluster-admins"
thanos:
url: "https://thanos-querier:9091"
tenant_label: "namespace"
use_mutual_tls: false
loki:
url: "https://loki-query-frontend:3100"
tenant_label: "kubernetes_namespace_name"
actor_header: "X-Loki-Actor"
tempo:
url: "https://tempo-query-frontend:3100"
tenant_label: "resource.namespace"
actor_header: "X-Tempo-User"Different OAuth providers use different claim names for user identity. You can now configure which JWT claims to use:
auth:
claims:
username: "preferred_username" # Keycloak, Okta
# username: "unique_name" # Azure AD
# username: "nickname" # Auth0
# username: "email" # Google
email: "email" # Most providers
# email: "upn" # Azure AD User Principal Name
groups: "groups" # Keycloak, Okta
# groups: "roles" # Azure AD
# groups: "https://example.com/groups" # Auth0 (namespaced)Common OAuth Providers:
| Provider | Username | Groups | |
|---|---|---|---|
| Keycloak | preferred_username |
email |
groups |
| Azure AD | unique_name, upn |
email, upn |
roles |
| Auth0 | nickname, name |
email |
https://domain.com/groups |
email, sub |
email |
hd (domain) |
|
| Okta | preferred_username |
email |
groups |
See Configuration Examples for provider-specific setup guides.
Configure proxy performance settings for high-throughput deployments:
# Global proxy defaults (optional - sensible defaults provided)
proxy:
request_timeout: 60s # Maximum request duration
idle_conn_timeout: 90s # Keep-alive for idle connections
tls_handshake_timeout: 10s # TLS handshake timeout
max_idle_conns: 500 # Total idle connections across all upstreams
max_idle_conns_per_host: 100 # Idle connections per upstream
force_http2: true # Enable HTTP/2 when available
# Per-upstream overrides (customize for workload characteristics)
loki:
url: "https://loki-query-frontend:3100"
tenant_label: "kubernetes_namespace_name"
proxy:
request_timeout: 120s # Log queries can be slow
max_idle_conns_per_host: 150 # High log volume needs more connections
tempo:
url: "https://tempo-query-frontend:3100"
tenant_label: "resource.namespace"
proxy:
request_timeout: 300s # Trace queries need longer timeout
max_idle_conns_per_host: 50 # Lower volume, fewer connections
thanos:
url: "https://thanos-querier:9091"
tenant_label: "namespace"
proxy:
request_timeout: 60s
max_idle_conns_per_host: 100Performance Characteristics:
- Throughput: 1000-2000 req/s with connection pooling
- Connection Reuse: >95% under steady load
- Latency Impact: <1 µs added per request (negligible)
- Configuration Precedence:
upstream-specific > global > built-in defaults
When to Tune:
- High request rate (>500 req/s): Increase
max_idle_conns_per_host - Slow queries: Increase
request_timeoutfor specific upstream - HTTP/2 capable upstreams: Enable
force_http2for multiplexing - Connection exhaustion: Increase
max_idle_connstotal pool size
Create labels.yaml using the extended multi-label format (required as of v0.12.0):
# Simple single-label policy
alice@example.com:
_rules:
- name: namespace
operator: '='
values: ['prod', 'staging']
_logic: AND
# Multi-label policy with AND logic
bob@example.com:
_rules:
- name: namespace
operator: '='
values: ['prod']
- name: team
operator: '=~'
values: ['backend.*']
- name: environment
operator: '!='
values: ['test']
_logic: AND # All three conditions must be satisfied
# Multi-label policy with OR logic
charlie@example.com:
_rules:
- name: namespace
operator: '='
values: ['dev']
- name: namespace
operator: '='
values: ['staging']
_logic: OR # Either dev OR staging
# Regex matching for complex patterns
engineering-team:
_rules:
- name: namespace
operator: '=~'
values: ['prod-.*', 'staging-.*'] # Matches prod-app1, staging-db, etc.
- name: team
operator: '='
values: ['engineering']
_logic: AND
# Admin access (cluster-wide bypass)
cluster-admins:
_rules:
- name: '#cluster-wide'
operator: '='
values: ['true']
_logic: ANDExtended Format Features:
- Multi-label enforcement: Combine multiple labels with AND/OR logic (e.g., namespace AND team AND environment)
- Rich operators:
=- Exact match!=- Not equal (negation)=~- Regex match (e.g.,backend.*matches backend-api, backend-worker)!~- Negative regex (exclude patterns)
- Logical combinations:
AND- All rules must be satisfied (default)OR- Any rule can be satisfied
- Per-user policies: Different users can have completely different label enforcement rules
Real-World Examples:
# Backend team: only prod namespace + backend services
backend-developers:
_rules:
- name: namespace
operator: '='
values: ['prod']
- name: service
operator: '=~'
values: ['backend-.*', 'api-.*']
_logic: AND
# QA team: staging and test environments, but not production
qa-team:
_rules:
- name: environment
operator: '='
values: ['staging', 'test']
- name: environment
operator: '!='
values: ['production']
_logic: AND
# Multi-region access: only specific regions
us-team:
_rules:
- name: region
operator: '='
values: ['us-east-1', 'us-west-2']
_logic: OR
# Cost center tracking with exclusions
finance-team:
_rules:
- name: cost_center
operator: '='
values: ['finance']
- name: namespace
operator: '!~'
values: ['temp-.*', 'test-.*'] # Exclude temporary/test namespaces
_logic: ANDIf you're upgrading from a version that used MySQL label store:
Step 1: Export labels from MySQL
-- Export user labels
SELECT username, allowed_labels FROM label_mappings;Step 2: Convert to extended format
Create labels.yaml using the extended multi-label format:
# Convert MySQL rows to extended format
username1:
_rules:
- name: namespace
operator: '='
values: ['namespace1', 'namespace2']
_logic: AND
group1:
_rules:
- name: namespace
operator: '='
values: ['namespace3']
_logic: ANDStep 3: Update configuration
Remove these fields from config.yaml:
web.label_store_kind(no longer needed)db:section (entire section)
Step 4: Deploy labels ConfigMap
kubectl create configmap lgtm-lbac-proxy-labels \
--from-file=labels.yaml=./labels.yaml \
--namespace observabilityNote: The label store now automatically uses file-based ConfigMap mode. For custom label store implementations (e.g., external databases, LDAP), see the contrib/labelstores/ directory.
BREAKING CHANGE in v0.12.0: The simple label format has been removed. If you're upgrading from v0.11 or earlier, you must migrate to the extended format.
Old Simple Format (DEPRECATED, removed in v0.12.0):
# ❌ This format is no longer supported
user@example.com:
namespace1: true
namespace2: trueNew Extended Format (REQUIRED as of v0.12.0):
# ✅ Use this format
user@example.com:
_rules:
- name: namespace
operator: '='
values: ['namespace1', 'namespace2']
_logic: ANDMigration Tool:
A migration tool is provided to automatically convert simple format to extended format:
# Download the migration tool
wget https://github.com/binhnguyenduc/lgtm-lbac-proxy/releases/latest/download/migrate-labels
# Make it executable
chmod +x migrate-labels
# Convert your labels file
./migrate-labels -input labels.yaml -output labels-extended.yaml -tenant-label namespace
# Verify the output
cat labels-extended.yaml
# Deploy the new format
kubectl create configmap lgtm-lbac-proxy-labels \
--from-file=labels.yaml=./labels-extended.yaml \
--namespace observability \
--dry-run=client -o yaml | kubectl apply -f -Migration Steps:
- Backup: Save your current
labels.yaml - Convert: Use the migration tool to convert to extended format
- Test: Validate the new format works in a test environment
- Deploy: Update your ConfigMap with the new format
- Upgrade: Deploy v0.12.0+ with the extended format
Why This Change?
The extended format provides:
- Multi-label enforcement: Support for complex policies (namespace AND team)
- Flexible operators: Exact match, regex, negation
- Logical combinations: AND/OR logic for rules
- Future-proof: Foundation for advanced RBAC features
User Policy: namespace="prod"
# Original Query
rate(http_requests_total[5m])
# Enforced Query
rate(http_requests_total{namespace="prod"}[5m])
User Policy: kubernetes_namespace_name="prod"
# Original Query
{app="nginx"} |= "error"
# Enforced Query
{kubernetes_namespace_name="prod",app="nginx"} |= "error"
User Policy: resource.namespace="prod"
# Original Query
{ span.http.status_code >= 500 }
# Enforced Query
{ resource.namespace="prod" && span.http.status_code >= 500 }
User Policy: namespace="prod" AND team=~"backend.*" AND environment!="test"
# Original Query
sum(rate(http_requests_total[5m])) by (service)
# Enforced Query
sum(rate(http_requests_total{namespace="prod",team=~"backend.*",environment!="test"}[5m])) by (service)
User Policy: namespace="prod" AND service=~"api-.*"
# Original Query
{job="kubernetes-pods"} | json | line_format "{{.message}}"
# Enforced Query
{namespace="prod",service=~"api-.*",job="kubernetes-pods"} | json | line_format "{{.message}}"
User Policy: resource.namespace="prod" AND resource.team="backend"
# Original Query
{ duration > 1s && span.http.status_code = 500 }
# Enforced Query
{ resource.namespace="prod" && resource.team="backend" && duration > 1s && span.http.status_code = 500 }
User Policy: namespace=~"prod-.*" (matches prod-app1, prod-db, etc.)
# Enforced Query
rate(http_requests_total{namespace=~"prod-.*"}[5m])
User Policy: environment!="test" (exclude test environment)
# Enforced Query
{environment!="test",app="nginx"} |= "error"
User Policy: namespace!~"temp-.*" (exclude temporary namespaces)
# Enforced Query
sum(container_memory_usage_bytes{namespace!~"temp-.*"}) by (pod)
The project includes a production-ready Helm chart for Kubernetes deployment.
Chart Version: 1.8.0+ | App Version: 0.14.0
- ✅ Complete LGTM Stack: Full support for Prometheus/Thanos, Loki, and Tempo
- ✅ Multi-Label Policies: Advanced access control with AND/OR logic and operators
- ✅ High Performance: Connection pooling and per-upstream optimization (1000-2000 req/s)
- ✅ Configurable Auth: Support for Keycloak, Azure AD, Auth0, Google, Okta
- ✅ Production Ready: Resource limits, security contexts, health probes, HPA
- ✅ Security Hardened: Non-root execution, dropped capabilities, seccomp profiles
- ✅ Auto-scaling: Horizontal Pod Autoscaler with CPU/memory metrics
- ✅ Monitoring: ServiceMonitor for Prometheus metrics collection
- ✅ Vanilla Kubernetes: No platform-specific dependencies
# Install the chart with new auth configuration (v0.14.0+)
helm install lgtm-lbac-proxy ./helm/lgtm-lbac-proxy \
--namespace observability \
--create-namespace \
--set proxy.auth.jwksCertUrl=https://oauth.example.com/certs \
--set proxy.auth.claims.username=preferred_username \
--set proxy.auth.claims.email=email \
--set proxy.auth.claims.groups=groups \
--set proxy.thanos.url=https://thanos-querier.monitoring.svc:9091 \
--set proxy.loki.url=https://loki-query-frontend.logging.svc:3100 \
--set proxy.tempo.url=https://tempo-query-frontend.tracing.svc:3200
# Create labels ConfigMap with extended format
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: lgtm-lbac-proxy-labels
namespace: observability
data:
labels.yaml: |
# Single-label policy
alice@example.com:
_rules:
- name: namespace
operator: '='
values: ['prod', 'staging']
_logic: AND
# Multi-label policy with regex
backend-team:
_rules:
- name: namespace
operator: '='
values: ['prod']
- name: team
operator: '=~'
values: ['backend-.*']
_logic: AND
# Admin with cluster-wide access
admin-group:
_rules:
- name: '#cluster-wide'
operator: '='
values: ['true']
_logic: AND
EOFMinimal Production (v0.14.0+):
# values.yaml
replicas: 2
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 256Mi
proxy:
# New auth configuration (v0.14.0+)
auth:
jwksCertUrl: https://oauth.example.com/certs
claims:
username: preferred_username
email: email
groups: groups
# Upstream configuration
thanos:
url: https://thanos-querier.monitoring.svc:9091
tenantLabel: namespace
loki:
url: https://loki-query-frontend.logging.svc:3100
tenantLabel: kubernetes_namespace_name
tempo:
url: https://tempo-query-frontend.tracing.svc:3200
tenantLabel: resource.namespaceHigh Availability with Performance Tuning:
replicas: 3
# HPA for auto-scaling
autoscaling:
enabled: true
minReplicas: 3
maxReplicas: 10
targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 80
# Resource limits for high throughput
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 1Gi
proxy:
auth:
jwksCertUrl: https://oauth.example.com/certs
claims:
username: preferred_username
email: email
groups: groups
# Global proxy performance settings (v0.13.0+)
proxyConfig:
request_timeout: 60s
max_idle_conns: 500
max_idle_conns_per_host: 100
idle_conn_timeout: 90s
force_http2: true
# Per-upstream tuning
loki:
url: https://loki-query-frontend.logging.svc:3100
tenantLabel: kubernetes_namespace_name
proxy:
request_timeout: 120s
max_idle_conns_per_host: 150
tempo:
url: https://tempo-query-frontend.tracing.svc:3200
tenantLabel: resource.namespace
proxy:
request_timeout: 300s
max_idle_conns_per_host: 50
# Topology spread for high availability
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotScheduleAzure AD Integration:
proxy:
auth:
jwksCertUrl: https://login.microsoftonline.com/YOUR_TENANT_ID/discovery/v2.0/keys
claims:
username: unique_name # Azure AD specific
email: upn # User Principal Name
groups: roles # Azure AD uses 'roles' for groupsFor complete configuration options and deployment examples, see:
Configure Grafana datasources to point to the proxy:
Prometheus/Thanos:
URL: http://lgtm-lbac-proxy:8080/api/v1
Loki:
URL: http://lgtm-lbac-proxy:8080/loki/api/v1
Tempo:
URL: http://lgtm-lbac-proxy:8080/tempo/api
All requests must include valid JWT token in Authorization header.
# Clone repository
git clone https://github.com/binhnguyenduc/lgtm-lbac-proxy.git
cd lgtm-lbac-proxy
# Build binary
make build
# Run tests
make test
# Build Docker image
make docker-build-fullSee BUILD.md for detailed build instructions.
If you're migrating from the original multena-proxy, see MIGRATION.md for a comprehensive guide.
- Binary name:
multena-proxy→lgtm-lbac-proxy - Docker image:
ghcr.io/gepaplexx/multena-proxy→ghcr.io/binhnguyenduc/lgtm-lbac-proxy - Helm chart:
gp-multena→lgtm-lbac-proxy
-
Label Format (v0.12.0): Simple label format removed
- Old:
user: {namespace: true} - New: Extended format with
_rulesand_logicrequired - Migration: Use
migrate-labelstool to convert existing labels
- Old:
-
MySQL Label Store (v0.9.0): MySQL support removed
- Old:
label_store_kind: mysqlwithdb:configuration - New: File-based ConfigMap only
- Migration: Export MySQL data and convert to YAML format
- Old:
-
Authentication Configuration (v0.14.0): New auth section (backward compatible)
- Recommended: Move
web.jwks_cert_url→auth.jwks_cert_url - New: Configurable JWT claims (
auth.claims) - Note: Legacy fields still work with deprecation warnings
- Recommended: Move
✅ API endpoints: No changes to routing or request handling ✅ JWT authentication: JWKS validation logic unchanged ✅ Core enforcement: Query enforcement patterns preserved
Contributions are welcome! This is an independent project focused on LGTM stack support.
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
This project is licensed under the GNU Affero General Public License v3.0 - see the LICENSE file for details.
- Original Project: multena-proxy by Gepardec
- Contributors: Thank you to everyone who has contributed to making LGTM stack observability more secure
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: GitHub Wiki