Skip to content

Latest commit

 

History

History
458 lines (331 loc) · 14 KB

File metadata and controls

458 lines (331 loc) · 14 KB

OpenTelemetry Tracing Documentation

This MCP server includes comprehensive distributed tracing using OpenTelemetry (OTEL), providing production-ready observability for tool executions and HTTP requests.

⚠️ Important: Stdio Transport Only

This server uses stdio transport exclusively. Only OTLP exporters are supported for tracing.

Console output is incompatible with stdio and will corrupt JSON-RPC communication. All diagnostic logging is disabled by default to ensure reliable operation.

Features

Configuration Loading Tracing

  • Environment Variable Loading: Automatic tracing of .env file loading
  • Load Metrics: Number of variables loaded, file existence, load success
  • Error Tracking: Detailed errors if configuration loading fails
  • Startup Visibility: See configuration issues at server startup

Tool Execution Tracing

  • Automatic Instrumentation: All tool executions are automatically traced
  • Comprehensive Attributes: Input size, output size, success/failure status, timing
  • Error Tracking: Detailed error information with error types and messages
  • Session Context: Support for session ID, user ID, account ID, chat session ID, and prompt ID
  • JWT Validation: Basic JWT format validation for tracing context

HTTP Request Instrumentation

  • Full Request Lifecycle: Complete HTTP request/response tracing with retry logic
  • Performance Metrics: Request duration, payload sizes, retry attempts
  • Response Details: Status codes, response sizes, content types
  • CloudFront Correlation: Automatic capture of CloudFront IDs for Mapbox API requests
  • Cache Monitoring: CloudFront cache hit/miss tracking via x-cache headers
  • Geographic Insights: CloudFront PoP location for geographic distribution analysis
  • Error Classification: Detailed error information with error types

Security & Performance

  • Sensitive Data Protection: Input parameters logged by size only, not content
  • Minimal Overhead: <1% CPU impact, ~10MB memory for trace buffers
  • Configurable Sampling: Support for production trace volume management
  • Graceful Fallback: No impact on functionality when tracing is disabled

Configuration

Environment Variables

The tracing system supports several configuration options through environment variables:

Basic Configuration

# OTLP HTTP endpoint (required to enable tracing)
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

# Optional OTLP headers for authentication
OTEL_EXPORTER_OTLP_HEADERS='{"Authorization": "Bearer your-token"}'

# Optional: OTEL diagnostic log level (default: NONE)
# Only use for troubleshooting OTEL configuration issues
OTEL_LOG_LEVEL=ERROR

Service Configuration

OTEL_SERVICE_NAME=mapbox-mcp-devkit-server
OTEL_SERVICE_VERSION=0.4.5
OTEL_RESOURCE_ATTRIBUTES=service.name=mapbox-mcp-devkit-server,service.version=0.4.5

Sampling Configuration

# Sample rate (0.0 to 1.0) for high-volume environments
OTEL_TRACES_SAMPLER=traceidratio
OTEL_TRACES_SAMPLER_ARG=0.1  # Sample 10% of traces

Enabling Tracing

Tracing is opt-in and disabled by default. To enable tracing, you must configure an OTLP endpoint:

# Enable tracing by setting the OTLP endpoint
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

# Optionally customize the service name
OTEL_SERVICE_NAME=mapbox-mcp-devkit-server

# For debugging OTEL configuration issues only
# OTEL_LOG_LEVEL=ERROR

Note: Console exporters are not supported due to stdio transport limitations.

Verification

To verify tracing is working correctly, see the Tracing Verification Guide which shows how to:

  • Set up Jaeger locally to view traces
  • Test tracing with the MCP inspector
  • Troubleshoot common issues

Configuration File

Use the provided .env.example file which includes:

  • Required settings (Mapbox API token)
  • OpenTelemetry tracing configuration (Jaeger/OTLP)
  • Optional AWS X-Ray, Azure, GCP, Datadog, New Relic, and Honeycomb configurations
  • All MCP server settings

Setup:

# Copy the example configuration
cp .env.example .env

# Edit .env to:
#   1. Add your MAPBOX_ACCESS_TOKEN
#   2. Uncomment tracing settings for Jaeger/OTLP
#   3. Or uncomment cloud provider settings if using AWS/Azure/GCP

The server automatically loads configuration from .env at startup, eliminating the need for inline environment variables in npm scripts.

Supported Backends

The server supports any OTLP-compatible observability backend. Configuration examples are provided in .env.example for:

Development

  • Jaeger: Local development with Docker (see Quick Start)
    • npm run tracing:jaeger:start
    • Endpoint: http://localhost:4318
    • UI: http://localhost:16686

Production Cloud Providers

  • AWS X-Ray: AWS-native distributed tracing

    • Endpoint: AWS Distro for OpenTelemetry Collector
    • Auth: IAM credentials
    • Setup Guide
  • Azure Monitor: Azure Application Insights

    • Endpoint: https://<region>.livediagnostics.monitor.azure.com
    • Auth: Connection string or AAD token
    • Setup Guide
  • Google Cloud Trace: GCP-native tracing

    • Endpoint: https://cloudtrace.googleapis.com
    • Auth: Application Default Credentials
    • Setup Guide

Production SaaS Observability Platforms

  • Datadog: Full-stack observability platform

    • Endpoint: https://api.datadoghq.com/api/v2/traces or local agent
    • Auth: API key
    • Setup Guide
  • New Relic: Application performance monitoring

    • Endpoint: https://otlp.nr-data.net:4318 (US) or https://otlp.eu01.nr-data.net:4318 (EU)
    • Auth: License key
    • Setup Guide
  • Honeycomb: Observability for complex systems

    • Endpoint: https://api.honeycomb.io:443
    • Auth: API key + dataset name
    • Setup Guide

Configuration

All backends are configured via .env file. See .env.example for complete configuration examples for each platform.

Quick Start

1. Setup .env Configuration

# Copy the example configuration
cp .env.example .env

# Edit .env to:
#   1. Add your MAPBOX_ACCESS_TOKEN
#   2. The OTEL_EXPORTER_OTLP_ENDPOINT is already set to http://localhost:4318
#   3. Customize OTEL_SERVICE_NAME if needed

2. Local Development with Jaeger

# Start Jaeger (Docker)
npm run tracing:jaeger:start

# Build and run the server with MCP inspector
npm run inspect:build

# View traces at http://localhost:16686

# Stop Jaeger when done
npm run tracing:jaeger:stop

3. AWS X-Ray Integration

# Edit .env and uncomment AWS X-Ray settings:
#   - AWS_REGION=us-east-1
#   - Update OTEL_RESOURCE_ATTRIBUTES to include aws.region
#   - Uncomment OTEL_EXPORTER_OTLP_HEADERS for X-Ray trace IDs

# Ensure AWS credentials are configured
# (via IAM role, AWS CLI profile, or environment variables)

# Start AWS Distro for OpenTelemetry Collector
# See: https://aws-otel.github.io/docs/getting-started/collector

# Start the server
npm run inspect:build

Trace Structure

Configuration Loading Spans

The server traces configuration loading at startup:

config.load_env
├── config.file.path: "/path/to/.env"
├── config.file.exists: true
├── config.vars.loaded: 5
├── operation.type: "config_load"
└── config.load.success: true

This span captures:

  • Whether the .env file exists
  • Number of environment variables loaded
  • Any errors during configuration loading
  • Overall success/failure status

Tool Execution Spans

tool.list_styles_tool
├── tool.name: "list_styles_tool"
├── tool.input.size: 256
├── tool.output.size: 4096
├── tool.success: true
├── session.id: "session-123"
├── user.id: "user-456"
├── account.id: "account-789"
├── chat.session.id: "chat-123"
└── prompt.id: "prompt-456"

HTTP Request Spans

http.get
├── http.method: "GET"
├── http.url: "https://api.mapbox.com/..."
├── http.status_code: 200
├── http.status_text: "OK"
├── http.user_agent: "mapbox-mcp-devkit-server/0.4.5"
├── http.request.content_length: 512
├── http.response.content_length: 2048
├── http.response.content_type: "application/json"
├── http.response.header.x_amz_cf_id: "HsL_E2ZgW72g4tg_ppvpljSFWa2yYcWziQjZ4d7_1czoC7-53UkAdg=="
├── http.response.header.x_amz_cf_pop: "IAD55-P3"
├── http.response.header.x_cache: "Miss from cloudfront"
└── http.response.header.etag: "W/\"21fe5-88gHkqbxd+dMWiCvnvxi2sikhUs\""

CloudFront Correlation IDs

For Mapbox API requests, the tracing system automatically captures CloudFront correlation headers:

  • x-amz-cf-id: CloudFront request ID for correlation with AWS support
  • x-amz-cf-pop: CloudFront Point of Presence location
  • x-cache: Cache hit/miss status from CloudFront
  • etag: Entity tag for cache validation

These headers enable:

  • Correlation with Mapbox API logs and support tickets
  • Geographic distribution analysis (via PoP location)
  • Cache performance monitoring
  • End-to-end request tracing through CloudFront

DevKit-Specific Tools Traced

All developer tools are automatically traced:

  • Style Management: list_styles_tool, create_style_tool, update_style_tool, retrieve_style_tool, delete_style_tool
  • Style Building: style_builder_tool, preview_style_tool, style_comparison_tool
  • Token Management: list_tokens_tool, create_token_tool
  • Local Processing: geojson_preview_tool, coordinate_conversion_tool, bounding_box_tool
  • Tilesets: tilequery_tool

Each tool execution creates a complete trace showing:

  • Tool invocation timing
  • API requests to Mapbox services
  • Data processing steps
  • Success/failure status
  • Error details if applicable

Troubleshooting

Tracing Not Working

  1. Check initialization: Look for "OpenTelemetry tracing: enabled" in logs
  2. Verify environment: Ensure NODE_ENV !== 'test' and VITEST is not set
  3. Check configuration: Verify OTLP endpoint is accessible
  4. Network connectivity: Test that the OTLP endpoint is reachable

High Memory Usage

  1. Reduce sampling rate: Set OTEL_TRACES_SAMPLER_ARG to a lower value (e.g., 0.1)
  2. Check batch size: Default batch processing should handle most cases
  3. Monitor buffers: Trace buffers are automatically flushed

Performance Impact

  1. Expected overhead: <1% CPU impact under normal load
  2. Memory usage: ~10MB for trace buffers
  3. Network impact: Traces sent in batches to minimize network calls

Advanced Configuration

Custom Sampling

For high-volume production environments, configure sampling:

export OTEL_TRACES_SAMPLER=traceidratio
export OTEL_TRACES_SAMPLER_ARG=0.1  # Sample 10% of traces

Custom Resource Attributes

Add custom resource attributes for better trace organization:

export OTEL_RESOURCE_ATTRIBUTES="service.name=mapbox-mcp-devkit-server,service.version=0.4.5,environment=production,datacenter=us-east-1"

Disabling Specific Instrumentations

The tracing system automatically disables noisy instrumentations (fs, dns) but you can configure more:

// In tracing.ts, modify the getNodeAutoInstrumentations call
getNodeAutoInstrumentations({
  '@opentelemetry/instrumentation-fs': { enabled: false },
  '@opentelemetry/instrumentation-dns': { enabled: false },
  '@opentelemetry/instrumentation-http': { enabled: true } // Keep HTTP
});

Security Considerations

Data Privacy

  • Input sanitization: Only input/output sizes are logged, not content
  • JWT validation: Basic format validation only, no secret verification
  • Error messages: Error details are logged but sensitive data is protected

Authentication

  • OTLP headers: Support for authentication headers
  • TLS: Use HTTPS endpoints for production
  • IAM roles: Use IAM roles for AWS X-Ray integration

Integration Examples

Docker Compose with Jaeger

version: '3.8'
services:
  jaeger:
    image: jaegertracing/all-in-one:latest
    ports:
      - '16686:16686'
      - '4318:4318'
    environment:
      - COLLECTOR_OTLP_ENABLED=true

  mcp-devkit-server:
    build: .
    environment:
      - OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
      - MAPBOX_ACCESS_TOKEN=${MAPBOX_ACCESS_TOKEN}
    depends_on:
      - jaeger

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mapbox-mcp-devkit-server
spec:
  template:
    spec:
      containers:
        - name: mcp-devkit-server
          image: mapbox-mcp-devkit-server:latest
          env:
            - name: OTEL_EXPORTER_OTLP_ENDPOINT
              value: 'http://otel-collector:4318'
            - name: OTEL_SERVICE_NAME
              value: 'mapbox-mcp-devkit-server'
            - name: OTEL_RESOURCE_ATTRIBUTES
              value: 'service.name=mapbox-mcp-devkit-server,k8s.namespace=default'

Monitoring and Alerting

Key Metrics to Monitor

  1. Tool Success Rate: Percentage of successful tool executions
  2. HTTP Error Rate: Percentage of failed HTTP requests
  3. Response Times: P95/P99 latencies for tools and HTTP requests
  4. Error Types: Most common error types and patterns

Sample Queries

For Jaeger or compatible systems:

# Find slow tool executions
operation="tool.*" AND duration:>5s

# Find HTTP errors
operation="http.*" AND error=true

# Find specific tool failures
operation="tool.create_style_tool" AND error=true

Support

For tracing-related questions or issues:

  1. Check the troubleshooting section above
  2. Verify your configuration against the examples
  3. Test with console tracing first before using remote backends
  4. Check that your OTLP endpoint is accessible and properly configured