Skip to content

DIARCHERS-1396: MCP-dedicated API layer with rate limiting, auth, and audit#610

Open
sap-yuan wants to merge 7 commits into
masterfrom
diarchers-1396-mcp-api-layer
Open

DIARCHERS-1396: MCP-dedicated API layer with rate limiting, auth, and audit#610
sap-yuan wants to merge 7 commits into
masterfrom
diarchers-1396-mcp-api-layer

Conversation

@sap-yuan
Copy link
Copy Markdown
Collaborator

@sap-yuan sap-yuan commented Jun 5, 2026

Summary

Implements DIARCHERS-1396: isolates all MCP/AI-agent API calls under a dedicated /api/v1/mcp/* namespace, with independent rate limiting, scoped bearer tokens, and audit logging — mirroring the dhaas-control-center pattern.

  • New DB migration (00046.sql): mcp_token and mcp_access_log tables
  • src/api/handlers/mcp/auth.py: ib_mcp_* bearer token validation; check_project_access_mcp and check_trigger_access_mcp helpers
  • src/api/handlers/mcp/rate_limit.py: Redis sliding-window rate limiter (per-user per-endpoint, fail-open on Redis outage)
  • src/api/handlers/mcp/audit.py: fire-and-forget audit logging into mcp_access_log
  • src/api/handlers/mcp/token_routes.py: token CRUD at POST/GET/PATCH/DELETE /api/v1/mcp/tokens/* (session auth)
  • src/api/handlers/mcp/routes/: /api/v1/mcp/* endpoints for projects, builds, jobs, logs, artifacts, trigger
  • infrabox/test/api/mcp_test.py: 21 unit tests covering token hash, access checks, rate limiter (allow/deny/fail-open)

Changes

  • Zero impact on existing /api/v1/* endpoints — MCP tokens are blocked from all non-MCP paths
  • Per-user Redis sliding window with configurable RPM limits per endpoint (log/artifact: 10, trigger: 5, default: 30)
  • Project-scoped tokens with per-project expiry in JSONB; trigger requires explicit allow_trigger=true
  • Middleware order: auth → rate limit → project access → trigger access → handler → audit

Testing

cd infrabox/test/api
PYTHONPATH=../../src python -m pytest mcp_test.py -v   # 21 passed

JIRA

DIARCHERS-1396

Yuan Huang added 6 commits May 25, 2026 14:30
The global_token table (migration 00045) defines expires_at as NOT NULL,
but the test was inserting rows without providing this column, causing
NotNullViolation errors that cascaded into all 14 test failures.

Add expires_at = NOW() + INTERVAL '30 days' to the three direct INSERT
statements in global_tokens_test.py.
The package.json engines field requires node>=20.0.0 and npm>=10.0.0,
but the CI Dockerfile was still using node:8.9-alpine. The build.js
version check rejects the old runtime, causing the build to fail.
The cp -r of ~50,000 small files in node_modules was causing the CI job
to timeout (1 hour). Using tar pipe for sequential bulk I/O reduces the
copy time from minutes to seconds.
The real bottleneck was not cp -r in build.sh but job.py's post-build
step that compresses and uploads /infrabox/cache via snappy+tar to the
API server. With node:20's much larger node_modules (~200MB+), this
compression/upload exceeds the 1-hour job timeout.

Solution: stop writing node_modules back to /infrabox/cache. Instead,
use mv to restore cached node_modules at the start (fast), and don't
write it back (npm install with warm cache only takes ~20s anyway).
This eliminates the expensive cache upload entirely.
…hang

webpack 3.12.0 leaves open handles (internal timers/fs watchers) under
Node.js 20, causing the process to never exit naturally after a
successful build. The container hangs, docker run waits forever, and
the InfraBox job hits the 1-hour timeout.

The failure path already calls process.exit(1) explicitly; mirror that
for the success path with process.exit(0).

Root cause introduced by 34ade8f (node:8.9 -> node:20-alpine upgrade).
… and audit

- DB migration 00046: mcp_token and mcp_access_log tables
- api/handlers/mcp/auth.py: ib_mcp_* bearer token validation, project/trigger access checks
- api/handlers/mcp/rate_limit.py: Redis sliding-window per-user per-endpoint rate limiter (fail-open)
- api/handlers/mcp/audit.py: fire-and-forget audit logging to mcp_access_log
- api/handlers/mcp/token_routes.py: token CRUD at /api/v1/mcp/tokens/*
- api/handlers/mcp/routes/: /api/v1/mcp/* endpoints for projects, builds, jobs, artifacts, trigger
- infrabox/test/api/mcp_test.py: 21 unit tests covering hash, access checks, rate limiter
@sap-yuan sap-yuan self-assigned this Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant