This guide covers production operations for Site Audit: scheduled audits, property alerts, access control, database migrations, and test execution.
Related documentation: README.md · Documentation index
Site Audit exposes HTTP endpoints suitable for cron and monitoring systems. By default, these routes accept requests from localhost only. When exposing the application beyond a single host, place the endpoints behind your own authentication and network controls.
| Capability | Endpoint | Typical schedule |
|---|---|---|
| Scheduled audits | POST /api/schedule/check |
Weekly or daily |
| Property alerts | POST /api/alerts/check?propertyId={id} |
Daily |
Configure per-property schedules and webhooks under Integrations → Scheduled audits & alerts.
POST /api/schedule/check
The endpoint invokes schedule_runner.py, which:
- Evaluates each property's
schedule_cronexpression (UTC, five-field cron syntax) against the current minute. - Spawns a full audit (
python -m src) withWP_PROPERTY_IDandWP_SCHEDULED_SPAWN=1. - Reads
pipeline_configfor shared integration keys (Google, and similar) only. Crawl settings are derived from the property'ssite_urlanddefault_crawl_preset(starter,spa,ecommerce, orperformance).
Important: Scheduled runs never write to or overwrite pipeline_config. Manual Run audit actions from the web UI also use saved pipeline_config without modification.
Run scheduled audits every Monday at 06:00 UTC:
# crontab -e
0 6 * * 1 curl -fsS -X POST http://127.0.0.1:3000/api/schedule/checkThe response includes:
output— runner loggscLinksStale— properties that require a Google Search Console Links CSV re-import
POST /api/alerts/check?propertyId={id}
Evaluates health-score changes and stale GSC Links imports for the specified property. When alert_webhook_url is configured on the property, sends a POST notification to that URL. When alert_email is set and SMTP is configured on the server, sends a plain-text email summary.
Response JSON includes alerts, webhook_sent, and email_sent.
Set on the host running the web app (Docker: web service environment):
| Variable | Required | Default | Purpose |
|---|---|---|---|
SMTP_HOST |
Yes (with SMTP_FROM) |
— | SMTP server hostname |
SMTP_FROM |
Yes (with SMTP_HOST) |
— | From address |
SMTP_PORT |
No | 587 |
SMTP port |
SMTP_USER |
No | — | Login user (if auth required) |
SMTP_PASS |
No | — | Login password |
SMTP_USE_TLS |
No | true |
Use STARTTLS |
If SMTP is not configured, alert checks still succeed; email_sent is false.
Check alerts daily at 07:00 UTC for property ID 1:
# crontab -e
0 7 * * * curl -fsS -X POST "http://127.0.0.1:3000/api/alerts/check?propertyId=1"When AUTH_SECRET (or SESSION_SECRET) is set, the application requires login. Roles (web/src/server/auth.ts):
| Role | Mutations | AI Chat |
|---|---|---|
analyst (default) |
Allowed | Allowed |
editor |
Allowed | Allowed |
admin |
Allowed | Allowed |
client-readonly |
Blocked (403) | Allowed |
viewer |
Blocked (403) | Blocked (403) |
Set the default role for new sessions:
AUTH_DEFAULT_ROLE=client-readonly
Production also requires AUTH_SECRET and optionally AUTH_USER / AUTH_PASSWORD (see docker-compose.prod.yml).
The mcp service in docker-compose.prod.yml exposes read-only audit tools over HTTP at /mcp. Configure on Secrets → Remote MCP (/secrets) or via environment variables (env overrides saved values):
| Variable | Purpose |
|---|---|
WP_MCP_TOKEN |
Bearer token for MCP clients (Authorization: Bearer …) |
WP_MCP_ALLOWED_HOSTS |
Public hostname allowlist (e.g. audit.example.com) |
WP_MCP_ALLOWED_ORIGINS |
Optional Origin allowlist |
WP_MCP_DOMAIN |
Tool bundle (core recommended for remote) |
MCP_PORT |
Host port mapped to container 8000 (default 8000) |
Terminate TLS at your reverse proxy; do not expose plain HTTP publicly. Configure token and allowed hostnames on Secrets → Remote MCP (/secrets, Remote MCP section).
Set AUTH_DEFAULT_ROLE=client-readonly so session logins cannot run audits or save settings. The API returns 403 on mutations; the UI hides Run audit and disables save controls. Use viewer instead if chat access should also be blocked.
Apply schema changes after pulling updates. Current Alembic head: 015_crawl_page_html (per-URL HTML storage). Recent migrations: 013 (link edges, discovery mode), 014 (pipeline job log truncation).
./local-run migrateIf PostgreSQL is already running:
alembic upgrade headMigrations run automatically at container start. Use one of the following so Postgres and the application share a network:
docker compose up # build from source
docker compose -f docker-compose.pull.yml up # pre-built WEB_IMAGEDo not run the application container in isolation with docker run unless you provide a reachable DATABASE_URL.
For CI parity, run from the repository root:
./local-test # Python + web (matches CI python and web jobs)
./local-test python # Backend gates + browser pytest + CLI smokeCI also runs a Docker job (image build, browser pytest in container, compose smoke). See .github/workflows/ci.yml.
Python (core coverage gate — 100%):
export DATABASE_URL=postgres://profiling:profiling@localhost:5432/website_profiling
alembic upgrade head
pytest tests/ -m "not browser"Integration tests marked @pytest.mark.integration skip when DATABASE_URL is unset.
Browser crawl end-to-end:
pytest tests/test_crawler_browser_e2e.py -m browserReporting and tools coverage gates:
./local-test pythonTest file lists for reporting and tools gates are maintained in scripts/local-test.sh and .github/workflows/ci.yml. Update all three locations when adding coverage tests.
Web (Vitest):
cd web && npm test