feat: add log aggregation and request correlation for Playwright tests#298
Merged
Conversation
e702c16 to
31aba4e
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces a comprehensive logging and correlation system for Playwright tests, enabling backend logs to be aggregated and correlated with test reports through unique identifiers.
Changes:
- Implemented a new log-aggregator service for centralized log collection and querying
- Added requestID and testID fields throughout the system for log correlation
- Enhanced all services (control, file, worker) to capture and forward logs with correlation IDs
- Updated e2e tests to automatically attach aggregated logs to test reports
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| worker-python/Dockerfile | Standardized Dockerfile AS keyword capitalization |
| worker-javascript/Dockerfile | Standardized Dockerfile AS keyword capitalization |
| worker-java/Dockerfile | Standardized Dockerfile AS keyword capitalization |
| worker-csharp/Dockerfile | Standardized Dockerfile AS keyword capitalization |
| log-aggregator/main.go | New in-memory log aggregation service with HTTP API |
| log-aggregator/Dockerfile | Dockerfile for building the log-aggregator service |
| k8/log-aggregator-service.yaml | Kubernetes service configuration for log-aggregator |
| k8/log-aggregator-deployment.yaml.tpl | Kubernetes deployment template for log-aggregator |
| k8/file-deployment.yaml.tpl | Added LOG_AGGREGATOR environment variables |
| k8/control-deployment.yaml.tpl | Added LOG_AGGREGATOR environment variables |
| internal/workertypes/types.go | Added RequestID and TestID fields to worker request/response payloads |
| internal/worker/worker.go | Integrated structured logging with log aggregation |
| internal/logagg/logagg.go | Core log aggregation client library |
| internal/logagg/hook.go | Logrus hook for automatic log forwarding |
| file-service/main.go | Added request-scoped logging with correlation IDs |
| file-service/Dockerfile | Updated to copy full internal directory |
| e2e/tests/visual.spec.ts | Added log attachment functionality |
| e2e/tests/api.spec.ts | Added log attachment with correlation IDs |
| control-service/workers.go | Updated worker to pass correlation IDs |
| control-service/main.go | Enhanced with request-scoped logging and correlation |
| control-service/Dockerfile | Standardized Dockerfile AS keyword capitalization |
| .github/workflows/nodejs.yml | Configured log-aggregator in CI pipeline |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cfe2d94 to
d63519e
Compare
Critical fixes: - Replace deprecated io/ioutil with io.ReadAll in control-service - Fix race condition in worker logger by using instance logger instead of modifying global logger - Add error handling for http.ResponseWriter.Write() in log-aggregator Medium priority improvements: - Create custom HTTP client with proper timeouts in logagg package to ensure context cancellation is respected - Configure connection pooling and timeout settings for better reliability This ensures: 1. Go 1.16+ compatibility (no deprecated APIs) 2. Thread-safe logging in worker processes 3. Proper error reporting when writing log responses fails 4. HTTP requests properly respect context deadlines Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
227b1af to
b79fd06
Compare
defer logagg.DeferPost(...) deferred the call to DeferPost itself and discarded the returned closure, so the end-of-request buffer flush never ran. Logs already reach the aggregator via the per-entry logrus hook, so the deferred buffer flush was redundant (and would double-send if fixed with a trailing call). Remove DeferPost and the now write-only logBuffer in file-service and worker; control-service keeps its buffer for error response bodies.
alpine:3.23 now ships squid-7.5-r0; the exact pin squid=7.3-r0 no longer resolves (apk: breaks world[squid=7.3-r0]), failing the build (squid) CI job and blocking the e2e test job. Verified the image builds with 7.5-r0.
The video-recording example crawled the live news.ycombinator.com, clicking specific links in a loop. When HN was slow or rate-limited the CI runner, a .click() hit the 30s timeout, the snippet threw, and no video was produced - failing 'Examples > 4: should be able to record a video' (visual.spec.ts:86, ~50% flip rate on main). The failing link varied run-to-run, confirming an external-dependency flake rather than a selector bug. Navigate the stable playwright.dev docs instead (unique exact-match links to avoid strict-mode violations). Verified locally: snippet runs 3/3 producing ~100KB .webm files, frontend builds, and the new snippet is bundled via the ?raw import. Also fix the stale example description.
handleRun set the X-Request-ID/X-Test-ID response headers twice: once unconditionally near the top and again right before the success response. Keep the early set (it covers all response paths) and remove the redundant pre-success duplicate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Technical Details
log-aggregator- lightweight Go HTTP service with 60min TTL in-memory storageinternal/logagg- shared library for posting logs to aggregator (best-effort, 2s timeout)/logs/{testId}endpoint and attach to test reportsChanges by Service
Infrastructure
LOG_AGGREGATOR_URL,LOG_AGGREGATOR_ENABLEDAS builder)🤖 Generated with Claude Code