This project is a Proof of Concept (POC) for a robust video processing pipeline. It allows users to authenticate, upload videos directly to AWS S3 using pre-signed URLs, and triggers asynchronous backend workers to transcode the video, generate HLS (HTTP Live Streaming) streams, extract thumbnails, transcribe audio, and generate AI summaries.
This README is designed to provide context for AI assistants or human developers to quickly understand the architecture, maintain, and expand the project.
- Frontend: Next.js 16 (App Router), React 19
- Styling: Tailwind CSS 4
- Backend API: Node.js, Express.js (v5)
- Database: PostgreSQL (provided locally via Neon Docker image)
- ORM: Drizzle ORM
- Task Queue: Redis + BullMQ
- Video Processing: FFmpeg (via
fluent-ffmpegwrapper) - Speech-to-Text:
faster-whisper(CTranslate2 Whisper running on CPU) - AI Summarization: Gemini 2.5 Flash via
deepagentsSDK - Storage: AWS S3
- Containerization: Docker & Docker Compose
- Authentication:
- Users register/login via
/api/authendpoints. - The backend issues a JWT, stored client-side (cookies/localStorage) for subsequent requests.
- Users register/login via
- S3 Direct Upload:
- Frontend calls
/api/s3/uploadto request a secure pre-signed URL. - A direct
PUTrequest is made from the browser to S3, eliminating the need to proxy massive video files through the Node.js server.
- Frontend calls
- Trigger Processing:
- Once the S3 upload finishes, the frontend signals the backend via
/api/s3/process(passing thefileId).
- Once the S3 upload finishes, the frontend signals the backend via
- Background Workers (BullMQ):
- The
/api/s3/processendpoint enqueues jobs onto Redis.
- The
- Separate worker processes pick up these jobs to execute heavy commands asynchronously.
- Transcode Worker: Processes/compresses the original video.
- HLS Worker: Converts the video into partitioned HLS streams for adaptive bitrate streaming.
- Thumbnail Worker: Extracts static thumbnails.
- Transcribe Worker: (AI Worker) Downloads video, extracts audio, performs speech-to-text via
faster-whisper, uploads the transcript file to S3, and enqueues the transcript's plain text onto thesummaryqueue. - Summary Worker: (AI Worker) Takes the plain text transcript, invokes the Gemini API via
deepagents(gemini-2.5-flash), saves the generated summary directly in the database (summarycolumn), and updates the summary status.
videoThumbnailProcessingPOC/
├── web/ # Next.js 16 frontend app
│ ├── src/
│ │ ├── app/ # Next.js App Router components and layouts
│ │ │ └── components/ # Reusable UI components
│ │ └── lib/ # Central API client and utility library
│ ├── package.json # Frontend npm package manifest
│ └── tsconfig.json # Frontend TypeScript settings
├── server/ # Express 5 backend server & workers
│ ├── src/
│ │ ├── app.ts # Express app configuration & middleware
│ │ ├── server.ts # REST API server entry point
│ │ ├── worker.ts # BullMQ transcoder, HLS, and thumbnail worker entry point
│ │ ├── ai-worker.ts # BullMQ faster-whisper transcription & Gemini summary worker entry point
│ │ ├── routes/ # Express route controllers
│ │ ├── controllers/ # API action handlers
│ │ ├── middleware/ # Authentication and validation middlewares
│ │ ├── models/ # Drizzle schema definitions
│ │ ├── services/ # S3, Transcode, transcription, and summarization services
│ │ └── workers/ # BullMQ queue & worker specifications
│ ├── drizzle/ # Migration SQL files generated by Drizzle
│ ├── Dockerfile # Multi-stage Docker container build instructions
│ ├── package.json # Server dependencies & launch scripts
│ └── .env.example # Template for server environment configurations
└── docker-compose.yml # Orchestrates Redis, local DB connection proxy, API, Worker, and AI Worker
-
Environment Variables: Navigate to the
serverdirectory, copy the example environment file, and fill in your AWS credentials and PostgreSQL URI:cd server cp .env.example .env # Edit .env with your AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, S3_BUCKET_NAME etc.
-
Docker Compose: From the root of the project, spin up the entire infrastructure:
docker-compose up --build
This command starts:
redis: Used by BullMQ.neon-local: Local Neon PostgreSQL instance.api: Node server running on port 8080.worker: Node process dedicated strictly to processing the video/HLS/thumbnail transcoding queues.ai-worker: Node process dedicated to running the audio transcription and transcript summarization workers (requires aGEMINI_API_KEY).
-
Database Push (if running locally natively):
cd server npm install npm run db:push -
Accessing the UI: Run the Next.js frontend development server:
cd web npm install npm run devNow open http://localhost:3000 in your web browser to view the application interface.
If you are an AI attempting to update or debug this repository, keep the following principles in mind:
-
Adding New Features to Upload Flow:
- The upload bypasses the server body parsers for efficiency. Any metadata should be saved in the database before generating the pre-signed URL or during the
/api/s3/processcallback.
- The upload bypasses the server body parsers for efficiency. Any metadata should be saved in the database before generating the pre-signed URL or during the
-
Expanding the Worker System:
- To add a new background task (e.g., video transcription or watermark injection), create a new queue and worker pair in
server/src/workers/. - Be sure to instantiate and attach the new worker in
server/src/worker.tsso theworkerDocker container actually listens to the queue. - Workers use
fluent-ffmpeg. Ensure the worker environment always has FFmpeg installed (handled automatically in the currentDockerfile).
- To add a new background task (e.g., video transcription or watermark injection), create a new queue and worker pair in
-
Database Tweaks:
- Modify
server/src/models/*.ts. - Use
npm run db:generatefollowed bynpm run db:migrateornpm run db:pushto apply changes.
- Modify
-
Frontend Development:
- The frontend is built on Next.js 16 (App Router) and React 19. All user components reside in
web/src/app/components/and route pages are inweb/src/app/. Keep style definitions standardized under Tailwind CSS 4 utility classes.
- The frontend is built on Next.js 16 (App Router) and React 19. All user components reside in
-
Monitoring & Scale constraints:
docker-compose.ymllimits the worker container to 2GB of memory. High-resolution transcoding may require increasing this allocation.