Video Thumbnail Processing POC

Overview

This project is a Proof of Concept (POC) for a robust video processing pipeline. It allows users to authenticate, upload videos directly to AWS S3 using pre-signed URLs, and triggers asynchronous backend workers to transcode the video, generate HLS (HTTP Live Streaming) streams, extract thumbnails, transcribe audio, and generate AI summaries.

This README is designed to provide context for AI assistants or human developers to quickly understand the architecture, maintain, and expand the project.

Technology Stack

Frontend: Next.js 16 (App Router), React 19
Styling: Tailwind CSS 4
Backend API: Node.js, Express.js (v5)
Database: PostgreSQL (provided locally via Neon Docker image)
ORM: Drizzle ORM
Task Queue: Redis + BullMQ
Video Processing: FFmpeg (via fluent-ffmpeg wrapper)
Speech-to-Text: faster-whisper (CTranslate2 Whisper running on CPU)
AI Summarization: Gemini 2.5 Flash via deepagents SDK
Storage: AWS S3
Containerization: Docker & Docker Compose

Architecture & Data Flow

Authentication:
- Users register/login via /api/auth endpoints.
- The backend issues a JWT, stored client-side (cookies/localStorage) for subsequent requests.
S3 Direct Upload:
- Frontend calls /api/s3/upload to request a secure pre-signed URL.
- A direct PUT request is made from the browser to S3, eliminating the need to proxy massive video files through the Node.js server.
Trigger Processing:
- Once the S3 upload finishes, the frontend signals the backend via /api/s3/process (passing the fileId).
Background Workers (BullMQ):
- The /api/s3/process endpoint enqueues jobs onto Redis.

Separate worker processes pick up these jobs to execute heavy commands asynchronously.
Transcode Worker: Processes/compresses the original video.
HLS Worker: Converts the video into partitioned HLS streams for adaptive bitrate streaming.
Thumbnail Worker: Extracts static thumbnails.
Transcribe Worker: (AI Worker) Downloads video, extracts audio, performs speech-to-text via faster-whisper, uploads the transcript file to S3, and enqueues the transcript's plain text onto the summary queue.
Summary Worker: (AI Worker) Takes the plain text transcript, invokes the Gemini API via deepagents (gemini-2.5-flash), saves the generated summary directly in the database (summary column), and updates the summary status.

Directory Structure

videoThumbnailProcessingPOC/
├── web/                  # Next.js 16 frontend app
│   ├── src/
│   │   ├── app/          # Next.js App Router components and layouts
│   │   │   └── components/ # Reusable UI components
│   │   └── lib/          # Central API client and utility library
│   ├── package.json      # Frontend npm package manifest
│   └── tsconfig.json     # Frontend TypeScript settings
├── server/               # Express 5 backend server & workers
│   ├── src/
│   │   ├── app.ts        # Express app configuration & middleware
│   │   ├── server.ts     # REST API server entry point
│   │   ├── worker.ts     # BullMQ transcoder, HLS, and thumbnail worker entry point
│   │   ├── ai-worker.ts  # BullMQ faster-whisper transcription & Gemini summary worker entry point
│   │   ├── routes/       # Express route controllers
│   │   ├── controllers/  # API action handlers
│   │   ├── middleware/   # Authentication and validation middlewares
│   │   ├── models/       # Drizzle schema definitions
│   │   ├── services/     # S3, Transcode, transcription, and summarization services
│   │   └── workers/      # BullMQ queue & worker specifications
│   ├── drizzle/          # Migration SQL files generated by Drizzle
│   ├── Dockerfile        # Multi-stage Docker container build instructions
│   ├── package.json      # Server dependencies & launch scripts
│   └── .env.example      # Template for server environment configurations
└── docker-compose.yml    # Orchestrates Redis, local DB connection proxy, API, Worker, and AI Worker

Local Development Setup

Environment Variables: Navigate to the server directory, copy the example environment file, and fill in your AWS credentials and PostgreSQL URI:
```
cd server
cp .env.example .env
# Edit .env with your AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, S3_BUCKET_NAME etc.
```
Docker Compose: From the root of the project, spin up the entire infrastructure:
```
docker-compose up --build
```
This command starts:
- redis: Used by BullMQ.
- neon-local: Local Neon PostgreSQL instance.
- api: Node server running on port 8080.
- worker: Node process dedicated strictly to processing the video/HLS/thumbnail transcoding queues.
- ai-worker: Node process dedicated to running the audio transcription and transcript summarization workers (requires a GEMINI_API_KEY).
Database Push (if running locally natively):
```
cd server
npm install
npm run db:push
```
Accessing the UI: Run the Next.js frontend development server:
```
cd web
npm install
npm run dev
```
Now open http://localhost:3000 in your web browser to view the application interface.

AI Assistant Guide for Maintenance & Expansion

If you are an AI attempting to update or debug this repository, keep the following principles in mind:

Adding New Features to Upload Flow:
- The upload bypasses the server body parsers for efficiency. Any metadata should be saved in the database before generating the pre-signed URL or during the /api/s3/process callback.
Expanding the Worker System:
- To add a new background task (e.g., video transcription or watermark injection), create a new queue and worker pair in server/src/workers/.
- Be sure to instantiate and attach the new worker in server/src/worker.ts so the worker Docker container actually listens to the queue.
- Workers use fluent-ffmpeg. Ensure the worker environment always has FFmpeg installed (handled automatically in the current Dockerfile).
Database Tweaks:
- Modify server/src/models/*.ts.
- Use npm run db:generate followed by npm run db:migrate or npm run db:push to apply changes.
Frontend Development:
- The frontend is built on Next.js 16 (App Router) and React 19. All user components reside in web/src/app/components/ and route pages are in web/src/app/. Keep style definitions standardized under Tailwind CSS 4 utility classes.
Monitoring & Scale constraints:
- docker-compose.yml limits the worker container to 2GB of memory. High-resolution transcoding may require increasing this allocation.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.vscode		.vscode
server		server
web		web
.gitignore		.gitignore
GEMINI.md		GEMINI.md
INTERVIEW_REVISION.md		INTERVIEW_REVISION.md
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Thumbnail Processing POC

Overview

Technology Stack

Architecture & Data Flow

Directory Structure

Local Development Setup

AI Assistant Guide for Maintenance & Expansion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Video Thumbnail Processing POC

Overview

Technology Stack

Architecture & Data Flow

Directory Structure

Local Development Setup

AI Assistant Guide for Maintenance & Expansion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages