This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
JustOCR is a web app for OCR processing using multiple providers. Users can upload documents (images or PDFs), select an OCR provider, and extract text.
bun dev # Start development server at localhost:3000
bun run build # Production build (only run when asked)
bun run lint # Run ESLint
bun run test # Run test suite (loads .env.local)Note: Do not run bun dev - the dev server is already running in a separate terminal.
This is a Next.js 16 project using the App Router with React 19 and TypeScript.
UI Framework: shadcn/ui with the "base-vega" style variant, using:
- Tailwind CSS 4 for styling
- @base-ui/react as the component primitive library
- lucide-react for icons
- Stone color palette with CSS variables defined in
app/globals.css
Path Aliases (configured in tsconfig.json):
@/*maps to root (e.g.,@/components,@/lib)
Key Directories:
app/- Next.js App Router pages and layoutsapp/api/ocr/- OCR processing API endpointcomponents/- App components (upload-zone, provider-selector, ocr-results)components/ui/- shadcn/ui componentslib/ocr/- OCR provider abstraction layerlib/ocr/providers/- Individual OCR provider implementationslib/pdf.ts- Server-side PDF to image conversion (uses pdftoppm)lib/pdf-browser.ts- Client-side PDF to image conversion (uses PDF.js)tests/- Test suite (unit tests + integration tests for live API calls)
Provider Abstraction (lib/ocr/):
types.ts- OCRProvider, OCRResult, and benchmark interfacesindex.ts- Provider registry and processOCR functionbenchmark.ts- Benchmark comparison utilities and export functionsproviders/tesseract-browser.ts- Client-side Tesseract for Privacy Modeproviders/mistral.ts- Mistral OCR API provider (server-side)providers/google.ts- Google Cloud Vision provider (server-side)client/- Browser-side BYOK providers (direct API calls, keys never touch server)mistral.ts- Direct Mistral API callsgoogle.ts- Direct Google Vision API callsindex.ts- BYOK provider registry
Adding a new OCR provider:
- Create
lib/ocr/providers/<name>.tsimplementingOCRProviderinterface - Register in
lib/ocr/index.tsproviders object - Add to
components/provider-selector.tsxPROVIDERS array - For BYOK support: add client in
lib/ocr/client/<name>.ts
Environment Variables & Auth:
MISTRAL_API_KEY- Mistral OCR API key (in.env.local, optional if using BYOK)- Google Cloud Vision uses Application Default Credentials (ADC):
- Run
gcloud auth application-default loginfor local dev - Or set
GOOGLE_APPLICATION_CREDENTIALSpointing to service account JSON
- Run
- Production without keys: Users can still use Tesseract (Privacy Mode) and BYOK for cloud providers
PDF Support:
- Server-side: Uses
pdftoppm(poppler) for PDF to PNG conversion at 300 DPI- Requires poppler installed on the system (
brew install poppleron macOS)
- Requires poppler installed on the system (
- Client-side: Uses PDF.js (
lib/pdf-browser.ts) for in-browser PDF to image conversion- Privacy Mode now fully supports PDFs - data never leaves the browser
Features:
- Privacy Mode: Tesseract processes images and PDFs entirely in browser - data never leaves device
- BYOK: Users provide their own API keys for Mistral/Google, stored in localStorage
- Benchmarking: Compare up to 4 providers side-by-side, export results as JSON/CSV
tesseract.js- Browser-based OCR engine for Privacy Modepdfjs-dist- Client-side PDF rendering (PDF.js)sharp- Image metadatapdftoppm- Server-side PDF rendering (system dependency via poppler, not npm)
bunx --bun shadcn@latest add <component-name>- Cloud providers: AWS Textract
- User authentication and usage tracking
- Stripe integration for paid tiers
- See
planning/roadmap.mdfor full roadmap