A containerized service that monitors a directory for new product images, processes them through OpenAI's vision API, and saves structured data to CSV. Now includes a web interface for interactive image processing.
- Directory Monitoring: Automatically processes images in
d_warehousedirectory - Web Interface: Upload and process images through a modern web UI
- AI Processing: Uses OpenAI GPT-4 Vision to extract product information
- Complete Text Extraction: Captures ALL visible text from product labels
- CSV Export: Download results as CSV files
- Retry Logic: Exponential backoff for API failures
- Containerized: Docker deployment with volume persistence
- Henkel Branding: Professional UI with company branding
- Copy environment variables:
cp .env.example .env- Add your OpenAI API key to
.env:
OPENAI_API_KEY=your_actual_api_key_here
OPENAI_TIMEOUT=60
- Build and run with Docker Compose:
docker-compose up --build- Open your browser to
http://localhost:8000 - Upload a product image using drag-and-drop or file browser
- Click "Process Image" and wait for AI analysis
- View results and download CSV file
- Place product images in
data/d_warehouse/ - The monitor service automatically processes new images
- Results are saved to
data/d_mart/processed_images.csv
Run the web interface locally:
python run_web.pyRun the monitor service locally:
python run_local.pyRun the test suite to verify functionality:
# Test without OpenAI (uses mock data)
python test_monitor.py
# Test with real OpenAI API (requires API key)
python test_monitor.py
# Choose 'y' when promptedEnvironment variables:
| Variable | Default | Description |
|---|---|---|
| OPENAI_API_KEY | - | Your OpenAI API key |
| OPENAI_MODEL | gpt-4o | OpenAI model to use |
| OPENAI_TEMPERATURE | 0.1 | Response randomness |
| OPENAI_MAX_TOKENS | 100 | Max response length |
| OPENAI_TIMEOUT | 60 | API timeout in seconds |
| Column | Description |
|---|---|
| image_name | Original image filename |
| item | Complete product description with ALL label text |
| price | Product price |
| brand | Brand name |
| size | Product size |
| product_type | Product category |
- JPG/JPEG
- PNG
- BMP
- TIFF
The application runs two services:
- Monitor Service (
label-parser-monitor): Watches directory for new images - Web Service (
label-parser-web): Provides web interface on port 8000
API Timeout Errors:
- Increase
OPENAI_TIMEOUTvalue - Check network connectivity
- Verify API key is valid
Processing Failures:
- Service retries failed requests 3 times with exponential backoff
- Check logs in
logs/image_parser.log - Ensure images are valid and not corrupted
Web Interface Issues:
- Ensure port 8000 is not in use
- Check browser console for JavaScript errors
- Verify file upload size limits