A powerful web application that combines Azure Document Intelligence and Google Gemini AI to process financial documents with intelligent chat capabilities.
- Azure AI Integration: Advanced OCR and structured data extraction
- Real-time Processing: Instant document analysis with progress indicators
- Interactive Chat: Ask questions about your processed documents
- SQLite Database: Local storage for processed documents
- CSV Export: Export all processed data with structured formatting
- Document History: Track all processed documents with timestamps
- Streamlit Framework: Clean, responsive web interface
- Python 3.8+
- Azure for Students account from https://azure.microsoft.com/en-us/free/students
- Google Gemini API from https://aistudio.google.com/
-
Clone the repository
git clone https://github.com/saishagoel27/ocr_app cd ocr_app -
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Set up environment variables Create a
.envfile in the project root:AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT=your_azure_endpoint AZURE_DOCUMENT_INTELLIGENCE_KEY=your_azure_key GEMINI_API_KEY=your_gemini_api_key
streamlit run app.pyThe application will open in your browser at http://localhost:8501
ocr_app/
├── app.py # Main application file
├── .env # Environment variables (create this)
├── financial_docs.db # SQLite database (auto-created)
├── requirements.txt # Python dependencies
- streamlit: Web framework
- azure-ai-documentintelligence: Azure OCR service
- google-generativeai: Gemini AI integration
- sqlite3: Database management
- pandas: Data manipulation
- python-dotenv: Environment variable management
This project is licensed under the MIT License - free to use and modify
Built with ❤️ using Azure AI and Google Gemini