A professional, full-stack AI research platform designed to synthesize complex academic papers into actionable insights. PaperDigest AI leverages multi-staged LLM pipelines and RAG (Retrieval-Augmented Generation) to streamline literature reviews and technical deep-dives.
- Structural Extraction - Automatically segments PDFs into logical sections (Abstract, Introduction, Methodology, etc.).
- Multi-pass Processing - Asynchronous extraction of titles, authors, and high-level paper metadata.
- Smart Formatting - Cleans raw PDF artifacts and multi-column layouts using PyMuPDF and Regex heuristics.
- Hierarchical Summarization - Generates section-by-section summaries before synthesizing a final 'Global TL;DR'.
- Technical Deep-Dives - Specialized prompts for extracting technical methodologies, model architectures, and metrics.
- Metadata Tagging - Automatic identification of research datasets and software licenses used in the study.
anced Search & RAG
- Semantic Search - Search across your paper library using natural language, powered by vector embeddings.
- Side-by-Side Review - Compare AI summaries directly alongside the original PDF text.
- Context-Aware RAG - Injects the most relevant snippets from papers into LLM prompts for high-accuracy extraction.
- Literature Review Matrix - A specialized view that aggregates summaries from all your papers into a comparison table.
- Researcher Notes - Add personal annotations alongside AI-generated insights.
- CSV Export - Export your entire literature review matrix for external analysis or reporting.
- Cloud Power - Native support for Google Gemini 1.5 Pro/Flash for high-accuracy processing.
- Local Privacy - Built-in integration with Ollama (Llama 3) for offline, privacy-first research analysis.
- Factory Pattern - Seamlessly toggle between LLM providers via environment configuration.
- Docker & Docker Compose (Recommended)
- Git
- Google Gemini API Key (Optional, for cloud processing)
- Ollama (Optional, for local processing)
-
Clone the repository
git clone https://github.com/Rishabds7/ai-research-assistant.git cd ai-research-assistant -
Configure Environment Variables Create a
.envfile in the root directory (refer to.env.exampleinbackend/):# Backend Settings GEMINI_API_KEY=your_api_key_here LLM_PROVIDER=gemini # or 'ollama' # Database Settings DB_PASSWORD=your_secure_password
-
Launch the platform
docker-compose up --build
-
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000/api/
- Next.js 15 & React 19 - Modern, responsive frontend with Server Components.
- Django & DRF - Robust Python backend for API management and data persistence.
- PostgreSQL & pgvector - High-performance vector database for semantic search.
- Celery & Redis - Distributed task queue for asynchronous AI processing.
- Tailwind CSS 4 - Premium UI styling with glassmorphism and modern aesthetics.
PDF Upload → Celery Worker → PDFProcessor → LLM (Gemini/Ollama) → EmbeddingService → PostgreSQL
- Embedding: Convert paper chunks into 384-dimensional vectors via
all-MiniLM-L6-v2. - Storage: Vector data persists in PostgreSQL using the
pgvectorextension. - Similarity: Natural language queries are matched using Cosine Distance (
<=>).
research-assistant-mvp/
├── backend/ # Django Application
│ ├── core/ # Project settings & Celery config
│ ├── papers/ # Models, Views, and Tasks
│ └── services/ # Core AI/PDF Logic (LLM, Embeddings, PDFProcessor)
├── frontend/ # Next.js Application
│ ├── src/app/ # Pages & Layouts
│ ├── src/components/ # Atomic UI components
│ └── src/lib/ # API client & utility functions
├── docker-compose.yml # Orchestration for DB, Redis, Worker, and Apps
└── README.md # Professional Documentation
- ✅ Chrome 110+
- ✅ Firefox 100+
- ✅ Safari 16+
- ✅ Edge 110+
services.pdf_processor.PDFProcessor- Text segmentation logic.services.llm_service.LLMService- Multi-provider (Gemini/Ollama) factory.services.embedding_service.EmbeddingService- Vectorization and RAG logic.
The API documentation is available via Swagger/ReDoc at /api/docs/ when the backend is running.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-ai-tool) - Commit your changes (
git commit -m 'Add amazing AI feature') - Push to the branch (
git push origin feature/amazing-ai-tool) - Open a Pull Request
PaperDigest AI - Synthesizing deep research into actionable insights.