Large Language Models (LLMs) are powerful but have limitations such as hallucinations, lack of awareness of external or private data, and weak handling of conversational context. This project addresses these challenges by implementing a Context-Aware Retrieval-Augmented Generation (RAG) Chatbot that grounds responses in a trusted knowledge source while maintaining conversational memory.
The system is built using LangChain v1.0+, OpenAI, FAISS, and Streamlit, following modern best practices and avoiding deprecated APIs.
Traditional LLM-based chatbots:
- Cannot reliably answer questions from specific external documents
- Often hallucinate responses
- Do not understand follow-up questions without additional context handling
- Expose poor UX and insecure API key handling
There is a need for a conversational system that retrieves answers from a verified knowledge base, maintains conversational context, and provides a secure, user-friendly interface.
The objectives of this project are to:
- Build a context-aware chatbot using Retrieval-Augmented Generation (RAG)
- Enable accurate question answering from an external knowledge source
- Maintain conversational history for multi-turn interactions
- Reduce hallucinations by grounding responses in retrieved documents
- Use modern LangChain v1.0+ (LCEL) APIs
- Provide a clean and secure frontend experience
-
Frontend (Streamlit)
- API key input
- Chat interface
- Clear chat history control
-
Document Loader
- WebBaseLoader (Wikipedia – Artificial Intelligence page)
-
Text Processing
- RecursiveCharacterTextSplitter
-
Vector Store
- FAISS for similarity search
-
Embedding Model
- OpenAI Embeddings (
text-embedding-3-small)
- OpenAI Embeddings (
-
LLM
- OpenAI Chat Model (
gpt-4o-mini)
- OpenAI Chat Model (
-
RAG Pipeline (LCEL)
- Contextual question reformulation
- Document retrieval
- Grounded answer generation
-
API Key Connection
- User enters OpenAI API key via frontend
- Chat functionality enabled only after successful connection
-
Data Ingestion
- Wikipedia page is loaded and parsed
-
Chunking
- Text split into overlapping chunks for better retrieval
-
Vectorization
- Text chunks converted into embeddings
- Stored in FAISS vector database
-
Query Processing
- User query reformulated into a standalone question if chat history exists
-
Retrieval
- Relevant document chunks retrieved using semantic similarity
-
Answer Generation
- LLM generates concise answers grounded in retrieved context
-
Memory Handling
- Chat history stored using Streamlit session state
- Context-aware multi-turn conversations
- Retrieval-Augmented Generation (RAG)
- Modern LangChain LCEL implementation
- Secure API key handling
- Clear chat history functionality
- Cached vector store for performance
- Programming Language: Python 3.10+
- Frontend: Streamlit
- LLM Framework: LangChain v1.0+
- Vector Database: FAISS
- LLM & Embeddings: OpenAI
# Clone repository
git clone https://github.com/your-username/context-aware-rag-chatbot.git
cd context-aware-rag-chatbot
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run application
streamlit run app.py- The chatbot accurately answers questions based on retrieved context
- Follow-up questions are handled effectively using conversational memory
- Retrieval grounding significantly reduces hallucinations
- Clean UI improves usability and security
- Knowledge limited to the ingested dataset
- Requires an active OpenAI API key
- Single-source document ingestion in the current version
- Upload and query custom documents (PDF, DOCX, TXT)
- Multi-document support
- Streaming responses
- Agentic RAG (Planner–Retriever–Verifier)
- Conversation export functionality
This project demonstrates a production-ready implementation of a Context-Aware RAG Chatbot using modern LLM tooling. It provides a scalable foundation for building enterprise-grade conversational AI systems that are accurate, secure, and context-aware.