Skip to content

rajshekharbind/langgraph-chatboat-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ LangGraph Agentic AI β€” RAG Chatbot

A production-ready Agentic AI + RAG (Retrieval-Augmented Generation) chatbot built with LangGraph, LangChain, Google Gemini, and FAISS. This repository integrates document RAG (PDFs), web search tools, agentic workflows, and full observability via LangSmith.


πŸ”Ž Project Summary

This project demonstrates a full-stack LLM application architecture that:

  • Uses LangGraph for graph-based agent workflows and tool orchestration.
  • Uses LangChain utilities for document loading, text splitting, and vector retrieval.
  • Uses Google Gemini models for chat & embeddings via langchain-google-genai.
  • Stores embeddings locally using FAISS for fast retrieval.
  • Integrates web-search tools (DuckDuckGo) for up-to-date information.
  • Adds observability, tracing, and evaluation with LangSmith.

This README covers concepts, setup, architecture, usage, troubleshooting, and a ready requirements.txt.


🎯 Key Features

  • Agentic, tool-calling workflows (LangGraph StateGraph + ToolNode).
  • RAG pipeline: PyPDFLoader β†’ RecursiveCharacterTextSplitter β†’ embeddings β†’ FAISS.
  • Web search tool integration (DuckDuckGoSearchRun) for external knowledge.
  • Stateful memory & checkpointing using MemorySaver.
  • Observability & tracing with LangSmith (traces, monitoring, evals).
  • Streamlit demo UI (optional) for quick prototyping.

🧭 Architecture Overview

User β†’ Streamlit / CLI / API
    ↓
LangGraph Agent (StateGraph)
    ↓
Decide: RAG retrieval / web search / direct LLM call
    β”œβ”€ RAG: PyPDFLoader β†’ TextSplitter β†’ Embeddings β†’ FAISS β†’ Retriever
    β”œβ”€ DuckDuckGoSearchRun β†’ Web results
    └─ ChatGoogleGenerativeAI β†’ LLM response
    ↓
MemorySaver checkpoint β†’ LangSmith trace β†’ Response

πŸ“š Core Concepts Explained

LangGraph

  • Graph-based agent framework to model agent workflows as nodes and edges (StateGraph, ToolNode).
  • Ideal for building stateful multi-step agent logic, sub-agents, and memory checkpoints (MemorySaver).

LangChain

  • Utilities for document loading, prompt templates, text splitting, chains, and vectorstore adapters.
  • Community tools (e.g., langchain_community) add connectors such as PyPDFLoader, DuckDuckGoSearchRun, and FAISS wrappers.

RAG (Retrieval-Augmented Generation)

  • Load documents (PDF) β†’ chunk text β†’ embed chunks β†’ store in vector DB (FAISS).
  • During query time: embed query β†’ similarity search in FAISS β†’ supply top-k contexts to the LLM.

Google Gemini Integration

  • ChatGoogleGenerativeAI for conversational LLM responses.
  • GoogleGenerativeAIEmbeddings to create embeddings for document chunks.

FAISS

  • Local, high-performance vector index for storing and searching embeddings.
  • Good for prototype and single-node deployments; consider Milvus/Weaviate/Elasticsearch for scale.

Web Search (DuckDuckGo)

  • DuckDuckGoSearchRun (requires the ddgs package) to fetch live web results as a tool for the agent.
  • Useful for queries requiring up-to-date information.

LangSmith (Observability)

  • Observability platform for LLMs and agents: tracing, monitoring, evaluation, and dashboards.
  • Each agent run produces a trace capturing the end-to-end execution (LLM calls, tool calls, intermediate steps).
  • Use LangSmith to debug prompt failures, tool failures, latency, cost, and non-deterministic behaviors.
  • Enable tracing by setting the appropriate environment variable (e.g., LANGSMITH_TRACING=true) if you integrate LangSmith tracing wrappers.

Tip: Instrument your agent to emit traces for every request during dev & staging β€” it significantly reduces debugging time.


βš™οΈ Prerequisites

  • Python 3.10+
  • Optional: a virtual environment (venv / conda)
  • Google API key for Gemini (set as env var)
  • (Optional) LangSmith account & API key for observability

πŸ”§ Quickstart β€” Local Development

  1. Create & activate a virtual environment (recommended):
python -m venv .venv
# Windows
.\.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
  1. Save the requirements.txt in the repo (see the bottom of this README). Then install:
pip install -r requirements.txt
  1. Create a .env file in the repo root and add required keys:
# Google Gemini
GOOGLE_API_KEY=your_google_api_key
# Optional: LangSmith (observability)
LANGSMITH_API_KEY=your_langsmith_api_key
LANGSMITH_TRACING=true
  1. Run the prototype Streamlit app (if included):
streamlit run app.py

Or run your agent entrypoint (e.g., python main.py or your custom CLI script).


πŸ“ Suggested Repository Structure

β”œβ”€ data/                     # Raw PDFs, sample documents
β”œβ”€ src/
β”‚  β”œβ”€ agents/                # LangGraph state graphs & nodes
β”‚  β”œβ”€ loaders/               # Document loaders & preprocessors
β”‚  β”œβ”€ retriever/             # Embeddings & FAISS wrapper
β”‚  β”œβ”€ tools/                 # Tool adapters (DuckDuckGo, custom tools)
β”‚  β”œβ”€ webapp/                # Streamlit / FastAPI / UI code
β”‚  └─ main.py                # App entrypoint
β”œβ”€ tests/                    # Unit & integration tests
β”œβ”€ .env.example
β”œβ”€ requirements.txt
└─ README.md

πŸ§ͺ Testing & CI

  • Unit test chains, loader results, and FAISS indexing using pytest.

  • Use a small, deterministic model or mock the LLM for unit tests.

  • Add CI pipeline to:

    • Run linters (black, pylint).
    • Run tests and static analysis.

πŸ›‘ Security & Privacy

  • Keep API keys in .env and out of version control. Add .env to .gitignore.
  • If using LangSmith, review data retention and privacy settings. Mask or scrub sensitive data before tracing if required by policy.
  • For production, consider self-hosting vector DB & observability if data residency is required.

πŸš€ Deployment Notes

  • Containerize with Docker (example steps below).
  • For production: use a managed vector DB (Milvus/Weaviate), model provider endpoints, and a secrets manager.
  • Use a process manager (gunicorn / uvicorn) for API backends.

Example Dockerfile (skeleton)

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
CMD ["python", "main.py"]

πŸ” Troubleshooting

ImportError: Could not import ddgs (DuckDuckGoSearchRun)

If you see an error about ddgs when using DuckDuckGoSearchRun, run:

python -m pip install -U ddgs

Or inside your venv:

pip install -U ddgs

FAISS issues on Windows

faiss-cpu sometimes has wheel compatibility issues on Windows. If you run into errors, consider:

  • Using WSL (Linux) for development, or
  • Using an alternative vector store (sqlite + embeddings) for local testing.

LangSmith tracing not appearing

  • Ensure LANGSMITH_TRACING=true in your environment and that the LangSmith wrapper is enabled in your runtime.

βœ… Contribution Guidelines

  • Follow git feature-branch workflow.
  • Write unit tests for new functionality.
  • Run black and pylint before submitting a PR.

πŸ“„ License

This project uses the MIT License (or choose your preferred license). Include LICENSE in your repo.


πŸ“¦ requirements.txt

Use the following pinned dependencies for reproducible environments:

# ============================================
# LangGraph Chatbot Dependencies
# ============================================

# Core Framework
langgraph==0.2.28
langchain==0.2.1
langchain-core==0.2.38
langchain-community==0.2.0

# LLM Integration
langchain-google-genai==1.0.10
google-generativeai==0.7.2

# Vector Database & Document Processing
faiss-cpu==1.8.0
pypdf==4.0.1

# Tools & Utilities
python-dotenv==1.0.0
aiohttp==3.13.3
pydantic>=2.7.4
requests>=2.31.0

# Web Framework (Streamlit)
streamlit>=1.38.0
streamlit-chat>=0.1.1

# Optional: For better async support
greenlet>=3.3.1

# Development/Testing (optional)
pytest>=7.4.3
black>=24.1.0
pylint>=3.0.3

πŸ“š Resources & References

  • LangChain docs & LangSmith guides (tracing, observability, quickstarts).
  • LangGraph documentation for agent graphs.

πŸ™‹ Need help?

I can:

  • Generate Dockerfile, docker-compose.yml, and deployment docs.
  • Create a Streamlit demo app (app.py) wired to this stack.
  • Add CI workflow (GitHub Actions) including tests and linting.

Releases

No releases published

Packages

 
 
 

Contributors

Languages