A K-12 focused Subject Matter Expert (SME) agent for Climate and Weather education, built with advanced RAG, hierarchical chunking, and safety guardrails.
This project implements a sophisticated Retrieval-Augmented Generation (RAG) system that acts as a tutor and curriculum designer. It goes beyond simple "chat with PDF" by using a "Index Small, Retrieve Big" strategy to ensure high precision in search while providing rich context to the LLM.
- Hierarchical Retrieval ("Index Small, Retrieve Big"):
- Documents are split into Parent (2048 tokens), Child (512 tokens), and Grandchild (128 tokens) chunks.
- Only Grandchild chunks are indexed for high-precision vector search.
- When a match is found, the system automatically retrieves the Parent chunk to provide full context to the LLM.
- Advanced RAG Pipeline:
- Hybrid Search: Combines dense vector search (FAISS + BGE large) with keyword search.
- Reranking: Uses a cross-encoder (
BAAI/bge-reranker-large) to re-score top results for maximum relevance.
- K-12 Education Focused:
- Difficulty Adaptation: Automatically adjusts responses for K6-8, K9-10, or K11-12 levels.
- Curriculum Generation: Uses an LLM agent to design structured lesson plans and export them as PDF, DOCX, or PPTX.
- Safety Guardrails:
- Input sanitization to prevent prompt injections.
- Output moderation to ensure content is safe for students.
- Model Agnostic:
- Supports Local LLMs (Ollama) for privacy and zero cost.
- Supports Cloud LLMs (OpenAI, Anthropic, Gemini) for higher reasoning capabilities.
The system is designed around a decoupled indexing/retrieval strategy:
- Ingestion: Multi-format support (PDF, DOCX, PPTX, TXT) with
src/preprocessing.py. - Chunking: Recursive character splitting with metadata propagation in
src/chunking.py. - Indexing:
src/indexing.pybuilds a FAISS index of grandchild chunks usingBAAI/bge-large-en-v1.5. - Routing: A hybrid intent classifier (Regex + LLM) routes queries to the appropriate handler (Chat, RAG, Curriculum, Tools).
- Python 3.8+
- Ollama (optional, for local LLMs)
-
Clone the repository and install dependencies:
pip install -r requirements.txt
-
(Optional) If using a local LLM, pull the model:
ollama pull llama3.2
Process the documents in data/ and build the vector index:
python build_index.pyThis handles hierarchical chunking, embedding generation, and FAISS indexing.
Start the chat interface:
python climate_sme_agent.py --interactiveCommands inside chat:
/difficulty [basic|intermediate|advanced]- Set the target audience level./tools- List available tools (Calculator, Source Lookup, etc.)./quit- Exit.
Single Query:
python climate_sme_agent.py --query "Explain the greenhouse effect" --difficulty basicCurriculum Generation: Generate a 6-week lesson plan on hurricanes and export to PowerPoint:
python climate_sme_agent.py \
--curriculum-topics "hurricanes,severe storms" \
--curriculum-grade K6-8 \
--curriculum-format pptx \
--curriculum-weeks 6 \
--curriculum-output my_lesson_planDemo Mode: Run a pre-scripted demonstration of capabilities:
python climate_sme_agent.py --demoSettings are managed in src/config.py.
You can switch providers via command line arguments or environment variables.
Local (Ollama):
python climate_sme_agent.py --llm-provider ollama --llm-model llama3.2OpenAI (GPT-4, etc.):
export OPENAI_API_KEY="sk-..."
python climate_sme_agent.py --llm-provider openai --llm-model gpt-4Anthropic (Claude):
export ANTHROPIC_API_KEY="sk-..."
python climate_sme_agent.py --llm-provider anthropic --llm-model claude-3-opus.
├── src/
│ ├── preprocessing.py # content ingestion
│ ├── chunking.py # hierarchical splitter
│ ├── indexing.py # FAISS + Embedding logic
│ ├── sme.py # Difficulty adaptation & response logic
│ ├── agent.py # Orchestrator & Router
│ ├── llm.py # LLM Provider wrappers
│ └── tools.py # Calculator, Search tools
├── data/ # Source documents
├── build_index.py # Script: Docs -> Vector Store
├── query_index.py # Script: Test retrieval only
└── climate_sme_agent.py # Main Entry Point
The default knowledge base includes open educational resources:
- Open WA Weather and Climate Book
- NIOS Weather and Climate Chapters
- KS3 Weather and Climate Lessons