A full-stack TypeScript application demonstrating modern AI techniques including RAG (Retrieval Augmented Generation), fine-tuning, agents, and LLM observability with automated web scraping capabilities.
Before getting started, you'll need to set up the following services:
-
OpenAI API Key (https://platform.openai.com/api-keys)
- You'll need at least $5 in credits on your OpenAI account
- Used for embeddings, chat completions, and fine-tuning
-
Pinecone API Key (https://www.pinecone.io/)
- Free tier available
- Used for vector database storage and similarity search
-
Helicone API Key (https://www.helicone.ai/)
- Free tier available
- Used for LLM observability and monitoring
Create a .env file in the root directory with these keys:
OPENAI_API_KEY=your_openai_key_here
PINECONE_API_KEY=your_pinecone_key_here
HELICONE_API_KEY=your_helicone_key_here
PINECONE_INDEX=your_index_name
OPENAI_FINETUNED_MODEL=your_finetuned_model_id (optional)
Before diving into the code, we highly recommend watching 3Blue1Brown's series on neural networks and embeddings to build intuition for how these systems work:
- Neural Networks Series - Visual introduction to neural networks
- But what is a GPT? - Understanding transformer architecture
- Visualizing Attention - How attention mechanisms work
-
Multi-Agent System: 2 specialized agents for different content types:
- LinkedIn Agent: Uses a fine-tuned GPT-4 model for professional content to post on LinkedIn
- RAG Agent: Leverages Pinecone vector database for RAG-based content analysis
-
Web Scraping:
- Extraction of articles from multiple sources
- Bias detection and content structuring
- Direct vectorization and storage in Pinecone database
-
Training Pipeline:
- Scripts for fine-tuning data preparation
- Cost estimation tools
- Training job management
-
Observability:
- Integration with Helicone for LLM monitoring
- Performance tracking
- Usage analytics
- Frontend: Next.js, TypeScript, TailwindCSS
- Backend: Next.js API Routes
- AI/ML: OpenAI API, Pinecone Vector Database
- Web Scraping: Puppeteer
- Monitoring: Helicone
- Package Manager: Yarn
This repository serves as a practical guide for you to learn:
-
RAG Implementation
- Vector database integration with Pinecone
- Semantic search capabilities
- Automated web scraping
- Context-aware responses using retrieved content
-
Fine-tuning
- Data preparation
- Model training
- Cost optimization
-
Agent Architecture
- Specialized agent design
- Response handling
- Agent response format
-
Web Scraping & Data Pipeline
- Intelligent content extraction
- Automated bias detection
- Content vectorization and storage
-
LLM Observability
- Performance monitoring
- Usage tracking
- Cost management
-
News Article Scraping & Vectorization
- The application uses Puppeteer to automatically scrape news articles from configured sources
- Articles are processed to extract content
- Scraped content is automatically vectorized using OpenAI embeddings and stored in Pinecone
-
Manual Article Upload
- Navigate to
/scrape-contentto manually scrape urls - Content is automatically vectorized and added to the Pinecone database
- Navigate to
mini-rag/
├── app/
│ ├── api/ # API routes
│ ├── libs/ # Shared utilities
│ ├── scripts/ # Training and data scripts
│ └── page.tsx # Main application
This app doesn't work yet. Your job is to build it from scratch by completing exercises and TODOs. When you're done, you'll have a fully functional AI-powered chat app with:
- RAG Agent - Chat with your knowledge base (technical docs, articles, etc.)
- LinkedIn Agent - Fine-tuned on Brian's LinkedIn posts to generate professional content
Step 1: Understand Vector Math (30 mins)
Before writing any code, build intuition for how embeddings work:
# Run the word arithmetic exercise
yarn exercise:word-mathThis demonstrates "word math" like king - man + woman ≈ queen. Understanding this is crucial for understanding RAG.
Step 2: Set Up Your Vector Database
Create a Pinecone index and configure your environment:
- Sign up at Pinecone (free tier)
- Create an index:
- Name:
rag-tutorial - Dimensions:
512 - Metric:
cosine
- Name:
- Add to
.env:OPENAI_API_KEY=sk-proj-... PINECONE_API_KEY=... PINECONE_INDEX=rag-tutorial
- Learn: Pinecone Quickstart
Step 3: Upload Knowledge Base to Pinecone
Scrape documentation and upload embeddings:
# Edit app/scripts/scrapeAndVectorizeContent.ts to add your URLs
# Then run:
yarn tsx app/scripts/scrapeAndVectorizeContent.tsThis will scrape URLs, chunk the content, generate embeddings, and upload to Pinecone.
- Learn: Chunking Strategies, Text Embeddings
Step 4: Train Your LinkedIn Agent
Fine-tune a model on Brian's LinkedIn posts:
# Generate training data from posts
yarn tsx app/scripts/generate-training-data.ts
# (Optional) Estimate cost before training
yarn tsx app/scripts/estimate-training-cost.ts
# Upload to OpenAI and start fine-tuning job
yarn tsx app/scripts/upload-training-data.tsOnce training completes (~10-20 mins), add the model ID to .env:
OPENAI_FINETUNED_MODEL=ft:gpt-4o-mini-2024-07-18:personal::YOUR_IDStep 5: Fix All The TODOs
Search the codebase for TODO comments - you'll find them in:
app/api/upload-document/route.ts- Implement document upload pipelineapp/libs/openai/agents/linkedin-agent.ts- Complete LinkedIn agentapp/libs/openai/agents/rag-agent.ts- Build RAG retrieval and generationapp/libs/openai/agents/selector-agent.ts- Create agent router
Key concepts you'll implement:
- RAG: What is RAG?, Vector Similarity Search
- Agents: OpenAI Structured Outputs, Multi-Agent Systems
- Streaming: Vercel AI SDK, Server-Sent Events
Step 6: Run The App
yarn install
yarn devVisit http://localhost:3000 and test:
- Upload new documents (URLs or raw text)
- Ask technical questions (should use RAG agent)
- Request LinkedIn posts (should use fine-tuned agent)
Step 7: Run Tests
# Test your agent selector
yarn test:selector
# Test all implementations
yarn test- The
working_versionbranch has the complete solution if you get stuck - Use
console.log()liberally to understand data flow - Check Pinecone dashboard to verify vectors are uploaded
- Use Helicone dashboard to debug LLM calls and see cost
- Read the inline comments in TODO sections - they guide you step-by-step
Good luck! Figure it out. 🚀
This repo is a preview of the full hands-on program where we build AI applications together as a group. You'll get live support, code reviews, and build production-ready AI systems alongside other developers.