🎙️ Podcast Summarizer Agent

🚀 Overview

Podcast Summarizer Agent is an AI-powered application that automatically retrieves, transcribes, and summarizes podcast episodes. It uses a combination of LangChain, TogetherAI, and MongoDB to create an intelligent summarization pipeline.

🏗️ Architecture

The application uses LangGraph to create a sophisticated agent workflow. Here's how the components interact:

Core Components

Input Handler
- Receives and processes user queries
- Initializes the conversation state
- Routes requests to appropriate components
Agent Node
- Powered by TogetherAI
- Processes input using advanced language models
- Makes intelligent decisions about next actions
- Maintains conversation context
Tool Node
- YouTube Search: Finds relevant podcast episodes
- Transcript Retrieval: Extracts captions from videos
- Summary Generation: Creates concise episode summaries
- Database Operations: Manages data persistence
Conditional Routing
- Determines workflow progression
- Handles error cases and retries
- Manages conversation flow

✨ Key Features

🔍 Smart Search: Automated podcast discovery using SerpAPI
📝 AI-Powered Processing:
- Automatic transcription via YouTube Transcript API
- Intelligent summarization using TogetherAI
🗄️ Data Management:
- MongoDB integration for persistent storage
- Efficient metadata management
🤖 Advanced Workflow:
- LangChain-powered agent system
- Dynamic tool selection
- Error recovery mechanisms
🌐 Dual Interface:
- Agent works as a cron job on railway.com
- User-friendly web interface (Streamlit)

🛠️ Technology Stack

Component	Technology	Purpose
Backend	Python 3.8+	Core processing
AI Framework	LangChain	Agent orchestration
Database	MongoDB	Data storage
AI Models	TogetherAI	Content summarization
Search	SerpAPI	YouTube search
Transcription	YouTube Transcript API	Caption extraction
UI	Streamlit	Web interface

📂 Project Structure

summarizer_container/
│── build_agent.py       # LangChain agent configuration
│── get_transcripts_tools.py  # YouTube search and transcript retrieval
│── prompt.py           # Prompt templates and configurations
│── streamlit_app.py    # Streamlit web interface
│── mongo_functions.py  # MongoDB operations
│── env_variables.py    # Environment variable management
│── requirements.txt    # Python dependencies
│── Dockerfile         # Container configuration

🚀 Getting Started

Prerequisites

Python 3.8 or higher
MongoDB instance
Required API Keys:
- TogetherAI
- SerpAPI
- MongoDB

Installation

Clone the repository
Install dependencies:
```
pip install -r requirements.txt
```

Configure environment variables in .env file:

TOGETHER_API_KEY=your_key_here
SERPAPI_API_KEY=your_key_here
MONGODB_URI=your_connection_string

Running the Application

Option 1: Streamlit Interface

streamlit run streamlit_app.py

Option 2: Docker Deployment

docker build -t podcast-summarizer .
docker run -p 5001:5001 podcast-summarizer

🔗 API Documentation

`/podcast_agent`

Method: GET
Purpose: Process podcast-related queries
Parameters:
- message: User input message
Response: JSON with agent's response

`/healthcheck`

Method: GET
Purpose: Verify application status
Response: JSON with system status

💰 Resource Costs

Service	Cost Model	Usage
TogetherAI	Pay-per-use	AI model calls
MongoDB	Free tier available	Data storage
SerpAPI	Pay-per-use	Search operations

🔮 Roadmap

🎯 Mission: Making podcast content more accessible through AI-powered summarization.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
Podcast Agent.drawio.png		Podcast Agent.drawio.png
README.md		README.md
api.py		api.py
build_agent.py		build_agent.py
create_rag_pinecone.py		create_rag_pinecone.py
get_transcripts_tools.py		get_transcripts_tools.py
output.png		output.png
prompt.py		prompt.py
requirements.txt		requirements.txt
retreiver.py		retreiver.py
run_all.py		run_all.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Podcast Summarizer Agent

🚀 Overview

🏗️ Architecture

Core Components

✨ Key Features

🛠️ Technology Stack

📂 Project Structure

🚀 Getting Started

Prerequisites

Installation

Running the Application

Option 1: Streamlit Interface

Option 2: Docker Deployment

🔗 API Documentation

`/podcast_agent`

`/healthcheck`

💰 Resource Costs

🔮 Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Podcast Summarizer Agent

🚀 Overview

🏗️ Architecture

Core Components

✨ Key Features

🛠️ Technology Stack

📂 Project Structure

🚀 Getting Started

Prerequisites

Installation

Running the Application

Option 1: Streamlit Interface

Option 2: Docker Deployment

🔗 API Documentation

/podcast_agent

/healthcheck

💰 Resource Costs

🔮 Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`/podcast_agent`

`/healthcheck`

Packages