Accurately detect sponsor mentions and keywords in YouTube videos using AI transcription
This tool helps you find and count how often specific brands, sponsors, or keywords appear in YouTube videos without having to watch the entire video. Unlike tools that rely on YouTube's built-in captions (which are often inaccurate), this analyzer uses Whisper AI to create high-quality transcripts, ensuring you catch all keyword mentions.
- π₯ Downloads audio from any YouTube video
- ποΈ Creates highly accurate transcripts using Whisper AI (much better than YouTube's automatic captions)
- π Finds all mentions of sponsors or keywords in the transcript with precision
- π Shows you exactly when and where sponsor mentions appear with timestamps
- πΎ Saves results to view later
- π Works even on videos without captions or with poor automatic captions
YouTube's automatic captions often miss words, mispronounce names, and struggle with technical terms or accents. This means you might miss important sponsor mentions when using YouTube's transcripts.
By using state-of-the-art Whisper AI (through the Groq API), this tool delivers:
- Higher accuracy for brand names and technical terms
- Better handling of different accents and speech patterns
- More reliable timestamp information
- Detection of mentions that YouTube's captions would miss completely
This accuracy is crucial when you need to find every instance of a keyword or analyze how frequently a sponsor is mentioned.
-
Install Python (if you don't already have it)
- Download from python.org (version 3.9 or newer)
- During installation, check "Add Python to PATH"
-
Download this tool
- Download and unzip this repository to a folder on your computer
-
Install required packages
- Open Command Prompt (Windows) or Terminal (Mac)
- Navigate to the folder where you unzipped the files
- Run this command:
pip install -r requirements.txt
-
Set up your API key (needed for transcription)
- Create a file named
.envin the main folder - Add this line to the file, replacing YOUR_KEY with your Groq API key:
GROQ_API_KEY=YOUR_KEY - Save the file
- Create a file named
-
Install FFmpeg (for audio processing)
- Download from ffmpeg.org
- Extract the files and place ffmpeg.exe and ffprobe.exe in the same folder as this tool
Simply run:
python run_analysis.py --interactive
This will guide you through each step with simple questions:
- Enter a YouTube URL or video ID
- Enter keywords to search for (separated by commas)
- Choose transcript options
- Select save options (results are saved by default)
- Follow the prompts for remaining options
If you prefer typing a single command:
python run_analysis.py --url https://www.youtube.com/watch?v=VIDEO_ID --keywords "Bybit, exchange" --save_results
Replace:
VIDEO_IDwith the YouTube video ID (or use the full URL)"Bybit, exchange"with your sponsor names or keywords
When you run an analysis, two files are automatically saved:
- The formatted output you see on screen
- A raw data file that can be viewed again later
To view your saved results:
Run:
python view_analysis.py --interactive
This will:
- Show you a list of all your saved analyses
- Display each file with:
- The video ID
- Date of analysis
- File size
- Keywords that were analyzed
- Use arrow keys to select the file you want to view
- Press Enter to view the full results
You'll see a selection menu like:
? Select an analysis file to view:
Β» WFndPuge5yE (2025-03-15 10:37:06) - 0.31MB - Keywords: Bybit, exchange
dQw4w9WgXcQ (2023-04-15 12:34:56) - 0.25MB - Keywords: Binance, crypto, Bitcoin
[Specify a different file path]
If you know which file you want to view:
python view_analysis.py output/analysis_VIDEO_ID_TIMESTAMP_raw.json
python run_analysis.py --interactive
Then follow the prompts:
- Enter URL:
https://www.youtube.com/watch?v=WFndPuge5yE - Enter keywords:
Bybit, exchange - Select options as prompted
python run_analysis.py --url https://www.youtube.com/watch?v=WFndPuge5yE --keywords "Bybit, exchange" --save_results
python view_analysis.py --interactive
Then select the analysis you want to view from the list.
python run_analysis.py --url https://www.youtube.com/watch?v=VIDEO_ID --keywords "Bitcoin,Solana" --output json
python run_analysis.py --url https://www.youtube.com/watch?v=VIDEO_ID --keywords "Bybit,NordVPN" --save_transcript
"Missing API key" error
- Make sure you created the
.envfile with your API key - Check that the file is in the main folder and has no other file extension
"FFmpeg not found" error
- Make sure ffmpeg.exe and ffprobe.exe are in the same folder as the tool
- If you're on Mac/Linux, install FFmpeg using your package manager
"No results found" for keywords
- Try different variations of your keywords (singular/plural forms)
- Check for typos in your keywords
- Try using simpler keywords
"'python' is not recognized" error
- Make sure Python is installed and added to your PATH
- Try using
pyinstead ofpythonon Windows - Open a new Command Prompt window after installing Python
YouTube Download Issues
- The tool uses yt-dlp which is more robust against YouTube's restrictions
- Some videos may have restrictions that prevent downloading
- If you see "403 Forbidden" errors, the tool will automatically retry with different methods
Audio Processing Issues
- For "Error splitting audio file" errors, try manually reducing the audio quality before processing
Need more help?
- Run the tool with
--helpfor a list of all available options:python run_analysis.py --help
- Python 3.6 or higher
- A Groq API key (for Whisper API access)
- Internet connection
- ffmpeg and ffprobe (required for handling large files)
- Clone this repository:
git clone https://github.com/vanzan01/yt-whisper-analyzer.git
cd yt-whisper-analyzer
- Install dependencies:
pip install -r requirements.txt
-
Set up your Groq API key using one of these methods:
Option 1: Using .env file (recommended)
Copy the template file and add your API key:
cp .env.template .envThen edit the
.envfile and replaceyour_api_key_herewith your actual Groq API key.Option 2: Environment variable
# On Windows (Command Prompt) set GROQ_API_KEY=your_api_key_here # On Windows (PowerShell) $env:GROQ_API_KEY="your_api_key_here" # On macOS/Linux export GROQ_API_KEY=your_api_key_hereOption 3: Command-line parameter
Pass your API key directly using the
--api_keyparameter.
--url URL: YouTube video URL--video_id VIDEO_ID: YouTube video ID (alternative to URL)--keywords "word1, word2": Comma-separated list of keywords to analyze--output: Output format (text, json) [default: text]--model: Whisper model to use [default: from .env or "whisper-large-v3"]--api_key: Groq API key (alternatively, set in .env file or as environment variable)--save_transcript: Save transcript to file--save_results: Save analysis results to file--output_dir: Directory for output files [default: "output"]--transcript_file: Path to a local transcript file to use instead of downloading and transcribing--use_existing_transcript: Look for an existing transcript in the output directory before downloading and transcribing--interactive,-i: Run in interactive mode with guided setup and visual UI--no-banner: Hide the CryptoBanter ASCII art banner
--interactive: Select from saved filesfile: Direct path to a specific file
You can configure default settings in the .env file:
# Groq API Key
GROQ_API_KEY=your_api_key_here
# Default Whisper Model
WHISPER_MODEL=whisper-large-v3
# Optional: Custom paths to ffmpeg and ffprobe
FFMPEG_PATH=/path/to/ffmpeg
FFPROBE_PATH=/path/to/ffprobe
Available models:
whisper-large-v3: Best accuracy, multilingual support, most detailedwhisper-large-v3-turbo: Good balance of speed and accuracydistil-whisper-large-v3-en: Fastest option, English-only
The tool uses yt-dlp to reliably download audio from YouTube videos, even when facing restrictions that would block other libraries.
For large audio files that exceed the Groq API's 25MB size limit:
- The file is automatically split into smaller chunks using ffmpeg
- Each chunk is transcribed separately
- The transcripts are combined into a single result
- If chunking fails, quality reduction is attempted
This process happens automatically and is transparent to the user.
When transcribing large files, the tool shows:
- Visual progress bars for audio chunking
- Transcription progress indicators with estimated time remaining
- Colorful status messages for better visibility
- Clearly marked timestamps in results
- Python 3.6+
- groq
- python-dotenv
- yt-dlp
- pydub
- ffmpeg and ffprobe
- Internet connection
This project builds upon existing work from the banter-get-transcripts repository.
This project is licensed under the MIT License - see below for details.
Copyright (c) 2023 YouTube Whisper Analyzer Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The MIT License is a permissive license that allows you to:
- β Use this software for commercial purposes
- β Modify the software as needed
- β Distribute modified versions
- β Use it privately or publicly
- β Include it in your own projects (open source or commercial)
The only requirement is that you must include the same MIT License and copyright notice in any copies or substantial portions of the software that you distribute. You do not need to mention or credit this project in your documentation, presentations, videos, or other non-code contexts.
While the MIT license doesn't legally require it, we kindly ask that you consider giving credit to the YouTube Whisper Analyzer project when you use it in your work. A simple acknowledgment like "Powered by YouTube Whisper Analyzer" or a link to this repository in your documentation, description, or credits would be greatly appreciated. This helps increase awareness of the tool and contributes to building a community of users.
Example credit line:
This analysis was performed using YouTube Whisper Analyzer (https://github.com/vanzan01/yt-whisper-analyzer)
Thank you for your support!