A Chrome extension that captures audio from browser tabs and provides real-time transcription using Deepgram's Live Streaming API.
- Real-time transcription of tab audio using Deepgram's Nova-2 model
- Live streaming with interim results for instant feedback
- Session timer with pause/resume functionality for meeting duration tracking
- Recording controls - Start, pause/resume, and stop with clear visual states
- Export functionality - Copy to clipboard or download as Text/JSON/CSV
- Clean sidepanel interface with recording controls and status indicators
- Relative timestamps for easy navigation
- User-friendly error handling with helpful messages
This project uses the Deepgram Live Streaming API with the official Node.js SDK:
- API Documentation: developers.deepgram.com/reference/listen-live
- SDK Package:
@deepgram/sdkv3.4.0 - Model: Nova-2 (latest Deepgram model for accuracy)
Before running this application, ensure you have:
- Node.js (v16 or higher) - Download here
- npm (comes with Node.js)
- Google Chrome (v88 or higher)
- Deepgram API Key - Get free credits at console.deepgram.com
git clone https://github.com/debadyuti23/RealTime-Audio-Transcriptor.git
cd RealTime-Audio-Transcriptornpm installCreate a .env file in the root directory:
PORT=3004
DEEPGRAM_API_KEY=your_actual_deepgram_api_key_hereCopy paste your deepgram API Key from Deepgram Console
Important:
- The API key is shown only once during creation - copy it immediately
npm startYou should see:
WebSocket server running on port 3004
Keep this terminal window open - the server must be running for transcription to work.
- Open Chrome and navigate to
chrome://extensions/ - Enable "Developer mode" (toggle in top-right corner)
- Click "Load unpacked"
- Select the
RealTime-Audio-Transcriptorfolder - The extension should appear in your extensions list
- Navigate to any webpage with audio content (YouTube, news sites, etc.)
- Click the extension icon in the Chrome toolbar (or pin it for easy access)
- The sidepanel will open on the right side showing connection status and session timer
- Click "Start Recording" to begin transcription - the timer will start automatically
- Audio from the current tab will be transcribed in real-time
- Use "Pause" to temporarily stop recording (timer pauses) and "Resume" to continue
- Click "Stop Recording" when done - timer resets to 00:00:00
- Use export options to save your transcript with session duration metadata
- Audio Source: Only captures audio from the active tab
- Session Timer: Displays in HH:MM:SS format, automatically starts with recording
- Pause/Resume: Use to take breaks during long sessions - timer pauses accordingly
- Permissions: Grant microphone/tab capture permissions when prompted
- Connection: Ensure stable internet for best transcription quality
- Export: Use Ctrl+C (or Cmd+C) to quickly copy transcript
- Status: Check connection status and session duration in the sidepanel header
"Not connected to server"
- Ensure
npm startis running in terminal - Check if port 3004 is available
- Verify
.envfile exists with correct PORT
"Unable to start transcription service"
- Check your
DEEPGRAM_API_KEYin.env - Verify you have remaining API credits
- Ensure stable internet connection
"Extension has not been invoked for current page"
- Navigate to a regular webpage (not chrome:// pages)
- Click the extension icon to open sidepanel
- Grant necessary permissions when prompted
No audio being captured
- Ensure the tab has audio playing
- Check Chrome's tab audio indicator (speaker icon)
- Try refreshing the page and restarting recording
If you encounter persistent issues:
- Stop the server (Ctrl+C in terminal)
- Reload the extension in
chrome://extensions/ - Restart the server with
npm start - Try again on a fresh webpage
- Frontend: Chrome Extension (Manifest V3)
- Backend: Node.js WebSocket server
- API: Deepgram Live Streaming API Model Nova-2 Free Tier
- Audio: WebM format, real-time streaming with pause/resume support
- Timer: JavaScript-based session duration tracking with pause functionality
- Styling: Modular CSS architecture with external stylesheets
- Permissions:
tabCapture,activeTab,sidePanel,storage
RealTime-Audio-Transcriptor/
├── .env # Environment variables
├── package.json # Dependencies and scripts
├── server.js # WebSocket server
├── manifest.json # Chrome extension manifest
├── service-worker.js # Background service worker
├── sidepanel.html # Extension UI structure
├── sidepanel.css # Extension UI styles
├── sidepanel.js # Extension UI logic
└── icons/ # Extension icons
This project is for educational/demonstration purposes. Please comply with Deepgram's terms of service and applicable laws regarding audio recording.