A TypeScript application that transcribes audio files using OpenAI's Whisper API. It supports splitting large audio files into chunks to comply with the 25MB API limit.
- Transcribes WAV and MP3 audio files using OpenAI's Whisper API
- Automatically splits large audio files into chunks
- Adds timestamps to the transcription
- Supports various transcription options (language, prompt, temperature)
- Easy command-line interface
- Node.js (v14 or higher)
- FFmpeg installed on your system
- OpenAI API key
-
Clone this repository:
git clone https://github.com/gtofig/whisper-transcribe.git cd whisper-transcribe -
Install dependencies:
npm install
-
Create a
.envfile in the root directory with your OpenAI API key:OPENAI_API_KEY=your_openai_api_key_here -
Build the project:
npm run build
npm start -- -i /path/to/your/audio/file.mp3The transcription will be saved to ./transcriptions/file_transcription.txt.
npm start -- \
-i /path/to/your/audio/file.mp3 \
-o /path/to/output/directory \
-m 20 \
-l en \
-p "This is a discussion about technology." \
-t 0.2-i, --input: Path to the input audio file (required)-o, --output: Output directory for transcriptions (default:./transcriptions)-m, --maxChunkSize: Maximum chunk size in MB (default: 25)-l, --language: Language of the audio (ISO-639-1 code)-p, --prompt: Prompt to guide the transcription-t, --temperature: Temperature for the OpenAI API (default: 0)
For development with hot-reloading:
npm run dev -- -i /path/to/your/audio/file.mp3openai: Official OpenAI API clientfluent-ffmpeg: Node.js wrapper for FFmpegfs-extra: Enhanced file system operationsdotenv: Environment variable managementyargs: Command-line argument parsing
MIT