OpenAI's general-purpose speech recognition model. Supports 99 languages, transcription, translation to English, and language identification. Six model sizes from tiny (39M params) to large (1550M params). Use for speech-to-text, podcast transcription, or multilingual audio processing. Best for robust, multilingual ASR.
# Whisper - Robust Speech Recognition OpenAI's multilingual speech recognition model. ## When to use Whisper **Use when:** - Speech-to-text transcription (99 languages) - Podcast/video transcription - Meeting notes automation - Translation to English - Noisy audio transcription - Multilingual audio processing **Metrics**: - **72,900+ GitHub stars** - 99 languages supported - Trained on 680,000 hours of audio - MIT License **Use alternatives instead**: - **AssemblyAI**: Managed API, speaker diarization - **Deepgram**: Real-time streaming ASR - **Google Speech-to-Text**: Cloud-based
Sign in to view the full prompt.
Sign In