💡

Complete Your Audio Production Stack

Voxtral Transcribe 2 users also rely on these tools to enhance their workflow:

💰 Affiliate disclosure: We may earn a commission if you sign up through these links at no extra cost to you.

Listed in Audio & Music with 43 other toolsPart of 790+ curated AI tools on AISO
Voxtral Transcribe 2 logo

Voxtral Transcribe 2

Mistral's STT family — sub-200ms realtime + best-in-class batch at $0.003/min

0
paidDR 86Voxtral Mini Transcribe V2: $0.003/min via Mistral API. Voxtral Realtime: $0.006/min via API, plus open weights on HuggingFace (Apache 2.0) for self-hosting. Test free in Mistral Studio audio playground.View full pricing →

Visit Voxtral Transcribe 2

https://mistral.ai/news/voxtral-transcribe-2

About Voxtral Transcribe 2

Voxtral Transcribe 2 is Mistral AI's next-generation speech-to-text family, released February 4, 2026. It includes two models: Voxtral Mini Transcribe V2 (batch transcription with speaker diarization, $0.003/min) and Voxtral Realtime (live transcription with latency configurable down to sub-200ms, $0.006/min). Voxtral Realtime is open-weights under Apache 2.0. Outperforms GPT-4o mini Transcribe, Gemini 2.5 Flash, AssemblyAI Universal, and Deepgram Nova on accuracy. 13 languages supported.

Key Features

Two-model family: Mini Transcribe V2 (batch) + Realtime (live streaming)
Voxtral Realtime: latency configurable down to sub-200ms for voice agents
Voxtral Realtime: open weights under Apache 2.0 (4B parameters, edge-deployable)
Speaker diarization with precise start/end timestamps and speaker labels
Context biasing: up to 100 words/phrases for names, technical terms, domain vocabulary
Word-level timestamps for subtitle generation, audio search, content alignment
13 languages: English, Chinese, Hindi, Spanish, Arabic, French, Portuguese, Russian, German, Japanese, Korean, Italian, Dutch
Processes audio up to 3 hours in a single request
Outperforms GPT-4o mini Transcribe, Gemini 2.5 Flash, AssemblyAI Universal, Deepgram Nova on accuracy
3× faster than ElevenLabs Scribe v2 at one-fifth the cost
GDPR and HIPAA-compliant deployments via secure on-prem or private cloud
Audio playground in Mistral Studio — no code required to test

Voxtral Transcribe 2 Pros & Cons

Pros

  • +Sub-200ms latency on Voxtral Realtime unlocks voice agent use cases that batch-only models can't serve
  • +Best price-performance in batch transcription: lowest word error rate at $0.003/min
  • +Apache 2.0 open weights for Voxtral Realtime — privacy-first edge deployment without licensing cost
  • +Speaker diarization is first-class, not an add-on — critical for meeting and call center use cases
  • +Context biasing handles domain-specific vocabulary that generic models consistently get wrong
  • +Single Mistral API key covers LLMs, TTS, and STT — reduces vendor sprawl

⚠️ Cons

  • Voxtral Mini Transcribe V2 (the higher-accuracy batch model) is API-only — no open weights
  • Context biasing is English-optimized; multilingual context biasing is experimental
  • 13-language support is narrower than Whisper's broader language coverage
  • Newer than AssemblyAI and Deepgram — less ecosystem tooling and community resources at launch

Who Is Voxtral Transcribe 2 Best For?

👤Teams building voice agents or real-time transcription products needing sub-200ms latency
👤Contact centers transcribing calls in real time with speaker attribution and CRM sync
👤Meeting intelligence products requiring accurate diarization across multilingual recordings
👤Organizations with HIPAA/GDPR requirements needing on-prem or private cloud STT deployment

Tags

mistralspeech-to-texttranscriptionsttvoice aispeaker diarizationrealtime transcriptionopen sourceapache 2.0audio apimultilingualvoice agents
🏷️

Is this your tool?

Claim your listing to get a Featured badge, edit your description, and stand out from competitors. All plans include a permanent dofollow backlink to your site.

Claim Now →

Stay updated on Audio & Music tools — join our weekly newsletter

One concise email with fresh launches, trending picks, and featured standouts.

Alternatives to Voxtral Transcribe 2

View all Voxtral Transcribe 2 alternatives →