Part of 785+ curated AI tools on AISO
Mistral Small 4 logo

Mistral Small 4

Mistral's unified open-source model — reasoning + vision + coding, Apache 2.0

0
freemiumDR 86Open weights under Apache 2.0 license — free to download, self-host, fine-tune, and use commercially. Available via Mistral API (Mistral Small tier pricing) and Le Chat (free + Pro plans).View full pricing →

Visit Mistral Small 4

https://mistral.ai/news/mistral-small-4/

About Mistral Small 4

Mistral's first unified open-source model, released March 16, 2026. A 119B MoE model (6B active parameters per token) that merges reasoning (Magistral), multimodal vision (Pixtral), and agentic coding (Devstral) into a single Apache 2.0 model. 256k context window. 40% faster and 3× higher throughput than Mistral Small 3. Beats GPT-OSS 120B on coding and reasoning benchmarks while generating shorter outputs.

Key Features

119B total parameters, 6B active per token (MoE: 128 experts, 4 active)
256k token context window
Unified reasoning, vision, and coding in a single model
Configurable reasoning effort: reasoning_effort='none' (fast) or 'high' (deep)
Native image input support (text + vision in one model)
Apache 2.0 license — permissive commercial use, no additional restrictions
40% reduction in end-to-end latency vs Mistral Small 3
3× higher throughput vs Mistral Small 3 (throughput-optimized setup)
Beats GPT-OSS 120B on AA LCR and LiveCodeBench with shorter outputs
Runs on vLLM, llama.cpp, SGLang, and Transformers

Mistral Small 4 Pros & Cons

Pros

  • +Apache 2.0 license — genuinely open source with no commercial restrictions, unlike most frontier-class open models
  • +MoE architecture: 6B active parameters per token means inference is far cheaper than a dense 119B model
  • +Unified reasoning + vision + coding eliminates model routing overhead for multi-task pipelines
  • +Configurable reasoning_effort per API request — tune quality vs. latency without switching models
  • +40% faster than Small 3 and 3× higher throughput — a meaningful improvement for high-volume workloads
  • +Beats GPT-OSS 120B on coding and reasoning benchmarks while producing shorter, more efficient outputs
  • +Available on vLLM, llama.cpp, SGLang — integrates with existing inference infrastructure

⚠️ Cons

  • Self-hosting still requires serious GPU resources: minimum 4× HGX H100, 2× HGX H200, or 1× DGX B200
  • Newer and less battle-tested in production than GPT-4o or Claude Sonnet at scale
  • Vision benchmarks not fully published — Mistral hasn't released detailed Pixtral-equivalent image task scores
  • For maximum reasoning quality on hard math and research tasks, frontier-only models (o3, Claude Opus 4.8) still lead

Who Is Mistral Small 4 Best For?

👤Developers who want Apache 2.0 open-source LLM with frontier-adjacent capability
👤Regulated industries (healthcare, finance, government) needing on-premise deployment
👤Teams building multi-task agentic pipelines that combine reasoning, coding, and vision
👤Cost-sensitive workloads where the MoE efficiency advantage compounds at volume

Tags

mistralllmapiopen-sourceopen-weightscodingreasoningmultimodalmoeself-hosted
🏷️

Is this your tool?

Claim your listing to get a Featured badge, edit your description, and stand out from competitors. All plans include a permanent dofollow backlink to your site.

Claim Now →

📬 Get the best new AI tools delivered weekly

One concise email with fresh launches, trending picks, and featured standouts.

Alternatives to Mistral Small 4

View all Mistral Small 4 alternatives →