Part of 800+ curated AI tools on AISO
Mathstral 7B logo

Mathstral 7B

Open-weight 7B math specialist from Mistral AI — STEM reasoning, MATH benchmark SOTA at release

0
freeDR 86Open weights on Hugging Face (mistralai/mathstral-7B-v0.1) — free to download and self-host. Compatible with mistral-inference and mistral-finetune. No commercial API endpoint offered at release; self-hosting required.View full pricing →

Visit Mathstral 7B

https://mistral.ai/news/mathstral

About Mathstral 7B

Mistral AI's open-weight math-specialized LLM released July 2024. Built on Mistral 7B, Mathstral achieves 56.6% on the MATH benchmark and 63.47% on MMLU, rising to 74.59% on MATH with a strong reward model and 64 candidates. Developed in collaboration with Project Numina to advance academic mathematical reasoning. Weights available on Hugging Face under an Apache 2.0-style research license.

Key Features

56.6% on MATH benchmark — state-of-the-art in the 7B class at release (July 2024)
63.47% on MMLU overall, with strong gains on STEM subjects vs. Mistral 7B baseline
74.59% on MATH with majority voting + strong reward model among 64 candidates
Built on Mistral 7B architecture — compatible with mistral-inference and mistral-finetune tooling
Instructed model fine-tuned for multi-step mathematical and logical reasoning
Produced in collaboration with Project Numina — research-grade academic use focus
GRE Math Subject Test evaluation curated by Professor Paul Bourdon (UVA)
Open weights under research-friendly license — use or fine-tune for STEM applications

Mathstral 7B Pros & Cons

Pros

  • +Best open-weight 7B math performance at release — 56.6% MATH is a meaningful step above base Mistral 7B
  • +Scales well with inference-time compute: 74.59% MATH with reward model + 64 candidates
  • +Free open weights on Hugging Face — no API costs, fine-tune for custom math domains
  • +Drop-in with existing Mistral 7B tooling (mistral-inference, mistral-finetune)
  • +Developed with Project Numina — academic collaboration adds credibility for research use

⚠️ Cons

  • Superseded by larger math-capable models (DeepSeek-Math, Qwen-Math, Gemini 2.5 Flash) that significantly outperform it
  • No commercial API endpoint — self-hosting required, which demands GPU resources
  • Specialized for math/STEM; not a general-purpose chat or coding model
  • Research license may have restrictions for some commercial fine-tuning use cases

Who Is Mathstral 7B Best For?

👤Academic researchers building on open-weight math models for competition math or theorem proving
👤Teams fine-tuning a small STEM-specialized LLM where 7B GPU cost matters
👤Educational applications requiring step-by-step mathematical reasoning on a budget
👤Benchmarking or ablation studies comparing math LLM progress from 2024 baselines

Tags

mistralopen-sourcemathstem7bllmself-hostedhuggingfacereasoningacademic
🏷️

Is this your tool?

Claim your listing to get a Featured badge, edit your description, and stand out from competitors. All plans include a permanent dofollow backlink to your site.

Claim Now →

📬 Get the best new AI tools delivered weekly

One concise email with fresh launches, trending picks, and featured standouts.