PH
Open SourceText Generationby Microsoft Research

Phi-4

Phi-4 is Microsoft's 14B small language model that delivers reasoning and math performance rivaling far larger models, achieved through heavy use of high-quality synthetic 'textbook' training data. Open weights under MIT.

edge-aillmmicrosoft-researchopen-source-aiphi-4small-language-model
Quick facts
LicenseMIT
Params14B
StrengthReasoning
Context16K
No ratings yet — be the first
Params
14B
small but strong
Strength
Reasoning / math
STEM
Context
16K
tokens
License
MIT
open weights

What is Phi-4?

Phi-4 is a 14-billion-parameter small language model from Microsoft Research, the latest in the influential Phi series that champions the idea that data quality, not just scale, drives capability. Despite its modest size, Phi-4 delivers reasoning and especially math performance that rivals models several times larger, by training heavily on carefully curated and synthetic 'textbook-quality' data designed to teach reasoning rather than just memorise facts. It is released openly under the MIT licence.

How it was built

The Phi approach centres on data curation. Microsoft generates and filters large amounts of high-quality synthetic data — structured to be educational and reasoning-rich — alongside carefully selected web and organic content, so the model spends its limited capacity learning how to reason rather than absorbing noise. Phi-4 also benefits from refined training and post-training techniques, with a 16K-token context. The headline result is exceptional performance per parameter, particularly on STEM, math and reasoning benchmarks.

What it is good at

Phi-4 punches far above its weight on reasoning, mathematics, logic and STEM question answering, where it competes with much larger models. Its compact size makes it ideal for cost- and latency-sensitive applications, on-premise and edge deployment, and reasoning tasks where you do not want to pay for a frontier model. It is a strong choice as an efficient assistant base and for building reasoning-heavy features affordably.

Licensing & access

Phi-4 is open under the MIT licence — permissive for research and commercial use — with weights on Hugging Face, availability through Azure AI, and easy local runs via Ollama and Transformers. At 14B it runs on a single GPU (and quantised on consumer hardware), making strong reasoning genuinely accessible to run yourself rather than only via a hosted frontier API.

Practical considerations

Phi-4's strengths are reasoning and STEM; it is less focused on broad world knowledge, multilinguality or very long-form factual recall than larger generalist models, and like all LLMs it can hallucinate, so verify factual claims. Because so much training is synthetic, its knowledge cut-off and coverage differ from web-scale models. Use it where reasoning quality per dollar matters most, and pair it with retrieval for up-to-date facts.

How it compares

Phi-4 is the leading example of the 'small but smart' philosophy it helped popularise, building on the lineage that includes Microsoft's own Orca reasoning work. Against a similarly sized reasoning model like Orca 2, Phi-4 is more capable and more recent; against Mistral Small it offers comparable efficiency with a strong reasoning tilt. When you want maximum reasoning ability from a model you can run cheaply, Phi-4 is genuinely a standout in its weight class.

Getting started

Pull Phi-4 from Hugging Face or Ollama and prompt it on reasoning, math or STEM tasks to see its strength; use Transformers for fine-tuning or Azure AI for a managed endpoint. Run a quantised build to fit consumer GPUs, lean on it for reasoning-heavy features, and and combine it with a retrieval step whenever you need the current or niche factual knowledge that a reasoning-focused model like this may simply not hold.

Model variants

MOST POPULAR

Phi-4 14B

14B
Reasoning

Main release

MOST POPULAR

Phi-4-mini

3.8B
Small

Smaller, more efficient

Capabilities

🧠
Strong reasoning
Competes with far larger models on logic and multi-step reasoning.
🔢
Math and STEM
Particularly capable on quantitative and scientific problems for its size.
🪶
Efficient
14B parameters deliver high quality per dollar and fit a single GPU.
📚
Data-quality trained
Heavy use of curated synthetic 'textbook' data teaches reasoning, not just recall.

Pros & Cons

Pros6
  • Reasoning and math rivaling much larger models
  • Strong performance per parameter
  • Compact 14B — single-GPU friendly
  • Open MIT licence (commercial use)
  • Available on Hugging Face, Azure and Ollama
  • Great reasoning quality per dollar
Cons4
  • Less broad world knowledge than big generalists
  • Weaker on multilingual and long-form recall
  • Can hallucinate — verify facts
  • Pair with retrieval for current knowledge

Inspiration

Phi-4 use cases & project ideas

Reasoning tasks

Logic and multi-step problems.

Math & STEM

Solve quantitative questions.

Efficient assistant

A capable local chatbot.

On-prem reasoning

Run strong reasoning privately.

FAQ

Frequently asked questions

It achieves reasoning and math performance rivaling much larger models at just 14B parameters, by training heavily on high-quality synthetic data.

More to explore

You might also like

01
O2
Orca 2 13B
7B / 13B · Research Only