MODEL · OPENAI · 120B TOTAL / 5.1B ACTIVE
gpt-oss-120b
OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.
License: Apache 2.0 · Context: 128K · Released: August 5, 2025
The decision in five lines
- The call
- Skip for local — for coding
- Best for
- coding · chat · agents
- Runs on
- 3 hardware picks fit (cheapest: Framework Desktop (Ryzen AI Max+ 395) · $1,999)
- Watch out
- Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.
- Evidence
- Estimated
- 120B total
- PARAMETERS
- MOE
- TYPE
- 128K
- CONTEXT
- ~63 GB (native MXFP4 — no separate quant needed)
- VRAM AT Q4
Where we recommend this
Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.
The call
OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.
When not to use: Hardware below 64 GB effective. Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.
Runner notes
Ollama tag `gpt-oss:120b`. Reasoning-effort parameter (low/medium/high) configurable in-prompt — see OpenAI docs. vLLM + TensorRT-LLM mature paths; llama.cpp added MXFP4 in 2025.
Hardware that fits
Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.
Next step
Find-by-model — see what hardware runs this→