the AI bench
VERIFIED JUNE 2026
All models

MODEL · OPENAI · 120B TOTAL / 5.1B ACTIVE

gpt-oss-120b

OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.

License: Apache 2.0 · Context: 128K · Released: August 5, 2025

The decision in five lines

The call
Skip for local — for coding
Best for
coding · chat · agents
Runs on
3 hardware picks fit (cheapest: Framework Desktop (Ryzen AI Max+ 395) · $1,999)
Watch out
Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.
Evidence
Estimated · last verified April 2026

120B total
PARAMETERS
MOE
TYPE
128K
CONTEXT
~63 GB (native MXFP4 — no separate quant needed)
VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

CODING ·
gpt-oss-120b (Apache 2.0, MXFP4 ~63 GB)Near o4-mini reasoning at 5.1B active. MXFP4-native (no separate quant). Community reports 200+ tok/s on consumer hardware. Single 80 GB GPU or 128 GB unified.
CHAT ·
gpt-oss-120b (Apache 2.0)Near o4-mini on reasoning + tool use; MXFP4 native ~63 GB. Apache 2.0 — the open-weight frontier reasoning pick.
AGENTS ·
gpt-oss-120b (Apache 2.0)OpenAI Apache 2.0 reasoning + tool use; near o4-mini. MXFP4 native ~63 GB. The open-weight agentic frontier on 80 GB+ hardware.

The call

OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.

When not to use: Hardware below 64 GB effective. Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.

Runner notes

Ollama tag `gpt-oss:120b`. Reasoning-effort parameter (low/medium/high) configurable in-prompt — see OpenAI docs. vLLM + TensorRT-LLM mature paths; llama.cpp added MXFP4 in 2025.

License
Apache 2.0
Released
August 5, 2025
Maker
OpenAI

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this