gpt-oss-120b

OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.

License: Apache 2.0 · Context: 128K · Released: August 5, 2025

The decision in five lines

The call: Skip for local — for coding
Best for: coding · chat · agents
Runs on: 3 hardware picks fit (cheapest: Framework Desktop (Ryzen AI Max+ 395) · $1,999)
Watch out: Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.
Evidence: Estimated · last verified April 2026

120B total: PARAMETERS
MOE: TYPE
128K: CONTEXT
~63 GB (native MXFP4 — no separate quant needed): VRAM AT Q4

Where we recommend this

Every tier slot in the planner where this model is a top or alternate pick. Pulled live from planner.js — when the planner refreshes, this table stays current.

CODING ·

gpt-oss-120b (Apache 2.0, MXFP4 ~63 GB)Near o4-mini reasoning at 5.1B active. MXFP4-native (no separate quant). Community reports 200+ tok/s on consumer hardware. Single 80 GB GPU or 128 GB unified.

CHAT ·

gpt-oss-120b (Apache 2.0)Near o4-mini on reasoning + tool use; MXFP4 native ~63 GB. Apache 2.0 — the open-weight frontier reasoning pick.

AGENTS ·

gpt-oss-120b (Apache 2.0)OpenAI Apache 2.0 reasoning + tool use; near o4-mini. MXFP4 native ~63 GB. The open-weight agentic frontier on 80 GB+ hardware.

The call

OpenAI's 120B MoE. Achieves near-parity with o4-mini on core reasoning benchmarks. MXFP4-native means weights ship at the working precision — no quality lost to a separate quant pass. Fits a single H100 80 GB or any 128 GB unified machine. The frontier-tier Apache-2.0 reasoning pick.
When not to use: Hardware below 64 GB effective. Also: fine-tuning workflows that don't support MXFP4 — the format's tooling is improving fast but not all frameworks have parity yet.

Runner notes

Ollama tag `gpt-oss:120b`. Reasoning-effort parameter (low/medium/high) configurable in-prompt — see OpenAI docs. vLLM + TensorRT-LLM mature paths; llama.cpp added MXFP4 in 2025.

License: Apache 2.0
Released: August 5, 2025
Maker: OpenAI
Model card: huggingface.co/openai/gpt-oss-120b →

Hardware that fits

Every hardware pick whose memory fits this model at the quant we recommend. Sorted cheapest-first — the top row is your best-value fit. Click through for the full buyer’s guide.

Next step

Find-by-model — see what hardware runs this→