AMD Radeon RX 7900 XTX

The AMD answer, with an honest asterisk on driver friction.

24 GB at roughly 85–90% of a 4090's throughput under ROCm. The hardware is fine; the software ecosystem is the tax. Plan 5–10 hours on first-time ROCm setup, plus the ongoing friction of Ollama being patchy on AMD. New-market pricing has eased off its DRAM-crunch peak — new is back to ~$1,340 (from the ~$1,500 spike) while the used floor has firmed to ~$810.

The decision in five lines

The call: Buy — The AMD answer, with an honest asterisk on driver friction.
Best for: Team red
Runs well: Qwen3-Coder-30B-A3B (MoE, fits 24GB) · Qwen 3.5 35B-A3B (MoE, fits 24GB) · Gemma 4 31B (256K context)
Watch out: AMD post-EOL'd the 7900 XTX, so it is a used-first buy. The April new-retail spike (Amazon $2,374 outliers) has eased — new now runs ~$1,300–$1,350 where stocked, used eBay ~$810. Plan to source used or step to RDNA4 (RX 9070 XT) for fresh inventory.
Evidence: Measured · last verified June 2026

24: GB GDDR6
960: GB/S BANDWIDTH
355: W TDP
~$810: USED

What fits at this tier

Runs 24 GB-tier models at Q4 cleanly — MoE 30B-A3B, 14B dense, gpt-oss-20b — via vLLM or llama.cpp (HIP). Some MoE picks had HIP kernel issues through 2025; check llama.cpp release notes before pulling the latest.

CODING

Qwen3-Coder-30B-A3B (MoE, fits 24GB) 3B-active MoE — benchmark champion for local coding at this tier.

CHAT / GENERAL

Qwen 3.5 35B-A3B (MoE, fits 24GB) 3B active MoE — 30B quality at 3B inference speed.

DOCS & RETRIEVAL

Gemma 4 31B (256K context) 31B dense with 256K context; Gemma commercial-permissive terms; Arena top 5.

IMAGE

HiDream-O1-Image (8B, MIT) May 8, 2026 release. Pixel-space (no VAE, no disjoint text encoder) — debuted top-10 on Artificial Analysis T2I Arena. MIT-licensed 8B; one model handles T2I + edit + subject-driven personalization at up to 2,048².

AGENTS

Qwen 3.5 35B-A3B (MoE, fits 24GB) MoE with native tool use; fits 24GB at Q4; Apache 2.0.

VOICE

VoxCPM2 (2B, Apache 2.0) 30 languages, 48 kHz, tokenizer-free diffusion AR; voice design from text. April 2026 release.

The call

Buy it if you already run Linux, want 24 GB without paying NVIDIA's supply-crunch tax, and treat driver tinkering as acceptable friction rather than pain.
Skip it if your time is worth more than ~$50/hr — you will spend a day on ROCm setup and another day later the first time a new model doesn't JIT cleanly. Also skip if scarcity premiums make the RX 9070 XT (RDNA4, 16 GB at $649–$779) the cleaner AMD entry — the 9070 XT trades 8 GB of headroom for fresh architecture support and ROCm 7+ first-class WMMA.

Watchouts

AMD post-EOL'd the 7900 XTX, so it is a used-first buy. The April new-retail spike (Amazon $2,374 outliers) has eased — new now runs ~$1,300–$1,350 where stocked, used eBay ~$810. Plan to source used or step to RDNA4 (RX 9070 XT) for fresh inventory.
Budget 5–10 hours for first-time ROCm setup. Official AMD packages are the reliable path; distro repos lag by 6–12 months.
Ollama on AMD is still patchy. Use vLLM or llama.cpp with HIP for the reliable runner story.
MoE on ROCm is patchy: llama.cpp #19880 is closed, but #20024 + #20545 remain open on Qwen 3.5 35B-A3B with ROCm 7.2. The Vulkan backend side-steps these and outperforms ROCm on the 7900 XTX for MoE anyway.

Local vs cloud at this tier

● LOCAL WINS

24 GB for under $1,000, fully offline, no per-token cost. Hardware longevity on AMD is better than NVIDIA historically — less planned obsolescence.

● CLOUD WINS

No driver maintenance, no kernel compatibility issues, immediate model access the day they're released. For anyone whose time is billable, cloud looks compelling compared to the AMD software story.

At ~$810 used with regular usage, break-even vs ChatGPT Plus is ~30–36 months. The real question is whether your hourly rate makes ROCm debugging a net positive.

Next step

Load this setup into the planner→