← Canonical rankings
Canonical Rankings

Best Macs for this model

GLM-5 ranked across the Mac lineup at the best practical quantization, using the best available runtime evidence. Historical baseline selected; model picker is focused on current-market choices.

29 ranked MacsUse the strongest current runtime evidence for each row.27 other historical models hiddenStatic paths cover only canonical model pages; sort and quantization stay as query state.

Historical baseline selected: GLM-5. Default model choices remain current-market; other historical models stay hidden.

RankMacScoreQuantTok/sRuntimeFitsHeadroomContextEvidencePriceWhy it ranks here
1Mac Studio M3 Ultra 256GB120IQ2_XS 13.2 tok/s Fastest evidence path: IQ2_XS · 13.2 tok/s · MLX · EstimatedMLXFits43.3 GB9kEstimated$7,499IQ2_XS is the current best practical quantization. 13.2 tok/s is estimated from nearby benchmark coverage. 43.3 GB headroom remains at this quantization.
2Mac Mini M4 16GB0F32 MLXNo-2795.1 GBEstimated$499GLM-5 does not fit on Mac Mini M4 16GB at the current practical quantization.
3Mac Mini M4 24GB0F32 MLXNo-2787.1 GBEstimated$599GLM-5 does not fit on Mac Mini M4 24GB at the current practical quantization.
4Mac Mini M4 32GB0F32 MLXNo-2779.1 GBEstimated$799GLM-5 does not fit on Mac Mini M4 32GB at the current practical quantization.
5MacBook Air M4 16GB 13-inch0F32 MLXNo-2795.1 GBEstimated$1,099GLM-5 does not fit on MacBook Air M4 16GB 13-inch at the current practical quantization.
6MacBook Air M4 24GB 13-inch0F32 MLXNo-2787.1 GBEstimated$1,299GLM-5 does not fit on MacBook Air M4 24GB 13-inch at the current practical quantization.
7MacBook Air M4 16GB 15-inch0F32 MLXNo-2795.1 GBEstimated$1,299GLM-5 does not fit on MacBook Air M4 16GB 15-inch at the current practical quantization.
8Mac Mini M4 Pro 24GB0F32 MLXNo-2787.1 GBEstimated$1,399GLM-5 does not fit on Mac Mini M4 Pro 24GB at the current practical quantization.
9MacBook Air M4 32GB 13-inch0F32 MLXNo-2779.1 GBEstimated$1,499GLM-5 does not fit on MacBook Air M4 32GB 13-inch at the current practical quantization.
10MacBook Air M4 24GB 15-inch0F32 MLXNo-2787.1 GBEstimated$1,499GLM-5 does not fit on MacBook Air M4 24GB 15-inch at the current practical quantization.
11Mac Mini M4 Pro 48GB0F32 MLXNo-2763.1 GBEstimated$1,599GLM-5 does not fit on Mac Mini M4 Pro 48GB at the current practical quantization.
12MacBook Air M4 32GB 15-inch0F32 MLXNo-2779.1 GBEstimated$1,699GLM-5 does not fit on MacBook Air M4 32GB 15-inch at the current practical quantization.
13MacBook Pro M4 Pro 24GB 14-inch0F32 MLXNo-2787.1 GBEstimated$1,999GLM-5 does not fit on MacBook Pro M4 Pro 24GB 14-inch at the current practical quantization.
14Mac Studio M4 Max 36GB0F32 MLXNo-2775.1 GBEstimated$1,999GLM-5 does not fit on Mac Studio M4 Max 36GB at the current practical quantization.
15MacBook Pro M4 Pro 48GB 14-inch0F32 MLXNo-2763.1 GBEstimated$2,499GLM-5 does not fit on MacBook Pro M4 Pro 48GB 14-inch at the current practical quantization.
16MacBook Pro M4 Pro 24GB 16-inch0F32 MLXNo-2787.1 GBEstimated$2,499GLM-5 does not fit on MacBook Pro M4 Pro 24GB 16-inch at the current practical quantization.
17Mac Studio M4 Max 48GB0F32 MLXNo-2763.1 GBEstimated$2,499GLM-5 does not fit on Mac Studio M4 Max 48GB at the current practical quantization.
18MacBook Pro M4 Max 36GB 14-inch0F32 MLXNo-2775.1 GBEstimated$2,999GLM-5 does not fit on MacBook Pro M4 Max 36GB 14-inch at the current practical quantization.
19MacBook Pro M4 Pro 48GB 16-inch0F32 MLXNo-2763.1 GBEstimated$2,999GLM-5 does not fit on MacBook Pro M4 Pro 48GB 16-inch at the current practical quantization.
20Mac Studio M4 Max 64GB0F32 MLXNo-2747.1 GBEstimated$2,999GLM-5 does not fit on Mac Studio M4 Max 64GB at the current practical quantization.
21MacBook Pro M4 Max 48GB 14-inch0F32 MLXNo-2763.1 GBEstimated$3,499GLM-5 does not fit on MacBook Pro M4 Max 48GB 14-inch at the current practical quantization.
22MacBook Pro M4 Max 36GB 16-inch0F32 MLXNo-2775.1 GBEstimated$3,499GLM-5 does not fit on MacBook Pro M4 Max 36GB 16-inch at the current practical quantization.
23MacBook Pro M4 Max 48GB 16-inch0F32 MLXNo-2763.1 GBEstimated$3,999GLM-5 does not fit on MacBook Pro M4 Max 48GB 16-inch at the current practical quantization.
24Mac Studio M3 Ultra 96GB0F32 MLXNo-2715.1 GBEstimated$3,999GLM-5 does not fit on Mac Studio M3 Ultra 96GB at the current practical quantization.
25MacBook Pro M4 Max 64GB 16-inch0F32 MLXNo-2747.1 GBEstimated$4,499GLM-5 does not fit on MacBook Pro M4 Max 64GB 16-inch at the current practical quantization.
26Mac Studio M4 Max 128GB0F32 MLXNo-2683.1 GBEstimated$4,499GLM-5 does not fit on Mac Studio M4 Max 128GB at the current practical quantization.
27MacBook Pro M5 Max 128GB 16-inch0F32 MLXNo-2683.1 GBEstimated$5,399GLM-5 does not fit on MacBook Pro M5 Max 128GB 16-inch at the current practical quantization.
28MacBook Pro M4 Max 128GB 16-inch0F32 MLXNo-2683.1 GBEstimated$5,999GLM-5 does not fit on MacBook Pro M4 Max 128GB 16-inch at the current practical quantization.
29Mac Pro M2 Ultra 192GB0F32 MLXNo-2619.1 GBEstimated$6,999GLM-5 does not fit on Mac Pro M2 Ultra 192GB at the current practical quantization.

GLM-5 — ranking first, raw rows below

Start with the ranked Mac table above. Use the rest of this page to inspect raw Apple Silicon coverage and model metadata.

Quantizations observed: 4bit

5Benchmark rows
1Chip tiers covered
16.7Fastest avg tok/s (M3 Ultra (512 GB))
391.82 GBMinimum RAM observed

Fastest published result is 16.7 tok/s on M3 Ultra (512 GB) at 4bit. Smallest published fit is 391.8 GB on M3 Ultra (512 GB). Longest published context on this page is 33k. Published runtimes include MLX. Start with Rankings for the decision, then use the raw rows below to audit the evidence.

Based on 5 external benchmarks; no lab runs yet.

Published runtimes: MLX.

744BTotal params
40BActive params
202,752Context window
2026-02-11Release date

This is a reference-only model record. It remains useful for historical benchmarks, migration checks, and audit context, but it is excluded from current frontier packs.

What this model is, and what Apple Silicon users are actually seeing

Official model cards tell you what the model is for and which software stacks it targets. Field reality below shows how much Apple Silicon evidence we have so far.

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI).

Official source  ·  Raw model card

agentscodingreasoning

Runtime support mentioned

vLLMSGLangTransformersKTransformersOpenHandsClaude CodexLLM

Official specs

  • Total parameters: 744B.
  • Active parameters: 40B.
  • Attention: DeepSeek Sparse Attention.

Official takeaways

  • Humanity’s Last Exam (HLE) & other reasoning tasks: We evaluate with a maximum generation length of 131,072 tokens (temperature=1.0, top_p=0.95, max_new_tokens=131072).
  • BrowserComp: Without context management, we retain details from the most recent 5 turns. With context management, we use the same discard-all strategy as DeepSeek-v3.2 and Kimi K2.5.
  • Terminal-Bench 2.0 (Terminus 2): We evaluate with the Terminus framework using timeout=2h, temperature=0.7, top_p=1.0, max_new_tokens=8192, with a 128K context window. Resource limits are capped at 16 CPUs and 32 GB RAM.
  • Terminal-Bench 2.0 (Claude Code): We evaluate in Claude Code 2.1.14 (think mode, default effort) with temperature=1.0, top_p=0.95, max_new_tokens=65536.

Official model cards describe intent, capabilities, and supported stacks. They do not prove Apple Silicon speed by themselves.

GLM-5: 6 Apple Silicon field reports; best reported generation ~20 tok/s; best reported prompt processing ~187 tok/s; reported RAM use ~391.82-415.41GB; seen on M3 ULTRA, Mac Studio M3 ULTRA 512GB; via oMLX.

5Benchmark rows
6Field reports
4Practitioner signals
Sparse BenchmarksEvidence status

What practitioners keep saying

  • Use the official launch post to ground GLM-5 currentness and local-serving planning.
  • Do not treat the launch post as Apple Silicon performance evidence without hardware, runtime build, quantization, context, and measured throughput.
  • The oMLX single-request table reports GLM-5-4bit on an M3 Ultra 512GB Mac at pp1024/tg128 with 187.0 tok/s prompt processing, 16.7 tok/s generation, 5.477s TTFT, 13.156s end-to-end time, and peak memory at 391.82GB.

Apple Silicon field sources

  • r/LocalLLaMA

    2026-02-24 · Mac Studio M3 Ultra 512GB · oMLX

    An accessible M3 Ultra 512GB oMLX report shows GLM-5 running on Apple Silicon with slow single-request latency but materially better throughput under continuous batching and persistent KV cache.

  • r/LocalLLaMA

    2026-02-16 · M3 Ultra

    A LocalLLaMA operator reports Unsloth GLM-5 low-bit quants running on M3 Ultra, including Q2 around 20 tok/s, while the source leaves key setup details unspecified.

  • r/LocalLLaMA

    Mac Studio M3 Ultra 512GB

    GLM-5 is no longer theoretical on Apple Silicon; operators are already running it on M3 Ultra-class desktops and comparing the experience against frontier hosted models.

Runtime/source notes to verify

  • Z.ai

    2026-04-30 · vLLM / SGLang

    The official GLM-5 launch post says the model weights are available for local deployment and names vLLM and SGLang as supported serving frameworks, but it does not establish Mac throughput or fit.

Runtime mentions in the field

oMLX

Hardware mentioned in reports

M3 UltraMacMac Studio

What would improve confidence

  • Reproduce Field Performance Signal
  • Resolve Blocked Source Capture

Published chip coverage includes M3 Ultra (512 GB). Fastest published row is 16.7 tok/s on M3 Ultra (512 GB) at 4bit. Lowest published RAM requirement is 391.8 GB on M3 Ultra (512 GB). Catalog context window is 33k.

Raw benchmark rows for GLM-5

Rows stay below the ranking because this page is answer-first. Use them to inspect exact chips, quantizations, runtimes, and sources.

ChipQuantRAM req.ContextAvg tok/sPrompt tok/sRuntimeSource
M3 Ultra (512 GB)4bit391.8 GB1k16.7 tok/s187.0 tok/sMLXref
M3 Ultra (512 GB)4bit394.1 GB4k13.7 tok/s180.1 tok/sMLXref
M3 Ultra (512 GB)4bit396.7 GB8k13.2 tok/s154.1 tok/sMLXref
M3 Ultra (512 GB)4bit402.7 GB16k12.0 tok/s117.4 tok/sMLXref
M3 Ultra (512 GB)4bit415.4 GB33k10.7 tok/s77.7 tok/sMLXref

Ordered by fastest published tok/s on the chip family in each Mac. Click through for the full machine page.

benchmarks.json — full dataset  ·  models.json — model summaries  ·  benchmarks.csv — CSV export

See all models →