Canonical Rankings

Best Macs for this model

GLM-5 ranked across the Mac lineup at the best practical quantization, using the best available runtime evidence.

Model

Quantization

Sort

Runtime

28 ranked MacsUse the strongest current runtime evidence for each row.Static paths cover only canonical model pages; sort and quantization stay as query state.

Rank	Mac	Score	Quant	Tok/s	Runtime	Fits	Evidence	Price	Why it ranks here
1	Mac Studio M3 Ultra 256GB	120	IQ2_XS	13.2 tok/s	MLX	Fits	Estimated	$7,499	IQ2_XS is the current best practical quantization. 13.2 tok/s is estimated from nearby benchmark coverage. 43.3 GB headroom remains at this quantization.
2	Mac Mini M4 16GB	0	F32	—	MLX	No	Estimated	$499	GLM-5 does not fit on Mac Mini M4 16GB at the current practical quantization.
3	Mac Mini M4 24GB	0	F32	—	MLX	No	Estimated	$599	GLM-5 does not fit on Mac Mini M4 24GB at the current practical quantization.
4	Mac Mini M4 32GB	0	F32	—	MLX	No	Estimated	$799	GLM-5 does not fit on Mac Mini M4 32GB at the current practical quantization.
5	MacBook Air M4 16GB 13-inch	0	F32	—	MLX	No	Estimated	$1,099	GLM-5 does not fit on MacBook Air M4 16GB 13-inch at the current practical quantization.
6	MacBook Air M4 24GB 13-inch	0	F32	—	MLX	No	Estimated	$1,299	GLM-5 does not fit on MacBook Air M4 24GB 13-inch at the current practical quantization.
7	MacBook Air M4 16GB 15-inch	0	F32	—	MLX	No	Estimated	$1,299	GLM-5 does not fit on MacBook Air M4 16GB 15-inch at the current practical quantization.
8	Mac Mini M4 Pro 24GB	0	F32	—	MLX	No	Estimated	$1,399	GLM-5 does not fit on Mac Mini M4 Pro 24GB at the current practical quantization.
9	MacBook Air M4 32GB 13-inch	0	F32	—	MLX	No	Estimated	$1,499	GLM-5 does not fit on MacBook Air M4 32GB 13-inch at the current practical quantization.
10	MacBook Air M4 24GB 15-inch	0	F32	—	MLX	No	Estimated	$1,499	GLM-5 does not fit on MacBook Air M4 24GB 15-inch at the current practical quantization.
11	Mac Mini M4 Pro 48GB	0	F32	—	MLX	No	Estimated	$1,599	GLM-5 does not fit on Mac Mini M4 Pro 48GB at the current practical quantization.
12	MacBook Air M4 32GB 15-inch	0	F32	—	MLX	No	Estimated	$1,699	GLM-5 does not fit on MacBook Air M4 32GB 15-inch at the current practical quantization.
13	MacBook Pro M4 Pro 24GB 14-inch	0	F32	—	MLX	No	Estimated	$1,999	GLM-5 does not fit on MacBook Pro M4 Pro 24GB 14-inch at the current practical quantization.
14	Mac Studio M4 Max 36GB	0	F32	—	MLX	No	Estimated	$1,999	GLM-5 does not fit on Mac Studio M4 Max 36GB at the current practical quantization.
15	MacBook Pro M4 Pro 48GB 14-inch	0	F32	—	MLX	No	Estimated	$2,499	GLM-5 does not fit on MacBook Pro M4 Pro 48GB 14-inch at the current practical quantization.
16	MacBook Pro M4 Pro 24GB 16-inch	0	F32	—	MLX	No	Estimated	$2,499	GLM-5 does not fit on MacBook Pro M4 Pro 24GB 16-inch at the current practical quantization.
17	Mac Studio M4 Max 48GB	0	F32	—	MLX	No	Estimated	$2,499	GLM-5 does not fit on Mac Studio M4 Max 48GB at the current practical quantization.
18	MacBook Pro M4 Max 36GB 14-inch	0	F32	—	MLX	No	Estimated	$2,999	GLM-5 does not fit on MacBook Pro M4 Max 36GB 14-inch at the current practical quantization.
19	MacBook Pro M4 Pro 48GB 16-inch	0	F32	—	MLX	No	Estimated	$2,999	GLM-5 does not fit on MacBook Pro M4 Pro 48GB 16-inch at the current practical quantization.
20	Mac Studio M4 Max 64GB	0	F32	—	MLX	No	Estimated	$2,999	GLM-5 does not fit on Mac Studio M4 Max 64GB at the current practical quantization.
21	MacBook Pro M4 Max 48GB 14-inch	0	F32	—	MLX	No	Estimated	$3,499	GLM-5 does not fit on MacBook Pro M4 Max 48GB 14-inch at the current practical quantization.
22	MacBook Pro M4 Max 36GB 16-inch	0	F32	—	MLX	No	Estimated	$3,499	GLM-5 does not fit on MacBook Pro M4 Max 36GB 16-inch at the current practical quantization.
23	MacBook Pro M4 Max 48GB 16-inch	0	F32	—	MLX	No	Estimated	$3,999	GLM-5 does not fit on MacBook Pro M4 Max 48GB 16-inch at the current practical quantization.
24	Mac Studio M3 Ultra 96GB	0	F32	—	MLX	No	Estimated	$3,999	GLM-5 does not fit on Mac Studio M3 Ultra 96GB at the current practical quantization.
25	MacBook Pro M4 Max 64GB 16-inch	0	F32	—	MLX	No	Estimated	$4,499	GLM-5 does not fit on MacBook Pro M4 Max 64GB 16-inch at the current practical quantization.
26	Mac Studio M4 Max 128GB	0	F32	—	MLX	No	Estimated	$4,499	GLM-5 does not fit on Mac Studio M4 Max 128GB at the current practical quantization.
27	MacBook Pro M4 Max 128GB 16-inch	0	F32	—	MLX	No	Estimated	$5,999	GLM-5 does not fit on MacBook Pro M4 Max 128GB 16-inch at the current practical quantization.
28	Mac Pro M2 Ultra 192GB	0	F32	—	MLX	No	Estimated	$6,999	GLM-5 does not fit on Mac Pro M2 Ultra 192GB at the current practical quantization.

GLM-5 — ranking first, raw rows below

Start with the ranked Mac table above. Use the rest of this page to inspect raw Apple Silicon coverage and model metadata.

Quantizations observed: 4bit

5Benchmark rows

1Chip tiers covered

16.7Fastest avg tok/s (M3 Ultra (512 GB))

391.82 GBMinimum RAM observed

What this page answers best

Fastest published result is 16.7 tok/s on M3 Ultra (512 GB) at 4bit. Smallest published fit is 391.8 GB on M3 Ultra (512 GB). Longest published context on this page is 33k. Published runtimes include MLX. Start with Rankings for the decision, then use the raw rows below to audit the evidence.

Evidence state: 5 linked reference rows and no Silicon Score Lab rows yet.

Published runtimes here: MLX.

Need the best Mac for this model? Use Buy Need a setup-first answer? Use Run Checking whether it fits? Use Fit Browse Macs by exact hardware Need the full audit trail? Use Bench Comparing against rented GPUs? Use AI Datacenter Index

Catalog record

744BTotal params

40BActive params

202,752Context window

2026-02-11Release date

What this model is, and what Apple Silicon users are actually seeing

Official model cards tell you what the model is for and which software stacks it targets. Field reality below shows how much Apple Silicon evidence we have so far.

Official brief

We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI).

Official source · Raw model card

agentscodingreasoning

Runtime support mentioned

vLLMSGLangTransformersKTransformersOpenHandsClaude CodexLLM

Official takeaways

Humanity’s Last Exam (HLE) & other reasoning tasks: We evaluate with a maximum generation length of 131,072 tokens (temperature=1.0, topp=0.95, maxnewtokens=131072).
SWE-bench & SWE-bench Multilingual: We run the SWE-bench suite with OpenHands using a tailored instruction prompt. Settings: temperature=0.7, topp=0.95, maxnewtokens=16384, with a 200K context window.
BrowserComp: Without context management, we retain details from the most recent 5 turns. With context management, we use the same discard-all strategy as DeepSeek-v3.2 and Kimi K2.5.
Terminal-Bench 2.0 (Terminus 2): We evaluate with the Terminus framework using timeout=2h, temperature=0.7, topp=1.0, maxnewtokens=8192, with a 128K context window. Resource limits are capped at 16 CPUs and 32 GB RAM.

Official model cards describe intent, capabilities, and supported stacks. They do not prove Apple Silicon speed by themselves.

Field reality on Apple Silicon

GLM-5: 1 Apple Silicon field report; best reported generation ~11 tok/s; seen on Mac Studio M3 ULTRA 512GB.

5Benchmark rows

1Field reports

1Practitioner signals

Sparse BenchmarksEvidence status

What practitioners keep saying

One Mac Studio M3 Ultra 512GB owner reports roughly 10-11 tok/s generation and strong quality, with the main pain point being very slow prefill on huge contexts.
This is exactly the kind of top-tier Apple workstation evidence the site should surface for frontier-scale local models.

Hardware mentioned in reports

M3 UltraMacMac Studio

What would improve confidence

Capture Practitioner Runtime Notes
Queue Lab Verification If Hardware Available
Reproduce Field Performance Signal
Resolve Blocked Source Capture

Current published coverage

Published chip coverage includes M3 Ultra (512 GB). Fastest published row is 16.7 tok/s on M3 Ultra (512 GB) at 4bit. Lowest published RAM requirement is 391.8 GB on M3 Ultra (512 GB). Catalog context window is 33k.

M3 Ultra (512 GB)

Raw benchmark rows for GLM-5

Rows stay below the ranking because this page is answer-first. Use them to inspect exact chips, quantizations, runtimes, and sources.

Chip	Quant	RAM req.	Context	Avg tok/s	Prompt tok/s	Runtime	Source
M3 Ultra (512 GB)	4bit	391.8 GB	1k	16.7 tok/s	187.0 tok/s	MLX	ref
M3 Ultra (512 GB)	4bit	394.1 GB	4k	13.7 tok/s	180.1 tok/s	MLX	ref
M3 Ultra (512 GB)	4bit	396.7 GB	8k	13.2 tok/s	154.1 tok/s	MLX	ref
M3 Ultra (512 GB)	4bit	402.7 GB	16k	12.0 tok/s	117.4 tok/s	MLX	ref
M3 Ultra (512 GB)	4bit	415.4 GB	33k	10.7 tok/s	77.7 tok/s	MLX	ref

Chips with published results for GLM-5

M3 Ultra (512 GB)

Data

benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export

See all models →