← All benchmarks

M4 Max (128 GB) — LLM Benchmarks

Measured LLM inference benchmarks for M4 Max (128 GB). Tokens per second across 8 models and multiple quantizations. Real runs, not estimates.

8Benchmark rows
8Models tested
184.4Fastest avg tok/s (Qwen 3 0.6B)
0Factory-lab verified rows

Benchmark results for M4 Max (128 GB)

Rows sorted by avg tok/s descending. Click source badge to see original measurement page.

ModelQuantRAM req.ContextAvg tok/sPrompt tok/sRuntimeSource
Qwen 3 0.6BQ8_010k184.4 tok/sLM Studioref
Gemma 3 4BQ4_04k100.5 tok/sLM Studioref
Qwen 3 30B A3BQ4_K_M10k70.2 tok/sLM Studioref
Qwen 3 8BQ4_K_M10k63.1 tok/sLM Studioref
Qwen 2.5 7B InstructQ8_010k49.7 tok/sLM Studioref
Gemma 3 27BQ8_0131k14.5 tok/sLM Studioref
Qwen 3 32BQ4_K_M10k11.7 tok/sLM Studioref
Qwen 3 235B A22BQ4_K_M10k8.1 tok/sLM Studioref

benchmarks.json — full dataset  ·  chips.json — chip summaries  ·  benchmarks.csv — CSV export

Data sourced from factory lab measurements and community reference runs. See all chips →