← Canonical rankings

M3 Max (GPU count not published) — benchmark record

M3 Max (GPU count not published) local LLM benchmarks across 64 GB RAM tiers on Apple Silicon Mac. 24 published rows across 2 models with explicit evidence state and RAM-tier comparison. Peak published speed is 10.4 tok/s.

24Benchmark rows
2Models tested
1RAM configurations
10.4Fastest avg tok/s

Each configuration differs only in unified memory. More RAM = larger models fit. Throughput is similar across RAM tiers at the same model size.

All benchmark rows — M3 Max (GPU count not published)

Sorted by avg tok/s descending. Click source badge to see original measurement.

Chip (RAM)ModelQuantRAM req.Avg tok/sPrompt tok/sRuntimeSource
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.4 tok/s153.6 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.3 tok/s152.1 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.3 tok/s169.5 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.3 tok/s164.8 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.3 tok/s171.4 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.3 tok/s163.8 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.2 tok/s169.2 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.1 tok/s168.3 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.1 tok/s167.0 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_010.1 tok/s166.8 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.9 tok/s162.2 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.9 tok/s161.5 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.7 tok/s154.2 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.7 tok/s153.0 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.2 tok/s140.1 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_09.2 tok/s139.0 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_08.6 tok/s128.0 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_08.6 tok/s127.1 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Llama 3.3 70BQ4_K - Medium8.2 tok/s67.9 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_07.6 tok/s111.2 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Qwen 3 32BQ8_07.5 tok/s111.8 tok/sOllamaref
M3 Max (GPU count not published, 64 GB)Llama 3.3 70BQ4_K - Medium7.5 tok/s65.2 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Llama 3.3 70BQ4_K - Medium7.0 tok/s59.5 tok/sllama.cppref
M3 Max (GPU count not published, 64 GB)Llama 3.3 70BQ4_K - Medium6.1 tok/s50.3 tok/sllama.cppref

benchmarks.json — full dataset  ·  chips.json — chip summaries  ·  benchmarks.csv — CSV export

Data from in-house lab measurements plus community-published benchmarks. See all chip families →