Canonical Rankings

Best models for this Mac

Mac Studio M4 Max 64GB ranked for coding with a most capable bias.

26 ranked modelsCatalog current through February 27, 2026M5 Max watch: 8B 61.6 tok/s · 14B 34.3 tok/s on the current 36GB public anchor.
RankModelScoreQuantTok/sEvidenceHeadroomWhy it ranks here
1Qwen 3 32B274Q8_022.0 tok/sBenchmark estimate31.0 GBFrontier candidate in the current catalog. Near-lossless quantization quality.
2Gemma 3 27B259Q8_014.5 tok/sBenchmark estimate34.1 GBNear-lossless quantization quality. Enough for most serious sessions.
3DeepSeek R1 Distill Qwen 32B252Q8_0Fit estimate31.0 GBNear-lossless quantization quality. Enough for most serious sessions.
4Qwen3.5-27B244Q8_0Fit estimate36.4 GBFrontier candidate in the current catalog. Near-lossless quantization quality.
5Mistral Small 3.1 24B240Q8_0Fit estimate39.9 GBNear-lossless quantization quality. Comfortable for long-context work.
6Devstral Small 1.1240Q8_0Fit estimate39.9 GBNear-lossless quantization quality. Comfortable for long-context work.
7Devstral Small 2 24B240Q8_0Fit estimate39.9 GBFrontier candidate in the current catalog. Near-lossless quantization quality.
8Magistral Small240Q8_0Fit estimate39.9 GBNear-lossless quantization quality. Enough for most serious sessions.
9Llama 3.3 70B233Q5_K_MFit estimate14.2 GBHigh quality quantization quality. Fine for standard chats and coding.
10Qwen 3 30B-A3B233Q8_084.9 tok/sBenchmark estimate34.3 GBNear-lossless quantization quality. Comfortable for long-context work.
11Qwen 3 8B211Q8_063.1 tok/sBenchmark estimate54.7 GBNear-lossless quantization quality. Comfortable for long-context work.
12Qwen3-Coder-30B-A3B211Q8_0Fit estimate34.7 GBNear-lossless quantization quality. Comfortable for long-context work.
13Qwen3.5-35B-A3B210Q8_0Fit estimate30.3 GBNear-lossless quantization quality. Comfortable for long-context work.
14GLM-4.7-Flash210Q8_0Fit estimate28.2 GBNear-lossless quantization quality. Fine for standard chats and coding.
15Llama 2 7B209Q8_036.4 tok/sBenchmark estimate53.2 GBNear-lossless quantization quality. Context headroom is tight.
16Qwen 2.5 14B199Q8_0Fit estimate49.0 GBNear-lossless quantization quality. Comfortable for long-context work.
17Phi-4 14B198Q8_0Fit estimate49.5 GBNear-lossless quantization quality. Fine for standard chats and coding.
18Qwen3.5-9B191Q8_0Fit estimate54.1 GBFrontier candidate in the current catalog. Near-lossless quantization quality.
19Llama 3.1 8B189Q8_0Fit estimate55.0 GBNear-lossless quantization quality. Comfortable for long-context work.
20Qwen 2.5 7B189Q8_0Fit estimate56.0 GBNear-lossless quantization quality. Comfortable for long-context work.
21Mistral 7B v0.3188Q8_0Fit estimate55.7 GBNear-lossless quantization quality. Enough for most serious sessions.
22Gemma 3 4B150Q8_0100.5 tok/sBenchmark estimate58.4 GBNear-lossless quantization quality. Comfortable for long-context work.
23Qwen 3 4B150Q8_0143.2 tok/sBenchmark estimate58.6 GBNear-lossless quantization quality. Enough for most serious sessions.
24Qwen 3 0.6B145Q8_0184.4 tok/sBenchmark estimate62.8 GBNear-lossless quantization quality. Enough for most serious sessions.
25Qwen3-Coder-Next134Q5_K_MFit estimate9.8 GBHigh quality quantization quality. Fine for standard chats and coding.
26Llama 3.2 1B124Q8_0Fit estimate62.1 GBNear-lossless quantization quality. Comfortable for long-context work.