Best models for this Mac

25 modelsCatalog current through February 27, 2026

MacBook Pro M4 Max 36GB 14-inch ranked for coding with a most capable bias, using the best available runtime evidence.

Mac

Capability

Bias

Runtime

Use the strongest current runtime evidence for each row.M5 Max watch: 8B 61.6 tok/s · 14B 34.3 tok/s on the current 36GB public anchor.

Rank	Model	Score	Quant	Tok/s	Runtime	Evidence	Headroom	Why it ranks here
1	Qwen 3 32B	252	Q6_K	22.0 tok/s	Ollama	Estimated	8.5 GB	Q6_K is the highest practical quality here. 22.0 tok/s estimated from nearby benchmark coverage, with Ollama wrapper on llama.cpp as the best runtime hint. 8.5 GB headroom leaves workable context margin.
2	Devstral Small 1.1	249	8bit	43.0 tok/s	LM Studio	Estimated	11.9 GB	8bit is the highest practical quality here. 43.0 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 11.9 GB headroom leaves workable context margin.
3	Devstral Small 2 24B	249	8bit	25.3 tok/s	MLX	Estimated	11.9 GB	8bit is the highest practical quality here. 25.3 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 11.9 GB headroom leaves workable context margin.
4	Qwen3.5-27B	249	8bit	20.6 tok/s	MLX	Estimated	8.4 GB	8bit is the highest practical quality here. 20.6 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 8.4 GB headroom leaves workable context margin.
5	Gemma 3 27B	241	8bit	14.5 tok/s	LM Studio	Estimated	6.1 GB	8bit is the highest practical quality here. 14.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 6.1 GB headroom leaves workable context margin.
6	DeepSeek R1 Distill Qwen 32B	230	Q6_K	Measure it	Best available	Fit-first	8.5 GB	Q6_K is the highest practical quality here. Speed still needs direct measurement. 8.5 GB headroom leaves workable context margin.
7	Mistral Small 3.1 24B	227	8bit	Measure it	Best available	Fit-first	11.9 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 11.9 GB headroom leaves workable context margin.
8	Magistral Small	227	8bit	Measure it	Best available	Fit-first	11.9 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 11.9 GB headroom leaves workable context margin.
9	Nemotron-3-Nano-30B-A3B	216	8bit	43.7 tok/s	llama.cpp	Estimated	7.2 GB	8bit is the highest practical quality here. 43.7 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 7.2 GB headroom leaves workable context margin.
10	Qwen3-Coder-30B-A3B	215	8bit	58.5 tok/s	llama.cpp	Estimated	6.7 GB	8bit is the highest practical quality here. 58.5 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 6.7 GB headroom leaves workable context margin.
11	Qwen 3 30B-A3B	215	8bit	76.7 tok/s	MLX	Estimated	6.3 GB	8bit is the highest practical quality here. 76.7 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 6.3 GB headroom leaves workable context margin.
12	Qwen 3 8B	211	8bit	63.1 tok/s	LM Studio	Estimated	26.7 GB	8bit is the highest practical quality here. 63.1 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 26.7 GB headroom leaves workable context margin.
13	Qwen3.5-35B-A3B	210	Q6_K	80.0 tok/s	MLX	Estimated	8.1 GB	Q6_K is the highest practical quality here. 80.0 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 8.1 GB headroom leaves workable context margin.
14	Llama 2 7B	209	8bit	36.4 tok/s	llama.cpp	Estimated	25.2 GB	8bit is the highest practical quality here. 36.4 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 25.2 GB headroom leaves workable context margin.
15	GLM-4.7-Flash	209	6bit	36.8 tok/s	llama.cpp	Estimated	7.2 GB	6bit is the highest practical quality here. 36.8 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 7.2 GB headroom leaves workable context margin.
16	Qwen3.5-9B	206	8bit	15.0 tok/s	llama.cpp	Estimated	26.1 GB	8bit is the highest practical quality here. 15.0 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 26.1 GB headroom leaves workable context margin.
17	Qwen 2.5 14B	196	8bit	Measure it	Best available	Fit-first	21.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 21.0 GB headroom leaves workable context margin.
18	Phi-4 14B	195	8bit	Measure it	Best available	Fit-first	21.5 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 21.5 GB headroom leaves workable context margin.
19	Llama 3.1 8B	189	8bit	Measure it	Best available	Fit-first	27.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 27.0 GB headroom leaves workable context margin.
20	Qwen 2.5 7B	189	8bit	Measure it	Best available	Fit-first	28.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 28.0 GB headroom leaves workable context margin.
21	Mistral 7B v0.3	188	8bit	Measure it	Best available	Fit-first	27.7 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 27.7 GB headroom leaves workable context margin.
22	Gemma 3 4B	150	8bit	100.5 tok/s	LM Studio	Estimated	30.4 GB	8bit is the highest practical quality here. 100.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 30.4 GB headroom leaves workable context margin.
23	Qwen 3 4B	150	8bit	143.2 tok/s	MLX	Estimated	30.6 GB	8bit is the highest practical quality here. 143.2 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 30.6 GB headroom leaves workable context margin.
24	Qwen 3 0.6B	145	8bit	184.4 tok/s	LM Studio	Estimated	34.8 GB	8bit is the highest practical quality here. 184.4 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 34.8 GB headroom leaves workable context margin.
25	Llama 3.2 1B	124	8bit	Measure it	Best available	Fit-first	34.1 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 34.1 GB headroom leaves workable context margin.

Machine

36GBUnified memory

$2,999MSRP

macbook_proForm factor

m4-maxChip