Best models for this Mac

25 modelsCatalog current through February 27, 2026

MacBook Air M4 32GB 15-inch ranked for coding with a most capable bias, using the best available runtime evidence.

Mac

Capability

Bias

Runtime

Use the strongest current runtime evidence for each row.M5 Max watch: 8B 61.6 tok/s · 14B 34.3 tok/s on the current 36GB public anchor.

Rank	Model	Score	Quant	Tok/s	Runtime	Evidence	Headroom	Why it ranks here
1	Qwen 3 32B	250	6bit	22.0 tok/s	Ollama	Estimated	6.6 GB	6bit is the highest practical quality here. 22.0 tok/s estimated from nearby benchmark coverage, with Ollama wrapper on llama.cpp as the best runtime hint. 6.6 GB headroom leaves workable context margin.
2	Devstral Small 1.1	245	8bit	43.0 tok/s	LM Studio	Estimated	7.9 GB	8bit is the highest practical quality here. 43.0 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 7.9 GB headroom leaves workable context margin.
3	Devstral Small 2 24B	245	8bit	25.3 tok/s	MLX	Estimated	7.9 GB	8bit is the highest practical quality here. 25.3 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 7.9 GB headroom leaves workable context margin.
4	Qwen3.5-27B	243	Q6_K	20.6 tok/s	MLX	Estimated	8.9 GB	Q6_K is the highest practical quality here. 20.6 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 8.9 GB headroom leaves workable context margin.
5	Gemma 3 27B	236	Q6_K	14.5 tok/s	LM Studio	Estimated	6.7 GB	Q6_K is the highest practical quality here. 14.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 6.7 GB headroom leaves workable context margin.
6	DeepSeek R1 Distill Qwen 32B	228	6bit	Measure it	Best available	Fit-first	6.6 GB	6bit is the highest practical quality here. Speed still needs direct measurement. 6.6 GB headroom leaves workable context margin.
7	Mistral Small 3.1 24B	223	8bit	Measure it	Best available	Fit-first	7.9 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 7.9 GB headroom leaves workable context margin.
8	Magistral Small	223	8bit	Measure it	Best available	Fit-first	7.9 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 7.9 GB headroom leaves workable context margin.
9	Nemotron-3-Nano-30B-A3B	211	Q6_K	43.7 tok/s	llama.cpp	Estimated	8.2 GB	Q6_K is the highest practical quality here. 43.7 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 8.2 GB headroom leaves workable context margin.
10	Qwen3-Coder-30B-A3B	210	Q6_K	58.5 tok/s	llama.cpp	Estimated	7.8 GB	Q6_K is the highest practical quality here. 58.5 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 7.8 GB headroom leaves workable context margin.
11	Qwen 3 8B	210	8bit	63.1 tok/s	LM Studio	Estimated	22.7 GB	8bit is the highest practical quality here. 63.1 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 22.7 GB headroom leaves workable context margin.
12	Qwen 3 30B-A3B	210	Q6_K	76.7 tok/s	MLX	Estimated	7.4 GB	Q6_K is the highest practical quality here. 76.7 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 7.4 GB headroom leaves workable context margin.
13	Qwen3.5-35B-A3B	209	6bit	80.0 tok/s	MLX	Estimated	6.4 GB	6bit is the highest practical quality here. 80.0 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 6.4 GB headroom leaves workable context margin.
14	Llama 2 7B	207	8bit	36.4 tok/s	llama.cpp	Estimated	21.2 GB	8bit is the highest practical quality here. 36.4 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 21.2 GB headroom leaves workable context margin.
15	Qwen3.5-9B	204	8bit	15.0 tok/s	llama.cpp	Estimated	22.1 GB	8bit is the highest practical quality here. 15.0 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 22.1 GB headroom leaves workable context margin.
16	GLM-4.7-Flash	203	Q5	36.8 tok/s	llama.cpp	Estimated	6.7 GB	Q5 is the highest practical quality here. 36.8 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 6.7 GB headroom leaves workable context margin.
17	Qwen 2.5 14B	192	8bit	Measure it	Best available	Fit-first	17.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 17.0 GB headroom leaves workable context margin.
18	Phi-4 14B	191	8bit	Measure it	Best available	Fit-first	17.5 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 17.5 GB headroom leaves workable context margin.
19	Qwen 2.5 7B	189	8bit	Measure it	Best available	Fit-first	24.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 24.0 GB headroom leaves workable context margin.
20	Llama 3.1 8B	188	8bit	Measure it	Best available	Fit-first	23.0 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 23.0 GB headroom leaves workable context margin.
21	Mistral 7B v0.3	188	8bit	Measure it	Best available	Fit-first	23.7 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 23.7 GB headroom leaves workable context margin.
22	Gemma 3 4B	150	8bit	100.5 tok/s	LM Studio	Estimated	26.4 GB	8bit is the highest practical quality here. 100.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 26.4 GB headroom leaves workable context margin.
23	Qwen 3 4B	150	8bit	143.2 tok/s	MLX	Estimated	26.6 GB	8bit is the highest practical quality here. 143.2 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 26.6 GB headroom leaves workable context margin.
24	Qwen 3 0.6B	145	8bit	184.4 tok/s	LM Studio	Estimated	30.8 GB	8bit is the highest practical quality here. 184.4 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 30.8 GB headroom leaves workable context margin.
25	Llama 3.2 1B	124	8bit	Measure it	Best available	Fit-first	30.1 GB	8bit is the highest practical quality here. Speed still needs direct measurement. 30.1 GB headroom leaves workable context margin.

Machine

32GBUnified memory

$1,699MSRP

macbook_airForm factor

m4Chip