Best models for this Mac

24 modelsCatalog current through February 27, 2026

MacBook Air M4 24GB 13-inch ranked for coding with a most capable bias, using the best available runtime evidence.

Use the strongest current runtime evidence for each row.M5 Max watch: 8B 61.6 tok/s · 14B 34.3 tok/s on the current 36GB public anchor.
RankModelScoreQuantTok/sRuntimeEvidenceHeadroomWhy it ranks here
1Qwen 3 32B242Q4_K_M22.0 tok/sOllamaEstimated4.0 GBQ4_K_M is the highest practical quality here. 22.0 tok/s estimated from nearby benchmark coverage, with Ollama wrapper on llama.cpp as the best runtime hint. 4.0 GB headroom leaves workable context margin.
2Qwen3.5-27B238Q5_K_M20.6 tok/sMLXEstimated3.6 GBQ5_K_M is the highest practical quality here. 20.6 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 3.6 GB headroom is tight.
3Devstral Small 1.1236Q6_K43.0 tok/sLM StudioEstimated3.9 GBQ6_K is the highest practical quality here. 43.0 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 3.9 GB headroom is tight.
4Devstral Small 2 24B236Q6_K25.3 tok/sMLXEstimated3.9 GBQ6_K is the highest practical quality here. 25.3 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 3.9 GB headroom is tight.
5Gemma 3 27B227Q514.5 tok/sLM StudioEstimated3.7 GBQ5 is the highest practical quality here. 14.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 3.7 GB headroom is tight.
6DeepSeek R1 Distill Qwen 32B220Q4_K_MMeasure itBest availableFit-first4.0 GBQ4_K_M is the highest practical quality here. Speed still needs direct measurement. 4.0 GB headroom leaves workable context margin.
7Mistral Small 3.1 24B214Q6_KMeasure itBest availableFit-first3.9 GBQ6_K is the highest practical quality here. Speed still needs direct measurement. 3.9 GB headroom is tight.
8Magistral Small214Q6_KMeasure itBest availableFit-first3.9 GBQ6_K is the highest practical quality here. Speed still needs direct measurement. 3.9 GB headroom is tight.
9Nemotron-3-Nano-30B-A3B203Q543.7 tok/sllama.cppEstimated5.6 GBQ5 is the highest practical quality here. 43.7 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 5.6 GB headroom leaves workable context margin.
10Qwen 3 8B2028bit63.1 tok/sLM StudioEstimated14.7 GB8bit is the highest practical quality here. 63.1 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 14.7 GB headroom leaves workable context margin.
11Qwen3-Coder-30B-A3B202Q558.5 tok/sllama.cppEstimated5.4 GBQ5 is the highest practical quality here. 58.5 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 5.4 GB headroom leaves workable context margin.
12Qwen 3 30B-A3B202Q576.7 tok/sMLXEstimated5.0 GBQ5 is the highest practical quality here. 76.7 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 5.0 GB headroom leaves workable context margin.
13Qwen3.5-35B-A3B200Q4_K_M80.0 tok/sMLXEstimated4.2 GBQ4_K_M is the highest practical quality here. 80.0 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 4.2 GB headroom leaves workable context margin.
14Llama 2 7B1998bit36.4 tok/sllama.cppEstimated13.2 GB8bit is the highest practical quality here. 36.4 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 13.2 GB headroom leaves workable context margin.
15Qwen3.5-9B1968bit15.0 tok/sllama.cppEstimated14.1 GB8bit is the highest practical quality here. 15.0 tok/s estimated from nearby benchmark coverage, with llama.cpp backend as the best runtime hint. 14.1 GB headroom leaves workable context margin.
16Qwen 2.5 14B1848bitMeasure itBest availableFit-first9.0 GB8bit is the highest practical quality here. Speed still needs direct measurement. 9.0 GB headroom leaves workable context margin.
17Phi-4 14B1838bitMeasure itBest availableFit-first9.5 GB8bit is the highest practical quality here. Speed still needs direct measurement. 9.5 GB headroom leaves workable context margin.
18Qwen 2.5 7B1818bitMeasure itBest availableFit-first16.0 GB8bit is the highest practical quality here. Speed still needs direct measurement. 16.0 GB headroom leaves workable context margin.
19Llama 3.1 8B1808bitMeasure itBest availableFit-first15.0 GB8bit is the highest practical quality here. Speed still needs direct measurement. 15.0 GB headroom leaves workable context margin.
20Mistral 7B v0.31808bitMeasure itBest availableFit-first15.7 GB8bit is the highest practical quality here. Speed still needs direct measurement. 15.7 GB headroom leaves workable context margin.
21Gemma 3 4B1448bit100.5 tok/sLM StudioEstimated18.4 GB8bit is the highest practical quality here. 100.5 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 18.4 GB headroom leaves workable context margin.
22Qwen 3 4B1448bit143.2 tok/sMLXEstimated18.6 GB8bit is the highest practical quality here. 143.2 tok/s estimated from nearby benchmark coverage, with MLX backend as the best runtime hint. 18.6 GB headroom leaves workable context margin.
23Qwen 3 0.6B1448bit184.4 tok/sLM StudioEstimated22.8 GB8bit is the highest practical quality here. 184.4 tok/s estimated from nearby benchmark coverage, with LM Studio wrapper on mixed as the best runtime hint. 22.8 GB headroom leaves workable context margin.
24Llama 3.2 1B1228bitMeasure itBest availableFit-first22.1 GB8bit is the highest practical quality here. Speed still needs direct measurement. 22.1 GB headroom leaves workable context margin.
24GBUnified memory
$1,299MSRP
macbook_airForm factor
m4Chip