Compatibility route: RunCompatibility route

Which models are practical on Mac Mini M4 24GB?

Current lens: Coding. Rank models by most capable, while keeping throughput, fit, local cost, and evidence on the same sheet.

This route now presets the rankings instead of acting like a separate product.

Snapshot

Top model: DeepSeek R1 Distill Qwen 32B

25 viable rows · 3 measured directly · — at Q4_K_M.

Rows
251
Models
33
Macs
28
Measured
3

Catalog current through February 27, 2026

Query setup

Audit evidence

Choose a Mac first, then narrow by lens, quant target, runtime, and ranking preference.

Results

Model matches for Mac Mini M4 24GB

Sorted by most capable · 25 viable rows · 3 measured directly

Viable rows
25
Direct benchmark-backed
3
#1Fit-firstInsufficient data

DeepSeek R1 Distill Qwen 32B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output
Prompt
QuantQ4_K_M
RuntimeBest available
Headroom3.2 GB
Context13k
Local costHold for speed
Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Fit is computed; speed is still unmeasured Fit is computed from model size and KV-cache math, but speed still needs direct coverage.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q4_K_MBalanced quality3.2 GB13kBest availableFit-first
#2EstimatedLow confidence

Qwen 3 32B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output22.0 tok/s
Prompt
QuantQ4_K_M
RuntimeOllama
Headroom3.2 GB
Context128k
Local cost$2.91
Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: Ollama.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q4_K_MBalanced quality3.2 GB13k22.0 tok/sOllamaEstimated
#3EstimatedLow confidence

Qwen3.5-27B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output8.5 tok/s
Prompt
QuantQ5_K_M
Runtimellama.cpp
Headroom2.8 GB
Context11k
Local cost$7.53
Detail Open

Capability

27B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q5_K_MHigh quality2.8 GB11k8.5 tok/sllama.cppEstimated
Q4_K_MBalanced quality6.5 GB27k0.0 tok/sllama.cppEstimated
#4EstimatedLow confidence

Devstral Small 1.1

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output33.0 tok/s
Prompt
QuantQ6_K
RuntimeLM Studio
Headroom2.3 GB
Context15k
Local cost$1.94
Detail Open

Capability

24B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: LM Studio.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q6_KHigh quality2.3 GB15k33.0 tok/sLM StudioEstimated
Q5_K_MHigh quality4.9 GB32k33.0 tok/sLM StudioEstimated
Q4_K_MBalanced quality8.3 GB54k33.0 tok/sLM StudioEstimated
#5EstimatedLow confidence

Devstral Small 2 24B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output3.4 tok/s
Prompt
QuantQ6_K
Runtimellama.cpp
Headroom2.3 GB
Context15k
Local cost$19.0
Detail Open

Capability

24B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q6_KHigh quality2.3 GB15k3.4 tok/sllama.cppEstimated
Q5_K_MHigh quality4.9 GB32k3.4 tok/sllama.cppEstimated
Q4_K_MBalanced quality8.3 GB54k3.4 tok/sllama.cppEstimated
#6EstimatedLow confidence

Gemma 3 27B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output14.5 tok/s
Prompt
QuantQ5_K_M
RuntimeLM Studio
Headroom2.5 GB
Context131k
Local cost$4.42
Detail Open

Capability

27.4B dense · adjacent fit for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: LM Studio.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q5_K_MHigh quality2.5 GB5k14.5 tok/sLM StudioEstimated
Q4_K_MBalanced quality6.3 GB13k14.5 tok/sLM StudioEstimated
#7Fit-firstInsufficient data

Mistral Small 3.1 24B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output
Prompt
QuantQ6_K
RuntimeBest available
Headroom2.3 GB
Context15k
Local costHold for speed
Detail Open

Capability

24B dense · adjacent fit for coding

Evidence

Fit is computed; speed is still unmeasured Fit is computed from model size and KV-cache math, but speed still needs direct coverage.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q6_KHigh quality2.3 GB15kBest availableFit-first
Q5_K_MHigh quality4.9 GB32kBest availableFit-first
Q4_K_MBalanced quality8.3 GB54kBest availableFit-first
#8EstimatedLow confidence

Qwen3.5-9B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output3.1 tok/s
Prompt
QuantQ8_0
Runtimellama.cpp
Headroom13.0 GB
Context106k
Local cost$20.4
Detail Open

Capability

9B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail
QuantQualityHeadroomContextSpeedRuntimeEvidence
Q8_0Reference quality13.0 GB106k3.1 tok/sllama.cppEstimated
Q6_KHigh quality14.6 GB120k2.2 tok/sllama.cppEstimated
Q5_K_MHigh quality15.6 GB128k3.1 tok/sllama.cppEstimated
Q4_K_MBalanced quality16.8 GB138k3.1 tok/sllama.cppEstimated