Compatibility route: RunCompatibility route

Which models are practical on Mac Mini M4 24GB?

Current lens: Coding. Rank models by most capable, while keeping throughput, fit, local cost, and evidence on the same sheet.

This route now presets the rankings instead of acting like a separate product.

Snapshot

Top model: DeepSeek R1 Distill Qwen 32B

25 viable rows · 3 measured directly · — at Q4_K_M.

Rows: 251
Models: 33
Macs: 28
Measured: 3

Catalog current through February 27, 2026

Raw data Models Chips

Query setup

Audit evidence

Choose a Mac first, then narrow by lens, quant target, runtime, and ranking preference.

Mac

Capability

Quant target

Runtime

Sort

Results

Model matches for Mac Mini M4 24GB

Sorted by most capable · 25 viable rows · 3 measured directly

Viable rows: 25
Direct benchmark-backed: 3

ModelOutputPromptQuantRuntimeHeadroomContextCost readDetail

#1Fit-firstInsufficient data

DeepSeek R1 Distill Qwen 32B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output—

Prompt—

QuantQ4_K_M

RuntimeBest available

Headroom3.2 GB

Context13k

Local costHold for speed

Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Fit is computed; speed is still unmeasured Fit is computed from model size and KV-cache math, but speed still needs direct coverage.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q4_K_MBalanced quality3.2 GB13k—Best availableFit-first

#2EstimatedLow confidence

Qwen 3 32B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output22.0 tok/s

Prompt—

QuantQ4_K_M

RuntimeOllama

Headroom3.2 GB

Context128k

Local cost$2.91

Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: Ollama.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q4_K_MBalanced quality3.2 GB13k22.0 tok/sOllamaEstimated

#3EstimatedLow confidence

Qwen3.5-27B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output8.5 tok/s

Prompt—

QuantQ5_K_M

Runtimellama.cpp

Headroom2.8 GB

Context11k

Local cost$7.53

Detail Open

Capability

27B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q5_K_MHigh quality2.8 GB11k8.5 tok/sllama.cppEstimated

Q4_K_MBalanced quality6.5 GB27k0.0 tok/sllama.cppEstimated

#4EstimatedLow confidence

Devstral Small 1.1

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output33.0 tok/s

Prompt—

QuantQ6_K

RuntimeLM Studio

Headroom2.3 GB

Context15k

Local cost$1.94

Detail Open

Capability

24B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: LM Studio.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q6_KHigh quality2.3 GB15k33.0 tok/sLM StudioEstimated

Q5_K_MHigh quality4.9 GB32k33.0 tok/sLM StudioEstimated

Q4_K_MBalanced quality8.3 GB54k33.0 tok/sLM StudioEstimated

#5EstimatedLow confidence

Devstral Small 2 24B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output3.4 tok/s

Prompt—

QuantQ6_K

Runtimellama.cpp

Headroom2.3 GB

Context15k

Local cost$19.0

Detail Open

Capability

24B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q6_KHigh quality2.3 GB15k3.4 tok/sllama.cppEstimated

Q5_K_MHigh quality4.9 GB32k3.4 tok/sllama.cppEstimated

Q4_K_MBalanced quality8.3 GB54k3.4 tok/sllama.cppEstimated

#6EstimatedLow confidence

Gemma 3 27B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output14.5 tok/s

Prompt—

QuantQ5_K_M

RuntimeLM Studio

Headroom2.5 GB

Context131k

Local cost$4.42

Detail Open

Capability

27.4B dense · adjacent fit for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: LM Studio.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q5_K_MHigh quality2.5 GB5k14.5 tok/sLM StudioEstimated

Q4_K_MBalanced quality6.3 GB13k14.5 tok/sLM StudioEstimated

#7Fit-firstInsufficient data

Mistral Small 3.1 24B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output—

Prompt—

QuantQ6_K

RuntimeBest available

Headroom2.3 GB

Context15k

Local costHold for speed

Detail Open

Capability

24B dense · adjacent fit for coding

Evidence

Fit is computed; speed is still unmeasured Fit is computed from model size and KV-cache math, but speed still needs direct coverage.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q6_KHigh quality2.3 GB15k—Best availableFit-first

Q5_K_MHigh quality4.9 GB32k—Best availableFit-first

Q4_K_MBalanced quality8.3 GB54k—Best availableFit-first

#8EstimatedLow confidence

Qwen3.5-9B

Mac Mini M4 24GB · 24GB · $599 · Desktop

Output3.1 tok/s

Prompt—

QuantQ8_0

Runtimellama.cpp

Headroom13.0 GB

Context106k

Local cost$20.4

Detail Open

Capability

9B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: llama.cpp.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0Reference quality13.0 GB106k3.1 tok/sllama.cppEstimated

Q6_KHigh quality14.6 GB120k2.2 tok/sllama.cppEstimated

Q5_K_MHigh quality15.6 GB128k3.1 tok/sllama.cppEstimated

Q4_K_MBalanced quality16.8 GB138k3.1 tok/sllama.cppEstimated