Compatibility route: RunCompatibility route

Which models are practical on MacBook Pro M5 Max 128GB 16-inch?

Current lens: Coding. Rank models by most capable, while keeping throughput, fit, local cost, and evidence on the same sheet.

This route now presets the rankings instead of acting like a separate product.

Snapshot

Top model: MiniMax M2.7

50 viable rows · 21 benchmark-backed · — at 3bit.

Rows: 432
Models: 55
Macs: 29
Benchmark-backed: 21

Catalog current through April 22, 2026. Benchmark evidence through April 27, 2026.

Raw data Models Chips

Query setup

Audit evidence

Choose a Mac first, then narrow by lens, quant target, runtime, and ranking preference.

Mac

Capability

Quant target

Runtime

Sort

Results

Model matches for MacBook Pro M5 Max 128GB 16-inch

Sorted by most capable · 50 viable rows · 21 benchmark-backed

Viable rows: 50
Benchmark-backed rows: 21

ModelOutputPromptQuantRuntimeHeadroomContextCost readDetail

#1Fit-firstInsufficient data

MiniMax M2.7

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output—

Prompt—

Quant3bitSource-backed MLX MiniMax-M2.7-3bit - 112 GB min

RuntimeBest available

Headroom16.0 GB

Context116k

Local costHold for speed

Detail Open

Capability

228.704B dense · tuned for coding

Evidence

Fit uses a source-backed memory profile; direct speed coverage is still missing MiniMax-M2.7-3bit source profile lists 112 GB minimum memory on MLX; throughput still needs direct benchmark coverage.

Coverage

No direct benchmark rows yet

No speed row yet, so this cost read is held.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

3bit Source-backed MLX MiniMax-M2.7-3bit - 112 GB minCompact quality16.0 GB116k—Best availableFit-first

#2Trusted referenceLow confidence

Qwen 2.5 72B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output10.0 tok/s

Prompt—

QuantQ8_0

RuntimeOllama

Headroom53.3 GB

Context131k

Local cost$57.3

Detail Open

Capability

72.7B dense · tuned for coding

Evidence

Direct trusted-reference benchmark coverage on this hardware class Speed is backed by trusted-reference benchmark coverage. Most common runtime in the evidence is Ollama.

Coverage

1 direct benchmark row

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality53.3 GB131k10.0 tok/sOllamaTrusted reference

Q6_K High quality66.4 GB131k10.0 tok/sOllamaTrusted reference

Q5_K_M High quality74.3 GB131k10.0 tok/sOllamaTrusted reference

Q4_K_M Balanced quality84.4 GB131k10.0 tok/sOllamaTrusted reference

#3Trusted referenceLow confidence

DeepSeek R1 Distill Llama 70B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output11.0 tok/s

Prompt—

QuantQ8_0

RuntimeOllama

Headroom55.4 GB

Context131k

Local cost$52.1

Detail Open

Capability

70.6B dense · tuned for coding

Evidence

Direct trusted-reference benchmark coverage on this hardware class Speed is backed by trusted-reference benchmark coverage. Most common runtime in the evidence is Ollama.

Coverage

1 direct benchmark row

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality55.4 GB131k11.0 tok/sOllamaTrusted reference

Q6_K High quality68.1 GB131k11.0 tok/sOllamaTrusted reference

Q5_K_M High quality75.8 GB131k11.0 tok/sOllamaTrusted reference

Q4_K_M Balanced quality85.6 GB131k11.0 tok/sOllamaTrusted reference

#4Trusted referenceMedium confidence

Llama 3.3 70B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output15.0 tok/s

Prompt—

QuantQ8_0

RuntimeOllama

Headroom55.4 GB

Context131k

Local cost$38.2

Detail Open

Capability

70.6B dense · tuned for coding

Evidence

Direct trusted-reference benchmark coverage on this hardware class Speed is backed by trusted-reference benchmark coverage. Most common runtime in the evidence is Ollama.

Coverage

2 direct benchmark rows

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality55.4 GB131k15.0 tok/sOllamaTrusted reference

Q6_K High quality68.1 GB131k15.0 tok/sOllamaTrusted reference

Q5_K_M High quality75.8 GB131k15.0 tok/sOllamaTrusted reference

Q4_K_M Balanced quality85.6 GB131k15.0 tok/sOllamaTrusted reference

#5Trusted referenceLow confidence

Qwen 3 32B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output28.0 tok/s

Prompt—

QuantQ8_0

RuntimeOllama

Headroom93.2 GB

Context131k

Local cost$20.5

Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Direct trusted-reference benchmark coverage on this hardware class Speed is backed by trusted-reference benchmark coverage. Most common runtime in the evidence is Ollama.

Coverage

1 direct benchmark row

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality93.2 GB131k28.0 tok/sOllamaTrusted reference

Q6_K High quality99.1 GB131k28.0 tok/sOllamaTrusted reference

Q5_K_M High quality102.7 GB131k28.0 tok/sOllamaTrusted reference

Q4_K_M Balanced quality107.2 GB131k28.0 tok/sOllamaTrusted reference

#6Trusted referenceLow confidence

Gemma 4 31B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output26.0 tok/s

Prompt—

QuantQ8_0

RuntimeMLX

Headroom95.3 GB

Context104k

Local cost$22.0

Detail Open

Capability

30.7B dense · tuned for coding

Evidence

Direct trusted-reference benchmark coverage on this hardware class Speed is backed by trusted-reference benchmark coverage. Most common runtime in the evidence is MLX.

Coverage

1 direct benchmark row

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality95.3 GB104k26.0 tok/sMLXTrusted reference

Q6_K High quality100.8 GB110k26.0 tok/sMLXTrusted reference

Q5_K_M High quality104.2 GB114k26.0 tok/sMLXTrusted reference

Q4_K_M Balanced quality108.4 GB118k26.0 tok/sMLXTrusted reference

#7EstimatedLow confidence

DeepSeek R1 Distill Qwen 32B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output27.0 tok/s

Prompt—

QuantQ8_0

RuntimeOllama

Headroom93.2 GB

Context131k

Local cost$21.2

Detail Open

Capability

32.76B dense · tuned for coding

Evidence

Estimated from nearby benchmark coverage, not a direct match Speed is estimated from nearby benchmark coverage rather than this exact machine-and-quant match. Best runtime hint: Ollama.

Coverage

No direct benchmark rows yet

Speed is estimated, so this cost read is provisional.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality93.2 GB131k27.0 tok/sOllamaEstimated

Q6_K High quality99.1 GB131k27.0 tok/sOllamaEstimated

Q5_K_M High quality102.7 GB131k27.0 tok/sOllamaEstimated

Q4_K_M Balanced quality107.2 GB131k27.0 tok/sOllamaEstimated

#8Community rowInsufficient data

Qwen3.5-27B

MacBook Pro M5 Max 128GB 16-inch · 128GB · $5,399 · Portable

Output31.6 tok/s

Prompt—

QuantQ8_0

RuntimeMLX

Headroom99.0 GB

Context8k

Local cost$18.1

Detail Open

Capability

27B dense · tuned for coding

Evidence

Direct community benchmark coverage on this hardware class Speed is backed by community benchmark coverage. Most common runtime in the evidence is MLX.

Coverage

2 direct benchmark rows

Default 10% utilization napkin math.

Quant ladder and fit detail

QuantQualityHeadroomContextSpeedRuntimeEvidence

Q8_0 Reference quality99.0 GB262k31.6 tok/sMLXCommunity row

Q6_K High quality103.9 GB262k16.5 tok/sllama.cppCommunity row

Q5_K_M High quality106.8 GB262k31.6 tok/sMLXCommunity row

Q4_K_M Balanced quality110.5 GB262k31.6 tok/sMLXCommunity row

Current query

Mac: MacBook Pro M5 Max 128GB 16-inch
RAM: 128GB unified memory
Lens: Coding
Runtime: Best available
Sort: Most capable

Evidence read

Silicon Score Lab

Direct Silicon Score Lab coverage is available on this hardware class.

Trusted reference

Direct external benchmark coverage is available and stays labeled until first-party reproduction lands.

Community row

Direct community benchmark coverage is available, but confidence depends on source detail and reproduction.

Benchmark-backed

Direct benchmark coverage is available on this hardware class; inspect Bench for provenance before treating it as first-party evidence.

Estimated

Anchored by nearby benchmark coverage, but not a direct machine-and-quant match.

Directional

Useful for frontier movement, but not settled by direct benchmark coverage yet.

Fit-first

Sizing math is usable now, but throughput still needs direct speed coverage.

Audit Bench Download raw data

FAQ

Frequently asked questions about Run

These answers stay tied to the live workspace defaults for this compatibility route, so the copy explains the same sort order and query framing the table is using.

What does the Run route optimize for?: Run starts with a Mac and sorts toward the most capable models that remain practical on that machine. It is the fastest way to answer what your current Mac can run locally without guessing from raw memory numbers alone.
Why does Run show models instead of Macs?: Run flips the workspace into Mac-to-models mode. Instead of asking which Mac to buy for a model, it asks which models remain usable on the Mac you already own or plan to deploy.
How should I interpret the evidence labels on Run?: Benchmark-backed rows have direct speed evidence, with Lab, trusted-reference, and community labels showing provenance. Estimated rows are derived from adjacent evidence, and fit-first rows are memory-feasibility reads.