Model coverage

Use this page when you're picking a model first and need to see which Macs run it well.

Frontier-priority models: 18
Frontier families: 10
Catalog models: 60
Chip variants covered: 103
Benchmark rows: 432
Parameter range: 0.6B – 744B

Catalog size isn't the point. What matters is whether a model family actually runs on Apple Silicon, the smallest Mac that fits it, and the fastest Mac that's been benchmarked.

qwen-3.6

largeCatalog focus

2 models · 5 benchmark rows · 5 chips tested

55.0 tok/sfastest on M5 Max (128 GB)

ModelCoverageChipsFastestSmallest fitStatus

Qwen3.6-35B-A3BDetail

4 rows455.0 tok/s—benchmark rows

Qwen3.6-27BDetail

1 rows116.6 tok/s29 GBbenchmark rows

gemma-4

smallCatalog focus

4 models · 17 benchmark rows · 7 chips tested

158.0 tok/sfastest on M5 Max (128 GB)

ModelCoverageChipsFastestSmallest fitStatus

Gemma 4 31BDetail

4 rows426.0 tok/s—benchmark rows

Gemma 4 26B-A4BDetail

4 rows450.0 tok/s—benchmark rows

Gemma 4 E4BDetail

5 rows5128.0 tok/s—benchmark rows

Gemma 4 E2BDetail

4 rows4158.0 tok/s—benchmark rows

minimax-m2

frontierCatalog focus

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

MiniMax M2.7Detail

catalog only———no Apple Silicon rows

mistral-small-4

xlCatalog focus

1 model · 3 benchmark rows · 2 chips tested

45.0 tok/sfastest on M4 Ultra (192 GB)

ModelCoverageChipsFastestSmallest fitStatus

Mistral Small 4 119BDetail

3 rows245.0 tok/s—benchmark rows

qwen-3.5

smallCatalog focus

6 models · 54 benchmark rows · 18 chips tested

148.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Qwen3.5-397B-A17BDetail

2 rows240.2 tok/s—benchmark rows

Qwen3.5-122B-A10BDetail

6 rows365.8 tok/s40.8 GBbenchmark rows

Qwen3.5-35B-A3BDetail

13 rows9128.0 tok/s19.6 GBbenchmark rows

Qwen3.5-27BDetail

18 rows1038.0 tok/s15.3 GBbenchmark rows

Qwen3.5-9BDetail

13 rows9106.0 tok/s5.1 GBbenchmark rows

Qwen3.5-4BDetail

2 rows2148.0 tok/s—benchmark rows

ministral-3

mediumCatalog focus

2 models · 6 benchmark rows · 4 chips tested

98.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Ministral 3 14BDetail

3 rows358.0 tok/s—benchmark rows

Ministral 3 8BDetail

3 rows398.0 tok/s—benchmark rows

devstral

largeCatalog focus

2 models · 12 benchmark rows · 7 chips tested

47.0 tok/sfastest on M3 Ultra (256 GB)

ModelCoverageChipsFastestSmallest fitStatus

Devstral Small 2 24BDetail

8 rows347.0 tok/s13.4 GBbenchmark rows

Devstral Small 1.1Reference-onlyDetail

4 rows443.0 tok/s18.51 GBbenchmark rows

glm-4.5

xlCatalog focus

1 model · 5 benchmark rows · 2 chips tested

54.0 tok/sfastest on M3 Ultra (256 GB)

ModelCoverageChipsFastestSmallest fitStatus

GLM-4.5-AirDetail

5 rows254.0 tok/s60.3 GBbenchmark rows

magistral

largeCatalog focus

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

Magistral SmallDetail

catalog only———no Apple Silicon rows

llama-4

xlCatalog focus

1 model · 3 benchmark rows · 2 chips tested

30.0 tok/sfastest on M4 Ultra (192 GB)

ModelCoverageChipsFastestSmallest fitStatus

Llama 4 Scout 17B-16EDetail

3 rows230.0 tok/s—benchmark rows

glm-5.1

frontier

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

GLM-5.1Detail

catalog only———no Apple Silicon rows

nemotron-cascade-2

large

1 model · 3 benchmark rows · 3 chips tested

35.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Nemotron Cascade 2 30B-A3BDetail

3 rows335.0 tok/s—benchmark rows

glm-5

frontier

1 model · 5 benchmark rows · 1 chip tested

16.7 tok/sfastest on M3 Ultra (512 GB)

ModelCoverageChipsFastestSmallest fitStatus

GLM-5Reference-onlyDetail

5 rows116.7 tok/s391.82 GBbenchmark rows

qwen-3-coder

2 models · 7 benchmark rows · 3 chips tested

79.3 tok/sfastest on M5 Max (128 GB)

ModelCoverageChipsFastestSmallest fitStatus

Qwen3-Coder-30B-A3BReference-onlyDetail

1 rows158.5 tok/s16.1 GBbenchmark rows

Qwen3-Coder-NextDetail

6 rows279.3 tok/s44.9 GBbenchmark rows

glm-4.7

large

1 model · 2 benchmark rows · 2 chips tested

58.0 tok/sfastest on M3 Ultra (256 GB)

ModelCoverageChipsFastestSmallest fitStatus

GLM-4.7-FlashDetail

2 rows258.0 tok/s17 GBbenchmark rows

nemotron-3

large

1 model · 1 benchmark rows · 1 chip tested

43.7 tok/sfastest on M1 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Nemotron-3-Nano-30B-A3BDetail

1 rows143.7 tok/s22 GBbenchmark rows

gpt-oss

2 models · 2 benchmark rows · 2 chips tested

10.0 tok/sfastest on M4 Ultra (192 GB)

ModelCoverageChipsFastestSmallest fitStatus

gpt-oss 120BDetail

2 rows210.0 tok/s—benchmark rows

gpt-oss 20BDetail

catalog only———no Apple Silicon rows

qwen-3

small

8 models · 65 benchmark rows · 19 chips tested

370.0 tok/sfastest on M3 Ultra (256 GB)

ModelCoverageChipsFastestSmallest fitStatus

Qwen 3 235B-A22BReference-onlyDetail

8 rows630.0 tok/s100 GBbenchmark rows

Qwen 3 32BReference-onlyDetail

26 rows732.0 tok/s20 GB1 lab row

Qwen 3 30B-A3BReference-onlyDetail

10 rows792.1 tok/s16.12 GBbenchmark rows

Qwen 3 14BReference-onlyDetail

4 rows458.0 tok/s—benchmark rows

Qwen 3 8BReference-onlyDetail

6 rows598.0 tok/s—benchmark rows

qwen-3-0-6bReference-onlyDetail

1 rows1184.4 tok/s—benchmark rows

Qwen 3 4BReference-onlyDetail

9 rows4149.1 tok/s2.54 GBbenchmark rows

Qwen 3 0.6BReference-onlyDetail

1 rows1370.0 tok/s—benchmark rows

gemma-3

small

3 models · 18 benchmark rows · 12 chips tested

132.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Gemma 3 27BReference-onlyDetail

9 rows842.0 tok/s52.57 GBbenchmark rows

Gemma 3 12BReference-onlyDetail

4 rows468.0 tok/s—benchmark rows

Gemma 3 4BReference-onlyDetail

5 rows5132.0 tok/s—benchmark rows

mistral-small-3.1

large

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

Mistral Small 3.1 24BReference-onlyDetail

catalog only———no Apple Silicon rows

phi-4

small

2 models · 10 benchmark rows · 8 chips tested

142.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Phi-4 14BReference-onlyDetail

3 rows362.0 tok/s—benchmark rows

Phi-4 Mini Instruct 3.8BReference-onlyDetail

7 rows6142.0 tok/s—benchmark rows

deepseek-r1

frontier

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

DeepSeek R1Reference-onlyDetail

catalog only———no Apple Silicon rows

deepseek-r1-distill

medium

3 models · 10 benchmark rows · 8 chips tested

97.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

DeepSeek R1 Distill Llama 70BReference-onlyDetail

2 rows216.0 tok/s—benchmark rows

DeepSeek R1 Distill Qwen 32BReference-onlyDetail

3 rows327.0 tok/s—benchmark rows

DeepSeek R1 Distill Llama 8BReference-onlyDetail

5 rows497.0 tok/s—benchmark rows

llama-3.3

1 model · 17 benchmark rows · 8 chips tested

18.0 tok/sfastest on M4 Ultra (192 GB)

ModelCoverageChipsFastestSmallest fitStatus

Llama 3.3 70BReference-onlyDetail

17 rows818.0 tok/s—benchmark rows

llama-3.2

small

1 model · 0 benchmark rows · 0 chips tested

ModelCoverageChipsFastestSmallest fitStatus

Llama 3.2 1BReference-onlyDetail

catalog only———no Apple Silicon rows

qwen-2.5

3 models · 2 benchmark rows · 2 chips tested

15.0 tok/sfastest on M4 Ultra (192 GB)

ModelCoverageChipsFastestSmallest fitStatus

Qwen 2.5 72BReference-onlyDetail

2 rows215.0 tok/s—benchmark rows

Qwen 2.5 7BReference-onlyDetail

catalog only———no Apple Silicon rows

Qwen 2.5 14BReference-onlyDetail

catalog only———no Apple Silicon rows

llama-3.1

medium

1 model · 6 benchmark rows · 6 chips tested

138.0 tok/sfastest on M5 Max (128 GB)

ModelCoverageChipsFastestSmallest fitStatus

Llama 3.1 8BReference-onlyDetail

6 rows6138.0 tok/s—benchmark rows

mistral

medium

1 model · 6 benchmark rows · 4 chips tested

122.0 tok/sfastest on M5 Max (64 GB)

ModelCoverageChipsFastestSmallest fitStatus

Mistral 7B v0.3Reference-onlyDetail

6 rows4122.0 tok/s—benchmark rows

llama-2

medium

1 model · 5 benchmark rows · 5 chips tested

94.3 tok/sfastest on M2 Ultra (76-core GPU, 192 GB)

ModelCoverageChipsFastestSmallest fitStatus

Llama 2 7BReference-onlyDetail

5 rows594.3 tok/s3.56 GBbenchmark rows

qwen-2-5-14b-instruct

14B

1 model · 52 benchmark rows · 52 chips tested

36.7 tok/sfastest on M3 Ultra (80-core GPU, 256 GB)

ModelCoverageChipsFastestSmallest fitStatus

qwen-2-5-14b-instructDetail

52 rows5236.7 tok/s—benchmark rows

llama-3-1-8b-instruct

7B-8B

1 model · 52 benchmark rows · 52 chips tested

63.3 tok/sfastest on M3 Ultra (80-core GPU, 256 GB)

ModelCoverageChipsFastestSmallest fitStatus

llama-3-1-8b-instructDetail

52 rows5263.3 tok/s—benchmark rows

qwen-2-5-7b-instruct

7B-8B

1 model · 1 benchmark rows · 1 chip tested

49.7 tok/sfastest on M4 Max (128 GB)

ModelCoverageChipsFastestSmallest fitStatus

qwen-2-5-7b-instructDetail

1 rows149.7 tok/s—benchmark rows

llama-3-2-1b-instruct

0.5B-4B

1 model · 63 benchmark rows · 63 chips tested

229.0 tok/sfastest on M5 Max (32-core GPU, 36 GB)

ModelCoverageChipsFastestSmallest fitStatus

llama-3-2-1b-instructDetail

63 rows63229.0 tok/s—benchmark rows

Data

benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export

Return to the canonical rankings →