← Canonical rankings
Canonical Rankings

Best Macs for this model

Qwen3.5-397B-A17B ranked across the Mac lineup at the best practical quantization, using the best available runtime evidence. Model picker is focused on current-market choices.

29 ranked MacsUse the strongest current runtime evidence for each row.28 historical models hiddenStatic paths cover only canonical model pages; sort and quantization stay as query state.
RankMacScoreQuantTok/sRuntimeFitsHeadroomContextEvidencePriceWhy it ranks here
1Mac Studio M3 Ultra 256GB258Q4_K_M 40.2 tok/s Fastest evidence path: Q4_K_M · 40.2 tok/s · Best available · EstimatedBest availableFits42.9 GB47kEstimated$7,499Q4_K_M is the current best practical quantization. 40.2 tok/s is estimated from nearby benchmark coverage. 42.9 GB headroom remains at this quantization.
2Mac Pro M2 Ultra 192GB2373bit 40.2 tok/s Fastest evidence path: 3bit · 40.2 tok/s · Best available · EstimatedBest availableFits51.9 GB210kEstimated$6,9993bit is the current best practical quantization. 40.2 tok/s is estimated from nearby benchmark coverage. 51.9 GB headroom remains at this quantization.
3Mac Studio M4 Max 128GB214IQ2_K_S 40.2 tok/s Fastest evidence path: IQ2_K_S · 40.2 tok/s · Best available · EstimatedBest availableFits29.5 GB98kEstimated$4,499IQ2_K_S is the current best practical quantization. 40.2 tok/s is estimated from nearby benchmark coverage. 29.5 GB headroom remains at this quantization.
4MacBook Pro M4 Max 128GB 16-inch214IQ2_K_S 40.2 tok/s Fastest evidence path: IQ2_K_S · 40.2 tok/s · Best available · EstimatedBest availableFits29.5 GB98kEstimated$5,999IQ2_K_S is the current best practical quantization. 40.2 tok/s is estimated from nearby benchmark coverage. 29.5 GB headroom remains at this quantization.
5MacBook Pro M5 Max 128GB 16-inch105IQ2_K_S 13.0 tok/s Fastest evidence path: IQ2_K_S · 13.0 tok/s · Best available · EstimatedBest availableFits29.5 GB98kEstimated$5,399IQ2_K_S is the current best practical quantization. 13.0 tok/s is estimated from nearby benchmark coverage. 29.5 GB headroom remains at this quantization.
6Mac Mini M4 16GB0F32 Best availableNo-1464.4 GBEstimated$499Qwen3.5-397B-A17B does not fit on Mac Mini M4 16GB at the current practical quantization.
7Mac Mini M4 24GB0F32 Best availableNo-1456.4 GBEstimated$599Qwen3.5-397B-A17B does not fit on Mac Mini M4 24GB at the current practical quantization.
8Mac Mini M4 32GB0F32 Best availableNo-1448.4 GBEstimated$799Qwen3.5-397B-A17B does not fit on Mac Mini M4 32GB at the current practical quantization.
9MacBook Air M4 16GB 13-inch0F32 Best availableNo-1464.4 GBEstimated$1,099Qwen3.5-397B-A17B does not fit on MacBook Air M4 16GB 13-inch at the current practical quantization.
10MacBook Air M4 24GB 13-inch0F32 Best availableNo-1456.4 GBEstimated$1,299Qwen3.5-397B-A17B does not fit on MacBook Air M4 24GB 13-inch at the current practical quantization.
11MacBook Air M4 16GB 15-inch0F32 Best availableNo-1464.4 GBEstimated$1,299Qwen3.5-397B-A17B does not fit on MacBook Air M4 16GB 15-inch at the current practical quantization.
12Mac Mini M4 Pro 24GB0F32 Best availableNo-1456.4 GBEstimated$1,399Qwen3.5-397B-A17B does not fit on Mac Mini M4 Pro 24GB at the current practical quantization.
13MacBook Air M4 32GB 13-inch0F32 Best availableNo-1448.4 GBEstimated$1,499Qwen3.5-397B-A17B does not fit on MacBook Air M4 32GB 13-inch at the current practical quantization.
14MacBook Air M4 24GB 15-inch0F32 Best availableNo-1456.4 GBEstimated$1,499Qwen3.5-397B-A17B does not fit on MacBook Air M4 24GB 15-inch at the current practical quantization.
15Mac Mini M4 Pro 48GB0F32 Best availableNo-1432.4 GBEstimated$1,599Qwen3.5-397B-A17B does not fit on Mac Mini M4 Pro 48GB at the current practical quantization.
16MacBook Air M4 32GB 15-inch0F32 Best availableNo-1448.4 GBEstimated$1,699Qwen3.5-397B-A17B does not fit on MacBook Air M4 32GB 15-inch at the current practical quantization.
17MacBook Pro M4 Pro 24GB 14-inch0F32 Best availableNo-1456.4 GBEstimated$1,999Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Pro 24GB 14-inch at the current practical quantization.
18Mac Studio M4 Max 36GB0F32 Best availableNo-1444.4 GBEstimated$1,999Qwen3.5-397B-A17B does not fit on Mac Studio M4 Max 36GB at the current practical quantization.
19MacBook Pro M4 Pro 48GB 14-inch0F32 Best availableNo-1432.4 GBEstimated$2,499Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Pro 48GB 14-inch at the current practical quantization.
20MacBook Pro M4 Pro 24GB 16-inch0F32 Best availableNo-1456.4 GBEstimated$2,499Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Pro 24GB 16-inch at the current practical quantization.
21Mac Studio M4 Max 48GB0F32 Best availableNo-1432.4 GBEstimated$2,499Qwen3.5-397B-A17B does not fit on Mac Studio M4 Max 48GB at the current practical quantization.
22MacBook Pro M4 Max 36GB 14-inch0F32 Best availableNo-1444.4 GBEstimated$2,999Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Max 36GB 14-inch at the current practical quantization.
23MacBook Pro M4 Pro 48GB 16-inch0F32 Best availableNo-1432.4 GBEstimated$2,999Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Pro 48GB 16-inch at the current practical quantization.
24Mac Studio M4 Max 64GB0F32 Best availableNo-1416.4 GBEstimated$2,999Qwen3.5-397B-A17B does not fit on Mac Studio M4 Max 64GB at the current practical quantization.
25MacBook Pro M4 Max 48GB 14-inch0F32 Best availableNo-1432.4 GBEstimated$3,499Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Max 48GB 14-inch at the current practical quantization.
26MacBook Pro M4 Max 36GB 16-inch0F32 Best availableNo-1444.4 GBEstimated$3,499Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Max 36GB 16-inch at the current practical quantization.
27MacBook Pro M4 Max 48GB 16-inch0F32 Best availableNo-1432.4 GBEstimated$3,999Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Max 48GB 16-inch at the current practical quantization.
28Mac Studio M3 Ultra 96GB0F32 Best availableNo-1384.4 GBEstimated$3,999Qwen3.5-397B-A17B does not fit on Mac Studio M3 Ultra 96GB at the current practical quantization.
29MacBook Pro M4 Max 64GB 16-inch0F32 Best availableNo-1416.4 GBEstimated$4,499Qwen3.5-397B-A17B does not fit on MacBook Pro M4 Max 64GB 16-inch at the current practical quantization.

Qwen3.5-397B-A17B — ranking first, raw rows below

Start with the ranked Mac table above. Use the rest of this page to inspect raw Apple Silicon coverage and model metadata.

Quantizations observed: q4.1bit, 4bit

2Benchmark rows
2Chip tiers covered
40.2Fastest avg tok/s (M3 Ultra (512 GB))
Minimum RAM observed

Fastest published result is 40.2 tok/s on M3 Ultra (512 GB) at q4.1bit. Published runtimes include flash-moe, MLX. Start with Rankings for the decision, then use the raw rows below to audit the evidence.

Based on 2 external benchmarks; no lab runs yet.

Published runtimes: flash-moe, MLX.

397BTotal params
17BActive params
262,144Context window
2026-02-16Release date

What this model is, and what Apple Silicon users are actually seeing

Official model cards tell you what the model is for and which software stacks it targets. Field reality below shows how much Apple Silicon evidence we have so far.

Unified Vision-Language Foundation: Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding benchmarks.

Official source  ·  Raw model card

agentscodingreasoningvisual-understanding

Runtime support mentioned

vLLMSGLangTransformersKTransformers

Official specs

  • Type: Causal Language Model with Vision Encoder.
  • Scale: 397B in total and 17B activated.
  • Context: 262,144 natively and extensible up to 1,010,000 tokens.
  • Total parameters: 397B in total and 17B activated.
  • Max input: 262,144 natively and extensible up to 1,010,000 tokens.

Official takeaways

  • Unified Vision-Language Foundation: Early fusion training on multimodal tokens achieves cross-generational parity with Qwen3 and outperforms Qwen3-VL models across reasoning, coding, agents, and visual understanding ben…
  • Efficient Hybrid Architecture: Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead.
  • Scalable RL Generalization: Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability.
  • Global Linguistic Coverage: Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuanced cultural and regional understanding.

Official model cards describe intent, capabilities, and supported stacks. They do not prove Apple Silicon speed by themselves.

Qwen3.5-397B-A17B: 4 Apple Silicon field reports; best reported generation ~30.81 tok/s; best reported prompt processing ~122.46 tok/s; seen on MacBook Pro M5 MAX 128GB; via llama.cpp, flash-moe, MLX.

2Benchmark rows
4Field reports
5Practitioner signals
Sparse BenchmarksEvidence status

What practitioners keep saying

  • The post identifies Qwen3.5-397B-A17B-UD-IQ2_XXS on a MacBook Pro M5 Max 128GB, a roughly 106GB GGUF footprint, llama.cpp serving with --ctx-size 16384, and iogpu.wired_limit_mb=122880 to make the 16K context run fit.
  • The posted llama.cpp sample reports 122.46 tok/s prompt processing for 33 prompt tokens and 30.81 tok/s generation for 2458 generated tokens.
  • The author says prompt processing varies with batching, so that range should stay methodology context rather than a separate exact field row.

Apple Silicon field sources

  • r/LocalLLaMA

    2026-04-13 · MacBook Pro M5 Max 128GB · llama.cpp

    A cupel follow-up reports Qwen3.5-397B-A17B-UD-IQ2_XXS running through llama.cpp on an M5 Max 128GB MacBook Pro with a measured decode sample and important memory and sustained-load caveats.

  • r/LocalLLaMA

    2026-03-26 · Mac Studio M3 Ultra 512GB · MLX

    A Mac Studio M3 Ultra 512GB owner reports Qwen3.5-397B-A17B running locally on MLX 6-bit, which pushes the Apple Silicon stretch frontier far past speculative fit.

  • r/LocalLLaMA

    2026-03-26 · MacBook Pro M5 Max 128GB · flash-moe

    A follow-up M5 Max 128GB flash-moe benchmark turns Qwen3.5-397B-A17B from a vague stretch-tier experiment into a measured laptop result, with the best 4-bit run reaching 12.99 tok/s locally.

  • r/LocalLLaMA

    2026-03-22 · Mac Studio M3 Ultra 512GB · MLX

    Operators report that even when Qwen3.5-397B-A17B technically fits on an M3 Ultra 512GB Mac Studio, practical coding use is still uncomfortable enough that it remains a stretch-tier watch model.

  • r/LocalLLaMA

    2026-03-21 · MacBook Pro M5 Max 128GB · flash-moe

    Practitioners report that Qwen3.5-397B-A17B is at least experimentally runnable on M5 Max 128GB Apple Silicon with single-digit generation speed.

Runtime mentions in the field

llama.cppMLX

Hardware mentioned in reports

128GBM3 UltraMacMac StudioMacBookMacBook Pro

What would improve confidence

  • Expand Cross Chip Benchmark Coverage
  • Reproduce Field Performance Signal
  • Upgrade To First Party Measurement

Published chip coverage includes M3 Ultra (512 GB), M5 Max (128 GB). Fastest published row is 40.2 tok/s on M3 Ultra (512 GB) at q4.1bit.

Related Qwen3.5-397B-A17B models with published pages: Qwen3.5-27B · Qwen3.5-35B-A3B · Qwen3.5-9B · Qwen3.5-122B-A10B · Qwen3.5-4B

Raw benchmark rows for Qwen3.5-397B-A17B

Rows stay below the ranking because this page is answer-first. Use them to inspect exact chips, quantizations, runtimes, and sources.

ChipQuantRAM req.ContextAvg tok/sPrompt tok/sRuntimeSource
M3 Ultra (512 GB)q4.1bit40.2 tok/sMLXref
M5 Max (128 GB)4bit13.0 tok/sflash-moeref

Ordered by fastest published tok/s on the chip family in each Mac. Click through for the full machine page.

benchmarks.json — full dataset  ·  models.json — model summaries  ·  benchmarks.csv — CSV export

See all models →