Gemma 3 4B — Apple Silicon Benchmarks

Measured inference speed for Gemma 3 4B across 1 Apple Silicon chip. Tokens per second at multiple quantization levels. Real runs, not estimates.

Quantizations measured: Q4_0

1Benchmark rows

1Chip tiers covered

100.5Fastest avg tok/s (M4 Max (128 GB))

—Minimum RAM observed

Benchmark results for Gemma 3 4B

Rows sorted by avg tok/s descending. Click source badge to see original measurement page.

Chip	Quant	RAM req.	Context	Avg tok/s	Prompt tok/s	Runtime	Source
M4 Max (128 GB)	Q4_0	—	4k	100.5 tok/s	—	LM Studio	ref

Chips with published results for Gemma 3 4B

Data

benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export