Qwen
Qwen
/Qwen3-4B-Instruct-2507

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
2.1 GB39.9 tok/s290.2 tok/sRuns ok
Unsloth
Unsloth
4.0 GB30.4 tok/s495.6 tok/sRuns ok

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

3 devices