Qwen
Qwen
/Qwen3.5-4B

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
Unsloth
Unsloth
2.4 GBN/AN/AN/A
Unsloth
Unsloth
2.6 GB52.2 tok/s777.3 tok/sRuns well
MLX Community
MLX Community
2.8 GB82.7 tok/s575.2 tok/sRuns well
q8_0GGML's logo.
Unknown4.2 GBN/AN/AN/A
N/A

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

36 devices