whatcani.run
Home
Models
Device
Runs
Docs
GitHub
whatcani.run
Qwen
Qwen
Qwen
/
Qwen3-4B-Instruct-2507
4B
August 5, 2025
Apache 2.0
Share
Overview
Runs
Runs
View all benchmark runs for this model family.
Device
Quant
Runtime
Decode
Prefill
Peak memory
Date
Actions
Device
Quant
Decode
Actions
1
whatcani.run
Home
Models
Device
Runs
Docs
GitHub
Login
whatcani.run
Device
Quant
Runtime
Decode
Prefill
Peak memory
Date
Actions
M5 Pro
15
16
48 GB
4bit
mlx_lm
0.31.2
82.9
tok/s
2,359.8
tok/s
6.12
GB
13%
2 months ago
M3
8
10
24 GB
q8_0
llama.cpp
b8480
11.2
tok/s
234.0
tok/s
0.71
GB
3%
2 months ago
M1 Max
10
32
64 GB
q8_0
llama.cpp
b8240
30.4
tok/s
495.6
tok/s
5.02
GB
8%
2 months ago
M1 Max
10
32
64 GB
4bit
mlx_lm
0.31.0
39.9
tok/s
290.2
tok/s
4.08
GB
6%
2 months ago
Device
Quant
Decode
Actions
M5 Pro
15
16
48 GB
4bit
82.9
tok/s
M3
8
10
24 GB
q8_0
11.2
tok/s
M1 Max
10
32
64 GB
q8_0
30.4
tok/s
M1 Max
10
32
64 GB
4bit
39.9
tok/s
1