whatcani.run
Home
Models
Device
Runs
Docs
GitHub
whatcani.run
Meta
Meta
Meta
/
Llama 3.1 8B Instruct
8B
July 23, 2024
Llama 3.1 Community License
Share
Overview
Runs
Runs
View all benchmark runs for this model family.
Device
Quant
Runtime
Decode
Prefill
Peak memory
Date
Actions
Device
Quant
Decode
Actions
1
whatcani.run
Home
Models
Device
Runs
Docs
GitHub
Login
whatcani.run
Device
Quant
Runtime
Decode
Prefill
Peak memory
Date
Actions
M4
10
10
32 GB
4bit
mlx_lm
0.31.2
16.1
tok/s
141.7
tok/s
5.94
GB
19%
10 hours ago
M5 Pro
15
16
48 GB
4bit
mlx_lm
0.31.2
54.5
tok/s
1,351.2
tok/s
6.24
GB
13%
4 days ago
M3 Ultra
28
60
256 GB
4bit
mlx_lm
0.31.1
108.2
tok/s
1,226.9
tok/s
7.01
GB
3%
2 weeks ago
M4 Max
16
40
128 GB
4bit
mlx_lm
0.31.2
79.5
tok/s
727.0
tok/s
6.92
GB
5%
2 weeks ago
Device
Quant
Decode
Actions
M4
10
10
32 GB
4bit
16.1
tok/s
M5 Pro
15
16
48 GB
4bit
54.5
tok/s
M3 Ultra
28
60
256 GB
4bit
108.2
tok/s
M4 Max
16
40
128 GB
4bit
79.5
tok/s
1