Meta
Meta
Meta/Llama 3.1 8B Instruct

Runs

View all benchmark runs for this model family.

	Quant						Actions

	Quant		Actions

	Quant						Actions
M4	4bit	mlx_lm0.31.2	16.1 tok/s	141.7 tok/s	5.94 GB 19%
M5 Pro	4bit	mlx_lm0.31.2	54.5 tok/s	1,351.2 tok/s	6.24 GB 13%
M3 Ultra	4bit	mlx_lm0.31.1	108.2 tok/s	1,226.9 tok/s	7.01 GB 3%
M4 Max	4bit	mlx_lm0.31.2	79.5 tok/s	727.0 tok/s	6.92 GB 5%

	Quant		Actions
M4	4bit	16.1 tok/s
M5 Pro	4bit	54.5 tok/s
M3 Ultra	4bit	108.2 tok/s
M4 Max	4bit	79.5 tok/s