Meta
Meta
/Llama 3.1 8B Instruct

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
4.2 GB108.2 tok/s1,226.9 tok/sRuns great

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

4 devices
Llama 3.1 8B Instruct by Meta | whatcani.run