OpenAI
OpenAI
/gpt-oss-20b

Quantizations

QuantQuantized bySizeDecodePrefillScoreActions
MLX Community
MLX Community
10.4 GBN/AN/AN/A
MLX Community
MLX Community
11.2 GB94.2 tok/s1,267.9 tok/sRuns well
Unsloth
Unsloth
11.3 GB88.5 tok/s1,153.2 tok/sRuns well
OpenAI
OpenAI
12.8 GBN/AN/AN/A
LM Studio
LM Studio
20.7 GB69.8 tok/s1,302.4 tok/sRuns ok

Device Comparison

Results include trials with 4,096 input tokens and 1,024 output tokens only.

Decode / Prefill Speeds

18 devices
gpt-oss-20b by OpenAI | whatcani.run