AMD Ryzen AI Max+ 395 Strix Halo
Quantized models with Windows ROCm llama-bench results
Text Generation • 1B • Updated • 65.1k • 54Note pp512: 4168.69 ± 937.92 | tg128: 143.90 ± 7.15 [UD-Q6_K_XL]
unsloth/OLMo-2-0425-1B-Instruct-GGUF
Text Generation • 1B • Updated • 349 • 7Note pp512: 4563.71 ± 55.46 | tg128: 135.19 ± 0.38 [UD-Q6_K_XL]
unsloth/LFM2-8B-A1B-GGUF
Text Generation • 8B • Updated • 1.87k • 43Note pp512: 2352.90 ± 40.45 | tg128: 111.49 ± 2.38 [UD-Q6_K_XL]
ggml-org/SmolVLM2-2.2B-Instruct-GGUF
2B • Updated • 5.59k • 27Note pp512: 2062.47 ± 316.51 | tg128: 97.45 ± 4.51 [Q8_0]
ggml-org/gpt-oss-20b-GGUF
21B • Updated • 50.9k • 125Note pp512: 752.08 ± 13.02 | tg128: 66.05 ± 3.15 [MXFP4]
bartowski/Nanbeige_Nanbeige4-3B-Thinking-2511-GGUF
Text Generation • 4B • Updated • 3k • 15Note pp512: 1442.34 ± 110.00 | tg128: 84.12 ± 8.04 [IQ4_NL]
bartowski/XiaomiMiMo_MiMo-VL-7B-RL-2508-GGUF
Image-Text-to-Text • 8B • Updated • 295 • 1Note pp512: 704.39 ± 25.41 | tg128: 42.12 ± 5.50 [IQ4_NL]
ggml-org/Kimi-VL-A3B-Thinking-2506-GGUF
16B • Updated • 1.8k • 27Note pp512: 852.78 ± 40.09 | tg128: 45.57 ± 0.81 [Q8_0]
unsloth/Nemotron-3-Nano-30B-A3B-GGUF
Text Generation • 32B • Updated • 91.6k • 246Note pp512: 493.86 ± 2.23 | tg128: 40.13 ± 0.47 [UD-Q6_K_XL]
unsloth/Seed-Coder-8B-Reasoning-GGUF
Text Generation • 8B • Updated • 1.46k • 11Note pp512: 834.67 ± 4.36 | tg128: 28.63 ± 0.30 [UD-Q6_K_XL]
unsloth/rnj-1-instruct-GGUF
8B • Updated • 2.52k • 8Note pp512: 677.64 ± 10.98 | tg128: 26.79 ± 0.71 [UD-Q6_K_XL]
Intel/MiniMax-M2-REAP-172B-A10B-gguf-q2ks-mixed-AutoRound
173B • Updated • 201 • 6Note pp512: 182.88 ± 3.84 | tg128: 26.21 ± 0.44 [Q2_K_S]
bartowski/ServiceNow-AI_Apriel-1.6-15b-Thinker-GGUF
Image-Text-to-Text • 14B • Updated • 7.89k • 25Note pp512: 338.01 ± 5.23 | tg128: 25.25 ± 0.39 [IQ4_NL]
unsloth/GLM-4.6V-Flash-GGUF
Image-Text-to-Text • 9B • Updated • 28.8k • 81Note pp512: 639.56 ± 2.57 | tg128: 23.76 ± 0.04 [UD-Q6_K_XL]
unsloth/Apertus-8B-Instruct-2509-GGUF
Text Generation • 8B • Updated • 2.8k • 16Note pp512: 546.45 ± 33.03 | tg128: 19.44 ± 0.24 [UD-Q6_K_XL]
unsloth/Phi-4-reasoning-plus-GGUF
Text Generation • 15B • Updated • 3.86k • 77Note pp512: 417.69 ± 13.85 | tg128: 15.35 ± 0.32 [UD-Q6_K_XL]
unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
Image-to-Text • 108B • Updated • 105k • 134Note pp512: 138.94 ± 2.66 | tg128: 15.34 ± 1.97 [UD-IQ3_XXS]
bartowski/mistralai_Devstral-Small-2-24B-Instruct-2512-GGUF
Text Generation • 24B • Updated • 17k • 48Note pp512: 179.34 ± 0.99 | tg128: 14.57 ± 0.12 [IQ4_NL]
unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
24B • Updated • 157k • 94Note pp512: 245.79 ± 5.41 | tg128: 9.80 ± 0.03 [UD-Q6_K_XL]
unsloth/Olmo-3.1-32B-Think-GGUF
32B • Updated • 757 • 7Note pp512: 179.13 ± 5.12 | tg128: 7.04 ± 0.17 [UD-Q6_K_XL]
unsloth/Seed-OSS-36B-Instruct-GGUF
Text Generation • 36B • Updated • 2.27k • 41Note pp512: 166.54 ± 4.16 | tg128: 6.33 ± 0.02 [UD-Q6_K_XL]
unsloth/Kimi-Dev-72B-GGUF
73B • Updated • 5.14k • 48Note pp512: 68.09 ± 0.25 | tg128: 5.11 ± 0.32 [UD-IQ3_XXS]