nm-testing/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ-ActOrder
5B
•
Updated
•
18
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4-GPTQ
5B
•
Updated
•
22
nm-testing/Meta-Llama-3-8B-Instruct-NVFP4
5B
•
Updated
•
9
nm-testing/Meta-Llama-3-8B-Instruct-MXFP4A16-GPTQ
5B
•
Updated
•
7
nm-testing/Speculator-Qwen3-30B-MOE-VL-Eagle3
0.4B
•
Updated
•
160
nm-testing/Qwen3-0.6B-FP8_BLOCK
0.6B
•
Updated
•
65
nm-testing/Qwen3-0.6B-W4A16-G128
0.2B
•
Updated
•
212
nm-testing/Llama-3.2-1B-Instruct-DEBUG-STRAWBERRY
1B
•
Updated
•
17
nm-testing/Llama-3.2-1B-Instruct-DEBUG-COUNTER
1B
•
Updated
•
46
nm-testing/TinyLlama-1.1B-compressed-tensors-kv-cache-scheme
Text Generation
•
0.4B
•
Updated
•
1.77k
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-attn_head
1B
•
Updated
•
44
nm-testing/TinyLlama-1.1B-Chat-v1.0-kvcache-fp8-tensor
1B
•
Updated
•
569
nm-testing/Qwen3-30B-A3B-MXFP4A16
17B
•
Updated
•
3.69k
nm-testing/Qwen3-32B-MXFP4A16
18B
•
Updated
•
10
nm-testing/Meta-Llama-3-8B-Instruct-awq-NVFP4
5B
•
Updated
•
6
nm-testing/testing-llama3.1.8b-2layer-eagle3
Updated
•
217
nm-testing/Qwen3-30B-A3B-NVFP416
17B
•
Updated
•
48
nm-testing/CDH-test-nvfp4-awq
5B
•
Updated
nm-testing/granite-4.0-h-small-FP8-dynamic
Text Generation
•
32B
•
Updated
•
4
nm-testing/tinysmokeqwen3moe-W4A16-first-only-CTstable
2.54M
•
Updated
•
2.18k
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.3-70B-Instruct-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.3-70B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Qwen3-32B-QKV-Cache-FP8-Per-Head
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Tensor
Updated
nm-testing/Llama-3.1-8B-Instruct-FP8-dynamic-QKV-Cache-FP8-Per-Head
Updated