Text Generation
Transformers
GGUF
unsloth
imatrix
conversational

Qwen3-Next-Thinking now updated with iMatrix! + better performance with llama.cpp

#2
by danielhanchen - opened
Unsloth AI org

Now updated with imatrix. Quantized Qwen3-next uploads should now be much improved, especially at lower bit rates! :)

Also thanks to llama.cpp, they optimized model inference even further.

Yes you will need to redownload.

danielhanchen pinned discussion

Sign up or log in to comment