Hi, could you perform the same PRISM on the Minimax M2.1?
Hi, could you perform the same PRISM on the Minimax M2.1?
Yes please!
MiniMax-M2.1 is an incredible agentic model for sure. And a MiniMax-M2.1-PRISM would be amazing.
I've been donating the SOTA PRISM models out of pocket—unfortunately the compute, storage, and hosting costs are very expensive, and We've burned through available budget. If enough of us want it, we can prioritize MiniMax-M2.1-PRISM as the next community release.
GLM-4.7-PRISM just crossed 2,000+ downloads—if even a small fraction of us sponsored to help cover hard-costs, we'd make it happen.
@clevnumb , @win10 MiniMax-M2.1-PRISM is out and available here: https://huggingface.co/Ex0bit/MiniMax-M2.1-PRISM, please consider supporting the work!
@clevnumb , @win10 MiniMax-M2.1-PRISM is out and available here: https://huggingface.co/Ex0bit/MiniMax-M2.1-PRISM, please consider supporting the work!
First attempt using this and it just loops on thinking, with "Oh wait, I need to.....", etc.....had to cancel it. :-(
I will try more later...
Good timing @clevnumb ! We found and fixed a target layer selection bug in the initial IQ1_S quant that was likely causing your issue —re-upload in progress, along with higher BPW quants IQ2_M, IQ4_NL (apologies, but you'll need to re-download the fixed quant).
Note: low-BPW quants can sometimes cause looping; raising the repeat penalty or using a higher BPW quant should help. The full BF16 testing was beautiful to see in action!
Good timing @clevnumb ! We found and fixed a target layer selection bug in the initial IQ1_S quant that was likely causing your issue —re-upload in progress, along with higher BPW quants IQ2_M, IQ4_NL (apologies, but you'll need to re-download the fixed quant).
Note: low-BPW quants can sometimes cause looping; raising the repeat penalty or using a higher BPW quant should help. The full BF16 testing was beautiful to see in action!
I wish I could try those larger quants but on my unified memory Strix Halo system that has 96GB, total, using CachyOS, I can only afford 90GB of VRAM to these models... (only, lol). I'll try the fixed one, thank you!
Sorry, I meant with the GLM 4.7, I got the loop by the way...
Thanks for testing @clevnumb .GLM4.7-PRISM got an even weight abliteration, it’s massive original size didn’t allow for per-weight SNR. we’ll take a more fine tuned stab at updating the model with per weight abliteration once funding allows or when GLM-4.7-Flash comes out. For now higher repeat penalty should help.