Ex0bit/GLM-4.7-PRISM · Hi, could you perform the same PRISM on the Minimax M2.1?

Hi, could you perform the same PRISM on the Minimax M2.1?

by win10 - opened 11 days ago

Discussion

win10

11 days ago

Hi, could you perform the same PRISM on the Minimax M2.1?

clevnumb

11 days ago

Yes please!

Ex0bit

Owner 10 days ago

•

edited 10 days ago

MiniMax-M2.1 is an incredible agentic model for sure. And a MiniMax-M2.1-PRISM would be amazing.

I've been donating the SOTA PRISM models out of pocket—unfortunately the compute, storage, and hosting costs are very expensive, and We've burned through available budget. If enough of us want it, we can prioritize MiniMax-M2.1-PRISM as the next community release.

GLM-4.7-PRISM just crossed 2,000+ downloads—if even a small fraction of us sponsored to help cover hard-costs, we'd make it happen.

https://ko-fi.com/ericelbaz

Ex0bit

Owner 7 days ago

@clevnumb , @win10 MiniMax-M2.1-PRISM is out and available here: https://huggingface.co/Ex0bit/MiniMax-M2.1-PRISM, please consider supporting the work!

Ex0bit changed discussion status to closed 7 days ago

clevnumb

6 days ago

@clevnumb , @win10 MiniMax-M2.1-PRISM is out and available here: https://huggingface.co/Ex0bit/MiniMax-M2.1-PRISM, please consider supporting the work!

First attempt using this and it just loops on thinking, with "Oh wait, I need to.....", etc.....had to cancel it. :-(

I will try more later...

Ex0bit

Owner 6 days ago

•

edited 6 days ago

Good timing @clevnumb ! We found and fixed a target layer selection bug in the initial IQ1_S quant that was likely causing your issue —re-upload in progress, along with higher BPW quants IQ2_M, IQ4_NL (apologies, but you'll need to re-download the fixed quant).

Note: low-BPW quants can sometimes cause looping; raising the repeat penalty or using a higher BPW quant should help. The full BF16 testing was beautiful to see in action!

clevnumb

5 days ago

Good timing @clevnumb ! We found and fixed a target layer selection bug in the initial IQ1_S quant that was likely causing your issue —re-upload in progress, along with higher BPW quants IQ2_M, IQ4_NL (apologies, but you'll need to re-download the fixed quant).

Note: low-BPW quants can sometimes cause looping; raising the repeat penalty or using a higher BPW quant should help. The full BF16 testing was beautiful to see in action!

I wish I could try those larger quants but on my unified memory Strix Halo system that has 96GB, total, using CachyOS, I can only afford 90GB of VRAM to these models... (only, lol). I'll try the fixed one, thank you!

clevnumb

5 days ago

Sorry, I meant with the GLM 4.7, I got the loop by the way...

Ex0bit

Owner 5 days ago

•

edited 5 days ago

Thanks for testing @clevnumb .GLM4.7-PRISM got an even weight abliteration, it’s massive original size didn’t allow for per-weight SNR. we’ll take a more fine tuned stab at updating the model with per weight abliteration once funding allows or when GLM-4.7-Flash comes out. For now higher repeat penalty should help.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment