Smoothie-MiniMax-M2.1

Overview

This is a modified version of MiniMax-M2.1, using Smoothie-Qwen.

What is it?

Reduced probability of Kanji, Hanja, Chinese character(radical) tokens to reduce sudden language mixing.

For who?

If you see Chinese characters during non-Chinese conversation, this model will help in this case.

It does not "solve" the main problem, just improve its occurrence.

For Chinese and Japanese users: Use original model! This model will behave worse in these languages.

Result

From my testing:

  • Chinese character did not appear on Korean conversation.
  • When I ask about Japanese topic, model sucessfully answered with Kanji and Hiragana (although I can't test correctness of response)

How I did it?

I tried to replicate Unsloth's UD quant as possible because my system only can handle up to 3-bit quants.

  1. Download original model
  2. Apply Smoothie-qwen (See configs/config.yaml for reference)
  3. Convert to GGUF (BF16)
  4. Run llama-quantize with Unsloth imatrix and manual override to tensor type from UD quants
  5. Run llama-gguf-split (max size 50GB)

Recommendation

At temperature 1.0, tool calling is bit unstable. I recommend temperature=0.7.

Downloads last month
32
GGUF
Model size
229B params
Architecture
minimax-m2
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for hell0ks/Smoothie-MiniMax-M2.1

Quantized
(44)
this model