Mitko Vasilev's picture

Mitko Vasilev

mitkox

·

AI & ML interests

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

Recent Activity

posted an update 5 days ago

GLM-4.7-Flash is fast, good and cheap. 3,074 tokens/sec peak at 200k tokens context window on my desktop PC. Works with Claude Code and opencode for hours. No errors, drop-in replacement of the Anthropic cloud AI. MIT licensed, open weights, free for commercial use and modifications. Supports speculative decoding using MTP, which is highly effective in mitigating latency. Great for on device AI coding as AWQ 4bit at 18.5 GB. Hybrid inference on a single consumer GPU + CPU RAM.

posted an update 23 days ago

I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5. 2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day. But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol 3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!

posted an update about 2 months ago

Got to 1199.8 tokens/sec with Devstral Small -2 on my desktop GPU workstation. vLLM nightly. Works out of the box with Mistral Vibe. Next is time to test the big one.

View all activity

Organizations

liked a model 4 months ago

Kwaipilot/KAT-Dev-72B-Exp

Text Generation • 73B • Updated Oct 13, 2025 • 103 • 159

liked a model 5 months ago

deepseek-ai/DeepSeek-V3.1-Base

Text Generation • 685B • Updated Aug 26, 2025 • 13k • 1.01k

liked a model 7 months ago

moonshotai/Kimi-K2-Instruct

Text Generation • 1T • Updated Nov 7, 2025 • 166k • • 2.31k

liked a model 8 months ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29, 2025 • 424k • • 2.39k

liked 5 models 9 months ago

fdtn-ai/Foundation-Sec-8B

Text Generation • 8B • Updated Aug 26, 2025 • 6.58k • • 284

tngtech/DeepSeek-R1T-Chimera

Text Generation • 685B • Updated Nov 4, 2025 • 563 • 268

NousResearch/Minos-v1

Text Classification • 0.4B • Updated Apr 28, 2025 • 1.25k • • 172

facebook/blt

Updated Apr 30, 2025 • 11 • 74

facebook/blt-7b

Updated May 1, 2025 • 18 • 62

liked a model 10 months ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • 253B • Updated Oct 15, 2025 • 18.5k • • 342

liked a dataset 10 months ago

nvidia/OpenCodeReasoning

Viewer • Updated May 4, 2025 • 753k • 2.93k • 524

liked a model 10 months ago

nomic-ai/colnomic-embed-multimodal-7b

Visual Document Retrieval • Updated Apr 15, 2025 • 3.15k • 99

liked a dataset 10 months ago

virtuoussy/Multi-subject-RLVR

Viewer • Updated Apr 16, 2025 • 579k • 85 • 67

liked 2 models 10 months ago

Qwen/Qwen2.5-Omni-7B

Any-to-Any • 11B • Updated Apr 30, 2025 • 152k • 1.85k

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27, 2025 • 288k • • 3.08k

liked 2 models 11 months ago

unsloth/QwQ-32B-GGUF

Text Generation • 33B • Updated Apr 27, 2025 • 1.09k • 86

Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11, 2025 • 71.6k • • 2.88k

liked a dataset 11 months ago

PrimeIntellect/SYNTHETIC-1

Viewer • Updated Feb 21, 2025 • 1.99M • 346 • 61

liked a dataset 12 months ago

open-r1/OpenR1-Math-Raw

Viewer • Updated Feb 24, 2025 • 516k • 169 • 76

liked a model about 1 year ago

mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0

Text Generation • 2B • Updated Jan 29, 2025 • 93 • 44