48 23 166

Ivan Fioravanti PRO

ivanfioravanti

AI & ML interests

None yet

Recent Activity

upvoted an article 1 day ago

We Got Claude to Fine-Tune an Open Source LLM

liked a model 7 days ago

mistralai/Ministral-3-14B-Base-2512

liked a model 7 days ago

mistralai/Ministral-3-3B-Base-2512

View all activity

Organizations

upvoted an article 1 day ago

Article

We Got Claude to Fine-Tune an Open Source LLM

6 days ago

•

408

upvoted an article 13 days ago

Article

Continuous batching from first principles

15 days ago

•

261

upvoted a paper 30 days ago

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

Paper • 2511.04962 • Published Nov 7 • 52

upvoted an article about 1 month ago

Article

On the Shifting Global Compute Landscape

Oct 29

•

upvoted an article 3 months ago

Article

Introducing Marvis TTS: Real-Time Streaming Speech Synthesis

Aug 27

•

upvoted an article 4 months ago

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

733

upvoted a paper 5 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 315

upvoted 2 articles 5 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Jul 10

•

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

735

upvoted an article 7 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

404

upvoted a collection 8 months ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29 • 668

upvoted a collection 11 months ago

DolphinLabeled Datasets

Collection

Eric Hartford has added labels to help you filter datasets, for your pleasure. • 5 items • Updated Jan 6 • 15

upvoted an article 11 months ago

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

Jan 2

•

upvoted a paper 11 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 58

upvoted 3 papers 12 months ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 158

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 43

upvoted 2 articles about 1 year ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

Dec 4, 2024

•

Article

Releasing the largest multilingual open pretraining dataset

Nov 13, 2024

•

104

upvoted an article over 1 year ago

Article

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Apr 29, 2024

•

Ivan Fioravanti PRO

AI & ML interests

Recent Activity

Organizations

ivanfioravanti's activity

We Got Claude to Fine-Tune an Open Source LLM

Continuous batching from first principles

On the Shifting Global Compute Landscape

Introducing Marvis TTS: Real-Time Streaming Speech Synthesis

Uncensor any LLM with abliteration

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

SmolLM3: smol, multilingual, long-context reasoner

You could have designed state of the art positional encoding

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

Releasing the largest multilingual open pretraining dataset

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together