YatharthS (Yatharth Sharma)

Thanks! Glad you found it helpful. I guess right now better, and more compressive audio tokenizers would be great. Training data for tasks apart from simple TTS and voice cloning is lacking as well.

liked a model 8 days ago

YatharthS/FlashSR

Audio-to-Audio • Updated 6 days ago • 38

New activity in YatharthS/MiraTTS 8 days ago

Dataset

4

#4 opened 9 days ago by

rahul7star

updated a model 9 days ago

YatharthS/MiraTTS

Text-to-Speech • 0.5B • Updated 9 days ago • 3.97k • 168

liked a model 9 days ago

jiaqili3/flexicodec

Text-to-Speech • Updated Nov 25, 2025 • 7

New activity in YatharthS/MiraTTS 10 days ago

Finetune when?

3

#2 opened 14 days ago by

ebybucuresteanu

commented on LLM based Audio models 11 days ago

This comment has been hidden

commented on LLM based Audio models 13 days ago

Speech tokens and text tokens are treated the same in LLMs, they just learn speech tokens as a different language as I stated. They will learn about using speech tokens in a similar way they learn about text tokens.

Unfortunately reasoning capabilities do decrease because of a few reasons:

Simply not much training data forcing the model to reason.
They are usually trained on relatively little amount of data. For example most models are trained on trillions of tokens of text but only billions of tokens of audio.
Small sizes, most models are less than 3b params and hence most just don’t have great reasoning capabilities.

liked a model 14 days ago

mradermacher/MiraTTS-GGUF

0.5B • Updated 9 days ago • 1.05k • 5

commented on LLM based Audio models 14 days ago

Thanks 🤗

reacted to their post with 👍 14 days ago

Post

3496

🤯 🤯 Released a high quality finetuned LLM based TTS model that can generate realistic and clear 48khz audio at over 100x realtime speed! 🤯 🤯

Github link: https://github.com/ysharma3501/MiraTTS

Model link: https://github.com/ysharma3501/MiraTTS

Blog explaining llm tts models: https://huggingface.co/blog/YatharthS/llm-tts-models

4 replies

·

reacted to ZennyKenny's post with 👍 14 days ago

Post

1950

🍓 One of the coolest parts about being an early Strawberry user has been the opportunity to build on the app at the ground floor.

The platform already has a ton of great integrations that let you interact with your external apps directly with tools, but I wanted to add the ability to do stuff in Slack as well.

💪 So I took the base Anthropic Slack MCP server, added a whole bunch of new tools, and generalized it as an HTTP-based SSE-server and deployed it in like 2 minutes with Railway so that Strawberry could make use of it (as can Claude or any other MCP client).

Now, you can Chat with your Strawberry Companion (or Claude, or whatever) and do things like:
➡️ Get caught up across all of your Slack channels after a long weekend or noisy incident without having to read 20 threads in 10 different channels
➡️ Create, read, and edit Canvases, Messages, and Channels
➡️ Take any resources or content that you're using in your Chat and inject it directly into Slack without copy / paste

😎 I'm pretty pleased with the results, and I made a short demo video showing the results of the work (link in comments). The best part is, it's available on GitHub for anyone else to use too (link in the comments, instructions in the README). The setup takes about 5-10 minutes.

2 replies

·

reacted to rajkumarrawal's post with 👍 14 days ago

Post

1865

" An open standardized protocol enabling communication for autonomous robots to exchange data, coordinate tasks, and collaborate in real-time environments in the age of AI ". r2r-protocol (Robot2Robot Protocol) is now officially open source! 🔓

"pip install r2r-protocol"

Whether you're a developer, researcher, or tech enthusiast, we invite you to explore, use, and contribute to the project.

🔗 Check it out here: [ https://github.com/Tech-Parivartan/r2r-protocol?tab=readme-ov-file ]

Let’s build the future together! 💡

AiParivartanResearchLab

techparivartan

Documentation of the r2r-protocal : [ https://techparivartanai.notion.site/Robot-to-Robot-r2r-Protocol-1f008f0fb18780439d70e8b9bbbdb869 ]

The R2R Protocol enables seamless robot-to-robot interaction across industrial automation, swarm robotics, logistics, and multi-agent systems. It defines structured message formats, negotiation logic, discovery mechanisms, and extensible APIs.

#r2r_protocol #robot2robot_protocol #ai #aiparivartanresearchlab #techparivartan

https://huggingface.co/blog/rajkumarrawal/rawalraj

reacted to sergiopaniego's post with 🚀 14 days ago

Post

1812

Google DeepMind releases FunctionGemma, a 240M model specialized in 🔧 tool calling, built for fine-tuning

TRL has day-0 support. To celebrate, we’re sharing 2 new resources:

> Colab guide to fine-tune it for 🌐 browser control with BrowserGym OpenEnv
> Standalone training script

> Colab notebook: https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_functiongemma_browsergym_openenv.ipynb
> Training script: https://github.com/huggingface/trl/blob/main/examples/scripts/openenv/browsergym_llm.py (command to run it inside the script)
> More notebooks in TRL: https://huggingface.co/docs/trl/example_overview#notebooks

Yatharth Sharma

AI & ML interests

Recent Activity

Organizations

Mira-TTS

Mira-TTS

Kartoffel - German TTS Arena

Two questions

YatharthS/FlashSR

Add ONNX model

YatharthS/FlashSR

Dataset

YatharthS/MiraTTS

jiaqili3/flexicodec

Finetune when?

mradermacher/MiraTTS-GGUF

Yatharth Sharma

AI & ML interests

Recent Activity

Organizations

YatharthS's activity

Mira-TTS

Mira-TTS

Kartoffel - German TTS Arena

Two questions

Add ONNX model

Dataset

Finetune when?