Mira-TTS
πͺ
1
(Unofficial) Gradio demo for MiraTTS
Thanks! Glad you found it helpful. I guess right now better, and more compressive audio tokenizers would be great. Training data for tasks apart from simple TTS and voice cloning is lacking as well.
Speech tokens and text tokens are treated the same in LLMs, they just learn speech tokens as a different language as I stated. They will learn about using speech tokens in a similar way they learn about text tokens.
Unfortunately reasoning capabilities do decrease because of a few reasons:
Thanks π€