Kani-TTS 400M v0.3 (Turkish Fine-tuned)

This model is a Turkish fine-tuned version of Kani-TTS 400M, adapted to the Turkish language using custom multi-speaker speech data collected from publicly available YouTube channels.

The primary goal of this fine-tuning is to improve Turkish pronunciation, prosody, and naturalness, enabling the base model to speak Turkish fluently.

🔊 Training Data

The training data was prepared via YouTube scraping and manual preprocessing (audio segmentation, cleaning, and transcript alignment).

🎤 Speakers

Speaker Tag	Source Channel	Gender	Duration
`sıla`	https://www.youtube.com/@BirDinle	Female	~33 hours
`taha`	https://www.youtube.com/@seslifikiristasyonu	Male	~28 hours

Total duration: ~61 hours
Language: Turkish (tr-TR)
Domain: Narration / voice-over style speech

Speaker tags are explicitly used during training to enable multi-speaker inference.

🧠 Training Details

Base model: Kani-TTS 400M
Training type: Fine-tuning
Objective:
- Improve Turkish language capability
- Adapt pronunciation and phoneme alignment
- Preserve speaker identity via speaker tags
Data source: Publicly available YouTube content
Audio length: ~3s – 20s segments

📜 License & Usage Notice (Academic / Non-Commercial)

This model is released strictly for academic research and experimental purposes.

The model is not intended for commercial use.
Commercial deployment, resale, or use in revenue-generating products or services is explicitly prohibited without prior permission.
The training data was derived from publicly accessible online sources and was used for research and educational objectives only.
Users are responsible for ensuring compliance with applicable laws, platform terms, and ethical guidelines when using this model.

By using this model, you agree that it is provided “as-is”, without warranties of any kind, and solely for non-commercial academic use.

Downloads last month: 29

Safetensors

Model size

0.4B params

Tensor type

BF16