Swa-CSM-1B

Swa-CSM-1B is the best open-source Swahili text-to-speech model, fine-tuned from Sesame's CSM-1B for high-quality Swahili speech generation. This model powers Soga, a Swahili AI voice assistant.

Developed by Nadhari AI Lab .

Model Details

  • Base Model: sesame/csm-1b
  • Language: Swahili (sw)
  • Task: Text-to-Speech
  • Architecture: Llama backbone with Mimi audio decoder (RVQ codes)

Usage

Basic Generation

import torch
from transformers import CsmForConditionalGeneration, AutoProcessor

model_id = "Nadhari/swa-csm-1b"
device = "cuda" if torch.cuda.is_available() else "cpu"

processor = AutoProcessor.from_pretrained(model_id)
model = CsmForConditionalGeneration.from_pretrained(model_id, device_map=device)

# Generate Swahili speech
text = "[0]Mambo vipi?."
inputs = processor(text, add_special_tokens=True).to(device)

audio = model.generate(**inputs, output_audio=True)
processor.save_audio(audio, "swahili_output.wav")

With Conversation Context

For best results, provide conversational context:

conversation = [
    {"role": "0", "content": [{"type": "text", "text": "Habari yako?"}]},
]
inputs = processor.apply_chat_template(
    conversation,
    tokenize=True,
    return_dict=True,
).to(device)

audio = model.generate(**inputs, output_audio=True)
processor.save_audio(audio, "swahili_conversation.wav")

Intended Use

This model is designed for:

  • Swahili text-to-speech applications
  • Research in African language speech synthesis.

Limitations

  • Optimized for standard Swahili; performance may vary with regional dialects
  • Inherited limitations from base CSM-1B model
  • Best results achieved with conversational context

Ethical Considerations

This model should not be used for:

  • Impersonation or fraud
  • Generating misleading or deceptive content
  • Any illegal or harmful activities

By using this model, you agree to use it responsibly and ethically.

Citation

@misc{swa-csm-1b,
  author = {Nadhari AI Lab},
  title = {Swa-CSM-1B: Swahili Text-to-Speech Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Nadhari/swa-csm-1b}
}

Acknowledgments

  • Sesame for the base CSM-1B model

Contact


Downloads last month
260
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Nadhari/swa-csm-1b

Base model

sesame/csm-1b
Finetuned
(22)
this model