Arabic End-of-Utterance Detection Model

Fine-tuned CAMeLBERT model for detecting end-of-utterance in Arabic conversations, with emphasis on Saudi dialect.

Model Description

This model is designed to detect when a speaker has finished their conversational turn in Arabic dialogue. It's particularly optimized for Saudi dialect patterns and real-time applications.

Model Details

Base Model: CAMeLBERT-MSA (CAMeL-Lab/bert-base-arabic-camelbert-msa)
Task: Binary classification (EOU vs. non-EOU)
Language: Arabic (Modern Standard Arabic + Saudi dialect)
Parameters: ~110M (base encoder) + classification head
Training Data: 2,000+ Arabic conversation samples

Intended Use

Real-time turn detection in conversational AI agents
Voice assistants for Arabic speakers
Dialogue systems
LiveKit agent integration

How to Use

Installation

pip install torch transformers

Basic Usage

from transformers import AutoTokenizer, AutoModel
import torch

# Load model and tokenizer
model_name = "mahmoudsaalama/arabic-eou-camelbert"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Prepare input
text = "السلام عليكم ورحمة الله"
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)

# Get prediction
with torch.no_grad():
    outputs = model(**inputs)
    probability = torch.sigmoid(outputs.logits).item()
    is_eou = probability > 0.5

print(f"EOU Probability: {probability:.4f}")
print(f"Is EOU: {is_eou}")

Using the SDK

For easier integration, use the Arabic EOU SDK:

pip install arabic-eou-sdk

from arabic_eou_sdk import ArabicEOUDetector

detector = ArabicEOUDetector(model_name="mahmoudsaalama/arabic-eou-camelbert")
result = detector.update_transcription("السلام عليكم", is_final=True)

print(f"Is EOU: {result['is_eou']}")
print(f"Probability: {result['probability']:.4f}")
print(f"Confidence: {result['confidence']:.4f}")

Training Details

Training Data

Size: ~2,000 samples (1,600 train, 200 val, 200 test)
Balance: 50% positive (EOU), 50% negative (non-EOU)
Sources: Synthetic Saudi Arabic conversations + public Arabic datasets

Training Procedure

Optimizer: AdamW
Learning Rate: 2e-5
Batch Size: 16
Epochs: 10 (with early stopping)
Mixed Precision: FP16
Hardware: GPU (CUDA)

Evaluation Metrics

Metric	Score
Accuracy	~90%
Precision	~88%
Recall	~92%
F1 Score	~90%
ROC AUC	~95%

Inference Speed

Configuration	Latency
GPU (FP32)	~15-20ms
GPU (INT8)	~8-12ms
CPU (FP32)	~60-80ms
CPU (INT8)	~25-35ms

Limitations

Dialectal Coverage: Optimized for Saudi dialect, may not generalize perfectly to other Arabic dialects
Synthetic Data: Trained primarily on synthetic conversations
Domain: Limited to common conversational topics
Dataset Size: Relatively small training set

Bias and Fairness

Model may perform better on Saudi dialect than other Arabic dialects
Training data focuses on common conversational patterns
May not handle code-switching or mixed-language conversations well

Citation

@model{arabic_eou_camelbert_2025,
  author = {Mahmoud Saalama},
  title = {Arabic End-of-Utterance Detection Model},
  year = {2025},
  publisher = {Hugging Face},
  url = {https://huggingface.co/mahmoudsaalama/arabic-eou-camelbert}
}

License

MIT License

Contact

For questions or feedback:

GitHub: arabic-eou-livekit
Hugging Face: @mahmoudsaalama

Acknowledgments

Base model: CAMeLBERT by CAMeL Lab
Framework: Transformers by Hugging Face
Integration: LiveKit for real-time applications

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mahmoudsaalama/arabic-eou-camelbert

Base model

CAMeL-Lab/bert-base-arabic-camelbert-msa

Finetuned

(8)

this model