Arabic End-of-Utterance Detection Model
Model Description
This model detects End-of-Utterance (EOU) in Arabic conversations, specifically optimized for Saudi dialects. It predicts the probability that a speaker has finished their conversational turn based on text transcription.
Use Case: Real-time conversational AI agents (voice assistants, chatbots, customer service)
Performance
| Metric | Score |
|---|---|
| Test Accuracy | 99.6% |
| Precision | 100% |
| Recall | 99.45% |
| F1 Score | 99.73% |
| AUC-ROC | 99.96% |
| Inference Time | ~15-20ms |
Training Data
- Total samples: 5,000
- SADA22 (Real Saudi audio): 104 samples (2.1%)
- Synthetic (Saudi patterns): 4,896 samples (97.9%)
- Splits: 80% train / 10% validation / 10% test
Quick Start
Installation
pip install transformers torch
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model
model = AutoModelForSequenceClassification.from_pretrained("HossamEL-Dein/arabic-eou-model")
tokenizer = AutoTokenizer.from_pretrained("HossamEL-Dein/arabic-eou-model")
model.eval()
# Predict EOU
text = "ู
ุฑุญุจุง ููู ุญุงูู ุงูููู
"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
eou_probability = probs[0][1].item()
print(f"EOU Probability: {eou_probability:.2%}")
# Output: EOU Probability: 98.56%
Integration with LiveKit
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
class EOUDetector:
def __init__(self, threshold=0.7):
self.model = AutoModelForSequenceClassification.from_pretrained("HossamEL-Dein/arabic-eou-model")
self.tokenizer = AutoTokenizer.from_pretrained("HossamEL-Dein/arabic-eou-model")
self.model.eval()
self.threshold = threshold
def check_eou(self, transcript_text):
inputs = self.tokenizer(transcript_text, return_tensors="pt")
with torch.no_grad():
outputs = self.model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
eou_prob = probs[0][1].item()
return {
'probability': eou_prob,
'is_eou': eou_prob > self.threshold
}
# Use in LiveKit agent
detector = EOUDetector()
result = detector.check_eou("ู
ุฑุญุจุง ููู ุญุงูู")
if result['is_eou']:
print("User finished speaking!")
Model Architecture
- Base Model: aubmindlab/bert-base-arabertv02
- Task: Binary sequence classification
- Input: Arabic text (up to 128 tokens)
- Output: 2-class probability distribution [Non-EOU, EOU]
- Parameters: 136M
Training Details
- Framework: PyTorch + Transformers
- Epochs: 3
- Batch Size: 16
- Learning Rate: 2e-5
- Optimizer: AdamW
- Training Time: ~3 hours on T4 GPU
Intended Use
Primary Use Cases
- โ Real-time voice assistants
- โ Arabic conversational AI
- โ Turn-taking detection in dialogues
- โ LiveKit agent integration
Limitations
- Trained primarily on Saudi dialect patterns
- Requires text input (not raw audio)
- Best for conversational context (5-10 seconds)
- May need threshold tuning for specific use cases
Dataset
Training dataset available at: HossamEL-Dein/arabic-eou-dataset
Citation
@misc{arabic-eou-2024,
author = {HossamEL-Dein},
title = {Arabic End-of-Utterance Detection Model},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/HossamEL-Dein/arabic-eou-model}
}
License
Apache 2.0
Contact
For questions or issues, please open an issue on the model repository.
- Downloads last month
- 37
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for HossamEL-Dein/arabic-eou-model
Base model
aubmindlab/bert-base-arabertv02