๐Ÿ›ก๏ธ Arabic Text Detoxification Model

Ensemble Knowledge Distillation Approach

Model License Language HuggingFace

Transform toxic Arabic text into polite, neutral alternatives while preserving meaning

Model Demo | Architecture | Dataset | Results


๐Ÿ“Š Architecture Overview

Model Architecture

๐ŸŽฏ Model Description

This model performs text detoxification for Arabic language โ€” converting offensive, toxic, or aggressive text into neutral, polite alternatives while preserving the original semantic meaning.

Key Features

Feature Description
๐Ÿ—๏ธ Architecture Bloom-1b7 (1.7B parameters) fine-tuned with ensemble distillation
๐ŸŒ Language Arabic (Modern Standard Arabic + dialects)
๐Ÿ“š Training Ensemble of 3 models โ†’ Knowledge distillation โ†’ Final model
โšก Hardware Optimized for NVIDIA A100 40GB, works on consumer GPUs
๐Ÿ“ Context Up to 2048 tokens

Ensemble Components

Model Parameters Role Source
AraGPT2-Medium 370M Arabic Language Expert AUB MIND Lab
Bloom-560m 560M Multilingual Generalization BigScience
Bloom-1b7 1.7B High Capacity Patterns BigScience

๐Ÿ“ˆ Evaluation Results

Metric Score Description
J-Score 0.7129 Joint metric (geometric mean)
STA 0.9500 Style Transfer Accuracy
SIM (ref) 0.9995 Similarity to reference
Fluency 1.0000 Grammatical correctness
J-Score    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘  0.71
STA        โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  0.95
SIM (ref)  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  1.00
Fluency    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ  1.00

๐Ÿš€ Quick Start

Installation

pip install transformers torch

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model
model_name = "ispromashka/arab-detoxification-isp"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
model.to("cuda")  # or "cpu"

def detoxify(text: str) -> str:
    """Convert toxic Arabic text to neutral form."""
    prompt = f"ุณุงู…: {text}\nู…ู‡ุฐุจ:"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    
    outputs = model.generate(
        **inputs,
        max_new_tokens=50,
        temperature=0.7,
        top_p=0.9,
        repetition_penalty=1.2,
        do_sample=True,
        pad_token_id=tokenizer.pad_token_id,
    )
    
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return result.split("ู…ู‡ุฐุจ:")[-1].strip().split("\n")[0]

# Example
toxic_text = "ุฃู†ุช ุบุจูŠ ุฌุฏุงู‹"
neutral_text = detoxify(toxic_text)
print(f"Input:  {toxic_text}")
print(f"Output: {neutral_text}")

๐Ÿ’ก Examples

Category Toxic Input (ุณุงู…) Neutral Output (ู…ู‡ุฐุจ)
Insult ุฃู†ุช ุบุจูŠ ุฌุฏุงู‹ ุฑุจู…ุง ุชุญุชุงุฌ ุฅู„ู‰ ู…ุฒูŠุฏ ู…ู† ุงู„ูˆู‚ุช ู„ู„ูู‡ู…
Command ุงุฎุฑุณ ูŠุง ุฃุญู…ู‚ ุฃุฑุฌูˆ ุฃู† ุชูƒูˆู† ุฃูƒุซุฑ ู‡ุฏูˆุกุงู‹
Criticism ู‡ุฐุง ุงู„ุนู…ู„ ุชุงูู‡ ูˆุณุฎูŠู ุงู„ุนู…ู„ ูŠู…ูƒู† ุชุทูˆูŠุฑู‡
Threat ุณุฃุฌุนู„ูƒ ุชู†ุฏู… ุฏุนู†ุง ู†ุญู„ ู‡ุฐุง ุจุณู„ุงู…
Contempt ุฃู†ุช ูุงุดู„ ุชู…ุงู…ุงู‹ ุงู„ู†ุฌุงุญ ูŠุญุชุงุฌ ู„ู…ุฒูŠุฏ ู…ู† ุงู„ุฌู‡ุฏ
Mockery ูŠุง ู„ู‡ ู…ู† ุบุจูŠ ุฑุจู…ุง ู„ู… ูŠูู‡ู… ุฌูŠุฏุงู‹
Blame ูƒู„ ุดูŠุก ุฎุทุคูƒ ู†ุญุชุงุฌ ุชุญุฏูŠุฏ ุงู„ู…ุณุคูˆู„ูŠุงุช
Appearance ู…ู†ุธุฑูƒ ุณูŠุก ุงู„ู…ุธู‡ุฑ ูŠู…ูƒู† ุชุญุณูŠู†ู‡

๐Ÿ”ฌ Methodology

Training Pipeline

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    STAGE 1: Base Models                     โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Train 3 specialized models independently on detox dataset  โ”‚
โ”‚  โ€ข AraGPT2-Medium (25 epochs)                               โ”‚
โ”‚  โ€ข Bloom-560m (25 epochs)                                   โ”‚
โ”‚  โ€ข Bloom-1b7 (20 epochs)                                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 STAGE 2: Ensemble Selection                 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  For each input, select best prediction using:              โ”‚
โ”‚  Sentence-BERT (paraphrase-multilingual-mpnet-base-v2)      โ”‚
โ”‚  Selection: argmax(cosine_similarity(pred, reference))      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                              โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚               STAGE 3: Knowledge Distillation               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Fine-tune fresh Bloom-1b7 on:                              โ”‚
โ”‚  โ€ข Original dataset (3000+ examples)                        โ”‚
โ”‚  โ€ข Ensemble best predictions (1500+ examples)               โ”‚
โ”‚  โ€ข Total: 4500+ training examples                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Evaluation Metrics

J-Score (Primary metric):

J=STAร—SIMร—FL3J = \sqrt[3]{STA \times SIM \times FL}

Where:

  • STA (Style Transfer Accuracy): Measures toxicity removal success
  • SIM (Semantic Similarity): Content preservation (Sentence-BERT cosine similarity)
  • FL (Fluency): Ratio of grammatically valid outputs

๐Ÿ“ Dataset

Dataset used for training and evaluation:
ispromashka/arabic-detox-dataset

Composition

Category Examples Description
Personal Insults 30 Direct personal attacks
Aggressive Commands 20 Hostile imperatives
Work Criticism 25 Professional negative feedback
Threats 15 Intimidation and warnings
Contempt 15 Expressions of superiority
Blame 15 Accusatory statements
Appearance Criticism 15 Physical/aesthetic insults
Mockery 15 Sarcastic belittling
Total Unique 150 โ€”
Augmented (ร—20) 3,000+ Training examples

Data Format

ุณุงู…: {toxic_text}
ู…ู‡ุฐุจ: {neutral_text}<EOS>

โš™๏ธ Training Configuration

Parameter Base Models Final Model
Hardware NVIDIA A100 40GB NVIDIA A100 40GB
Precision BF16 BF16
Batch Size 8โ€“16 8
Learning Rate 2e-5 โ€“ 3e-5 1.5e-5
Epochs 20โ€“25 15
Optimizer AdamW AdamW
Scheduler Cosine Cosine
Warmup 10% 10%
Total Time ~85 min ~30 min

โš ๏ธ Limitations

  • Language Coverage: Optimized for Modern Standard Arabic; dialectal performance may vary
  • Text Length: Best for short-medium texts (< 100 tokens)
  • Domain: Trained on general toxicity; domain-specific content may need fine-tuning
  • Context: Does not consider conversation history

๐Ÿ“– Citation

@misc{arabicdetox2024,
  author = {ispromashka},
  title = {Arabic Text Detoxification: Ensemble Knowledge Distillation Approach},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ispromashka/arab-detoxification-isp}
}

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with โค๏ธ for the Arabic NLP community

GitHub

Downloads last month
34
Safetensors
Model size
2B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results