YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🚀 Usage

Using LoRA Adapters

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and apply LoRA adapters
model_name = "cogni-x/CogniXpert-DeepSeek-R1-Distill-Llama8B-English-LoRA"
base_model = 'unsloth/DeepSeek-R1-Distill-Llama-8B'

tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map='auto',
    load_in_4bit=True  # Optional: for memory efficiency
)
model = PeftModel.from_pretrained(base_model, model_name)

# Example: English mental health conversation
messages = [
    {"role": "user", "content": "I've been feeling really anxious about work lately."}
]

inputs = tokenizer.apply_chat_template(
    messages, 
    add_generation_prompt=True, 
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=256,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example in Swahili

messages = [
    {"role": "user", "content": "Nimekuwa na wasiwasi mwingi kuhusu kazi yangu."}
]
# Model will respond in Swahili

Example in Sheng

messages = [
    {"role": "user", "content": "Niko na stress mob ju ya job yangu bana."}
]
# Model will respond using appropriate Sheng/Swahili mix

📊 Training Metrics

Metric	Value
Training Loss	0.8424
Evaluation Loss	0.8149
Perplexity	2.26
Training Time	3977.78 minutes

🔧 Training Details

Training Data

The model was fine-tuned on a combination of:

English Mental Health Counseling Dataset - Professional therapeutic conversations
Swahili Therapeutic Dataset - Culturally-adapted mental health dialogues
Sheng Lexical Dataset - Urban Kenyan youth language patterns

Training Configuration

Base Model: DeepSeek-R1-Distill-Llama-8B (4-bit quantized)
Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 32
LoRA Alpha: 64
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Sequence Length: 2048 tokens
Training Framework: Unsloth + TRL
Optimizer: AdamW 8-bit
Learning Rate: 2e-4
Batch Size: Effective batch size of 64 (multi-GPU)

Multi-turn Conversation Handling

Training: All conversation turns included with context for coherence
Evaluation: Only first turns used to avoid bias from assuming perfect prior responses
Response Masking: Loss computed only on assistant responses, not prompts

🌍 Language Support

English

Professional mental health counseling with evidence-based therapeutic techniques.

Swahili (Kiswahili)

Culturally-sensitive therapeutic conversations adapted for East African context.

Sheng

Urban Kenyan youth slang for relatable, authentic support conversations.

Language Detection: Automatic - responds in the same language as input.

⚖️ Ethical Considerations

Intended Users

Individuals seeking emotional support and self-reflection tools
Mental health organizations looking to provide preliminary support
Researchers studying multilingual therapeutic AI

Out-of-Scope Use

Crisis intervention (use emergency services instead)
Clinical diagnosis or treatment
Replacement for licensed mental health professionals
Legal or medical advice

Bias and Limitations

May reflect biases present in training data
Cultural nuances may not be fully captured
Sheng language is informal and evolving - may not match all regional variations
Should be used as a supplement, not replacement, for professional care

📝 Citation

If you use this model, please cite:

@misc{cognixpert-deepseek-mental-health,
  title={CogniXpert DeepSeek Multilingual Mental Health Model},
  author={CogniX Ltd},
  year={2025},
  publisher={HuggingFace},
  url={https://huggingface.co/cogni-x/CogniXpert-DeepSeek-R1-Distill-Llama8B-English-LoRA}
}

📧 Contact

Organization: CogniX Ltd
Project: CogniXpert AI
Repository: GitHub

For questions, issues, or collaboration opportunities, please visit our GitHub repository.

🙏 Acknowledgments

Built on Unsloth for efficient training
Base model: DeepSeek-R1-Distill-Llama-8B
Training framework: Hugging Face TRL and Transformers

Last Updated: 2025-12-11

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support