TharuLLaMA-3.2-3B-Instruct

TharuLLaMA-3.2-3B-Instruct is a Low-Rank Adapter (LoRA) fine-tuned version of the Llama-3.2-3B-Instruct model, built to understand and generate Tharu language text.

The goal of this model is to support low-resource language NLP, especially for the Tharu community across eastern Uttar Pradesh, Nepal, Bihar, and surrounding regions. It is designed as a general-purpose instruction-following assistant capable of answering queries, conversing, translating, and performing various tasks in Tharu.

Model Details

  • Developer: Agniva Maiti
  • Base Model: meta-llama/Llama-3.2-3B-Instruct
  • Language: Tharu (thr)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Precision: fp16

Training Data

The model was trained on the TharuNLP Instruction Corpus, containing over 3,000+ Tharu instruction–response pairs.

The dataset primarily consists of the Rana Tharu dialect, with additional examples from Dangaura Tharu, Kochila Tharu, and a small portion of mixed Tharu speech varieties. This makes the model strongest in Rana Tharu, while still reasonably generalizing across the broader Tharu language family.

Data Splitting

To ensure reliable evaluation:

  • Training: 80 percent
  • Validation: 10 percent
  • Test: 10 percent

This released model represents the final checkpoint trained on 100 percent of the training split as part of a data-scaling experiment.

Training Hyperparameters

  • Epochs: 3
  • Batch Size: 2 per device (with 8 gradient accumulation steps)
  • Sequence Length: 512
  • Learning Rate: 2e-4

LoRA Configuration

  • Rank (r): 16
  • Alpha: 32
  • Dropout: 0.05
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Intended Use

This model is suitable for:

  • Chatbots and assistants for the Tharu language
  • Research in low-resource Indo-Aryan languages
  • Translation and educational tools
  • Cultural and linguistic preservation projects

How to Use

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load Base Model
base_model_id = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load Tharu Adapter
adapter_id = "agnivamaiti/TharuLLaMA-3.2-3B-Instruct"
model = PeftModel.from_pretrained(model, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Inference
prompt = "Machine Learning ko ho?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.3,
    top_k=15,
    top_p=0.3,
    repetition_penalty=1.2,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations and Safety

  • Hallucinations: May generate incorrect or fabricated facts
  • Bias: Inherits biases from the base LLaMA model and the dialect patterns present in the Tharu dataset
  • Critical Use: Not suitable for legal, medical, or financial decision-making

Credits

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agnivamaiti/TharuLLaMA-3.2-3B-Instruct

Adapter
(590)
this model