Gujarati Question Answering Model (Finetuned)

This model is a fine-tuned version of Naman0807/gujarati-lm on a custom Gujarati Question Answering dataset. It is designed to generate answers given a context and a question.

Model Description

Model Architecture: GPT-2 (Decoder-only)
Task: Generative Question Answering
Language: Gujarati
Base Model: Naman0807/gujarati-lm
Context Window: 256 tokens

Intended Use

This model is intended to answer questions based on a provided context in Gujarati. It uses a generative approach where the input is formatted as: Context: <context> Question: <question> Answer: and the model generates the answer.

How to Use

You can use this model with the Hugging Face pipeline.

from transformers import pipeline

# Load the pipeline
generator = pipeline("text-generation", model="Naman0807/fine_tuned_gujarati_qa") # Replace with your actual repo name if different

# Define context and question
context = "ગુજરાત ભારતનું એક રાજ્ય છે. તેનું પાટનગર ગાંધીનગર છે."
question = "ગુજરાતનું પાટનગર કયું છે?"

# Format the prompt
prompt = f"Context: {context}\nQuestion: {question}\nAnswer:"

# Generate answer
output = generator(prompt, max_length=256, num_return_sequences=1, do_sample=True, temperature=0.7)
generated_text = output[0]['generated_text']

# Extract the answer part
print(generated_text)

Training Data

The model was fine-tuned on the Gujarati subset of the L3Cube-Pune IndicSQuAD dataset. Dataset Link

The dataset was formatted into a generative text format:

Format: Context: ... Question: ... Answer: ...

Training Procedure

Hyperparameters

Epochs: 3
Batch Size: 4
Learning Rate: 5e-5
Max Sequence Length: 256
Optimizer: AdamW
Precision: Mixed Precision (FP16)

Training Code Snippet

training_args = TrainingArguments(
    output_dir="./fine_tuned_gujarati_qa",
    per_device_train_batch_size=4,
    num_train_epochs=3,
    learning_rate=5e-5,
    fp16=True,
    ...
)

Limitations

Context Length: Limited to 256 tokens. Long contexts may be truncated.
Hallucination: As a generative model, it may hallucinate answers if the answer is not present in the context or if the model is unsure.

Downloads last month: 2

Safetensors

Model size

51.8M params

Tensor type

F32

Model tree for Naman0807/gujarati-lm-finetuned

Base model

Naman0807/gujarati-lm

Finetuned

(1)

this model

Naman0807
/

gujarati-lm-finetuned