TeleQnA-Qwen2.5-7B-Instruct (Fine-Tuned)
This model is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct on the TeleQnA dataset. It achieves State-of-the-Art (SOTA) performance on the TeleQnA benchmark, outperforming GPT-4.
Paper Replicated: TeleQnA: A Benchmark Dataset to Assess Large Language Models Telecommunications Knowledge (arXiv:2310.15051)
π Performance
Evaluated on the full TeleQnA validation set (N=1000):
| Model | Overall Accuracy | Lexicon | Standards Specs | Standards Overview | Research Pubs | Research Overview |
|---|---|---|---|---|---|---|
| This Model | 76.80% | 95.00% | 67.02% | 75.00% | 79.21% | 77.88% |
| GPT-4 | 74.00% | ~87% | 64% | ~70% | ~80% | ~78% |
| GPT-3.5 | 67.00% | - | - | - | - | - |
Key Result: This specialized 7B model beats GPT-4 in overall accuracy (+2.8%) and significantly outperforms it in domain-specific terminology (Lexicon +8%) and technical standards (Standards Specs +3%).
π» Usage
To use this model, you need to load the adapter and the base model.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model_name = "Qwen/Qwen2.5-7B-Instruct"
adapter_name = "nraptisss/teleqna-qwen2.5-7b-finetune"
# 1. Load Base Model
model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# 2. Load Adapter
model = PeftModel.from_pretrained(model, adapter_name)
model.eval()
# 3. Inference
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
question = "What is the primary advantage of using coordinated beamforming in Wi-Fi 8?"
options = "option 1: Improved security\noption 2: Increased coverage\noption 3: Reduced interference\noption 4: Lower latency"
prompt = f"<|im_start|>user\n{question}\n\n{options}<|im_end|>\n<|im_start|>assistant\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π οΈ Training Details
- Dataset: TeleQnA (~10,000 Telecom MCQs)
- Method: QLoRA (4-bit quantization)
- Rank (r): 64
- Alpha: 16
- Epochs: 1
- Hardware: Single NVIDIA RTX 6000 Ada
π Citation
If you use this model, please cite the original TeleQnA paper:
@article{teleqna2023,
title={TeleQnA: A Benchmark Dataset to Assess Large Language Models Telecommunications Knowledge},
author={Ali Maatouk et al.},
journal={arXiv preprint arXiv:2310.15051},
year={2023}
}
- Downloads last month
- 26
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for nraptisss/teleqna-qwen2.5-7b-finetune
Evaluation results
- Accuracy on TeleQnAself-reported76.800