TicketClassificationGPT

Model Summary

TicketClassificationGPT is a GPT-2–based text classification model designed entirely from scratct to classify IT support tickets into 8 predefined categories.
The model uses the original OpenAI GPT-2 architecture and weights, with the language modeling head replaced by a custom classification head. Only the final layers were fine-tuned for the ticket classification task.

This model is fully compatible with the Hugging Face transformers ecosystem and can be loaded using AutoModel.from_pretrained.

How to Get Started with the Model

Inference Example (Transformers + tiktoken)

from transformers import AutoModel
import tiktoken

# Load tokenizer
tokenizer = tiktoken.get_encoding("gpt2")

# Load model
model_id = "FarhanAK128/TicketClassificationGPT"
model = AutoModel.from_pretrained(
    model_id,
    trust_remote_code=True
)

# Example prediction
text = "Need extra space on Google Drive."
prediction = model.predict(text, tokenizer)

print("Predicted class:", prediction) # Predicted class: Storage

Note: This model uses a custom .predict() method defined in the repository and requires trust_remote_code=True to function.

Model Details

📝 Model Description

Developed by: Farhan Ali Khan
Model type: GPT-2–based text classification model
Base architecture: GPT-2 (OpenAI)
Framework: PyTorch
Task: Text Classification
Number of classes: 8
Language: English
License: MIT
Finetuned from model: OpenAI GPT-2

📋 Classification Labels

Class ID	Category
0	Hardware
1	HR Support
2	Access
3	Miscellaneous
4	Storage
5	Purchase
6	Internal Project
7	Administrative Rights

Model Sources

Repository: https://huggingface.co/FarhanAK128/TicketClassificationGPT
Base model: OpenAI GPT-2 like architecture from scratch
Paper: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

Training Details

Training Data

The model was trained on the IT Service Ticket Classification Dataset available on Kaggle.

Dataset name: IT Service Ticket Classification Dataset
Source: Kaggle
Link: https://www.kaggle.com/datasets/adisongoh/it-service-ticket-classification-dataset
Content: Labeled IT support ticket text data
Language: English

The dataset was used for supervised multi-class classification after standard text preprocessing and tokenization.

Training Procedure

Base weights: OpenAI GPT-2
Fine-tuning strategy: Partial fine-tuning (classification head + final transformer layers)
Optimizer: AdamW
Learning rate: 1e-4
Weight decay: 0.1
Epochs: 5
Random seed: 123
Loss function: Cross-Entropy Loss
Training regime: FP32
Evaluation frequency: Every 30 steps
Total training time: ~140 minutes
Final training loss: ~0.61
Final validation loss: ~0.86

📈 Training Progress

Training and Validation Loss

Training and Validation Accuracy

📊 Model Performance

Dataset Split	Accuracy
🏋️ Training	76.54%
🧪 Validation	75.67%
🧠 Test	73.83%

Uses

Direct Use

This model can be used directly to classify short IT support ticket texts into predefined categories.

Example use cases:

Automated ticket routing
Helpdesk prioritization
Internal IT workflow automation

Downstream Use

The model may be further fine-tuned on:

Organization-specific ticket data
Expanded label sets
Domain-specific terminology

Out-of-Scope Use

Multilingual text classification
Open-domain topic classification
Legal, medical, or safety-critical decision-making

Bias, Risks, and Limitations

Trained on a limited-domain dataset (IT support tickets)
Not evaluated for demographic or social bias
Predictions may be unreliable for unseen ticket categories
Performance depends on input text quality and length

Recommendations

Human validation is recommended before using predictions in production systems.
For best results, further fine-tuning on in-domain data is advised.

Model Card Authors

Farhan Ali Khan

Model Card Contact

For questions or feedback, please reach out via my Hugging Face profile: FarhanAK128

Downloads last month: 96

Safetensors

Model size

0.1B params

Tensor type

F32