Bird Species Classification - ConvNeXt-Base
Model Description
This model classifies 200 bird species using ConvNeXt-Base architecture with transfer learning.
Performance
- Test Accuracy: 83.64%
- Average Per-Class Accuracy: 83.29%
- Architecture: ConvNeXt-Base (87M parameters)
- Dataset: CUB-200-2011 (200 bird species)
Training Details
Model Architecture
- Base Model: ConvNeXt-Base pretrained on ImageNet-1K
- Classifier: Custom 2-layer classifier with dropout
- Input Size: 224x224 RGB images
Training Strategy
Phase 1 (40 epochs): Frozen backbone, train classifier only
- Learning Rate: 0.001
- Batch Size: 32
Phase 2 (20 epochs): Full fine-tuning
- Learning Rate: 0.0001
- Batch Size: 32
Regularization
- Dropout: 0.6, 0.5
- Label Smoothing: 0.2
- Weight Decay: 0.005
- Data Augmentation: rotation, flip, color jitter, random erasing
Usage
import torch
import torch.nn as nn
from torchvision import models, transforms
from PIL import Image
# Load model
model = models.convnext_base(weights=None)
num_features = model.classifier[2].in_features
model.classifier[2] = nn.Sequential(
nn.Dropout(0.6),
nn.Linear(num_features, 1024),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(1024, 200)
)
# Load weights
model.load_state_dict(torch.load('final_model.pth', map_location='cpu'))
model.eval()
# Preprocessing
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])
# Predict
image = Image.open('bird.jpg').convert('RGB')
image_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
outputs = model(image_tensor)
probabilities = torch.nn.functional.softmax(outputs[0], dim=0)
top5_prob, top5_indices = torch.topk(probabilities, 5)
print("Top 5 Predictions:")
for prob, idx in zip(top5_prob, top5_indices):
print(f"Class {idx}: {prob.item()*100:.2f}%")
Try it out!
Try the live demo: Bird Species Classifier
Model Files
final_model.pth(1.06 GB): Full model weights
Citation
Dataset: CUB-200-2011
@techreport{WahCUB_200_2011,
Title = {{The Caltech-UCSD Birds-200-2011 Dataset}},
Author = {Wah, C. and Branson, S. and Welinder, P. and Perona, P. and Belongie, S.},
Year = {2011},
Institution = {California Institute of Technology},
Number = {CNS-TR-2011-001}
}
Contact
For questions or issues, please open an issue on the Space repository.