DINOV2
Collection
Vision Transformer (ViT) model trained using the DINOv2 method. • 8 items • Updated
Vision Transformer (ViT) model trained using the DINOv2 method.
Reference
DINOV2 offers a powerful, generalist visual backbone learned entirely from unlabeled images as described in DINOv2: Learning Robust Visual Features without Supervision
Keras and KerasHub can be installed with:
pip install -U -q keras-hub
pip install -U -q keras
Jax, TensorFlow, and Torch come preinstalled in Kaggle Notebooks. For instructions on installing them in another environment see the Keras Getting Started page.
The following model checkpoints are provided by the Keras team. Weights have been ported from: https://huggingface.co. Full code examples for each are available below.
| Preset name | Parameters | Description |
|---|---|---|
| dinov2_small | 22.58M | Vision Transformer (small-sized model) trained using DINOv2. |
| dinov2_base | 87.63M | Vision Transformer (base-sized model) trained using DINOv2. |
| dinov2_large | 305.77M | Vision Transformer (large-sized model) trained using DINOv2. |
| dinov2_giant | 1.13B | Vision Transformer (giant-sized model) trained using DINOv2. |
| dinov2_with_registers_small | 22.58M | Vision Transformer (small-sized model) trained using DINOv2, with registers. |
| dinov2_with_registers_base | 87.63M | Vision Transformer (base-sized model) trained using DINOv2, with registers. |
| dinov2_with_registers_large | 305.77M | Vision Transformer (large-sized model) trained using DINOv2, with registers. |
| dinov2_with_registers_giant | 1.13B | Vision Transformer (giant-sized model) trained using DINOv2, with registers. |