Beta testing resources in the NB-ASR project
AI & ML interests
None defined yet.
Recent Activity
View all activity
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
NbAiLab/nb-whisper-large-distil-turbo-beta
Automatic Speech Recognition • 0.8B • Updated • 4.7k • 8 -
NbAiLab/nb-whisper-large
Automatic Speech Recognition • 2B • Updated • 8.93k • 37 -
NbAiLab/nb-whisper-medium
Automatic Speech Recognition • 0.8B • Updated • 1.28k • 4 -
NbAiLab/nb-whisper-small
Automatic Speech Recognition • 0.2B • Updated • 807 • 1
Quantized version of the NB-Llama 3.x models. Due to hardware issues, we have still not been able to make quantized version of the 70B models.
-
NbAiLab/nb-llama-3.2-1B-Q4_K_M-GGUF
Text Generation • 1B • Updated • 9 -
NbAiLab/nb-llama-3.2-3B-Q4_K_M-GGUF
Text Generation • 3B • Updated • 9 -
NbAiLab/nb-llama-3.1-8B-Q4_K_M-GGUF
Text Generation • 8B • Updated • 52 • 1 -
NbAiLab/nb-llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 28
Models based on Wav2Vec from Meta, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
Boosting Norwegian Automatic Speech Recognition
Paper • 2307.01672 • Published • 1 -
NbAiLab/nb-wav2vec2-300m-bokmaal-v2
Automatic Speech Recognition • 0.3B • Updated • 40.6k -
NbAiLab/nb-wav2vec2-300m-bokmaal
Automatic Speech Recognition • 0.3B • Updated • 937 -
NbAiLab/nb-wav2vec2-1b-bokmaal-v2
Automatic Speech Recognition • 1.0B • Updated • 956k
Models based on BERT from Google, and trained on data from various sources, including the digital collection at the National Library of Norway.
Tools for processing of newspapers: cropping (TBD) and front page detection.
Preview release of the Borealis family of instruction tuned models by the National Library of Norway.
-
NbAiLab/borealis-27b-instruct-preview
Image-Text-to-Text • 27B • Updated • 385 • 5 -
NbAiLab/borealis-12b-instruct-preview
Image-Text-to-Text • 12B • Updated • 566 • 1 -
NbAiLab/borealis-4b-instruct-preview
Image-Text-to-Text • 4B • Updated • 3.61k • 13 -
NbAiLab/borealis-1b-instruct-preview
Image-Text-to-Text • 1.0B • Updated • 1.23k • 1
Llama 3.x models in various sizes.
-
NbAiLab/nb-notram-llama-3.2-1b-instruct
Text Generation • 1B • Updated • 1.11k • 1 -
NbAiLab/nb-notram-llama-3.2-3b-instruct
Text Generation • 3B • Updated • 915 • 2 -
NbAiLab/nb-notram-llama-3.1-8b-instruct
Text Generation • 8B • Updated • 57 • 2 -
NbAiLab/nb-notram-llama-3.3-70b-instruct
Text Generation • 71B • Updated • 35
NB-Whisper models that are mostly suited for linguists and researchers. The output is lowercase and without punctation.
-
NbAiLab/nb-whisper-large-verbatim
Automatic Speech Recognition • 2B • Updated • 64 • 2 -
NbAiLab/nb-whisper-medium-verbatim
Automatic Speech Recognition • 0.8B • Updated • 30 -
NbAiLab/nb-whisper-small-verbatim
Automatic Speech Recognition • 0.2B • Updated • 16 -
NbAiLab/nb-whisper-base-verbatim
Automatic Speech Recognition • 72.6M • Updated • 11
Models based on GPT-J from EleutherAI, and trained on data from various sources, including the digital collection at the National Library of Norway.
Speech data for our speech to text models
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
NbAiLab/nb-whisper-tiny-beta
Automatic Speech Recognition • 37.8M • Updated • 14 • 1 -
NbAiLab/nb-whisper-base-beta
Automatic Speech Recognition • 72.6M • Updated • 16 • 1 -
NbAiLab/nb-whisper-small-beta
Automatic Speech Recognition • 0.2B • Updated • 27 • 15 -
NbAiLab/nb-whisper-medium-beta
Automatic Speech Recognition • 0.8B • Updated • 14 • 2
Beta testing resources in the NB-ASR project
Preview release of the Borealis family of instruction tuned models by the National Library of Norway.
-
NbAiLab/borealis-27b-instruct-preview
Image-Text-to-Text • 27B • Updated • 385 • 5 -
NbAiLab/borealis-12b-instruct-preview
Image-Text-to-Text • 12B • Updated • 566 • 1 -
NbAiLab/borealis-4b-instruct-preview
Image-Text-to-Text • 4B • Updated • 3.61k • 13 -
NbAiLab/borealis-1b-instruct-preview
Image-Text-to-Text • 1.0B • Updated • 1.23k • 1
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
NbAiLab/nb-whisper-large-distil-turbo-beta
Automatic Speech Recognition • 0.8B • Updated • 4.7k • 8 -
NbAiLab/nb-whisper-large
Automatic Speech Recognition • 2B • Updated • 8.93k • 37 -
NbAiLab/nb-whisper-medium
Automatic Speech Recognition • 0.8B • Updated • 1.28k • 4 -
NbAiLab/nb-whisper-small
Automatic Speech Recognition • 0.2B • Updated • 807 • 1
Llama 3.x models in various sizes.
-
NbAiLab/nb-notram-llama-3.2-1b-instruct
Text Generation • 1B • Updated • 1.11k • 1 -
NbAiLab/nb-notram-llama-3.2-3b-instruct
Text Generation • 3B • Updated • 915 • 2 -
NbAiLab/nb-notram-llama-3.1-8b-instruct
Text Generation • 8B • Updated • 57 • 2 -
NbAiLab/nb-notram-llama-3.3-70b-instruct
Text Generation • 71B • Updated • 35
Quantized version of the NB-Llama 3.x models. Due to hardware issues, we have still not been able to make quantized version of the 70B models.
-
NbAiLab/nb-llama-3.2-1B-Q4_K_M-GGUF
Text Generation • 1B • Updated • 9 -
NbAiLab/nb-llama-3.2-3B-Q4_K_M-GGUF
Text Generation • 3B • Updated • 9 -
NbAiLab/nb-llama-3.1-8B-Q4_K_M-GGUF
Text Generation • 8B • Updated • 52 • 1 -
NbAiLab/nb-llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 28
NB-Whisper models that are mostly suited for linguists and researchers. The output is lowercase and without punctation.
-
NbAiLab/nb-whisper-large-verbatim
Automatic Speech Recognition • 2B • Updated • 64 • 2 -
NbAiLab/nb-whisper-medium-verbatim
Automatic Speech Recognition • 0.8B • Updated • 30 -
NbAiLab/nb-whisper-small-verbatim
Automatic Speech Recognition • 0.2B • Updated • 16 -
NbAiLab/nb-whisper-base-verbatim
Automatic Speech Recognition • 72.6M • Updated • 11
Models based on Wav2Vec from Meta, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
Boosting Norwegian Automatic Speech Recognition
Paper • 2307.01672 • Published • 1 -
NbAiLab/nb-wav2vec2-300m-bokmaal-v2
Automatic Speech Recognition • 0.3B • Updated • 40.6k -
NbAiLab/nb-wav2vec2-300m-bokmaal
Automatic Speech Recognition • 0.3B • Updated • 937 -
NbAiLab/nb-wav2vec2-1b-bokmaal-v2
Automatic Speech Recognition • 1.0B • Updated • 956k
Models based on GPT-J from EleutherAI, and trained on data from various sources, including the digital collection at the National Library of Norway.
Models based on BERT from Google, and trained on data from various sources, including the digital collection at the National Library of Norway.
Speech data for our speech to text models
Tools for processing of newspapers: cropping (TBD) and front page detection.
Models based on Whisper from OpenAI, and trained on data from Språkbanken and the digital collection at the National Library of Norway.
-
NbAiLab/nb-whisper-tiny-beta
Automatic Speech Recognition • 37.8M • Updated • 14 • 1 -
NbAiLab/nb-whisper-base-beta
Automatic Speech Recognition • 72.6M • Updated • 16 • 1 -
NbAiLab/nb-whisper-small-beta
Automatic Speech Recognition • 0.2B • Updated • 27 • 15 -
NbAiLab/nb-whisper-medium-beta
Automatic Speech Recognition • 0.8B • Updated • 14 • 2