Aletheia-ng/pidgin-corpus-synth
Viewer
•
Updated
•
57.1k
•
72
Aletheia-ng/yoruba-corpus-synth
Viewer
•
Updated
•
20.2k
•
23
Aletheia-ng/nigerian-pidgin-corpus-synth
Aletheia-ng/pretrain_data10
Viewer
•
Updated
•
40.9M
•
9
Aletheia-ng/low_resource_languages_pretrain_data4
Viewer
•
Updated
•
469M
•
5
Aletheia-ng/pretrain_data11
Updated
Aletheia-ng/pretrain_data9
Viewer
•
Updated
•
79.1M
•
1
Aletheia-ng/pretrain_data5
Viewer
•
Updated
•
9.43M
•
19
Aletheia-ng/pretrain_data4
Viewer
•
Updated
•
124M
•
34
Aletheia-ng/pretrain_data7
Viewer
•
Updated
•
13M
Aletheia-ng/pretrain_data3
Viewer
•
Updated
•
143M
•
74
Viewer
•
Updated
•
136
Aletheia-ng/pretrain_data
Viewer
•
Updated
•
109M
•
31
Aletheia-ng/pretrain_data2
Viewer
•
Updated
•
18.2M
•
16
Aletheia-ng/low_resource_languages_pretrain
Viewer
•
Updated
•
202M
•
1.3k
•
1
Aletheia-ng/masakhaner_eval
Aletheia-ng/noisy_dataset
Viewer
•
Updated
•
84k
•
2
Viewer
•
Updated
•
84k
•
2
Aletheia-ng/personal_finance_v0.2
Viewer
•
Updated
•
56.6k
•
8
•
1
Aletheia-ng/bloomberg-news-articles-pretraining-dataset
Viewer
•
Updated
•
437k
•
33
•
5
Aletheia-ng/ChatML-aya_dataset
Viewer
•
Updated
•
202k
•
2
Aletheia-ng/yo_wiki_processed
Viewer
•
Updated
•
43.5k
Viewer
•
Updated
•
270k
•
1
Viewer
•
Updated
•
4.4k
Viewer
•
Updated
•
43.5k
•
1
Viewer
•
Updated
•
288
Viewer
•
Updated
•
1.01k
•
3
Viewer
•
Updated
•
3.67k
•
1