Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper
β’
2407.13623
β’
Published
β’
56
Increase your vocabulary size when you scale up your language model
Predict optimal vocabulary size for models