LLaMA-zhtw
Collection
6 items
β’
Updated
ε¨ Llama 3 δΈθ©¦ι©δΈζ Continue Pretraining (CP)οΌε ±θ¨θ¨η·΄ 800M tokensγ
η±ζΌδΈζι θ¨η·΄θͺζεθ³ͺιζζΉι²η©ΊιοΌCP εΎθ‘¨ηΎζͺθ½θΆ θΆεη Llama 3οΌζεζ―θΌεΉΎειζΊη€ΎηΎ€θ¨η·΄ηδΈζ Llama 3 δΉζι‘δΌΌηζ³γ
ε¨θ±ζζΉι’ LLaMA 3 zhtw δ½Ώη¨ FineWebοΌδ½ΏεΎ MMLU 葨ηΎι«ζΌε Άδ»δΈζCP樑εοΌθ½εθεη LLaMA 3 ζεΉ³γ
| Models | β TMMLU+ (ACC) | CMMLU (ACC) | MMLU (ACC) | |
|---|---|---|---|---|
| TC, Knowledge | CN, Knowledge | EN, Knowledge | ||
| 5 shot | 5 shot | 5 shot | ||
| Yi-6B | 6B | 49.63 | 75.53 | 65.35 |
| Qwen-7B | 7B | 42.84 | 73.1 | 61.00 |
| Meta-Llama-3-8B | 8B | 41.97 | 50.8 | 65.17 |
| p208p2002/llama-3-zhtw-8B | 8B | 41.84 | 50.6 | 65.31 |
| Breeze-7B-Base-v0_1 | 7B | 40.35 | 44.05 | 61.63 |
| hfl/llama-3-chinese-8b | 8B | 39.64 | 50.9 | 61.1 |
| Dataset | Lang | Weight |
|---|---|---|
| FineWeb | en | 0.35 |
| Wudao | zh-cn | 0.1 |
| C4Tw | zh-tw | 0.1 |
| WikiZhTw | zh-tw | 0.15 |
| NdltdT10 | zh-tw | 0.1 |
| GitHubMarkDown | code | 0.1 |
| GitHubPython | code | 0.1 |