CodeShell

CodeShellๆ˜ฏๅŒ—ไบฌๅคงๅญฆ็Ÿฅ่ฏ†่ฎก็ฎ—ๅฎž้ชŒๅฎค่”ๅˆๅ››ๅทๅคฉๅบœ้“ถ่กŒAIๅ›ข้˜Ÿ็ ”ๅ‘็š„ๅคš่ฏญ่จ€ไปฃ็ ๅคงๆจกๅž‹ๅŸบๅบงใ€‚CodeShellๅ…ทๆœ‰70ไบฟๅ‚ๆ•ฐ๏ผŒๅœจไบ”ๅƒไบฟTokens่ฟ›่กŒไบ†่ฎญ็ปƒ๏ผŒไธŠไธ‹ๆ–‡็ช—ๅฃ้•ฟๅบฆไธบ8194ใ€‚ๅœจๆƒๅจ็š„ไปฃ็ ่ฏ„ไผฐBenchmark๏ผˆHumanEvalไธŽMBPP๏ผ‰ไธŠ๏ผŒCodeShellๅ–ๅพ—ๅŒ็ญ‰่ง„ๆจกๆœ€ๅฅฝ็š„ๆ€ง่ƒฝใ€‚ไธŽๆญคๅŒๆ—ถ๏ผŒๆˆ‘ไปฌๆไพ›ไบ†ไธŽCodeShell้…ๅฅ—็š„้ƒจ็ฝฒๆ–นๆกˆไธŽIDEๆ’ไปถ๏ผŒ่ฏทๅ‚่€ƒไปฃ็ ๅบ“CodeShellใ€‚ๅŒๆ—ถ๏ผŒไธบไบ†ๆ–นไพฟไธญๅ›ฝ็”จๆˆทไธ‹่ฝฝ๏ผŒๆˆ‘ไปฌๅœจModelscopeๅ’ŒWisemodelไธญไนŸไธŠไผ ไบ†ๅฏนๅบ”็‰ˆๆœฌ๏ผŒๅ›ฝๅ†…็”จๆˆทๅฏไปฅ่ฎฟ้—ฎใ€‚ๆœฌไป“ๅบ“ไธบCodeShell-7B้ข„่ฎญ็ปƒๆจกๅž‹ไป“ๅบ“ใ€‚

CodeShell is a multi-language code LLM developed by the Knowledge Computing Lab of Peking University. CodeShell has 7 billion parameters and was trained on 500 billion tokens with a context window length of 8194. On authoritative code evaluation benchmarks (HumanEval and MBPP), CodeShell achieves the best performance of its scale. Meanwhile, we provide deployment solutions and IDE plugins that complement CodeShell. Please refer to the CodeShell code repository for more details. This repository is for the CodeShell-7B base model.

Main Characteristics of CodeShell

  • ๅผบๅคง็š„ๆ€ง่ƒฝ๏ผšCodelShellๅœจHumanEvalๅ’ŒMBPPไธŠ่พพๅˆฐไบ†7Bไปฃ็ ๅŸบๅบงๅคงๆจกๅž‹็š„ๆœ€ไผ˜ๆ€ง่ƒฝ

  • ๅฎŒๆ•ด็š„ไฝ“็ณป๏ผš้™คไบ†ไปฃ็ ๅคงๆจกๅž‹๏ผŒๅŒๆ—ถๅผ€ๆบIDE๏ผˆVS CodeไธŽJetBrains๏ผ‰ๆ’ไปถ๏ผŒๅฝขๆˆๅผ€ๆบ็š„ๅ…จๆ ˆๆŠ€ๆœฏไฝ“็ณป

  • ่ฝป้‡ๅŒ–้ƒจ็ฝฒ๏ผšๆ”ฏๆŒๆœฌๅœฐC++้ƒจ็ฝฒ๏ผŒๆไพ›่ฝป้‡ๅฟซ้€Ÿ็š„ๆœฌๅœฐๅŒ–่ฝฏไปถๅผ€ๅ‘ๅŠฉๆ‰‹่งฃๅ†ณๆ–นๆกˆ

  • ๅ…จ้ข็š„่ฏ„ๆต‹๏ผšๆไพ›ๆ”ฏๆŒๅฎŒๆ•ด้กน็›ฎไธŠไธ‹ๆ–‡ใ€่ฆ†็›–ไปฃ็ ็”Ÿๆˆใ€ไปฃ็ ็ผบ้™ทๆฃ€ๆต‹ไธŽไฟฎๅคใ€ๆต‹่ฏ•็”จไพ‹็”Ÿๆˆ็ญ‰ๅธธ่ง่ฝฏไปถๅผ€ๅ‘ๆดปๅŠจ็š„ๅคšไปปๅŠก่ฏ„ๆต‹ไฝ“็ณป๏ผˆๅณๅฐ†ๅผ€ๆบ๏ผ‰

  • ้ซ˜ๆ•ˆ็š„่ฎญ็ปƒ๏ผšๅŸบไบŽ้ซ˜ๆ•ˆ็š„ๆ•ฐๆฎๆฒป็†ไฝ“็ณป๏ผŒCodeShellๅœจๅฎŒๅ…จๅ†ทๅฏๅŠจๆƒ…ๅ†ตไธ‹๏ผŒๅช่ฎญ็ปƒไบ†ไบ”ๅƒไบฟTokenๅณ่Žทๅพ—ไบ†ไผ˜ๅผ‚็š„ๆ€ง่ƒฝ

  • Powerful Performance: CodeShell achieves optimal performance for a 7B code base model on HumanEval and MBPP.

  • Complete Ecosystem: In addition to the mega code model, open-source IDE plugins (for VS Code and JetBrains) are also available, forming a comprehensive open-source full-stack technology system.

  • Lightweight Deployment: Supports local C++ deployment, offering a lightweight and fast localized software development assistant solution.

  • Comprehensive Evaluation: Provides a multi-task evaluation system that supports full project context, covering code generation, code defect detection and repair, test case generation, and other common software development activities (to be open-sourced soon).

  • Efficient Training: Based on an efficient data governance system, CodeShell, even when starting from scratch, achieved outstanding performance with training on just 500 trillion tokens.

Quickstart

Code Generation

Codeshell ๆไพ›ไบ†Hugging Faceๆ ผๅผ็š„ๆจกๅž‹๏ผŒๅผ€ๅ‘่€…ๅฏไปฅ้€š่ฟ‡ไธ‹ๅˆ—ไปฃ็ ๅŠ ่ฝฝๅนถไฝฟ็”จใ€‚

Codeshell offers a model in the Hugging Face format. Developers can load and use it with the following code.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("WisdomShell/CodeShell-7B", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("WisdomShell/CodeShell-7B", trust_remote_code=True).cuda()
inputs = tokenizer('def print_hello_world():', return_tensors='pt').cuda()
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Fill in the Moddle

CodeShell ๆ”ฏๆŒFill-in-the-Middleๆจกๅผ๏ผŒไปŽ่€Œๆ›ดๅฅฝ็š„ๆ”ฏๆŒ่ฝฏไปถๅผ€ๅ‘่ฟ‡็จ‹ใ€‚

CodeShell supports the Fill-in-the-Middle mode, thereby better facilitating the software development process.

input_text = "<fim_prefix>def print_hello_world():\n    <fim_suffix>\n    print('Hello world!')<fim_middle>"
inputs = tokenizer(input_text, return_tensors='pt').cuda()
outputs = model.generate(inputs)
print(tokenizer.decode(outputs[0]))

Model Details

Code Shellไฝฟ็”จGPT-2ไฝœไธบๅŸบ็ก€ๆžถๆž„๏ผŒ้‡‡็”จGrouped-Query Attentionใ€RoPE็›ธๅฏนไฝ็ฝฎ็ผ–็ ็ญ‰ๆŠ€ๆœฏใ€‚

Code Shell uses GPT-2 as its foundational architecture and incorporates technologies such as Grouped-Query Attention and RoPE relative position encoding.

Hyper-parameter Value
n_layer 42
n_embd 4096
n_inner 16384
n_head 32
num_query_groups 8
seq-length 8192
vocab_size 70144

Evaluation

ๆˆ‘ไปฌ้€‰ๅ–ไบ†็›ฎๅ‰ๆœ€ๆต่กŒ็š„ไธคไธชไปฃ็ ่ฏ„ๆต‹ๆ•ฐๆฎ้›†๏ผˆHumanEvalไธŽMBPP๏ผ‰ๅฏนๆจกๅž‹่ฟ›่กŒ่ฏ„ไผฐ๏ผŒไธŽ็›ฎๅ‰ๆœ€ๅ…ˆ่ฟ›็š„ไธคไธช7bไปฃ็ ๅคงๆจกๅž‹CodeLllamaไธŽStarcoder็›ธๆฏ”๏ผŒCodeshell ๅ–ๅพ—ไบ†ๆœ€ไผ˜็š„ๆˆ็ปฉใ€‚ๅ…ทไฝ“่ฏ„ๆต‹็ป“ๆžœๅฆ‚ไธ‹ใ€‚

We selected the two most popular code evaluation datasets currently available (HumanEval and MBPP) to assess the model. Compared to the two most advanced 7b LLM for code, CodeLllama and Starcoder, Codeshell achieved the best results. The specific evaluation results are as follows.

Pass@1

ไปปๅŠก CodeShell-7b CodeLlama-7b Starcoder-7b
humaneval 34.32 29.44 27.80
mbpp 38.65 37.60 34.16
multiple-js 33.17 31.30 27.02
multiple-java 30.43 29.24 24.30
multiple-cpp 28.21 27.33 23.04
multiple-swift 24.30 25.32 15.70
multiple-php 30.87 25.96 22.11
multiple-d 8.85 11.60 8.08
multiple-jl 22.08 25.28 22.96
multiple-lua 22.39 30.50 22.92
multiple-r 20.52 18.57 14.29
multiple-rkt 17.20 12.55 10.43
multiple-rs 24.55 25.90 22.82

Statement

ๆˆ‘ไปฌ้ƒ‘้‡ๅฃฐๆ˜Ž๏ผŒๆˆ‘ไปฌๅผ€ๅ‘ๅ›ข้˜ŸๅŸบไบŽCodeShellๆจกๅž‹ๅผ€ๅ‘ไบ†ๅŸบไบŽvscodeๅ’Œintellij็š„ๆ™บ่ƒฝ็ผ–็ ๅŠฉๆ‰‹ๆ’ไปถๅนถๅ‡ๅทฒๅผ€ๆบใ€‚้™คๆญคไปฅๅค–๏ผŒๆ— ่ฎบๆ˜ฏ้’ˆๅฏนiOSใ€Androidใ€HarmonyOSใ€Web๏ผŒ่ฟ˜ๆ˜ฏๅ…ถไป–ไปปไฝ•ๅนณๅฐ๏ผŒๆˆ‘ไปฌ็š„ๅผ€ๅ‘ๅ›ข้˜Ÿๅ‡ๆœชๅผ€ๅ‘ไปปไฝ•ๅŸบไบŽCodeShellๆจกๅž‹็š„ๅบ”็”จ็จ‹ๅบใ€‚ๆˆ‘ไปฌๅผบ็ƒˆๆ•ฆไฟƒๆ‰€ๆœ‰็”จๆˆทไธ่ฆๅˆฉ็”จCodeShellๆจกๅž‹ไปŽไบ‹ๅฑๅฎณๅ›ฝๅฎถๅ’Œ็คพไผšๅฎ‰ๅ…จๆˆ–่ฟๆณ•ๆดปๅŠจใ€‚ๅŒๆ—ถ๏ผŒๆˆ‘ไปฌ่ฆๆฑ‚็”จๆˆทไธ่ฆๅœจๆœช็ป้€‚ๅฝ“็š„ๅฎ‰ๅ…จๅฎกๆŸฅๅ’Œๅค‡ๆกˆ็š„ไบ’่”็ฝ‘ๆœๅŠกไธญไฝฟ็”จCodeShellๆจกๅž‹ใ€‚ๆˆ‘ไปฌๅธŒๆœ›ๆ‰€ๆœ‰็”จๆˆท้ƒฝ่ƒฝ้ตๅฎˆ่ฟ™ไธ€ๅŽŸๅˆ™๏ผŒไปฅ็กฎไฟๅœจๅˆ่ง„ๅ’Œๅˆๆณ•็š„็Žฏๅขƒไธ‹ๅ‘ๅฑ•็ง‘ๆŠ€ใ€‚

ๅฐฝ็ฎกๆˆ‘ไปฌๅœจ็กฎไฟๆจกๅž‹่ฎญ็ปƒ่ฟ‡็จ‹ไธญไฝฟ็”จๆ•ฐๆฎๅˆ่ง„ๆ€งๆ–น้ขๅทฒไป˜ๅ‡บๅทจๅคงๅŠชๅŠ›๏ผŒไฝ†็”ฑไบŽๆจกๅž‹ๅ’Œๆ•ฐๆฎ็š„ๅคๆ‚ๆ€ง๏ผŒๅฏ่ƒฝไผšๅ‡บ็Žฐ้šพไปฅ้ข„ๆ–™็š„้—ฎ้ข˜ใ€‚ๅ› ๆญค๏ผŒๅฏนไบŽไฝฟ็”จCodeShellๅผ€ๆบๆจกๅž‹ๅฏผ่‡ด็š„ไปปไฝ•้—ฎ้ข˜๏ผŒๅŒ…ๆ‹ฌไฝ†ไธ้™ไบŽๆ•ฐๆฎๅฎ‰ๅ…จ้—ฎ้ข˜ใ€ๅ…ฌๅ…ฑ่ˆ†่ฎบ้ฃŽ้™ฉ๏ผŒๆˆ–ๆจกๅž‹่ขซ่ฏฏ็”จใ€ๆปฅ็”จใ€ไผ ๆ’ญๆˆ–ไธๅฝ“ๅˆฉ็”จ็ญ‰้ฃŽ้™ฉๅ’Œ้—ฎ้ข˜๏ผŒๆˆ‘ไปฌๆฆ‚ไธ่ดŸ่ดฃใ€‚

We hereby declare that our development team has developed intelligent coding assistant plugins for vscode and intellij based on the CodeShell model, both of which have been open-sourced. Beyond this, whether for iOS, Android, HarmonyOS, Web, or any other platform, our development team has not developed any applications based on the CodeShell model. We strongly urge all users not to use the CodeShell model for activities that endanger national and social security or are illegal. At the same time, we request users not to use the CodeShell model in internet services that have not undergone proper security reviews and registration. We hope all users will adhere to this principle to ensure the development of technology in a compliant and legal environment.

Despite our significant efforts to ensure compliance in the data used during the model training process, unforeseen issues may arise due to the complexity of the models and data. Therefore, we are not responsible for any issues arising from the use of the open-sourced CodeShell model, including but not limited to data security issues, public opinion risks, or risks and problems related to the model being misused, abused, disseminated, or exploited improperly.

License

็คพๅŒบไฝฟ็”จCodeShellๆจกๅž‹้œ€่ฆ้ตๅพชCodeShellๆจกๅž‹่ฎธๅฏๅ่ฎฎๅŠApache 2.0 ่ฎธๅฏ่ฏใ€‚CodeShellๆจกๅž‹ๅ…่ฎธ็”จไบŽๅ•†ไธš็”จ้€”๏ผŒไฝ†ๅฆ‚ๆžœๆ‚จ่ฎกๅˆ’ๅฐ†CodeShellๆจกๅž‹ๆˆ–ๅ…ถๆดพ็”Ÿไบงๅ“็”จไบŽๅ•†ไธš็”จ้€”๏ผŒ้œ€่ฆๆ‚จ็กฎ่ฎคไธปไฝ“็ฌฆๅˆไปฅไธ‹ๆกไปถ๏ผš

  1. ๅ…ณ่”ๆ–น็š„ๆœๅŠกๆˆ–ไบงๅ“็š„ๆฏๆ—ฅๅนณๅ‡ๆดป่ทƒ็”จๆˆทๆ•ฐ๏ผˆDAU๏ผ‰ๅŽŸๅˆ™ไธŠไธ่ƒฝ่ถ…่ฟ‡100ไธ‡ใ€‚
  2. ๅ…ณ่”ๆ–นไธๅพ—ๆ˜ฏ้ขๅ‘ไธชไบบ็”จๆˆท็š„่ฝฏไปถๆœๅŠกๆไพ›ๅ•†ๆˆ–ไบ‘ๆœๅŠกๆไพ›ๅ•†ใ€‚
  3. ๅ…ณ่”ๆ–นไธๅญ˜ๅœจๅฐ†่Žทๅพ—ๆŽˆไบˆ็š„ๅ•†ไธš่ฎธๅฏ๏ผŒๅœจๆœช็ป่ฎธๅฏ็š„ๅ‰ๆไธ‹ๅฐ†ๅ…ถๅ†ๆŽˆๆƒ็ป™ๅ…ถไป–็ฌฌไธ‰ๆ–น็š„ๅฏ่ƒฝๆ€งใ€‚

ๅœจๆปก่ถณไธŠ่ฟฐๆกไปถ็š„ๅ‰ๆไธ‹๏ผŒๆ‚จ้œ€่ฆ้€š่ฟ‡ๅ‘codeshell.opensource@gmail.comๅ‘้€็”ตๅญ้‚ฎไปถ๏ผŒๆไบคใ€ŠCodeShellๆจกๅž‹่ฎธๅฏๅ่ฎฎใ€‹่ฆๆฑ‚็š„็”ณ่ฏทๆๆ–™ใ€‚็ปๅฎกๆ ธ้€š่ฟ‡ๅŽ๏ผŒๅฐ†ๆŽˆไบˆๆ‚จไธ€ไธชๅ…จ็ƒ็š„ใ€้žๆŽ’ไป–็š„ใ€ไธๅฏ่ฝฌ่ฎฉ็š„ใ€ไธๅฏๅ†ๆŽˆๆƒ็š„ๅ•†ไธš็‰ˆๆƒ่ฎธๅฏใ€‚

Community use of the CodeShell model requires adherence to the "CodeShell License Agreement" and the Apache 2.0 License. The CodeShell model is allowed for commercial use, but if you plan to use the CodeShell model or its derivatives for commercial purposes, you need to ensure that the entity meets the following conditions:

  1. The Daily Active Users (DAU) of your or your affiliate's service or product is less than 1 million.
  2. You and your affiliates must not be a software service provider or cloud service provider targeting individual users.
  3. You and your affiliates should not have the possibility of sub-licensing to other third parties without obtaining the commercial license granted.

Under the aforementioned conditions, you need to submit the application materials required by the "CodeShell License Agreement" by sending an email to codeshell.opensource@gmail.com. After approval, you will be granted a global, non-exclusive, non-transferable, non-sublicensable commercial copyright license.

Downloads last month
160
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for WisdomShell/CodeShell-7B

Quantizations
2 models

Spaces using WisdomShell/CodeShell-7B 22