CodeLlama 13B - SecureCode Edition

License Training Dataset Base Model perfecXion.ai

Meta's trusted code model enhanced with security expertise - enterprise-ready

πŸ€— Model Card | πŸ“Š Dataset | πŸ’» perfecXion.ai


🎯 What is This?

This is CodeLlama 13B Instruct fine-tuned on the SecureCode v2.0 dataset - Meta's established code model with strong brand recognition and enterprise adoption, now enhanced with production-grade security knowledge.

CodeLlama is built on Llama 2's foundation, trained on 500B tokens of code and code-adjacent data. Combined with SecureCode training, this model delivers:

βœ… Enterprise-grade security awareness across multiple languages βœ… Trusted brand backed by Meta's reputation βœ… Robust code generation with security as a first-class concern βœ… Production-ready reliability from extensively tested base model

The Result: A proven, enterprise-trusted code model with comprehensive security capabilities.

Why CodeLlama 13B? This model offers:

  • 🏒 Enterprise trust - Widely adopted in production environments
  • πŸ” Strong security baseline - 13B parameters for complex security reasoning
  • πŸ“ˆ Proven track record - Millions of downloads, extensive real-world testing
  • 🎯 Balanced performance - Better than 7B models without 70B resource requirements
  • βš–οΈ Commercial friendly - Permissive license from Meta

🚨 The Problem This Solves

AI coding assistants produce vulnerable code in 45% of security-relevant scenarios (Veracode 2025). Enterprises deploying code generation tools face significant risk without security awareness.

Real-world enterprise impact:

  • Equifax breach: $425 million settlement + reputation damage
  • Capital One: 100 million customer records, $80M fine
  • SolarWinds: 18,000 organizations compromised

CodeLlama SecureCode Edition brings enterprise-grade security to Meta's trusted code generation platform.


πŸ’‘ Key Features

🏒 Enterprise-Grade Foundation

CodeLlama 13B delivers strong performance:

  • HumanEval: 50.0% pass@1 (13B)
  • MultiPL-E: 45.5% average across languages
  • Widely deployed in enterprise environments
  • Extensive real-world validation

Now enhanced with 1,209 security-focused examples covering OWASP Top 10:2025.

πŸ” Comprehensive Security Training

Trained on real-world security incidents:

  • 224 examples of Broken Access Control vulnerabilities
  • 199 examples of Authentication Failures
  • 125 examples of Injection attacks (SQL, Command, XSS)
  • 115 examples of Cryptographic Failures
  • Complete OWASP Top 10:2025 coverage

🌍 Multi-Language Security Expertise

Fine-tuned on security examples across:

  • Python (Django, Flask, FastAPI)
  • JavaScript/TypeScript (Express, NestJS, React)
  • Java (Spring Boot) - CodeLlama's strength
  • C++ (Memory safety patterns)
  • Go (Gin framework)
  • PHP (Laravel, Symfony)
  • C# (ASP.NET Core)
  • Ruby (Rails)
  • Rust (Actix, Rocket)

πŸ“‹ Production Security Guidance

Every response includes:

  1. Vulnerable implementation demonstrating the flaw
  2. Secure implementation with enterprise best practices
  3. Attack demonstration with realistic exploit scenarios
  4. Operational guidance - SIEM integration, compliance, monitoring

πŸ“Š Training Details

Parameter Value
Base Model codellama/CodeLlama-13b-Instruct-hf
Fine-tuning Method LoRA (Low-Rank Adaptation)
Training Dataset SecureCode v2.0
Dataset Size 841 training examples
Training Epochs 3
LoRA Rank (r) 16
LoRA Alpha 32
Learning Rate 2e-4
Quantization 4-bit (bitsandbytes)
Trainable Parameters ~68M (0.52% of 13B total)
Total Parameters 13B
Context Window 16K tokens
GPU Used NVIDIA A100 40GB
Training Time ~110 minutes (estimated)

Training Methodology

LoRA fine-tuning preserves CodeLlama's enterprise reliability:

  • Trains only 0.52% of parameters
  • Maintains code generation quality
  • Adds comprehensive security understanding
  • Minimal deployment overhead

Enterprise deployment ready - Compatible with existing CodeLlama deployments.


πŸš€ Usage

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = "codellama/CodeLlama-13b-Instruct-hf"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Load SecureCode adapter
model = PeftModel.from_pretrained(model, "scthornton/codellama-13b-securecode")

# Generate secure enterprise code
prompt = """### User:
Write a secure Spring Boot controller for user registration that handles all OWASP Top 10 concerns.

### Assistant:
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Enterprise Deployment (4-bit Quantization)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

# 4-bit quantization - runs on 24GB GPU
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="bfloat16"
)

model = AutoModelForCausalLM.from_pretrained(
    "codellama/CodeLlama-13b-Instruct-hf",
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(model, "scthornton/codellama-13b-securecode")
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-13b-Instruct-hf")

# Production-ready deployment

Integration with LangChain (Enterprise Use Case)

from langchain.llms import HuggingFacePipeline
from transformers import AutoModelForCausalLM, pipeline
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-13b-Instruct-hf", device_map="auto")
model = PeftModel.from_pretrained(base_model, "scthornton/codellama-13b-securecode")
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-13b-Instruct-hf")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=2048)
llm = HuggingFacePipeline(pipeline=pipe)

# Enterprise security workflow
security_chain = LLMChain(llm=llm, prompt=security_prompt_template)
review_result = security_chain.run(code=enterprise_codebase)

🎯 Use Cases

1. Enterprise Security Code Review

Review mission-critical code for vulnerabilities:

Perform a comprehensive security audit of this payment processing module

2. Compliance-Focused Code Generation

Generate code meeting SOC 2, PCI-DSS, HIPAA requirements:

Write a HIPAA-compliant patient data access controller with audit logging

3. Legacy System Remediation

Modernize and secure legacy codebases:

Refactor this legacy Java authentication system to meet current security standards

4. Security Architecture Review

Analyze architectural security:

Review this microservices architecture for security vulnerabilities and attack vectors

5. Secure API Development

Generate production-ready secure APIs:

Create a RESTful API for financial transactions with comprehensive security controls

⚠️ Limitations

What This Model Does Well

βœ… Enterprise-grade security code generation βœ… Trusted brand with proven track record βœ… Strong performance on security-critical code βœ… Comprehensive security explanations

What This Model Doesn't Do

❌ Not a replacement for security audits ❌ Cannot guarantee compliance certification ❌ Not legal/regulatory advice ❌ Not a replacement for security professionals


πŸ“ˆ Performance Benchmarks

Hardware Requirements

Minimum:

  • 28GB RAM
  • 20GB GPU VRAM (with 4-bit quantization)

Recommended:

  • 48GB RAM
  • 24GB+ GPU (RTX 3090, RTX 4090, A5000)

Inference Speed (on A100 40GB):

  • ~50 tokens/second (4-bit quantization)
  • ~70 tokens/second (bfloat16)

Code Generation (Base Model Scores)

Benchmark Score
HumanEval 50.0%
MultiPL-E 45.5%
Enterprise deployments 100,000+

πŸ”¬ Dataset Information

Trained on SecureCode v2.0:

  • 1,209 examples with real CVE grounding
  • 100% incident validation
  • OWASP Top 10:2025 complete coverage
  • Expert security review

πŸ“„ License

Model: Apache 2.0 | Dataset: CC BY-NC-SA 4.0

Enterprise-friendly licensing from Meta + perfecXion.ai


πŸ“š Citation

@misc{thornton2025securecode-codellama,
  title={CodeLlama 13B - SecureCode Edition},
  author={Thornton, Scott},
  year={2025},
  publisher={perfecXion.ai},
  url={https://huggingface.co/scthornton/codellama-13b-securecode}
}

πŸ™ Acknowledgments

  • Meta AI for CodeLlama's enterprise-grade foundation
  • OWASP Foundation for vulnerability taxonomy
  • MITRE for CVE database
  • Enterprise security teams for real-world validation

πŸ”— Related Models

View Collection


Built with ❀️ for secure enterprise software development

perfecXion.ai | Contact

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for scthornton/codellama-13b-securecode

Finetuned
(27)
this model

Dataset used to train scthornton/codellama-13b-securecode

Collection including scthornton/codellama-13b-securecode