webauthn-security-v1_20251014_151917
Fine-tuned OLMo-2-1B model for WebAuthn security vulnerability analysis and fix generation.
Model Description
- Base Model: allenai/OLMo-2-1B (MLX-optimized quantized 4-bit)
- Fine-tuning Method: MLX LoRA
- Training Format: MLX Chat Messages (system/user/assistant)
- Domain: WebAuthn/FIDO2 Security Analysis
- Use Case: Vulnerability analysis and security fix generation
- Training Date: 2025-10-14
Training Details
Dataset Statistics
- Total Examples: 1,505
- Training Set: 1,476 examples
- Validation Set: 29 examples
Quality Distribution
- High Quality: 1,476 examples (98.1%)
- Public CVEfixes dataset
- Generated dependency fixes
- Low Quality: 0 examples (0.0%)
- AI-generated narratives
Data Sources
- Public CVEfixes Dataset - Real vulnerability fixes from open source projects
- Generated Fixes - Deterministic dependency upgrades (Trivy, OSV)
- AI Narratives - RAG-enhanced vulnerability analysis
Hyperparameters
- Learning Rate: 1e-06
- Batch Size: 1
- Training Iterations: 1200
- Quality Weighting: 2.5x multiplier for high-quality examples
- Optimizer: AdamW
- Fine-tune Type: LoRA
Training Environment
- Hardware: Apple Silicon (M1/M2/M3)
- Framework: MLX (Apple Silicon optimized)
- Training Time: N/A seconds
- Date: 2025-10-14T15:18:37.971681
Data Format
This model was trained on MLX Chat format with explicit role separation:
{
"messages": [
{
"role": "system",
"content": "You are a cybersecurity analyst specializing in WebAuthn and FIDO2 security vulnerabilities..."
},
{
"role": "user",
"content": "Based on the following analysis, provide the fix..."
},
{
"role": "assistant",
"content": "Upgrade dependency 'log4j' from '2.14.1' to '2.17.1'..."
}
],
"metadata": {
"quality": "high",
"source": "generated",
"chat_template": "chatml"
}
}
Usage
MLX (Apple Silicon)
from mlx_lm import load, generate
model, tokenizer = load("hitoshura25/webauthn-security-v1_20251014_151917")
prompt = "Analyze this WebAuthn vulnerability: CVE-2024-XXXXX"
response = generate(model, tokenizer, prompt, max_tokens=500)
print(response)
HuggingFace Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("hitoshura25/webauthn-security-v1_20251014_151917")
tokenizer = AutoTokenizer.from_pretrained("hitoshura25/webauthn-security-v1_20251014_151917")
messages = [
{"role": "system", "content": "You are a cybersecurity analyst..."},
{"role": "user", "content": "Analyze vulnerability: ..."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=500)
print(tokenizer.decode(outputs[0]))
Performance Metrics
Quality Assessment (Previous Evaluation)
- Syntax Validity: 98% (threshold: 95%)
- Security Improvement: 72% (threshold: 70%)
- Code Completeness: 65% (threshold: 60%)
Note: Metrics from previous model evaluation. Current model pending assessment.
Reproducibility
- Training Recipe: See
training-recipe.yamlin repository - Training Datasets: Available on HuggingFace Datasets
- Configuration: Embedded in
training_metadata.json
Limitations
- Specialized for WebAuthn/FIDO2 security analysis
- Optimized for Apple Silicon (MLX framework)
- Requires context-aware prompting for best results
- May not generalize to other security domains without fine-tuning
Ethical Considerations
This model is designed exclusively for defensive security purposes. It should:
- β Be used to identify and fix security vulnerabilities
- β Support security research and education
- β NOT be used to create or exploit vulnerabilities
- β NOT be used for malicious purposes
Citation
@misc{webauthn-security-20251014,
title={WebAuthn Security Analysis with MLX-Finetuned OLMo},
author={WebAuthn Security Research},
year={2025},
url={https://huggingface.co/hitoshura25/webauthn-security-v1_20251014_151917}
}
License
Apache 2.0
Generated by Security Analysis Pipeline Artifact Uploader v2.0