webauthn-security-v1_20251014_151917

Fine-tuned OLMo-2-1B model for WebAuthn security vulnerability analysis and fix generation.

Model Description

  • Base Model: allenai/OLMo-2-1B (MLX-optimized quantized 4-bit)
  • Fine-tuning Method: MLX LoRA
  • Training Format: MLX Chat Messages (system/user/assistant)
  • Domain: WebAuthn/FIDO2 Security Analysis
  • Use Case: Vulnerability analysis and security fix generation
  • Training Date: 2025-10-14

Training Details

Dataset Statistics

  • Total Examples: 1,505
  • Training Set: 1,476 examples
  • Validation Set: 29 examples

Quality Distribution

  • High Quality: 1,476 examples (98.1%)
    • Public CVEfixes dataset
    • Generated dependency fixes
  • Low Quality: 0 examples (0.0%)
    • AI-generated narratives

Data Sources

  1. Public CVEfixes Dataset - Real vulnerability fixes from open source projects
  2. Generated Fixes - Deterministic dependency upgrades (Trivy, OSV)
  3. AI Narratives - RAG-enhanced vulnerability analysis

Hyperparameters

  • Learning Rate: 1e-06
  • Batch Size: 1
  • Training Iterations: 1200
  • Quality Weighting: 2.5x multiplier for high-quality examples
  • Optimizer: AdamW
  • Fine-tune Type: LoRA

Training Environment

  • Hardware: Apple Silicon (M1/M2/M3)
  • Framework: MLX (Apple Silicon optimized)
  • Training Time: N/A seconds
  • Date: 2025-10-14T15:18:37.971681

Data Format

This model was trained on MLX Chat format with explicit role separation:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a cybersecurity analyst specializing in WebAuthn and FIDO2 security vulnerabilities..."
    },
    {
      "role": "user",
      "content": "Based on the following analysis, provide the fix..."
    },
    {
      "role": "assistant",
      "content": "Upgrade dependency 'log4j' from '2.14.1' to '2.17.1'..."
    }
  ],
  "metadata": {
    "quality": "high",
    "source": "generated",
    "chat_template": "chatml"
  }
}

Usage

MLX (Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("hitoshura25/webauthn-security-v1_20251014_151917")
prompt = "Analyze this WebAuthn vulnerability: CVE-2024-XXXXX"
response = generate(model, tokenizer, prompt, max_tokens=500)
print(response)

HuggingFace Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("hitoshura25/webauthn-security-v1_20251014_151917")
tokenizer = AutoTokenizer.from_pretrained("hitoshura25/webauthn-security-v1_20251014_151917")

messages = [
    {"role": "system", "content": "You are a cybersecurity analyst..."},
    {"role": "user", "content": "Analyze vulnerability: ..."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=500)
print(tokenizer.decode(outputs[0]))

Performance Metrics

Quality Assessment (Previous Evaluation)

  • Syntax Validity: 98% (threshold: 95%)
  • Security Improvement: 72% (threshold: 70%)
  • Code Completeness: 65% (threshold: 60%)

Note: Metrics from previous model evaluation. Current model pending assessment.

Reproducibility

  • Training Recipe: See training-recipe.yaml in repository
  • Training Datasets: Available on HuggingFace Datasets
  • Configuration: Embedded in training_metadata.json

Limitations

  • Specialized for WebAuthn/FIDO2 security analysis
  • Optimized for Apple Silicon (MLX framework)
  • Requires context-aware prompting for best results
  • May not generalize to other security domains without fine-tuning

Ethical Considerations

This model is designed exclusively for defensive security purposes. It should:

  • βœ… Be used to identify and fix security vulnerabilities
  • βœ… Support security research and education
  • ❌ NOT be used to create or exploit vulnerabilities
  • ❌ NOT be used for malicious purposes

Citation

@misc{webauthn-security-20251014,
  title={WebAuthn Security Analysis with MLX-Finetuned OLMo},
  author={WebAuthn Security Research},
  year={2025},
  url={https://huggingface.co/hitoshura25/webauthn-security-v1_20251014_151917}
}

License

Apache 2.0


Generated by Security Analysis Pipeline Artifact Uploader v2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train hitoshura25/webauthn-security-v1_20251014_151917