YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

LLama-Deepseek-Sparse-Attention

This repo contains two variants of the model. The only intended difference is the use of partial RoPE.

Variants

Models continually trained with 1B tokens on FW-edu

  • run-1089676

    • Partial RoPE on the indexer: no (standard/full RoPE disabled for the “partial” behavior)
    • Path: run-1089676/
    • Files: model.safetensors, config.json (plus tokenizer files if included)
  • run-1089496

    • Partial RoPE on the indexer: yes
    • Path: run-1089496/
    • Files: model.safetensors, config.json (plus tokenizer files if included)

license: mit

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support