YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
LLama-Deepseek-Sparse-Attention
This repo contains two variants of the model. The only intended difference is the use of partial RoPE.
Variants
Models continually trained with 1B tokens on FW-edu
run-1089676
- Partial RoPE on the indexer: no (standard/full RoPE disabled for the “partial” behavior)
- Path:
run-1089676/ - Files:
model.safetensors,config.json(plus tokenizer files if included)
run-1089496
- Partial RoPE on the indexer: yes
- Path:
run-1089496/ - Files:
model.safetensors,config.json(plus tokenizer files if included)
license: mit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support