YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Qwen3 8M Model with Falcon-H1-0.5B-Instruct Tokenizer

Model Description

This is an 8M parameter Qwen3 model architecture combined with the Falcon-H1-0.5B-Instruct tokenizer.

Architecture: Qwen3 (transformer with Grouped Query Attention, RMS Normalization, Q/K Normalization, RoPE)
Tokenizer: Falcon-H1-0.5B-Instruct
Parameters: 2,183,552
Precision: BF16
Format: SafeTensors

Configuration

vocab_size: 32768
hidden_size: 64
num_attention_heads: 4
num_key_value_heads: 2
num_hidden_layers: 2
intermediate_size: 160
head_dim: 16
max_position_embeddings: 4096

Usage

from transformers import Qwen3ForCausalLM, AutoTokenizer

model = Qwen3ForCausalLM.from_pretrained("./workspace/qwen3-8m-falcon-tokenizer")
tokenizer = AutoTokenizer.from_pretrained("./workspace/qwen3-8m-falcon-tokenizer")

# Generate text
inputs = tokenizer("Hello, world!", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Notes

This model uses the Qwen3 architecture but with Falcon-H1-0.5B-Instruct tokenizer
The model is initialized with random weights and should be fine-tuned for specific tasks
Compatible with the Qwen3 model family APIs and interfaces

Downloads last month: 5

Safetensors

Model size

2.18M params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support