## Issue with AngelSlim/Qwen3-4B_eagle3 Model Configuration

#1
by Billy1377 - opened

Dear AngelSlim Team,

I'm experiencing a compatibility issue when trying to use the AngelSlim/Qwen3-4B_eagle3 model with the official Qwen3-4B base model. I hope you can help clarify the correct usage.

Problem Description

When loading the EAGLE model with Qwen3-4B, I encounter a tensor dimension mismatch error:

RuntimeError: The size of tensor a (80) must match the size of tensor b (128) at non-singleton dimension 3

Configuration Mismatch

Qwen3-4B base model config:

  • hidden_size: 2560
  • num_attention_heads: 32
  • head_dim: 128

AngelSlim/Qwen3-4B_eagle3 config:

  • hidden_size: 2560
  • num_attention_heads: 32
  • head_dim: 80

Question: Why is the head_dim different? Which base model should I use with this EAGLE model?

Thanks!

Same question here... Not sure why it's this way

why model type is llama?

image

Sign up or log in to comment