Banner!

Model Card for Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder

Last update: 2026-05-19

SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder is a draft model using speculative decoding method to employ a lightweight block diffusion model to draft multiple tokens in parallel trained from Qwen-SEA-LION-v4.5-27B-IT. This is the drafter model, which must be paired with aisingapore/Qwen-SEA-LION-v4.5-27B-IT.

Model Details

Model Description

SEA-LION stands for Southeast Asian Languages In One Network.

We performed training on top of z-lab/Qwen3.6-27B-DFlash using the DFlash diffusion algorithm with forward KL divergence loss using aisingapore/Qwen-SEA-LION-v4.5-27B-IT as the target model.

  • Developed by: AI Products Pillar, AI Singapore
  • Funded by: Singapore NRF
  • Shared by: AI Products Pillar, AI Singapore
  • Model type: drafter for speculative decoding
  • Training Stage: SFT
  • License: MIT
  • Target model: aisingapore/Qwen-SEA-LION-v4.5-27B-IT

Model Sources

Uses

Out-of-Scope Use

The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

Bias, Risks, and Limitations

The model was not tested for robustness against adversarial prompting. It is important for users to be aware that our model exhibits certain limitations that warrant consideration. Like many LLMs, the model can hallucinate and occasionally generates irrelevant content, introducing fictional elements that are not grounded in the provided context. Users should also exercise caution in interpreting and validating the model's responses due to the potential inconsistencies.

How to Get Started with the Model

Use the code below to get started with the model with vLLM.

CUDA_VISIBLE_DEVICES=0 vllm serve aisingapore/Qwen-SEA-LION-v4.5-27B-IT \
--speculative-config '{"method": "dflash", "model": "aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder", "num_speculative_tokens": 16}' \
--attention-backend flash_attn \
--max-num-batched-tokens 32768 \
--gdn-prefill-backend triton

Training Details

Training Data

The instruction fine-tuning text dataset comprises of a collection of OSS & synthetic data.

Training Procedure

Training Hyperparameters

Training regime: Our workflow consists of instruction fine-tuning and model merging.

Evaluation

Testing Data, Factors & Metrics

We evaluated Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder on various dataset using sampled 128 prompts from https://huggingface.co/datasets/aisingapore/SEA-Instruct-2602 .

Dataset Baseline
aisingapore/Qwen-SEA-LION-v4.5-27B-IT
w/ MTP w/ DFlash
z-lab/Qwen3.6-27B-Dflash
w/ SpecDecoder
aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder
gsm8k 64.55 tok/s 149.19 tok/s
2.31x
283.67 tok/s
4.39x
324.32 tok/s
5.02x
math500 65.91 tok/s 153.19 tok/s
2.32x
306.96 tok/s
4.66x
335.22 tok/s
5.09x
humaneval 66.03 tok/s 155.52 tok/s
2.36x
374.13 tok/s
5.67x
397.11 tok/s
6.01x
mbpp 66.44 tok/s 148.69 tok/s
2.24x
235.37 tok/s
3.54x
260.46 tok/s
3.92x
mt-bench 66.40 tok/s 136.89 tok/s
2.06x
153.81 tok/s
2.32x
163.37 tok/s
2.46x
Tagalog 66.45 tok/s 117.06 tok/s
1.76x
79.91 tok/s
1.20x
164.53 tok/s
2.48x
Burmese 66.46 tok/s 134.61 tok/s
2.03x
85.47 tok/s
1.29x
266.73 tok/s
4.01x
Tamil 66.48 tok/s 111.99 tok/s
1.68x
76.47 tok/s
1.15x
175.82 tok/s
2.64x
Indonesian 66.47 tok/s 119.68 tok/s
1.80x
90.30 tok/s
1.36x
134.24 tok/s
2.02x
Vietnamese 66.48 tok/s 112.04 tok/s
1.69x
87.51 tok/s
1.32x
133.95 tok/s
2.01x
Thai 66.28 tok/s 104.81 tok/s
1.58x
75.37 tok/s
1.14x
118.49 tok/s
1.79x
Chinese 66.46 tok/s 130.32 tok/s
1.96x
113.88 tok/s
1.71x
133.43 tok/s
2.01x
Malay 66.44 tok/s 127.31 tok/s
1.92x
106.24 tok/s
1.60x
169.91 tok/s
2.56x

Note that: Concurrency could degrade throughput to all models. All the parameters setting follows default. num_speculative_tokens is 16

Concurrency (Scaling Performance)

Dataset 1 8 16 32 64
gsm8k 5.02x 3.84x 2.90x 2.01x 1.60x
math500 5.09x 3.77x 2.89x 1.75x 1.29x
humaneval 6.01x 3.78x 3.52x 2.32x 2.17x
mbpp 3.92x 3.61x 2.79x 1.77x 1.27x
mt-bench 2.46x 3.06x 2.51x 1.81x 1.59x
Tagalog 2.48x 2.32x 1.76x 1.09x 0.77x
Burmese 4.01x 2.50x 1.78x 1.12x 0.79x
Tamil 2.65x 2.54x 1.90x 1.13x 0.77x
Indonesian 2.02x 2.47x 1.92x 1.21x 0.86x
Vietnamese 2.02x 2.49x 1.89x 1.12x 0.89x
Thai 1.79x 2.36x 1.99x 1.32x 1.05x
Chinese 2.01x 2.65x 2.02x 1.40x 0.97x
Malay 2.56x 2.57x 2.08x 1.28x 0.92x

Column headers are concurrency values. Prompts are sampled from SEA-Instruct. Each language contains 128 prompts.

More Information

This is the repository for the commercial instruction-tuned model. The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

For more info, please contact us at sealion@aisingapore.org

Team

Ahmed Dabeer, Ahn Jeongmi, Antonyrex Sajeban, Chan Hok Teng Adwin, Cheng Zi Yi Nicholas, Choa Hsueh Mei Esther, Heng Jonathan, Huang Yuli, Jann Railey Estrada Montalan, Lee Chwan Ren, Leong Wai Yi, Leong Wei Qi, Liew Rachel, Limkonchotiwat Peerat, Muhammad Ridzuan Bin Mokhtar, Nagarajan Karthik, Ng Boon Cheong Raymond, Ngee Chia Tai, Ngui Jian Gang, Nguyen Thanh Ngan, Ong Tat-Wee David, Ong Zhi Hao, Pereira Mark, Poon Joseph, Rengarajan Hamsawardhini, Siow Wei Kang Bryan, Susanto Yosephine, Sutaveephamochanon Anocha, Tan Choon Meng, Tan Chor Phin Evelyn, Tan Siao Wei Jessica, Tan Yixian, Tee Jun Yun, Teng Kok Wai Walter, Teo Eng Sipp Leslie, Tjhi William, Wu Donghang, Yeo Yeow Tong, Yong Xianbin, Zhang Haoyang, Zhang Zhou

Acknowledgement

This project is supported by the National Research Foundation Singapore and Infocomm Media Development Authority (IMDA), Singapore under its National Large Language Model Funding Initiative.

Contact

sealion@aisingapore.org

Downloads last month
231
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder

Finetuned
(3)
this model

Collection including aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder