Model Card for Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder

Last update: 2026-05-19

SEA-LION is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.

Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder is a draft model using speculative decoding method to employ a lightweight block diffusion model to draft multiple tokens in parallel trained from Qwen-SEA-LION-v4.5-27B-IT. This is the drafter model, which must be paired with aisingapore/Qwen-SEA-LION-v4.5-27B-IT.

Model Details

Model Description

SEA-LION stands for Southeast Asian Languages In One Network.

We performed training on top of z-lab/Qwen3.6-27B-DFlash using the DFlash diffusion algorithm with forward KL divergence loss using aisingapore/Qwen-SEA-LION-v4.5-27B-IT as the target model.

Developed by: AI Products Pillar, AI Singapore
Funded by: Singapore NRF
Shared by: AI Products Pillar, AI Singapore
Model type: drafter for speculative decoding
Training Stage: SFT
License: MIT
Target model: aisingapore/Qwen-SEA-LION-v4.5-27B-IT

Model Sources

Repository: SEA-LION v4.5 - an aisingapore Collection

Uses

Out-of-Scope Use

The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

Bias, Risks, and Limitations

The model was not tested for robustness against adversarial prompting. It is important for users to be aware that our model exhibits certain limitations that warrant consideration. Like many LLMs, the model can hallucinate and occasionally generates irrelevant content, introducing fictional elements that are not grounded in the provided context. Users should also exercise caution in interpreting and validating the model's responses due to the potential inconsistencies.

How to Get Started with the Model

Use the code below to get started with the model with vLLM.

CUDA_VISIBLE_DEVICES=0 vllm serve aisingapore/Qwen-SEA-LION-v4.5-27B-IT \
--speculative-config '{"method": "dflash", "model": "aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder", "num_speculative_tokens": 16}' \
--attention-backend flash_attn \
--max-num-batched-tokens 32768 \
--gdn-prefill-backend triton

Training Details

Training Data

The instruction fine-tuning text dataset comprises of a collection of OSS & synthetic data.

Training Procedure

Training Hyperparameters

Training regime: Our workflow consists of instruction fine-tuning and model merging.

Evaluation

Testing Data, Factors & Metrics

We evaluated Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder on various dataset using sampled 128 prompts from https://huggingface.co/datasets/aisingapore/SEA-Instruct-2602 .

Dataset	Baseline `aisingapore/Qwen-SEA-LION-v4.5-27B-IT`	w/ MTP	w/ DFlash `z-lab/Qwen3.6-27B-Dflash`	w/ SpecDecoder `aisingapore/Qwen-SEA-LION-v4.5-27B-IT-SpecDecoder`
gsm8k	64.55 tok/s	149.19 tok/s 2.31x	283.67 tok/s 4.39x	324.32 tok/s 5.02x
math500	65.91 tok/s	153.19 tok/s 2.32x	306.96 tok/s 4.66x	335.22 tok/s 5.09x
humaneval	66.03 tok/s	155.52 tok/s 2.36x	374.13 tok/s 5.67x	397.11 tok/s 6.01x
mbpp	66.44 tok/s	148.69 tok/s 2.24x	235.37 tok/s 3.54x	260.46 tok/s 3.92x
mt-bench	66.40 tok/s	136.89 tok/s 2.06x	153.81 tok/s 2.32x	163.37 tok/s 2.46x
Tagalog	66.45 tok/s	117.06 tok/s 1.76x	79.91 tok/s 1.20x	164.53 tok/s 2.48x
Burmese	66.46 tok/s	134.61 tok/s 2.03x	85.47 tok/s 1.29x	266.73 tok/s 4.01x
Tamil	66.48 tok/s	111.99 tok/s 1.68x	76.47 tok/s 1.15x	175.82 tok/s 2.64x
Indonesian	66.47 tok/s	119.68 tok/s 1.80x	90.30 tok/s 1.36x	134.24 tok/s 2.02x
Vietnamese	66.48 tok/s	112.04 tok/s 1.69x	87.51 tok/s 1.32x	133.95 tok/s 2.01x
Thai	66.28 tok/s	104.81 tok/s 1.58x	75.37 tok/s 1.14x	118.49 tok/s 1.79x
Chinese	66.46 tok/s	130.32 tok/s 1.96x	113.88 tok/s 1.71x	133.43 tok/s 2.01x
Malay	66.44 tok/s	127.31 tok/s 1.92x	106.24 tok/s 1.60x	169.91 tok/s 2.56x

Note that: Concurrency could degrade throughput to all models. All the parameters setting follows default. num_speculative_tokens is 16

Concurrency (Scaling Performance)

Dataset	1	8	16	32	64
gsm8k	5.02x	3.84x	2.90x	2.01x	1.60x
math500	5.09x	3.77x	2.89x	1.75x	1.29x
humaneval	6.01x	3.78x	3.52x	2.32x	2.17x
mbpp	3.92x	3.61x	2.79x	1.77x	1.27x
mt-bench	2.46x	3.06x	2.51x	1.81x	1.59x
Tagalog	2.48x	2.32x	1.76x	1.09x	0.77x
Burmese	4.01x	2.50x	1.78x	1.12x	0.79x
Tamil	2.65x	2.54x	1.90x	1.13x	0.77x
Indonesian	2.02x	2.47x	1.92x	1.21x	0.86x
Vietnamese	2.02x	2.49x	1.89x	1.12x	0.89x
Thai	1.79x	2.36x	1.99x	1.32x	1.05x
Chinese	2.01x	2.65x	2.02x	1.40x	0.97x
Malay	2.56x	2.57x	2.08x	1.28x	0.92x

Column headers are concurrency values. Prompts are sampled from SEA-Instruct. Each language contains 128 prompts.

More Information

This is the repository for the commercial instruction-tuned model. The model has not been aligned for safety. Developers and users should perform their own safety fine-tuning and related security measures. In no event shall the authors be held liable for any claims, damages, or other liabilities arising from the use of the released weights and codes.

For more info, please contact us at sealion@aisingapore.org

Team

Ahmed Dabeer, Ahn Jeongmi, Antonyrex Sajeban, Chan Hok Teng Adwin, Cheng Zi Yi Nicholas, Choa Hsueh Mei Esther, Heng Jonathan, Huang Yuli, Jann Railey Estrada Montalan, Lee Chwan Ren, Leong Wai Yi, Leong Wei Qi, Liew Rachel, Limkonchotiwat Peerat, Muhammad Ridzuan Bin Mokhtar, Nagarajan Karthik, Ng Boon Cheong Raymond, Ngee Chia Tai, Ngui Jian Gang, Nguyen Thanh Ngan, Ong Tat-Wee David, Ong Zhi Hao, Pereira Mark, Poon Joseph, Rengarajan Hamsawardhini, Siow Wei Kang Bryan, Susanto Yosephine, Sutaveephamochanon Anocha, Tan Choon Meng, Tan Chor Phin Evelyn, Tan Siao Wei Jessica, Tan Yixian, Tee Jun Yun, Teng Kok Wai Walter, Teo Eng Sipp Leslie, Tjhi William, Wu Donghang, Yeo Yeow Tong, Yong Xianbin, Zhang Haoyang, Zhang Zhou

Acknowledgement

This project is supported by the National Research Foundation Singapore and Infocomm Media Development Authority (IMDA), Singapore under its National Large Language Model Funding Initiative.