Arcade-3B — SmolReasoner

DOILicense: Apache 2.0 Base Model NoesisLab GSM8K ARC-Easy

Arcade-3B is a 3B instruction-following and reasoning model built on SmolLM3-3B. It is the first public release from the ARCADE project at NoesisLab, which investigates the State–Constraint Orthogonality Hypothesis: standard Transformer hidden states conflate factual content and reasoning structure in the same subspace, and explicitly decoupling them improves generalization.


Method: SC-Orthogonal Training

Standard Transformer hidden states conflate two distinct functions:

Half Symbol Role
H[..., :D/2] S (State) What the model knows — factual content
H[..., D/2:] C (Constraint) How to retrieve it — reasoning structure

ARCADE's SCOrthoTrainer injects an orthogonality penalty on the final hidden layer, encouraging S and C to decouple in representation space without modifying any attention operators:

Ltotal=LCE+λBLb,l(Sb,lCb,l)2\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{CE}} + \frac{\lambda}{B \cdot L} \sum_{b,l} \left( \mathbf{S}_{b,l} \cdot \mathbf{C}_{b,l} \right)^2

with λ = 0.1. This soft regularization reduces divergence errors at inference time at zero architectural cost.

SC-Orthogonal Optimization Loop


Training Details

Setting Value
Base model HuggingFaceTB/SmolLM3-3B
λ (orth penalty) 0.1
Max sequence length 2048
Learning rate 2e-4 (cosine)
Steps 10 000
Effective batch 16 sequences/step
Hardware 1 × A100-80 GB
Precision bfloat16

Training Data

Dataset Split Sampling weight
nohurry/Opus-4.6-Reasoning-3000x-filtered train (2.3 K) 10 %
HuggingFaceTB/smol-smoltalk train (460 K) 45 %
OpenDataArena/ODA-Mixture-500k train (500 K) 45 %

Reasoning samples are wrapped with <think>…</think> tags and upsampled 10× to compensate for the small dataset size.


Evaluation

Results from lm-evaluation-harness:

Comparison with Peer Models

Benchmark Comparison

< 10% entries are displayed as <10% in the chart.

Benchmark Arcade-3B Gemma-2-2B Llama-2-7B Qwen1.5-1.8B OpenLLaMA-v2-3B
MMLU 52.9% 52.4% 45.3% 46.8% 41.0%
GSM8K 62.9% 50.9% 14.6% 37.8% < 10%
HumanEval 41.5% 32.3% 12.8% 27.4% < 10%
ARC-Challenge 52.6% 53.1% 46.2% 41.2% 34.2%
ARC-Easy 74.4% 75.9% 75.3% 66.8% 68.1%

Arcade-3B Detailed Scores

Benchmark Few-shot Metric Score ±
GSM8K 5 flexible-extract / exact_match 0.6293 0.0133
HumanEval 0 pass@1 0.4146 0.0386
ARC-Challenge 25 acc_norm 0.5256 0.0146
ARC-Easy 0 acc 0.7437 0.0090
MMLU 0 acc 0.5293 0.0040

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "NoesisLab/Arcade-3B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Solve step by step: If a train travels 120 km in 1.5 hours, what is its average speed?"}]
input_ids = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

output = model.generate(input_ids, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tok.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

For step-by-step reasoning, the model may emit a <think>…</think> block before the final answer.


Citation

@misc{noesislab2025arcade,
  title        = {ARCADE: State-Constraint Orthogonal Training},
  author       = {NoesisLab},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/NoesisLab/Arcade-3B}},
}

License

Apache 2.0 — inherited from SmolLM3-3B.

Downloads last month
-
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NoesisLab/Arcade-3B

Finetuned
(111)
this model
Quantizations
2 models