-
-
-
-
-
-
Inference Providers
Active filters:
grpo
codelion/Qwen3-4B-execution-world-model-lora
Text Generation
•
Updated
•
12
•
5
Danau5tin/calculator_agent_qwen2.5_0.5b
0.5B
•
Updated
•
1
•
1
openmed-community/granite-4.0-micro-OpenMed
Text Generation
•
3B
•
Updated
•
7
•
6
Guilherme34/True-Qwen2.5-14B-Instruct
Text Generation
•
15B
•
Updated
•
6
•
3
oberbics/llama-3.1-8B-newspaper_argument_mining
Text Generation
•
8B
•
Updated
•
3
•
1
Text Generation
•
4B
•
Updated
•
75
•
10
Text Generation
•
4B
•
Updated
•
12
•
2
zyc-zju/Qwen3-Embedding-4B-GRPO
Text Generation
•
Updated
•
297
•
2
aquiffoo/neo-3-3B-A400M-Thinking
Text Generation
•
Updated
•
2
aquiffoo/neo-3-1B-A90M-Instruct
Text Generation
•
Updated
•
2
bigatuna/Qwen3-0.6B-Sushi-Coder
Text Generation
•
0.6B
•
Updated
•
45
•
1
ehcalabres/lfm2.5-1.2b-instruct-grpo-lora
ielabgroup/Autobool-Qwen4b-No-reasoning
Reinforcement Learning
•
4B
•
Updated
•
13
•
1
ielabgroup/Autobool-Qwen4b-Reasoning
Reinforcement Learning
•
4B
•
Updated
•
17
•
1
ielabgroup/Autobool-Qwen4b-Reasoning-objective
Reinforcement Learning
•
4B
•
Updated
•
17
•
1
Chun121/Qwen3-4B-RPG-Roleplay-V2
Text Generation
•
4B
•
Updated
•
9.45k
•
33
Text Generation
•
0.1B
•
Updated
•
1
8B
•
Updated
sergiopaniego/Qwen2-0.5B-GRPO-test
Updated
Novaciano/ESP-NSFW-GRPO-1B-Sin_Censura-GGUF
1B
•
Updated
•
65
•
3
nbd22/Llama-3.1-8B-Instruct-GRPO-gsm8k-ft-lora
Updated
sergiopaniego/Qwen2-0.5B-GRPO
Updated
philschmid/qwen-2.5-3b-r1-countdown
Text Generation
•
3B
•
Updated
•
7
•
8
spinech/qwen-2.5-3b-r1-countdown
Text Generation
•
3B
•
Updated
•
5
Dongwei/Qwen2.5-1.5B-Open-R1-GRPO
Text Generation
•
2B
•
Updated
•
2
•
1
spinech/qwen2.5-3b-r1-rearc-stage1
Text Generation
•
3B
•
Updated
•
4
Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO
Text Generation
•
8B
•
Updated
•
7
•
1