AI & ML interests
None yet
Organizations
None yet
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minpTrue_FT10000_800
Updated
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_vrex_0.25_0.75_SEC0.0DRO0.0G1.0_minpTrue_1600
Text Generation
•
242k
•
Updated
•
9
citrinegui/Qwen2.5-3B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1600
Text Generation
•
242k
•
Updated
•
11
citrinegui/Llama-3.2-3B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1600
Text Generation
•
175k
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC0.99DRO0.0G0.0_minp0.0_1200
Text Generation
•
2B
•
Updated
•
4
citrinegui/Llama-3.2-3B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1200
Text Generation
•
175k
•
Updated
•
3
citrinegui/Llama-3.2-3B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC0.3DRO0.0G0.0_minp0.0_1200
Updated
citrinegui/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1200
Text Generation
•
2B
•
Updated
•
2
citrinegui/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1600
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minpTrue_10000
Text Generation
•
2B
•
Updated
•
6
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC0.0DRO1.0G0.0_minpTrue_FT4800_800
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minpTrue_FT4800_800
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minpTrue_4800
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minp0.0_1600
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC1.0DRO0.0G0.0_minpTrue_1600
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC0.0DRO1.0G0.0_minpTrue_1600
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC0.0DRO0.0G1.0_minpTrue_1600
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC0.0DRO1.0G1.0_minpTrue_1600
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_vrex_0.5_0.5_SEC0.0DRO1.0G0.2_minpTrue_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_DRO0.0G1.0_True_1600
Text Generation
•
2B
•
Updated
•
7
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_DRO1.0G0.0_True_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_gaussian_0.5_0.5_G0.0_True_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_G0.0_True_1600
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_G1.0_True_1600
Text Generation
•
2B
•
Updated
•
3
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_1600
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_5
Text Generation
•
2B
•
Updated
•
5
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_3
Updated
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_10
Text Generation
•
2B
•
Updated
•
4
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_400
Text Generation
•
2B
•
Updated
•
6
citrinegui/Qwen2.5-1.5B-Instruct_countdown2345_grpo_variance_regularized_0.5_0.5_True_20
Updated