-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2509.09675
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 70 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 22 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Paper • 2509.09675 • Published • 28 -
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Paper • 2509.07403 • Published • 58 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
NousResearch/Hermes-4-70B
Text Generation • 71B • Updated • 2.05k • • 161 -
unsloth/Kimi-K2-Instruct-0905-GGUF
1T • Updated • 2.65k • 51 -
securemy/PHOENIX.V
Text-to-Image • Updated -
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Paper • 2509.09675 • Published • 28
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Paper • 2509.09675 • Published • 28 -
LongEmotion: Measuring Emotional Intelligence of Large Language Models in Long-Context Interaction
Paper • 2509.07403 • Published • 58 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 70 -
Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training
Paper • 2509.03403 • Published • 22 -
LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations
Paper • 2509.03405 • Published • 23 -
SATQuest: A Verifier for Logical Reasoning Evaluation and Reinforcement Fine-Tuning of LLMs
Paper • 2509.00930 • Published • 4
-
NousResearch/Hermes-4-70B
Text Generation • 71B • Updated • 2.05k • • 161 -
unsloth/Kimi-K2-Instruct-0905-GGUF
1T • Updated • 2.65k • 51 -
securemy/PHOENIX.V
Text-to-Image • Updated -
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Paper • 2509.09675 • Published • 28
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4