Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 7 days ago • 103
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs Paper • 2505.21327 • Published May 27 • 83
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22 • 120
Adversarial Data Collection: Human-Collaborative Perturbations for Efficient and Robust Robotic Imitation Learning Paper • 2503.11646 • Published Mar 14 • 34
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing Paper • 2503.04629 • Published Mar 6 • 19
SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing Paper • 2503.04629 • Published Mar 6 • 19
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback Paper • 2501.03916 • Published Jan 7 • 16
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback Paper • 2501.03916 • Published Jan 7 • 16
GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training Paper • 2412.11863 • Published Dec 16, 2024 • 4