WeEdit: A Dataset, Benchmark and Glyph-Guided Framework for Text-centric Image Editing Paper • 2603.11593 • Published 5 days ago • 22
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers Paper • 2603.12245 • Published 5 days ago • 17
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 5 days ago • 58
Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated about 11 hours ago • 32
XSkill: Continual Learning from Experience and Skills in Multimodal Agents Paper • 2603.12056 • Published 5 days ago • 25
CodePercept: Code-Grounded Visual STEM Perception for MLLMs Paper • 2603.10757 • Published 6 days ago • 13
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published 6 days ago • 34
Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards Paper • 2603.09117 • Published 7 days ago • 9
MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants Paper • 2603.09652 • Published 7 days ago • 14
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs Paper • 2603.09906 • Published 7 days ago • 64
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published 7 days ago • 41
view article Article Granite 4.0 1B Speech: Compact, Multilingual, and Built for the Edge 8 days ago • 10
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs Paper • 2603.09095 • Published 7 days ago • 26
WildActor: Unconstrained Identity-Preserving Video Generation Paper • 2603.00586 • Published 17 days ago • 35