-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
Collections
Discover the best community collections!
Collections including paper arxiv:2508.14879
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 42 -
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 93 -
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Paper • 2406.08464 • Published • 71
-
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Paper • 2410.09604 • Published -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 11 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 8
-
RAISECity: A Multimodal Agent Framework for Reality-Aligned 3D World Generation at City-Scale
Paper • 2511.18005 • Published • 1 -
SynCity: Training-Free Generation of 3D Worlds
Paper • 2503.16420 • Published • 27 -
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper • 2501.08983 • Published • 22 -
WorldGrow: Generating Infinite 3D World
Paper • 2510.21682 • Published • 42
-
Describe Anything: Detailed Localized Image and Video Captioning
Paper • 2504.16072 • Published • 63 -
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Paper • 2410.09604 • Published -
Geospatial Mechanistic Interpretability of Large Language Models
Paper • 2505.03368 • Published • 11 -
Scenethesis: A Language and Vision Agentic Framework for 3D Scene Generation
Paper • 2505.02836 • Published • 8
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Paper • 2403.05530 • Published • 66 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 42 -
Rho-1: Not All Tokens Are What You Need
Paper • 2404.07965 • Published • 93 -
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Paper • 2406.08464 • Published • 71