LLM TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38 User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
Leaderboards Running Featured 601 Image Arena Leaderboard 📊 601 Image Generation and Image Editing Arena & Leaderboard Running on CPU Upgrade 7.46k MTEB Leaderboard 📊 7.46k Embedding Leaderboard Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots Running 4.91k Arena Leaderboard 🏆 4.91k View the LMArena leaderboard in full‑screen
Running Featured 601 Image Arena Leaderboard 📊 601 Image Generation and Image Editing Arena & Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots
LLM TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38 User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
TravelPlanner: A Benchmark for Real-World Planning with Language Agents Paper • 2402.01622 • Published Feb 2, 2024 • 38
User-LLM: Efficient LLM Contextualization with User Embeddings Paper • 2402.13598 • Published Feb 21, 2024 • 22
Leaderboards Running Featured 601 Image Arena Leaderboard 📊 601 Image Generation and Image Editing Arena & Leaderboard Running on CPU Upgrade 7.46k MTEB Leaderboard 📊 7.46k Embedding Leaderboard Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots Running 4.91k Arena Leaderboard 🏆 4.91k View the LMArena leaderboard in full‑screen
Running Featured 601 Image Arena Leaderboard 📊 601 Image Generation and Image Editing Arena & Leaderboard
Running on CPU Upgrade 14k Open LLM Leaderboard 🏆 14k Track, rank and evaluate open LLMs and chatbots