Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.18796

Papers - Training - Multi-Model Evaluation - PoLL

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 174
Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 28
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78

Papers - Command-R

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 27
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Natural Language (LLM, NLP etc)

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 55
FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17, 2024 • 34
How Far Can We Go with Practical Function-Level Program Repair?

Paper • 2404.12833 • Published Apr 19, 2024 • 7
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Collection of resources related to Agents.

Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 6
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 30

Papers - Training - Multi-Model Evaluation

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71
Stylus: Automatic Adapter Selection for Diffusion Models

Paper • 2404.18928 • Published Apr 29, 2024 • 15

Papers - Cohere

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 27
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Synthetic Data and Self-Improvement

Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 24
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 48
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117

Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition

Paper • 2404.02514 • Published Apr 3, 2024 • 11
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Papers - Rag - Multi-hop Queries

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Paper • 2401.15391 • Published Jan 27, 2024 • 6
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation

Paper • 2404.06910 • Published Apr 10, 2024 • 3
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Papers - Training - Multi-Model Evaluation - PoLL

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Papers - Training - Multi-Model Evaluation

Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71
Stylus: Automatic Adapter Selection for Diffusion Models

Paper • 2404.18928 • Published Apr 29, 2024 • 15

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 174
Secrets of RLHF in Large Language Models Part II: Reward Modeling

Paper • 2401.06080 • Published Jan 11, 2024 • 28
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78

Papers - Cohere

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 27
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Papers - Command-R

SnapKV: LLM Knows What You are Looking for Before Generation

Paper • 2404.14469 • Published Apr 22, 2024 • 27
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Synthetic Data and Self-Improvement

Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 24
Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 48
Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 151
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117

Natural Language (LLM, NLP etc)

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 55
FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17, 2024 • 34
How Far Can We Go with Practical Function-Level Program Repair?

Paper • 2404.12833 • Published Apr 19, 2024 • 7
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition

Paper • 2404.02514 • Published Apr 3, 2024 • 11
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Collection of resources related to Agents.

Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 6
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 30

Papers - Rag - Multi-hop Queries

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Paper • 2401.15391 • Published Jan 27, 2024 • 6
Superposition Prompting: Improving and Accelerating Retrieval-Augmented Generation

Paper • 2404.06910 • Published Apr 10, 2024 • 3
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 71

Previous
1
2
3
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs