HANI-LAB
/

Med-REFL-Huatuo-o1-8B-lora

Text Generation

medical-reasoning

Model card Files Files and versions

CityU-Zongxian commited on Jun 19, 2025

Commit

2924e17

·

verified ·

1 Parent(s): ca9fc8c

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -30,6 +30,13 @@ tags:
 Instead of focusing solely on the final answer, Med-REFL improves the model's intermediate reasoning process.  It leverages a Tree-of-Thought (ToT) methodology to explore diverse reasoning paths and automatically constructs Direct Preference Optimization (DPO) data.  This trains the model to identify and correct its own reasoning errors, leading to more accurate and trustworthy outputs.
 This repository contains the LoRA weights produced by the Med-REFL framework for various base models.
 # <span>Available Weights</span>
 The Med-REFL LoRA weights can be applied to the following base models to enhance their medical reasoning abilities.

 Instead of focusing solely on the final answer, Med-REFL improves the model's intermediate reasoning process.  It leverages a Tree-of-Thought (ToT) methodology to explore diverse reasoning paths and automatically constructs Direct Preference Optimization (DPO) data.  This trains the model to identify and correct its own reasoning errors, leading to more accurate and trustworthy outputs.
 This repository contains the LoRA weights produced by the Med-REFL framework for various base models.
+# <span>Performance</span>
+| Domain | Benchmark | Original | **+ Med-REFL** |
+| :--- | :--- | :--- | :--- |
+| **In-Domain** | MedQA-USMLE | 69.59 | **73.72** <span style="color: #2E8B57; font-size: small;">(+4.13)</span> |
+| **Out-of-Domain**| MedMCQA | 62.13 | **64.66** <span style="color: #2E8B57; font-size: small;">(+2.53)</span> |
+| **Out-of-Domain**| GPQA (Med+) | 50.67 | **56.80** <span style="color: #2E8B57; font-size: small;">(+6.13)</span> |
+| **Out-of-Domain**| MMLU-Pro (Med+) | 61.87 | **64.97** <span style="color: #2E8B57; font-size: small;">(+3.10)</span> |
 # <span>Available Weights</span>
 The Med-REFL LoRA weights can be applied to the following base models to enhance their medical reasoning abilities.