artpli commited on
Commit
98add2f
·
verified ·
1 Parent(s): 4c60ef1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -11
README.md CHANGED
@@ -12,41 +12,51 @@ tags:
12
  - text-to-motion
13
  ---
14
 
15
- <div align="center">
16
- <h1>FRoM-W1 (机智-W1): Towards General Humanoid Whole-Body Control with Language Instructions</h1>
17
- </div>
18
 
19
  <div align="center">
20
- The Humanoid Intelligence (Hi) Team at FudanNLP and OpenMOSS Group
 
 
21
  </div>
22
 
23
  <div align="center">
24
- <a href="https://github.com/OpenMOSS/FRoM-W1">💻Github</a>
25
  </div>
26
 
27
-
28
  ## Introduction
29
  <div align="center">
30
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6208b57eace0f815845c6dbf/cASCQ7yqKP3LJMNFuBZAM.png" alt="FRoM-W1" width="50%">
31
  </div>
32
 
33
 
34
  Humanoid robots are capable of performing various actions such as greeting, dancing, and even backflipping. However, these motions are often hard-coded or specifically trained, which limits their versatility.
35
- In this work, we present **FRoM-W1**, an open-source framework designed to achieve general humanoid whole-body motion control using natural language.
36
  To universally understand natural language and generate corresponding motions, as well as enable various humanoid robots to stably execute these motions in the physical world under gravity, **FRoM-W1** operates in two stages:
37
- (a) **H-GPT**: utilizing massive human data, a large-scale language-driven human whole-body motion generation model is trained to generate diverse natural behaviors.
38
  We further leverage the Chain-of-Thought technique to improve the model’s generalization in instruction understanding.
39
  (b) **H-ACT**: After retargeting generated human whole-body motions into robot-specific actions, a motion controller that is pretrained and further fine-tuned through reinforcement learning in physical simulation enables humanoid robots to accurately and stably perform corresponding actions.
40
  It is then deployed on real robots via a modular simulation-to-reality module.
41
  We extensively evaluate our framework on the Unitree H1 and G1 robots, demonstrating successful language-to-motion generation and stable execution in both simulation and real-world settings.
42
  We fully open-source the entire **FRoM-W1** framework and hope it will advance the development of humanoid intelligence.
43
 
44
- ## Usage
 
 
 
 
 
 
 
 
 
 
45
 
46
 
 
47
 
48
  <div align="center">
49
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6208b57eace0f815845c6dbf/PDWD8sgkNFCi0movdMkOU.png" alt="overview" width="80%">
50
  </div>
51
 
52
  The complete **FRoM-W1** workflow is illustrated above:
@@ -55,10 +65,18 @@ The complete **FRoM-W1** workflow is illustrated above:
55
  Deploy **H-GPT** via command-line tools or a web interface to convert natural-language commands into human motion representations.
56
  This module provides full training, inference, and evaluation code, and pretrained models are available on HuggingFace.
57
 
 
 
 
 
58
  - **H-ACT**
59
  **H-ACT** converts the motion representations from H-GPT into SMPL-X motion sequences and further retargets them to various humanoid robots.
60
  The resulting motions can be used both for training control policies and executing actions on real robots using our deployment pipeline.
61
 
 
 
 
 
62
  ## Citation
63
  If you find our work useful, please cite it for now in the following way:
64
  ```bibtex
 
12
  - text-to-motion
13
  ---
14
 
15
+ # FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions
 
 
16
 
17
  <div align="center">
18
+ <img src="./assets/hi_logo.png" alt="FRoM-W1" width="7.5%">
19
+
20
+ The **H**umanoid **I**ntelligence Team from FudanNLP and OpenMOSS
21
  </div>
22
 
23
  <div align="center">
24
+ <a href="https://github.com/OpenMOSS/FRoM-W1">💻Github</a>&emsp;<a href="https://huggingface.co/datasets/OpenMOSS-Team/FRoM-W1-Datasets">🤗Datasets</a>&emsp;<a href="https://huggingface.co/OpenMOSS-Team/FRoM-W1">🤗Models</a>
25
  </div>
26
 
 
27
  ## Introduction
28
  <div align="center">
29
+ <img src="./assets/FRoM-W1-Teaser.png" alt="FRoM-W1" width="50%">
30
  </div>
31
 
32
 
33
  Humanoid robots are capable of performing various actions such as greeting, dancing, and even backflipping. However, these motions are often hard-coded or specifically trained, which limits their versatility.
34
+ In this work, we present **FRoM-W1[^1]**, an open-source framework designed to achieve general humanoid whole-body motion control using natural language.
35
  To universally understand natural language and generate corresponding motions, as well as enable various humanoid robots to stably execute these motions in the physical world under gravity, **FRoM-W1** operates in two stages:
36
+ (a) **H-GPT**: Utilizing massive human data, a large-scale language-driven human whole-body motion generation model is trained to generate diverse natural behaviors.
37
  We further leverage the Chain-of-Thought technique to improve the model’s generalization in instruction understanding.
38
  (b) **H-ACT**: After retargeting generated human whole-body motions into robot-specific actions, a motion controller that is pretrained and further fine-tuned through reinforcement learning in physical simulation enables humanoid robots to accurately and stably perform corresponding actions.
39
  It is then deployed on real robots via a modular simulation-to-reality module.
40
  We extensively evaluate our framework on the Unitree H1 and G1 robots, demonstrating successful language-to-motion generation and stable execution in both simulation and real-world settings.
41
  We fully open-source the entire **FRoM-W1** framework and hope it will advance the development of humanoid intelligence.
42
 
43
+ [^1]: **F**oundational Humanoid **Ro**bot **M**odel - **W**hole-Body Control, Version **1**
44
+
45
+ ## Release Timeline
46
+ We will gradually release the paper, data, codebase, model checkpoints, and the real-robot deployment framework for **FRoM-W1** in the next week or two.
47
+
48
+ Here is the current release progress:
49
+ - [**2025/12/14**] We have released the **CoT data** of HumanML3D-X on **[HuggingFace Datasets](https://huggingface.co/datasets/OpenMOSS-Team/FRoM-W1-Datasets)**.
50
+ - [**2025/12/13**] We have uploaded the checkpoints for HGPT, Baselines (SMPL-X version of T2M, MotionDiffuse, MLD, T2M-GPT), and the SMPL-X Motion Generation eval model on **[HuggingFace Models](https://huggingface.co/OpenMOSS-Team/FRoM-W1)**.
51
+ - [**2025/12/10**] We have uploaded the initial version of the code for two core modules, **[H-GPT](./H-GPT/README.md)** and **[H-ACT](./H-ACT/README.md)** !
52
+ - [**2025/12/10**] We have released our lightweight, modular humanoid-robot deployment framework [**RoboJuDo**](https://github.com/HansZ8/RoboJuDo)!
53
+ - [**2025/12/10**] We are thrilled to initiate the release of **FRoM-W1**!
54
 
55
 
56
+ ## Usage
57
 
58
  <div align="center">
59
+ <img src="./assets/FRoM-W1-Overview.png" alt="overview" width="80%">
60
  </div>
61
 
62
  The complete **FRoM-W1** workflow is illustrated above:
 
65
  Deploy **H-GPT** via command-line tools or a web interface to convert natural-language commands into human motion representations.
66
  This module provides full training, inference, and evaluation code, and pretrained models are available on HuggingFace.
67
 
68
+ <div align="center">
69
+ <img src="./assets/FRoM-W1-HGPT.png" alt="fromw1-hgpt" width="80%">
70
+ </div>
71
+
72
  - **H-ACT**
73
  **H-ACT** converts the motion representations from H-GPT into SMPL-X motion sequences and further retargets them to various humanoid robots.
74
  The resulting motions can be used both for training control policies and executing actions on real robots using our deployment pipeline.
75
 
76
+ <div align="center">
77
+ <img src="./assets/FRoM-W1-HACT.png" alt="fromw1-hact" width="80%">
78
+ </div>
79
+
80
  ## Citation
81
  If you find our work useful, please cite it for now in the following way:
82
  ```bibtex