MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
β¨ Overview
MotionEdit is a novel dataset and benchmark for motion-centric image editing. We also propose MotionNFT (Motion-guided Negative-aware FineTuning), a post-training framework with motion alignment rewards to guide models on motion image editing task.
Model Description
- Model type: Image Editing
- Language(s): English
- Finetuned from model [optional]: Qwen/Qwen-Image-Edit-2509
Model Sources [optional]
- Repository: https://github.com/elainew728/motion-edit/tree/main
- Paper: https://arxiv.org/abs/2512.10284
- Demo Page: https://motion-edit.github.io/
π§ Usage
π§± To Start: Environment Setup
Clone our github repository and switch to the directory.
git clone https://github.com/elainew728/motion-edit.git
cd motion-edit
Create and activate the conda environment with dependencies that supports inference and training.
- Note: some models like UltraEdit requires specific dependencies on the diffusers library. Please refer to their official repository to resolve dependencies before running inference.
conda env create -f environment.yml
conda activate motionedit
Finally, configure your own huggingface token to access restricted models by modifying YOUR_HF_TOKEN_HERE in inference/run_image_editing.py.
π Inferencing on MotionEdit-Bench with Image Editing Models
We have released our MotionEdit-Bench on Huggingface. In this Github Repository, we provide code that supports easy inference across open-source Image Editing models: Qwen-Image-Edit, Flux.1 Kontext [Dev], InstructPix2Pix, HQ-Edit, Step1X-Edit, UltraEdit, MagicBrush, and AnyEdit.
Step 1: Data Preparation
The inference script default to using our MotionEdit-Bench, which will download the dataset from Huggingface. You can specify a cache_dir for storing the cached data.
Additionally, you can construct your own dataset for inference. Please organize all input images into a folder INPUT_FOLDER and create a metadata.jsonl in the same directory. The metadata.jsonl file must at least contain entries with 2 entries:
{
"file_name": IMAGE_NAME.EXT,
"prompt": PROMPT
}
Then, load your dataset by:
from datasets import load_dataset
dataset = load_dataset("imagefolder", data_dir=INPUT_FOLDER)
Step 2: Running Inference
Use the following command to run inference on MotionEdit-Bench with our MotionNFT Huggingface checkpoint, trained on MotionEdit with Qwen-Image-Edit as the base model:
python inference/run_image_editing.py \
-o "./outputs/" \
-m "motionedit" \
--seed 42
βοΈ Citing
@misc{wan2025motioneditbenchmarkinglearningmotioncentric,
title={MotionEdit: Benchmarking and Learning Motion-Centric Image Editing},
author={Yixin Wan and Lei Ke and Wenhao Yu and Kai-Wei Chang and Dong Yu},
year={2025},
eprint={2512.10284},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.10284},
}
Model tree for elaine1wan/motionedit
Base model
Qwen/Qwen-Image-Edit-2509