ai-forever commited on
Commit
bb8800b
·
verified ·
1 Parent(s): 430d66b

Upload 30 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,30 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/comfyui_kandinsky5.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/generation_examples/1036335634.mp4 filter=lfs diff=lfs merge=lfs -text
38
+ assets/generation_examples/1512407739[[:space:]](1).mp4 filter=lfs diff=lfs merge=lfs -text
39
+ assets/generation_examples/1512407739.mp4 filter=lfs diff=lfs merge=lfs -text
40
+ assets/generation_examples/642423904[[:space:]](1).mp4 filter=lfs diff=lfs merge=lfs -text
41
+ assets/generation_examples/642423904[[:space:]](2).mp4 filter=lfs diff=lfs merge=lfs -text
42
+ assets/generation_examples/642423904.mp4 filter=lfs diff=lfs merge=lfs -text
43
+ assets/generation_examples/68941856[[:space:]](1).mp4 filter=lfs diff=lfs merge=lfs -text
44
+ assets/generation_examples/68941856.mp4 filter=lfs diff=lfs merge=lfs -text
45
+ assets/generation_examples/distill/1.mp4 filter=lfs diff=lfs merge=lfs -text
46
+ assets/generation_examples/distill/2.mp4 filter=lfs diff=lfs merge=lfs -text
47
+ assets/generation_examples/distill/3.mp4 filter=lfs diff=lfs merge=lfs -text
48
+ assets/generation_examples/distill/4.mp4 filter=lfs diff=lfs merge=lfs -text
49
+ assets/generation_examples/sft/1.mp4 filter=lfs diff=lfs merge=lfs -text
50
+ assets/generation_examples/sft/2.mp4 filter=lfs diff=lfs merge=lfs -text
51
+ assets/generation_examples/sft/3.mp4 filter=lfs diff=lfs merge=lfs -text
52
+ assets/generation_examples/sft/4.mp4 filter=lfs diff=lfs merge=lfs -text
53
+ assets/generation_examples/sft/5.mp4 filter=lfs diff=lfs merge=lfs -text
54
+ assets/generation_examples/sft/6.mp4 filter=lfs diff=lfs merge=lfs -text
55
+ assets/generation_examples/test[[:space:]](1)[[:space:]](1).mp4 filter=lfs diff=lfs merge=lfs -text
56
+ assets/generation_examples/test2[[:space:]](1).mp4 filter=lfs diff=lfs merge=lfs -text
57
+ assets/generation_examples/video5237959401997893857.mp4 filter=lfs diff=lfs merge=lfs -text
58
+ assets/sbs/kandinsky_5_video_lite_vs_sora.jpg filter=lfs diff=lfs merge=lfs -text
59
+ assets/sbs/kandinsky_5_video_lite_vs_wan_2.1_14B.jpg filter=lfs diff=lfs merge=lfs -text
60
+ assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_5B.jpg filter=lfs diff=lfs merge=lfs -text
61
+ assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_A14B.jpg filter=lfs diff=lfs merge=lfs -text
62
+ assets/vbench.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,327 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <picture>
3
+ <img src="assets/KANDINSKY_LOGO_1_BLACK.png">
4
+ </picture>
5
+ </div>
6
+
7
+ <div align="center">
8
+ <a href="">Habr</a> | <a href="https://gen-ai-team.github.io/kandinsky-5-inference/">Project Page</a> | Technical Report (soon) | <a href=https://huggingface.co/collections/ai-forever/kandisnky-50-t2v-lite-68d71892d2cc9b02177e5ae5> Models🤗 </a>
9
+ </div>
10
+
11
+ <h1>Kandinsky 5.0: A family of diffusion models for Video & Image generation</h1>
12
+
13
+ In this repository, we provide a family of diffusion models to generate a video or an image (<em>Coming Soon</em>) given a textual prompt and distilled model for faster generation.
14
+
15
+ https://github.com/user-attachments/assets/b9ff0417-02a4-4f6b-aacc-60c44e7fe6f1
16
+
17
+ ## Project Updates
18
+
19
+ - 🔥 **Source**: ```2025/09/29```: We have open-sourced `Kandinsky 5.0 T2V Lite` a lite (2B parameters) version of `Kandinsky 5.0 Video` text-to-video generation model. Released checkpoints: `kandinsky5lite_t2v_pretrain_5s`, `kandinsky5lite_t2v_pretrain_10s`, `kandinsky5lite_t2v_sft_5s`, `kandinsky5lite_t2v_sft_10s`, `kandinsky5lite_t2v_nocfg_5s`, `kandinsky5lite_t2v_nocfg_10s`, `kandinsky5lite_t2v_distilled16steps_5s`, `kandinsky5lite_t2v_distilled16steps_10s` contains weight from pretrain, supervised finetuning, cfg distillation and distillation in 16 steps. 5s checkpoints are capable of generating videos up to 5 seconds long. 10s checkpoints is faster models checkpoints trained with [NABLA](https://huggingface.co/ai-forever/Wan2.1-T2V-14B-NABLA-0.7) algorithm and capable to generate videos up to 10 seconds long.
20
+
21
+ ## Kandinsky 5.0 T2V Lite
22
+
23
+ Kandinsky 5.0 T2V Lite is a lightweight video generation model (2B parameters) that ranks #1 among open-source models in its class. It outperforms larger Wan models (5B and 14B) and offers the best understanding of Russian concepts in the open-source ecosystem.
24
+
25
+ We provide 8 model variants, each optimized for different use cases:
26
+
27
+ * SFT model — delivers the highest generation quality;
28
+
29
+ * CFG-distilled — runs 2× faster;
30
+
31
+ * Diffusion-distilled — enables low-latency generation with minimal quality loss (6× faster);
32
+
33
+ * Pretrain model — designed for fine-tuning by researchers and enthusiasts.
34
+
35
+ All models are available in two versions: for generating 5-second and 10-second videos.
36
+
37
+ ## Pipeline
38
+
39
+ **Latent diffusion pipeline** with **Flow Matching**.
40
+
41
+ **Diffusion Transformer (DiT)** as the main generative backbone with **cross-attention to text embeddings**.
42
+
43
+ - **Qwen2.5-VL** and **CLIP** provides text embeddings.
44
+
45
+ - **HunyuanVideo 3D VAE** encodes/decodes video into a latent space.
46
+
47
+ - **DiT** is the main generative module using cross-attention to condition on text.
48
+
49
+ <img width="1600" height="477" alt="Picture1" src="https://github.com/user-attachments/assets/17fc2eb5-05e3-4591-9ec6-0f6e1ca397b3" />
50
+
51
+ <img width="800" height="406" alt="Picture2" src="https://github.com/user-attachments/assets/f3006742-e261-4c39-b7dc-e39330be9a09" />
52
+
53
+
54
+ ## Model Zoo
55
+
56
+ | Model | config | video duration | NFE | Checkpoint | Latency* (H100) | VBench score |
57
+ |-------------------------------------|--------|----------------|-----|------------|----------------|--------------|
58
+ | Kandinsky 5.0 T2V Lite SFT 5s |configs/config_5s_sft.yaml | 5s | 100 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-sft-5s) | 139 s | 84.02 |
59
+ | Kandinsky 5.0 T2V Lite SFT 10s |configs/config_10s_sft.yaml| 10s | 100 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s) | 224 s | 85.36 |
60
+ | Kandinsky 5.0 T2V Lite pretrain 5s |configs/config_5s_pretrain.yaml | 5s | 100 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-pretrain-5s) | 139 s | |
61
+ | Kandinsky 5.0 T2V Lite pretrain 10s |configs/config_10s_pretrain.yaml | 10s | 100 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-pretrain-10s) | 224 s | |
62
+ | Kandinsky 5.0 T2V Lite no-CFG 5s |configs/config_5s_nocfg.yaml| 5s | 50 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-5s) | 77 s | |
63
+ | Kandinsky 5.0 T2V Lite no-CFG 10s |configs/config_10s_nocfg.yaml| 10s | 50 |🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-nocfg-10s) | 124 s | |
64
+ | Kandinsky 5.0 T2V Lite distill 5s |configs/config_5s_distil.yaml| 5s | 16 | 🤗 [HF](https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-5s)| 35 s | |
65
+ | Kandinsky 5.0 T2V Lite distill 10s | | 10s | 16 | | 55 s | |
66
+
67
+ *Latency was measured after the second inference run. The first run of the model can be slower due to the compilation process. For 5-second models Flash Attention 3 was used.
68
+
69
+ ### Examples:
70
+
71
+ #### Kandinsky 5.0 T2V Lite SFT
72
+
73
+ <table border="0" style="width: 200; text-align: left; margin-top: 20px;">
74
+ <tr>
75
+ <td>
76
+ <video src="https://github.com/user-attachments/assets/bc38821b-f9f1-46db-885f-1f70464669eb" width=200 controls autoplay loop></video>
77
+ </td>
78
+ <td>
79
+ <video src="https://github.com/user-attachments/assets/9f64c940-4df8-4c51-bd81-a05de8e70fc3" width=200 controls autoplay loop></video>
80
+ </td>
81
+ <tr>
82
+ <td>
83
+ <video src="https://github.com/user-attachments/assets/77dd417f-e0bf-42bd-8d80-daffcd054add" width=200 controls autoplay loop></video>
84
+ </td>
85
+ <td>
86
+ <video src="https://github.com/user-attachments/assets/385a0076-f01c-4663-aa46-6ce50352b9ed" width=200 controls autoplay loop></video>
87
+ </td>
88
+ <tr>
89
+ <td>
90
+ <video src="https://github.com/user-attachments/assets/7c1bcb31-cc7d-4385-9a33-2b0cc28393dd" width=200 controls autoplay loop></video>
91
+ </td>
92
+ <td>
93
+ <video src="https://github.com/user-attachments/assets/990a8a0b-2df1-4bbc-b2e3-2859b6f1eea6" width=200 controls autoplay loop></video>
94
+ </td>
95
+ </tr>
96
+
97
+ </table>
98
+
99
+
100
+ #### Kandinsky 5.0 T2V Lite Distill
101
+
102
+ <table border="0" style="width: 200; text-align: left; margin-top: 20px;">
103
+ <tr>
104
+ <td>
105
+ <video src="https://github.com/user-attachments/assets/861342f9-f576-4083-8a3b-94570a970d58" width=200 controls autoplay loop></video>
106
+ </td>
107
+ <td>
108
+ <video src="https://github.com/user-attachments/assets/302e4e7d-781d-4a58-9b10-8c473d469c4b" width=200 controls autoplay loop></video>
109
+ </td>
110
+ <tr>
111
+ <td>
112
+ <video src="https://github.com/user-attachments/assets/3e70175c-40e5-4aec-b506-38006fe91a76" width=200 controls autoplay loop></video>
113
+ </td>
114
+ <td>
115
+ <video src="https://github.com/user-attachments/assets/b7da85f7-8b62-4d46-9460-7f0e505de810" width=200 controls autoplay loop></video>
116
+ </td>
117
+
118
+ </table>
119
+
120
+ ### Results:
121
+
122
+ #### Side-by-Side evaluation
123
+
124
+ The evaluation is based on the expanded prompts from the [Movie Gen benchmark](https://github.com/facebookresearch/MovieGenBench), which are available in the expanded_prompt column of the benchmark/moviegen_bench.csv file.
125
+
126
+ <table border="0" style="width: 400; text-align: left; margin-top: 20px;">
127
+ <tr>
128
+ <td>
129
+ <img src="assets/sbs/kandinsky_5_video_lite_vs_sora.jpg" width=400 ></img>
130
+ </td>
131
+ <td>
132
+ <img src="assets/sbs/kandinsky_5_video_lite_vs_wan_2.1_14B.jpg" width=400 ></img>
133
+ </td>
134
+ <tr>
135
+ <td>
136
+ <img src="assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_5B.jpg" width=400 ></img>
137
+ </td>
138
+ <td>
139
+ <img src="assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_A14B.jpg" width=400 ></img>
140
+ </td>
141
+
142
+ </table>
143
+
144
+ #### VBench results
145
+
146
+ <div align="center">
147
+ <picture>
148
+ <img src="assets/vbench.png">
149
+ </picture>
150
+ </div>
151
+
152
+ ## Quickstart
153
+
154
+ #### Installation
155
+ Clone the repo:
156
+ ```sh
157
+ git clone https://github.com/ai-forever/Kandinsky-5.git
158
+ cd Kandinsky-5
159
+ ```
160
+
161
+ Install dependencies:
162
+ ```sh
163
+ pip install -r requirements.txt
164
+ ```
165
+
166
+ To improve inference performance on NVidia Hopper GPUs, we recommend installing [Flash Attention 3](https://github.com/Dao-AILab/flash-attention/?tab=readme-ov-file#flashattention-3-beta-release).
167
+
168
+ #### Model Download
169
+ ```sh
170
+ python download_models.py
171
+ ```
172
+
173
+ #### Run Kandinsky 5.0 T2V Lite SFT 5s
174
+
175
+ ```sh
176
+ python test.py --prompt "A dog in red hat"
177
+ ```
178
+
179
+ #### Run Kandinsky 5.0 T2V Lite SFT 10s
180
+
181
+ ```sh
182
+ python test.py --config ./configs/config_10s_sft.yaml --prompt "A dog in red hat" --video_duration 10
183
+ ```
184
+
185
+ #### Run Kandinsky 5.0 T2V Lite pretrain 5s
186
+
187
+ ```sh
188
+ python test.py --config ./configs/config_5s_pretrain.yaml --prompt "A dog in red hat"
189
+ ```
190
+
191
+ #### Run Kandinsky 5.0 T2V Lite pretrain 10s
192
+
193
+ ```sh
194
+ python test.py --config ./configs/config_10s_pretrain.yaml --prompt "A dog in red hat" --video_duration 10
195
+ ```
196
+
197
+ #### Run Kandinsky 5.0 T2V Lite no-CFG 5s
198
+
199
+ ```sh
200
+ python test.py --config ./configs/config_5s_nocfg.yaml --prompt "A dog in red hat"
201
+ ```
202
+
203
+ #### Run Kandinsky 5.0 T2V Lite no-CFG 10s
204
+
205
+ ```sh
206
+ python test.py --config ./configs/config_10s_nocfg.yaml --prompt "A dog in red hat" --video_duration 10
207
+ ```
208
+
209
+ #### Run Kandinsky 5.0 T2V Lite distill 5s
210
+
211
+ ```sh
212
+ python test.py --config ./configs/config_5s_distil.yaml --prompt "A dog in red hat"
213
+ ```
214
+
215
+ #### Run Kandinsky 5.0 T2V Lite distill 10s
216
+
217
+ Coming soon
218
+
219
+ ### Inference
220
+
221
+ ```python
222
+ import torch
223
+ from IPython.display import Video
224
+ from kandinsky import get_T2V_pipeline
225
+
226
+ device_map = {
227
+ "dit": torch.device('cuda:0'),
228
+ "vae": torch.device('cuda:0'),
229
+ "text_embedder": torch.device('cuda:0')
230
+ }
231
+
232
+ pipe = get_T2V_pipeline(device_map, conf_path="configs/config_5s_sft.yaml")
233
+
234
+ images = pipe(
235
+ seed=42,
236
+ time_length=5,
237
+ width=768,
238
+ height=512,
239
+ save_path="./test.mp4",
240
+ text="A cat in a red hat",
241
+ )
242
+
243
+ Video("./test.mp4")
244
+ ```
245
+
246
+ Please, refer to [inference_example.ipynb](inference_example.ipynb) notebook for more usage details.
247
+
248
+ ### Distributed Inference
249
+
250
+ For a faster inference, we also provide the capability to perform inference in a distributed way:
251
+ ```
252
+ NUMBER_OF_NODES=1
253
+ NUMBER_OF_DEVICES_PER_NODE=1 / 2 / 4
254
+ python -m torch.distributed.launch --nnodes $NUMBER_OF_NODES --nproc-per-node $NUMBER_OF_DEVICES_PER_NODE test.py
255
+ ```
256
+
257
+ ### ComfyUI
258
+
259
+ See the instruction [here](comfyui)
260
+
261
+ ## 📑 Todo List
262
+ - Kandinsky 5.0 Lite Text-to-Video
263
+ - [x] Multi-GPU Inference code of the 2B models
264
+ - [ ] Checkpoints 2B models
265
+ - [x] pretrain
266
+ - [x] sft
267
+ - [ ] rl
268
+ - [x] cfg distil
269
+ - [x] distil 16 steps
270
+ - [ ] autoregressive generation
271
+ - [x] ComfyUI integration
272
+ - [ ] Diffusers integration
273
+ - [ ] Caching acceleration support
274
+ - Kandinsky 5.0 Lite Image-to-Video
275
+ - [ ] Multi-GPU Inference code of the 2B model
276
+ - [ ] Checkpoints of the 2B model
277
+ - [ ] ComfyUI integration
278
+ - [ ] Diffusers integration
279
+ - Kandinsky 5.0 Pro Text-to-Video
280
+ - [ ] Multi-GPU Inference code of the models
281
+ - [ ] Checkpoints of the model
282
+ - [ ] ComfyUI integration
283
+ - [ ] Diffusers integration
284
+ - Kandinsky 5.0 Pro Image-to-Video
285
+ - [ ] Multi-GPU Inference code of the model
286
+ - [ ] Checkpoints of the model
287
+ - [ ] ComfyUI integration
288
+ - [ ] Diffusers integration
289
+ - [ ] Technical report
290
+
291
+ # Authors
292
+ <B>Project Leader:</B> Denis Dimitrov</br>
293
+
294
+ <B>Team Leads:</B> Vladimir Arkhipkin, Vladimir Korviakov, Nikolai Gerasimenko, Denis Parkhomenko</br>
295
+
296
+ <B>Core Contributors:</B> Alexey Letunovskiy, Maria Kovaleva, Ivan Kirillov, Lev Novitskiy, Denis Koposov, Dmitrii Mikhailov, Anna Averchenkova, Andrey Shutkin, Julia Agafonova, Olga Kim, Anastasiia Kargapoltseva, Nikita Kiselev</br>
297
+
298
+ <B>Contributors:</B> Anna Dmitrienko, Anastasia Maltseva, Kirill Chernyshev, Ilia Vasiliev, Viacheslav Vasilev, Vladimir Polovnikov, Yury Kolabushin, Alexander Belykh, Mikhail Mamaev, Anastasia Aliaskina, Tatiana Nikulina, Polina Gavrilova</br>
299
+
300
+ # Citation
301
+
302
+ ```
303
+ @misc{kandinsky2025,
304
+ author = {Alexey Letunovskiy, Maria Kovaleva, Ivan Kirillov, Lev Novitskiy, Denis Koposov,
305
+ Dmitrii Mikhailov, Anna Averchenkova, Andrey Shutkin, Julia Agafonova, Olga Kim,
306
+ Anastasiia Kargapoltseva, Nikita Kiselev, Vladimir Arkhipkin, Vladimir Korviakov,
307
+ Nikolai Gerasimenko, Denis Parkhomenko, Anna Dmitrienko, Anastasia Maltseva,
308
+ Kirill Chernyshev, Ilia Vasiliev, Viacheslav Vasilev, Vladimir Polovnikov,
309
+ Yury Kolabushin, Alexander Belykh, Mikhail Mamaev, Anastasia Aliaskina,
310
+ Tatiana Nikulina, Polina Gavrilova, Denis Dimitrov},
311
+ title = {Kandinsky 5.0: A family of diffusion models for Video & Image generation},
312
+ howpublished = {\url{https://github.com/ai-forever/Kandinsky-5}},
313
+ year = 2025
314
+ }
315
+
316
+ @misc{mikhailov2025nablanablaneighborhoodadaptiveblocklevel,
317
+ title={$\nabla$NABLA: Neighborhood Adaptive Block-Level Attention},
318
+ author={Dmitrii Mikhailov and Aleksey Letunovskiy and Maria Kovaleva and Vladimir Arkhipkin
319
+ and Vladimir Korviakov and Vladimir Polovnikov and Viacheslav Vasilev
320
+ and Evelina Sidorova and Denis Dimitrov},
321
+ year={2025},
322
+ eprint={2507.13546},
323
+ archivePrefix={arXiv},
324
+ primaryClass={cs.CV},
325
+ url={https://arxiv.org/abs/2507.13546},
326
+ }
327
+ ```
assets/KANDINSKY_LOGO_1_BLACK.png ADDED
assets/KANDINSKY_LOGO_1_WHITE.png ADDED
assets/comfyui_kandinsky5.png ADDED

Git LFS Details

  • SHA256: 4c91961abe51a1fcbd3a35d438ea3b4f652f61a4d9f035c9f10e91dc5c9b79cd
  • Pointer size: 131 Bytes
  • Size of remote file: 474 kB
assets/generation_examples/1036335634.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d3657c36760e1694a4d3533b04e0d28ddd16d8d8e6373953e8f754742e2a54b
3
+ size 4199589
assets/generation_examples/1512407739 (1).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54b887c2c5cca6a4b5d8cc7f433a01a1c72c2592f07187b3c530126ac77aa601
3
+ size 7227407
assets/generation_examples/1512407739.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:00f48b1d0ddd97c7d802136d1a9090eb397ce34d95569fd4c4d6eb64eb46d06f
3
+ size 6778347
assets/generation_examples/642423904 (1).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:612d2a1340475f79b20f90cc5e85a5b9e79193af631e8bd7fa50cdc5fc47dee8
3
+ size 3038994
assets/generation_examples/642423904 (2).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:55bed6367e8c4de29f082cfbaa8af357fef81dc339e8abbadea50901ac635d10
3
+ size 3092127
assets/generation_examples/642423904.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:464da305ce36af41a1fb7fa842de1357b601008d128f1693f12f9674c906243c
3
+ size 2466511
assets/generation_examples/68941856 (1).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8274aa1cbc087465f6a1b72842c8d1cb1860c61232dd29ee0f17ea7ff2d2ac08
3
+ size 5856930
assets/generation_examples/68941856.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:759e825e4b6bd05ab92621390b1f5aa97240cd18a76312395ff39fd636dc8a9d
3
+ size 6942715
assets/generation_examples/distill/1.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06953c45987011d08aef79f7ae1368f9a69d480a509554fb55d0d46d4498255f
3
+ size 6916245
assets/generation_examples/distill/2.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d199141bd7cffb4cb30ac84f2879da330b83cd2429aa7a95c3406f8dce49134a
3
+ size 5384100
assets/generation_examples/distill/3.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9bfdc9c8cf2ada22de4ffc1d0281e56ed7c0d61a66bdff28697cbe6a2a8e97f5
3
+ size 3957258
assets/generation_examples/distill/4.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9bbebdd5de15721f082785bbc02dc7422b0d9d9a6ac244c02e1bfbecc22ba22e
3
+ size 6328091
assets/generation_examples/sft/1.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1bfc00348d1fdb09c43687e086cd912ba15097ecf3f85b6302827a54eafab3a2
3
+ size 4486280
assets/generation_examples/sft/2.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:612eb63051314d50bb333ee1c797e95a5df7522ca5df3dc853266b135a27ce06
3
+ size 4600755
assets/generation_examples/sft/3.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c8cc13e558964138ef53a098d4b1174db5a58979f7dbf021788a39ad41c2fcff
3
+ size 8193301
assets/generation_examples/sft/4.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:462f02343590e07e5ac2a919f566438d2ddd990d0c5ddcd8c92e15feae63eb11
3
+ size 7697517
assets/generation_examples/sft/5.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4e89d9d0ae048df05fc2f00e9acc009a261b38810f5792967a969bb3147a5e6d
3
+ size 3528986
assets/generation_examples/sft/6.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8226569ad637d3b179ce895685e52cfcf81372bbbab623386fddde8e0352c9db
3
+ size 4109417
assets/generation_examples/test (1) (1).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:949f1a6c6056307026430aa0809fc7b02ac6deb3732973b305aaad20e6754c76
3
+ size 1592811
assets/generation_examples/test2 (1).mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d626185c2e14dcca33b4e315529dd9555fc34f9dc96a327a28935bdb659406b
3
+ size 1103150
assets/generation_examples/video5237959401997893857.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4df397dc2d8032875aaccb58ef563f90c6cd25d50ce73bae40c5776dc9818ac0
3
+ size 5977768
assets/sbs/kandinsky_5_video_lite_vs_sora.jpg ADDED

Git LFS Details

  • SHA256: 2a5c838cb53a026a57d3037361ad4ed74bae4b31f4d1b11e6474956eca42d412
  • Pointer size: 131 Bytes
  • Size of remote file: 195 kB
assets/sbs/kandinsky_5_video_lite_vs_wan_2.1_14B.jpg ADDED

Git LFS Details

  • SHA256: 80bc261b9afcaf1446228a24a96afe3b5c24b4780f3e2f43e27496077611ec6f
  • Pointer size: 131 Bytes
  • Size of remote file: 196 kB
assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_5B.jpg ADDED

Git LFS Details

  • SHA256: d01f4a73b287541487228939fd505a947b78b6325f76421b2ee5f1523188e08e
  • Pointer size: 131 Bytes
  • Size of remote file: 192 kB
assets/sbs/kandinsky_5_video_lite_vs_wan_2.2_A14B.jpg ADDED

Git LFS Details

  • SHA256: 4f053f7d996112f40e8b49f6440ea75a40f71c02e60d467cff479ced0b54444a
  • Pointer size: 131 Bytes
  • Size of remote file: 198 kB
assets/vbench.png ADDED

Git LFS Details

  • SHA256: 27131bac1ccb83d3d28e8f558c6a7a91ed92816c0814583299b8584f0cda6546
  • Pointer size: 131 Bytes
  • Size of remote file: 170 kB