No COMFY

by BuBaLoM - opened 6 days ago

Discussion

BuBaLoM

6 days ago

COMFY last ? black screen generation(

Jianqiao1

6 days ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

RuneXX

4 days ago

•

edited 4 days ago

Seems to work fine in Comfy here.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/tree/main/ComfyUI

I did use a bit more steps, 6 or so. But still super fast ;-)
Standard res. For even higher res probably just add some more steps

kenshinn

inclusionAI org 3 days ago

•

edited 3 days ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

BuBaLoM

3 days ago

Выдает такую ошибку COMFY последней версии
model_type FLOW
unet missing: ['x_embedder.weight', 'x_embedder.bias', 'noise_refiner.0.attention.qkv.weight', 'noise_refiner.0.attention.out.weight', 'noise_refiner.0.attention.q_norm.weight', 'noise_refiner.0.attention.k_norm.weight', 'noise_refiner.1.attention.qkv.weight', 'noise_refiner.1.attention.out.weight', 'noise_refiner.1.attention.q_norm.weight', 'noise_refiner.1.attention.k_norm.weight', 'context_refiner.0.attention.qkv.weight', 'context_refiner.0.attention.out.weight', 'context_refiner.0.attention.q_norm.weight', 'context_refiner.0.attention.k_norm.weight', 'context_refiner.1.attention.qkv.weight', 'context_refiner.1.attention.out.weight', 'context_refiner.1.attention.q_norm.weight', 'context_refiner.1.attention.k_norm.weight', 'layers.0.attention.qkv.weight', 'layers.0.attention.out.weight', 'layers.0.attention.q_norm.weight', 'layers.0.attention.k_norm.weight', 'layers.1.attention.qkv.weight', 'layers.1.attention.out.weight', 'layers.1.attention.q_norm.weight', 'layers.1.attention.k_norm.weight', 'layers.2.attention.qkv.weight', 'layers.2.attention.out.weight', 'layers.2.attention.q_norm.weight', 'layers.2.attention.k_norm.weight', 'layers.3.attention.qkv.weight', 'layers.3.attention.out.weight', 'layers.3.attention.q_norm.weight', 'layers.3.attention.k_norm.weight', 'layers.4.attention.qkv.weight', 'layers.4.attention.out.weight', 'layers.4.attention.q_norm.weight', 'layers.4.attention.k_norm.weight', 'layers.5.attention.qkv.weight', 'layers.5.attention.out.weight', 'layers.5.attention.q_norm.weight', 'layers.5.attention.k_norm.weight', 'layers.6.attention.qkv.weight', 'layers.6.attention.out.weight', 'layers.6.attention.q_norm.weight', 'layers.6.attention.k_norm.weight', 'layers.7.attention.qkv.weight', 'layers.7.attention.out.weight', 'layers.7.attention.q_norm.weight', 'layers.7.attention.k_norm.weight', 'layers.8.attention.qkv.weight', 'layers.8.attention.out.weight', 'layers.8.attention.q_norm.weight', 'layers.8.attention.k_norm.weight', 'layers.9.attention.qkv.weight', 'layers.9.attention.out.weight', 'layers.9.attention.q_norm.weight', 'layers.9.attention.k_norm.weight', 'layers.10.attention.qkv.weight', 'layers.10.attention.out.weight', 'layers.10.attention.q_norm.weight', 'layers.10.attention.k_norm.weight', 'layers.11.attention.qkv.weight', 'layers.11.attention.out.weight', 'layers.11.attention.q_norm.weight', 'layers.11.attention.k_norm.weight', 'layers.12.attention.qkv.weight', 'layers.12.attention.out.weight', 'layers.12.attention.q_norm.weight', 'layers.12.attention.k_norm.weight', 'layers.13.attention.qkv.weight', 'layers.13.attention.out.weight', 'layers.13.attention.q_norm.weight', 'layers.13.attention.k_norm.weight', 'layers.14.attention.qkv.weight', 'layers.14.attention.out.weight', 'layers.14.attention.q_norm.weight', 'layers.14.attention.k_norm.weight', 'layers.15.attention.qkv.weight', 'layers.15.attention.out.weight', 'layers.15.attention.q_norm.weight', 'layers.15.attention.k_norm.weight', 'layers.16.attention.qkv.weight', 'layers.16.attention.out.weight', 'layers.16.attention.q_norm.weight', 'layers.16.attention.k_norm.weight', 'layers.17.attention.qkv.weight', 'layers.17.attention.out.weight', 'layers.17.attention.q_norm.weight', 'layers.17.attention.k_norm.weight', 'layers.18.attention.qkv.weight', 'layers.18.attention.out.weight', 'layers.18.attention.q_norm.weight', 'layers.18.attention.k_norm.weight', 'layers.19.attention.qkv.weight', 'layers.19.attention.out.weight', 'layers.19.attention.q_norm.weight', 'layers.19.attention.k_norm.weight', 'layers.20.attention.qkv.weight', 'layers.20.attention.out.weight', 'layers.20.attention.q_norm.weight', 'layers.20.attention.k_norm.weight', 'layers.21.attention.qkv.weight', 'layers.21.attention.out.weight', 'layers.21.attention.q_norm.weight', 'layers.21.attention.k_norm.weight', 'layers.22.attention.qkv.weight', 'layers.22.attention.out.weight', 'layers.22.attention.q_norm.weight', 'layers.22.attention.k_norm.weight', 'layers.23.attention.qkv.weight', 'layers.23.attention.out.weight', 'layers.23.attention.q_norm.weight', 'layers.23.attention.k_norm.weight', 'layers.24.attention.qkv.weight', 'layers.24.attention.out.weight', 'layers.24.attention.q_norm.weight', 'layers.24.attention.k_norm.weight', 'layers.25.attention.qkv.weight', 'layers.25.attention.out.weight', 'layers.25.attention.q_norm.weight', 'layers.25.attention.k_norm.weight', 'layers.26.attention.qkv.weight', 'layers.26.attention.out.weight', 'layers.26.attention.q_norm.weight', 'layers.26.attention.k_norm.weight', 'layers.27.attention.qkv.weight', 'layers.27.attention.out.weight', 'layers.27.attention.q_norm.weight', 'layers.27.attention.k_norm.weight', 'layers.28.attention.qkv.weight', 'layers.28.attention.out.weight', 'layers.28.attention.q_norm.weight', 'layers.28.attention.k_norm.weight', 'layers.29.attention.qkv.weight', 'layers.29.attention.out.weight', 'layers.29.attention.q_norm.weight', 'layers.29.attention.k_norm.weight', 'final_layer.linear.weight', 'final_layer.linear.bias', 'final_layer.adaLN_modulation.1.weight', 'final_layer.adaLN_modulation.1.bias']
unet unexpected: ['all_final_layer.2-1.adaLN_modulation.1.bias', 'all_final_layer.2-1.adaLN_modulation.1.weight', 'all_final_layer.2-1.linear.bias', 'all_final_layer.2-1.linear.weight', 'all_x_embedder.2-1.bias', 'all_x_embedder.2-1.weight', 't_embedder_2.mlp.0.bias', 't_embedder_2.mlp.0.weight', 't_embedder_2.mlp.2.bias', 't_embedder_2.mlp.2.weight', 'noise_refiner.0.attention.norm_k.weight', 'noise_refiner.0.attention.norm_q.weight', 'noise_refiner.0.attention.to_k.weight', 'noise_refiner.0.attention.to_out.0.weight', 'noise_refiner.0.attention.to_q.weight', 'noise_refiner.0.attention.to_v.weight', 'noise_refiner.1.attention.norm_k.weight', 'noise_refiner.1.attention.norm_q.weight', 'noise_refiner.1.attention.to_k.weight', 'noise_refiner.1.attention.to_out.0.weight', 'noise_refiner.1.attention.to_q.weight', 'noise_refiner.1.attention.to_v.weight', 'context_refiner.0.attention.norm_k.weight', 'context_refiner.0.attention.norm_q.weight', 'context_refiner.0.attention.to_k.weight', 'context_refiner.0.attention.to_out.0.weight', 'context_refiner.0.attention.to_q.weight', 'context_refiner.0.attention.to_v.weight', 'context_refiner.1.attention.norm_k.weight', 'context_refiner.1.attention.norm_q.weight', 'context_refiner.1.attention.to_k.weight', 'context_refiner.1.attention.to_out.0.weight', 'context_refiner.1.attention.to_q.weight', 'context_refiner.1.attention.to_v.weight', 'layers.0.attention.norm_k.weight', 'layers.0.attention.norm_q.weight', 'layers.0.attention.to_k.weight', 'layers.0.attention.to_out.0.weight', 'layers.0.attention.to_q.weight', 'layers.0.attention.to_v.weight', 'layers.1.attention.norm_k.weight', 'layers.1.attention.norm_q.weight', 'layers.1.attention.to_k.weight', 'layers.1.attention.to_out.0.weight', 'layers.1.attention.to_q.weight', 'layers.1.attention.to_v.weight', 'layers.2.attention.norm_k.weight', 'layers.2.attention.norm_q.weight', 'layers.2.attention.to_k.weight', 'layers.2.attention.to_out.0.weight', 'layers.2.attention.to_q.weight', 'layers.2.attention.to_v.weight', 'layers.3.attention.norm_k.weight', 'layers.3.attention.norm_q.weight', 'layers.3.attention.to_k.weight', 'layers.3.attention.to_out.0.weight', 'layers.3.attention.to_q.weight', 'layers.3.attention.to_v.weight', 'layers.4.attention.norm_k.weight', 'layers.4.attention.norm_q.weight', 'layers.4.attention.to_k.weight', 'layers.4.attention.to_out.0.weight', 'layers.4.attention.to_q.weight', 'layers.4.attention.to_v.weight', 'layers.5.attention.norm_k.weight', 'layers.5.attention.norm_q.weight', 'layers.5.attention.to_k.weight', 'layers.5.attention.to_out.0.weight', 'layers.5.attention.to_q.weight', 'layers.5.attention.to_v.weight', 'layers.6.attention.norm_k.weight', 'layers.6.attention.norm_q.weight', 'layers.6.attention.to_k.weight', 'layers.6.attention.to_out.0.weight', 'layers.6.attention.to_q.weight', 'layers.6.attention.to_v.weight', 'layers.7.attention.norm_k.weight', 'layers.7.attention.norm_q.weight', 'layers.7.attention.to_k.weight', 'layers.7.attention.to_out.0.weight', 'layers.7.attention.to_q.weight', 'layers.7.attention.to_v.weight', 'layers.8.attention.norm_k.weight', 'layers.8.attention.norm_q.weight', 'layers.8.attention.to_k.weight', 'layers.8.attention.to_out.0.weight', 'layers.8.attention.to_q.weight', 'layers.8.attention.to_v.weight', 'layers.9.attention.norm_k.weight', 'layers.9.attention.norm_q.weight', 'layers.9.attention.to_k.weight', 'layers.9.attention.to_out.0.weight', 'layers.9.attention.to_q.weight', 'layers.9.attention.to_v.weight', 'layers.10.attention.norm_k.weight', 'layers.10.attention.norm_q.weight', 'layers.10.attention.to_k.weight', 'layers.10.attention.to_out.0.weight', 'layers.10.attention.to_q.weight', 'layers.10.attention.to_v.weight', 'layers.11.attention.norm_k.weight', 'layers.11.attention.norm_q.weight', 'layers.11.attention.to_k.weight', 'layers.11.attention.to_out.0.weight', 'layers.11.attention.to_q.weight', 'layers.11.attention.to_v.weight', 'layers.12.attention.norm_k.weight', 'layers.12.attention.norm_q.weight', 'layers.12.attention.to_k.weight', 'layers.12.attention.to_out.0.weight', 'layers.12.attention.to_q.weight', 'layers.12.attention.to_v.weight', 'layers.13.attention.norm_k.weight', 'layers.13.attention.norm_q.weight', 'layers.13.attention.to_k.weight', 'layers.13.attention.to_out.0.weight', 'layers.13.attention.to_q.weight', 'layers.13.attention.to_v.weight', 'layers.14.attention.norm_k.weight', 'layers.14.attention.norm_q.weight', 'layers.14.attention.to_k.weight', 'layers.14.attention.to_out.0.weight', 'layers.14.attention.to_q.weight', 'layers.14.attention.to_v.weight', 'layers.15.attention.norm_k.weight', 'layers.15.attention.norm_q.weight', 'layers.15.attention.to_k.weight', 'layers.15.attention.to_out.0.weight', 'layers.15.attention.to_q.weight', 'layers.15.attention.to_v.weight', 'layers.16.attention.norm_k.weight', 'layers.16.attention.norm_q.weight', 'layers.16.attention.to_k.weight', 'layers.16.attention.to_out.0.weight', 'layers.16.attention.to_q.weight', 'layers.16.attention.to_v.weight', 'layers.17.attention.norm_k.weight', 'layers.17.attention.norm_q.weight', 'layers.17.attention.to_k.weight', 'layers.17.attention.to_out.0.weight', 'layers.17.attention.to_q.weight', 'layers.17.attention.to_v.weight', 'layers.18.attention.norm_k.weight', 'layers.18.attention.norm_q.weight', 'layers.18.attention.to_k.weight', 'layers.18.attention.to_out.0.weight', 'layers.18.attention.to_q.weight', 'layers.18.attention.to_v.weight', 'layers.19.attention.norm_k.weight', 'layers.19.attention.norm_q.weight', 'layers.19.attention.to_k.weight', 'layers.19.attention.to_out.0.weight', 'layers.19.attention.to_q.weight', 'layers.19.attention.to_v.weight', 'layers.20.attention.norm_k.weight', 'layers.20.attention.norm_q.weight', 'layers.20.attention.to_k.weight', 'layers.20.attention.to_out.0.weight', 'layers.20.attention.to_q.weight', 'layers.20.attention.to_v.weight', 'layers.21.attention.norm_k.weight', 'layers.21.attention.norm_q.weight', 'layers.21.attention.to_k.weight', 'layers.21.attention.to_out.0.weight', 'layers.21.attention.to_q.weight', 'layers.21.attention.to_v.weight', 'layers.22.attention.norm_k.weight', 'layers.22.attention.norm_q.weight', 'layers.22.attention.to_k.weight', 'layers.22.attention.to_out.0.weight', 'layers.22.attention.to_q.weight', 'layers.22.attention.to_v.weight', 'layers.23.attention.norm_k.weight', 'layers.23.attention.norm_q.weight', 'layers.23.attention.to_k.weight', 'layers.23.attention.to_out.0.weight', 'layers.23.attention.to_q.weight', 'layers.23.attention.to_v.weight', 'layers.24.attention.norm_k.weight', 'layers.24.attention.norm_q.weight', 'layers.24.attention.to_k.weight', 'layers.24.attention.to_out.0.weight', 'layers.24.attention.to_q.weight', 'layers.24.attention.to_v.weight', 'layers.25.attention.norm_k.weight', 'layers.25.attention.norm_q.weight', 'layers.25.attention.to_k.weight', 'layers.25.attention.to_out.0.weight', 'layers.25.attention.to_q.weight', 'layers.25.attention.to_v.weight', 'layers.26.attention.norm_k.weight', 'layers.26.attention.norm_q.weight', 'layers.26.attention.to_k.weight', 'layers.26.attention.to_out.0.weight', 'layers.26.attention.to_q.weight', 'layers.26.attention.to_v.weight', 'layers.27.attention.norm_k.weight', 'layers.27.attention.norm_q.weight', 'layers.27.attention.to_k.weight', 'layers.27.attention.to_out.0.weight', 'layers.27.attention.to_q.weight', 'layers.27.attention.to_v.weight', 'layers.28.attention.norm_k.weight', 'layers.28.attention.norm_q.weight', 'layers.28.attention.to_k.weight', 'layers.28.attention.to_out.0.weight', 'layers.28.attention.to_q.weight', 'layers.28.attention.to_v.weight', 'layers.29.attention.norm_k.weight', 'layers.29.attention.norm_q.weight', 'layers.29.attention.to_k.weight', 'layers.29.attention.to_out.0.weight', 'layers.29.attention.to_q.weight', 'layers.29.attention.to_v.weight']
Requested to load Lumina2

areyouaperson

3 days ago

I'm hoping to see a more stable, well-supported ComfyUI workflow soon for inclusionAI's TwinFlow-Z-Image-Turbo with support for quantized GGUF Z-Image-Turbo versions, LoRA support, and of course the one or few-step image generation it is meant for that still produces decent images with okay world knowledge, even if it isn't perfect. With all the customizability that default Z-Image-Turbo has such as specific pixel resolution size.

Please update when this stable ComfyUI workflow is ready.

RuneXX

3 days ago

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

Bit over my "pay grade", no idea about that ;-) maybe someone else do...
But i saw it was asked for as a new feature on ComfyUI. Maybe you want to post something there https://github.com/comfyanonymous/ComfyUI/issues/11424

And thanks for the models , getting some nice images (despite it not being fully working yet in comfy)
(tried both TwinFlow z-image and Twinflow Qwen Image)

If better support means even better images that sounds promising for sure ;-)

Jianqiao1

1 day ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).

A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:

https://github.com/smthemex/ComfyUI_TwinFlow

Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.

In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.

Here are some discussions:

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1

kenshinn

inclusionAI org 1 day ago

•

edited 1 day ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).

A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:

https://github.com/smthemex/ComfyUI_TwinFlow

Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.

In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.

Here are some discussions:

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1

@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.

TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.

We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.

https://github.com/inclusionAI/TwinFlow/blob/f1231854d7aba806eb586a79d05aa9cf062e29ca/src/methodes/twinflow/twinflow.py#L390-L490

L429 - L444:

            if sampling_style == "few":
                t_tgt = torch.zeros_like(t_cur)
            elif sampling_style == "mul":
                t_tgt = t_cur
            elif sampling_style == "any":
                t_tgt = t_next
            else:
                raise ValueError(f"Unknown sampling style: {sampling_style}")

            x_hat, z_hat, _, _ = self.forward(
                sampling_model,
                x_cur.to(input_dtype),
                t_cur.to(input_dtype),
                t_tgt.to(input_dtype),
                **model_kwargs,
            )

Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code diffusers_patch/transformer_qwenimage.py

temb = temb + temb_2 * timestep.unsqueeze(1)

into

temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)

to support all sampling styles.

Jianqiao1

about 23 hours ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).

A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:

https://github.com/smthemex/ComfyUI_TwinFlow

Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.

In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.

Here are some discussions:

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1

@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.

TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.

We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.

https://github.com/inclusionAI/TwinFlow/blob/f1231854d7aba806eb586a79d05aa9cf062e29ca/src/methodes/twinflow/twinflow.py#L390-L490

L429 - L444:
            if sampling_style == "few":
                t_tgt = torch.zeros_like(t_cur)
            elif sampling_style == "mul":
                t_tgt = t_cur
            elif sampling_style == "any":
                t_tgt = t_next
            else:
                raise ValueError(f"Unknown sampling style: {sampling_style}")

            x_hat, z_hat, _, _ = self.forward(
                sampling_model,
                x_cur.to(input_dtype),
                t_cur.to(input_dtype),
                t_tgt.to(input_dtype),
                **model_kwargs,
            )
Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code diffusers_patch/transformer_qwenimage.py
temb = temb + temb_2 * timestep.unsqueeze(1)
into
temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)
to support all sampling styles.

Yes, I noticed this. The "few" setting works best with a small number of steps, while "any" works better with more steps. I'm not sure when "mul" takes effect, but it produces the worst results, almost identical to using ComfyUI's default Ksampler, resulting in a lot of jagged edges in the details.
Additionally, the "few" mode with qwen_image shows some oversampling with multiple steps, resulting in a noticeable increase in image contrast, but this is likely expected. Strangely, z-image doesn't seem to have this problem, or perhaps it does, but it's not as noticeable.

Another point is the sampling method with sampling order = 2. I understand you intended to implement a second-order Heun solver, but it doesn't seem to be used in your code. I tried the original method and encountered some numerical overflows, resulting in strange image colors. Therefore, I implemented a similar method myself, which doesn't require introducing more noise with stochast_ratio when using the Heun method.

BuBaLoM

about 23 hours ago

TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)

BuBaLoM

about 22 hours ago

Таймеры надо настраивать под разные разрешения

BuBaLoM

about 22 hours ago

Я настраиваю так пишу промт "Настенные часы с арабскими цифрами вблизи, на нем часовые, минутные и секундные стрелки и название фирмы "BuBaLoM" отключаю рандомный генератор, и подбираю числа

kenshinn

inclusionAI org about 22 hours ago

Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/

@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.

We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?

Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).

A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:

https://github.com/smthemex/ComfyUI_TwinFlow

Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.

In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.

Here are some discussions:

https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1

@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.

TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.

We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.

https://github.com/inclusionAI/TwinFlow/blob/f1231854d7aba806eb586a79d05aa9cf062e29ca/src/methodes/twinflow/twinflow.py#L390-L490

L429 - L444:
            if sampling_style == "few":
                t_tgt = torch.zeros_like(t_cur)
            elif sampling_style == "mul":
                t_tgt = t_cur
            elif sampling_style == "any":
                t_tgt = t_next
            else:
                raise ValueError(f"Unknown sampling style: {sampling_style}")

            x_hat, z_hat, _, _ = self.forward(
                sampling_model,
                x_cur.to(input_dtype),
                t_cur.to(input_dtype),
                t_tgt.to(input_dtype),
                **model_kwargs,
            )
Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code diffusers_patch/transformer_qwenimage.py
temb = temb + temb_2 * timestep.unsqueeze(1)
into
temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)
to support all sampling styles.
Yes, I noticed this. The "few" setting works best with a small number of steps, while "any" works better with more steps. I'm not sure when "mul" takes effect, but it produces the worst results, almost identical to using ComfyUI's default Ksampler, resulting in a lot of jagged edges in the details.
Additionally, the "few" mode with qwen_image shows some oversampling with multiple steps, resulting in a noticeable increase in image contrast, but this is likely expected. Strangely, z-image doesn't seem to have this problem, or perhaps it does, but it's not as noticeable.

Another point is the sampling method with sampling order = 2. I understand you intended to implement a second-order Heun solver, but it doesn't seem to be used in your code. I tried the original method and encountered some numerical overflows, resulting in strange image colors. Therefore, I implemented a similar method myself, which doesn't require introducing more noise with stochast_ratio when using the Heun method.

Yes. The few setting is suitable for low-step sampling, typically under 8 steps; any can be used for both low-step and multi-step sampling, but usually requires >=4 steps, while mul is the standard Euler scheduler, where >=8 steps is generally more reasonable.

kenshinn

inclusionAI org about 22 hours ago

TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)

Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻

Jianqiao1

about 21 hours ago

TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)

Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻

'a' seems to be related to the overall composition, 'b' seems to be related to details, and 'c' seems to be related to lighting and shadows. Here, I've defaulted to linear values of 1.0, 1.0, and 1.0, but users can adjust these interesting parameters.

kenshinn

inclusionAI org about 21 hours ago

TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)

Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻

'a' seems to be related to the overall composition, 'b' seems to be related to details, and 'c' seems to be related to lighting and shadows. Here, I've defaulted to linear values of 1.0, 1.0, and 1.0, but users can adjust these interesting parameters.

mathematically, a and b are the parameters that control the beta distribution
c is a shifting parameter

BuBaLoM

about 16 hours ago

•

edited about 16 hours ago

Yes. The few setting is suitable for low-step sampling, typically under 8 steps; any can be used for both low-step and multi-step sampling, but usually requires >=4 steps, while mul is the standard Euler scheduler, where >=8 steps is generally more reasonable.
You forgot that these are new models. Their high processing and generation speed in 5-8 steps strongly depends on the shape of the noise level distribution (sigma). Linear distribution is inefficient in this case, so the code uses the Kumaravasami transform to create a non-linear schedule.
It's like driving at 200 km/h in a city of small alleys.

Jianqiao1

about 9 hours ago

released https://github.com/mengqin/ComfyUI-TwinFlow/

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment