No COMFY
COMFY last ? black screen generation(
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
Seems to work fine in Comfy here.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/tree/main/ComfyUI
I did use a bit more steps, 6 or so. But still super fast ;-)
Standard res. For even higher res probably just add some more steps
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Выдает такую ошибку COMFY последней версии
model_type FLOW
unet missing: ['x_embedder.weight', 'x_embedder.bias', 'noise_refiner.0.attention.qkv.weight', 'noise_refiner.0.attention.out.weight', 'noise_refiner.0.attention.q_norm.weight', 'noise_refiner.0.attention.k_norm.weight', 'noise_refiner.1.attention.qkv.weight', 'noise_refiner.1.attention.out.weight', 'noise_refiner.1.attention.q_norm.weight', 'noise_refiner.1.attention.k_norm.weight', 'context_refiner.0.attention.qkv.weight', 'context_refiner.0.attention.out.weight', 'context_refiner.0.attention.q_norm.weight', 'context_refiner.0.attention.k_norm.weight', 'context_refiner.1.attention.qkv.weight', 'context_refiner.1.attention.out.weight', 'context_refiner.1.attention.q_norm.weight', 'context_refiner.1.attention.k_norm.weight', 'layers.0.attention.qkv.weight', 'layers.0.attention.out.weight', 'layers.0.attention.q_norm.weight', 'layers.0.attention.k_norm.weight', 'layers.1.attention.qkv.weight', 'layers.1.attention.out.weight', 'layers.1.attention.q_norm.weight', 'layers.1.attention.k_norm.weight', 'layers.2.attention.qkv.weight', 'layers.2.attention.out.weight', 'layers.2.attention.q_norm.weight', 'layers.2.attention.k_norm.weight', 'layers.3.attention.qkv.weight', 'layers.3.attention.out.weight', 'layers.3.attention.q_norm.weight', 'layers.3.attention.k_norm.weight', 'layers.4.attention.qkv.weight', 'layers.4.attention.out.weight', 'layers.4.attention.q_norm.weight', 'layers.4.attention.k_norm.weight', 'layers.5.attention.qkv.weight', 'layers.5.attention.out.weight', 'layers.5.attention.q_norm.weight', 'layers.5.attention.k_norm.weight', 'layers.6.attention.qkv.weight', 'layers.6.attention.out.weight', 'layers.6.attention.q_norm.weight', 'layers.6.attention.k_norm.weight', 'layers.7.attention.qkv.weight', 'layers.7.attention.out.weight', 'layers.7.attention.q_norm.weight', 'layers.7.attention.k_norm.weight', 'layers.8.attention.qkv.weight', 'layers.8.attention.out.weight', 'layers.8.attention.q_norm.weight', 'layers.8.attention.k_norm.weight', 'layers.9.attention.qkv.weight', 'layers.9.attention.out.weight', 'layers.9.attention.q_norm.weight', 'layers.9.attention.k_norm.weight', 'layers.10.attention.qkv.weight', 'layers.10.attention.out.weight', 'layers.10.attention.q_norm.weight', 'layers.10.attention.k_norm.weight', 'layers.11.attention.qkv.weight', 'layers.11.attention.out.weight', 'layers.11.attention.q_norm.weight', 'layers.11.attention.k_norm.weight', 'layers.12.attention.qkv.weight', 'layers.12.attention.out.weight', 'layers.12.attention.q_norm.weight', 'layers.12.attention.k_norm.weight', 'layers.13.attention.qkv.weight', 'layers.13.attention.out.weight', 'layers.13.attention.q_norm.weight', 'layers.13.attention.k_norm.weight', 'layers.14.attention.qkv.weight', 'layers.14.attention.out.weight', 'layers.14.attention.q_norm.weight', 'layers.14.attention.k_norm.weight', 'layers.15.attention.qkv.weight', 'layers.15.attention.out.weight', 'layers.15.attention.q_norm.weight', 'layers.15.attention.k_norm.weight', 'layers.16.attention.qkv.weight', 'layers.16.attention.out.weight', 'layers.16.attention.q_norm.weight', 'layers.16.attention.k_norm.weight', 'layers.17.attention.qkv.weight', 'layers.17.attention.out.weight', 'layers.17.attention.q_norm.weight', 'layers.17.attention.k_norm.weight', 'layers.18.attention.qkv.weight', 'layers.18.attention.out.weight', 'layers.18.attention.q_norm.weight', 'layers.18.attention.k_norm.weight', 'layers.19.attention.qkv.weight', 'layers.19.attention.out.weight', 'layers.19.attention.q_norm.weight', 'layers.19.attention.k_norm.weight', 'layers.20.attention.qkv.weight', 'layers.20.attention.out.weight', 'layers.20.attention.q_norm.weight', 'layers.20.attention.k_norm.weight', 'layers.21.attention.qkv.weight', 'layers.21.attention.out.weight', 'layers.21.attention.q_norm.weight', 'layers.21.attention.k_norm.weight', 'layers.22.attention.qkv.weight', 'layers.22.attention.out.weight', 'layers.22.attention.q_norm.weight', 'layers.22.attention.k_norm.weight', 'layers.23.attention.qkv.weight', 'layers.23.attention.out.weight', 'layers.23.attention.q_norm.weight', 'layers.23.attention.k_norm.weight', 'layers.24.attention.qkv.weight', 'layers.24.attention.out.weight', 'layers.24.attention.q_norm.weight', 'layers.24.attention.k_norm.weight', 'layers.25.attention.qkv.weight', 'layers.25.attention.out.weight', 'layers.25.attention.q_norm.weight', 'layers.25.attention.k_norm.weight', 'layers.26.attention.qkv.weight', 'layers.26.attention.out.weight', 'layers.26.attention.q_norm.weight', 'layers.26.attention.k_norm.weight', 'layers.27.attention.qkv.weight', 'layers.27.attention.out.weight', 'layers.27.attention.q_norm.weight', 'layers.27.attention.k_norm.weight', 'layers.28.attention.qkv.weight', 'layers.28.attention.out.weight', 'layers.28.attention.q_norm.weight', 'layers.28.attention.k_norm.weight', 'layers.29.attention.qkv.weight', 'layers.29.attention.out.weight', 'layers.29.attention.q_norm.weight', 'layers.29.attention.k_norm.weight', 'final_layer.linear.weight', 'final_layer.linear.bias', 'final_layer.adaLN_modulation.1.weight', 'final_layer.adaLN_modulation.1.bias']
unet unexpected: ['all_final_layer.2-1.adaLN_modulation.1.bias', 'all_final_layer.2-1.adaLN_modulation.1.weight', 'all_final_layer.2-1.linear.bias', 'all_final_layer.2-1.linear.weight', 'all_x_embedder.2-1.bias', 'all_x_embedder.2-1.weight', 't_embedder_2.mlp.0.bias', 't_embedder_2.mlp.0.weight', 't_embedder_2.mlp.2.bias', 't_embedder_2.mlp.2.weight', 'noise_refiner.0.attention.norm_k.weight', 'noise_refiner.0.attention.norm_q.weight', 'noise_refiner.0.attention.to_k.weight', 'noise_refiner.0.attention.to_out.0.weight', 'noise_refiner.0.attention.to_q.weight', 'noise_refiner.0.attention.to_v.weight', 'noise_refiner.1.attention.norm_k.weight', 'noise_refiner.1.attention.norm_q.weight', 'noise_refiner.1.attention.to_k.weight', 'noise_refiner.1.attention.to_out.0.weight', 'noise_refiner.1.attention.to_q.weight', 'noise_refiner.1.attention.to_v.weight', 'context_refiner.0.attention.norm_k.weight', 'context_refiner.0.attention.norm_q.weight', 'context_refiner.0.attention.to_k.weight', 'context_refiner.0.attention.to_out.0.weight', 'context_refiner.0.attention.to_q.weight', 'context_refiner.0.attention.to_v.weight', 'context_refiner.1.attention.norm_k.weight', 'context_refiner.1.attention.norm_q.weight', 'context_refiner.1.attention.to_k.weight', 'context_refiner.1.attention.to_out.0.weight', 'context_refiner.1.attention.to_q.weight', 'context_refiner.1.attention.to_v.weight', 'layers.0.attention.norm_k.weight', 'layers.0.attention.norm_q.weight', 'layers.0.attention.to_k.weight', 'layers.0.attention.to_out.0.weight', 'layers.0.attention.to_q.weight', 'layers.0.attention.to_v.weight', 'layers.1.attention.norm_k.weight', 'layers.1.attention.norm_q.weight', 'layers.1.attention.to_k.weight', 'layers.1.attention.to_out.0.weight', 'layers.1.attention.to_q.weight', 'layers.1.attention.to_v.weight', 'layers.2.attention.norm_k.weight', 'layers.2.attention.norm_q.weight', 'layers.2.attention.to_k.weight', 'layers.2.attention.to_out.0.weight', 'layers.2.attention.to_q.weight', 'layers.2.attention.to_v.weight', 'layers.3.attention.norm_k.weight', 'layers.3.attention.norm_q.weight', 'layers.3.attention.to_k.weight', 'layers.3.attention.to_out.0.weight', 'layers.3.attention.to_q.weight', 'layers.3.attention.to_v.weight', 'layers.4.attention.norm_k.weight', 'layers.4.attention.norm_q.weight', 'layers.4.attention.to_k.weight', 'layers.4.attention.to_out.0.weight', 'layers.4.attention.to_q.weight', 'layers.4.attention.to_v.weight', 'layers.5.attention.norm_k.weight', 'layers.5.attention.norm_q.weight', 'layers.5.attention.to_k.weight', 'layers.5.attention.to_out.0.weight', 'layers.5.attention.to_q.weight', 'layers.5.attention.to_v.weight', 'layers.6.attention.norm_k.weight', 'layers.6.attention.norm_q.weight', 'layers.6.attention.to_k.weight', 'layers.6.attention.to_out.0.weight', 'layers.6.attention.to_q.weight', 'layers.6.attention.to_v.weight', 'layers.7.attention.norm_k.weight', 'layers.7.attention.norm_q.weight', 'layers.7.attention.to_k.weight', 'layers.7.attention.to_out.0.weight', 'layers.7.attention.to_q.weight', 'layers.7.attention.to_v.weight', 'layers.8.attention.norm_k.weight', 'layers.8.attention.norm_q.weight', 'layers.8.attention.to_k.weight', 'layers.8.attention.to_out.0.weight', 'layers.8.attention.to_q.weight', 'layers.8.attention.to_v.weight', 'layers.9.attention.norm_k.weight', 'layers.9.attention.norm_q.weight', 'layers.9.attention.to_k.weight', 'layers.9.attention.to_out.0.weight', 'layers.9.attention.to_q.weight', 'layers.9.attention.to_v.weight', 'layers.10.attention.norm_k.weight', 'layers.10.attention.norm_q.weight', 'layers.10.attention.to_k.weight', 'layers.10.attention.to_out.0.weight', 'layers.10.attention.to_q.weight', 'layers.10.attention.to_v.weight', 'layers.11.attention.norm_k.weight', 'layers.11.attention.norm_q.weight', 'layers.11.attention.to_k.weight', 'layers.11.attention.to_out.0.weight', 'layers.11.attention.to_q.weight', 'layers.11.attention.to_v.weight', 'layers.12.attention.norm_k.weight', 'layers.12.attention.norm_q.weight', 'layers.12.attention.to_k.weight', 'layers.12.attention.to_out.0.weight', 'layers.12.attention.to_q.weight', 'layers.12.attention.to_v.weight', 'layers.13.attention.norm_k.weight', 'layers.13.attention.norm_q.weight', 'layers.13.attention.to_k.weight', 'layers.13.attention.to_out.0.weight', 'layers.13.attention.to_q.weight', 'layers.13.attention.to_v.weight', 'layers.14.attention.norm_k.weight', 'layers.14.attention.norm_q.weight', 'layers.14.attention.to_k.weight', 'layers.14.attention.to_out.0.weight', 'layers.14.attention.to_q.weight', 'layers.14.attention.to_v.weight', 'layers.15.attention.norm_k.weight', 'layers.15.attention.norm_q.weight', 'layers.15.attention.to_k.weight', 'layers.15.attention.to_out.0.weight', 'layers.15.attention.to_q.weight', 'layers.15.attention.to_v.weight', 'layers.16.attention.norm_k.weight', 'layers.16.attention.norm_q.weight', 'layers.16.attention.to_k.weight', 'layers.16.attention.to_out.0.weight', 'layers.16.attention.to_q.weight', 'layers.16.attention.to_v.weight', 'layers.17.attention.norm_k.weight', 'layers.17.attention.norm_q.weight', 'layers.17.attention.to_k.weight', 'layers.17.attention.to_out.0.weight', 'layers.17.attention.to_q.weight', 'layers.17.attention.to_v.weight', 'layers.18.attention.norm_k.weight', 'layers.18.attention.norm_q.weight', 'layers.18.attention.to_k.weight', 'layers.18.attention.to_out.0.weight', 'layers.18.attention.to_q.weight', 'layers.18.attention.to_v.weight', 'layers.19.attention.norm_k.weight', 'layers.19.attention.norm_q.weight', 'layers.19.attention.to_k.weight', 'layers.19.attention.to_out.0.weight', 'layers.19.attention.to_q.weight', 'layers.19.attention.to_v.weight', 'layers.20.attention.norm_k.weight', 'layers.20.attention.norm_q.weight', 'layers.20.attention.to_k.weight', 'layers.20.attention.to_out.0.weight', 'layers.20.attention.to_q.weight', 'layers.20.attention.to_v.weight', 'layers.21.attention.norm_k.weight', 'layers.21.attention.norm_q.weight', 'layers.21.attention.to_k.weight', 'layers.21.attention.to_out.0.weight', 'layers.21.attention.to_q.weight', 'layers.21.attention.to_v.weight', 'layers.22.attention.norm_k.weight', 'layers.22.attention.norm_q.weight', 'layers.22.attention.to_k.weight', 'layers.22.attention.to_out.0.weight', 'layers.22.attention.to_q.weight', 'layers.22.attention.to_v.weight', 'layers.23.attention.norm_k.weight', 'layers.23.attention.norm_q.weight', 'layers.23.attention.to_k.weight', 'layers.23.attention.to_out.0.weight', 'layers.23.attention.to_q.weight', 'layers.23.attention.to_v.weight', 'layers.24.attention.norm_k.weight', 'layers.24.attention.norm_q.weight', 'layers.24.attention.to_k.weight', 'layers.24.attention.to_out.0.weight', 'layers.24.attention.to_q.weight', 'layers.24.attention.to_v.weight', 'layers.25.attention.norm_k.weight', 'layers.25.attention.norm_q.weight', 'layers.25.attention.to_k.weight', 'layers.25.attention.to_out.0.weight', 'layers.25.attention.to_q.weight', 'layers.25.attention.to_v.weight', 'layers.26.attention.norm_k.weight', 'layers.26.attention.norm_q.weight', 'layers.26.attention.to_k.weight', 'layers.26.attention.to_out.0.weight', 'layers.26.attention.to_q.weight', 'layers.26.attention.to_v.weight', 'layers.27.attention.norm_k.weight', 'layers.27.attention.norm_q.weight', 'layers.27.attention.to_k.weight', 'layers.27.attention.to_out.0.weight', 'layers.27.attention.to_q.weight', 'layers.27.attention.to_v.weight', 'layers.28.attention.norm_k.weight', 'layers.28.attention.norm_q.weight', 'layers.28.attention.to_k.weight', 'layers.28.attention.to_out.0.weight', 'layers.28.attention.to_q.weight', 'layers.28.attention.to_v.weight', 'layers.29.attention.norm_k.weight', 'layers.29.attention.norm_q.weight', 'layers.29.attention.to_k.weight', 'layers.29.attention.to_out.0.weight', 'layers.29.attention.to_q.weight', 'layers.29.attention.to_v.weight']
Requested to load Lumina2
I'm hoping to see a more stable, well-supported ComfyUI workflow soon for inclusionAI's TwinFlow-Z-Image-Turbo with support for quantized GGUF Z-Image-Turbo versions, LoRA support, and of course the one or few-step image generation it is meant for that still produces decent images with okay world knowledge, even if it isn't perfect. With all the customizability that default Z-Image-Turbo has such as specific pixel resolution size.
Please update when this stable ComfyUI workflow is ready.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Bit over my "pay grade", no idea about that ;-) maybe someone else do...
But i saw it was asked for as a new feature on ComfyUI. Maybe you want to post something there https://github.com/comfyanonymous/ComfyUI/issues/11424
And thanks for the models , getting some nice images (despite it not being fully working yet in comfy)
(tried both TwinFlow z-image and Twinflow Qwen Image)
If better support means even better images that sounds promising for sure ;-)
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).
A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:
https://github.com/smthemex/ComfyUI_TwinFlow
Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.
In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.
Here are some discussions:
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).
A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:
https://github.com/smthemex/ComfyUI_TwinFlow
Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.
In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.
Here are some discussions:
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1
@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.
TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.
We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.
L429 - L444:
if sampling_style == "few":
t_tgt = torch.zeros_like(t_cur)
elif sampling_style == "mul":
t_tgt = t_cur
elif sampling_style == "any":
t_tgt = t_next
else:
raise ValueError(f"Unknown sampling style: {sampling_style}")
x_hat, z_hat, _, _ = self.forward(
sampling_model,
x_cur.to(input_dtype),
t_cur.to(input_dtype),
t_tgt.to(input_dtype),
**model_kwargs,
)
Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code diffusers_patch/transformer_qwenimage.py
temb = temb + temb_2 * timestep.unsqueeze(1)
into
temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)
to support all sampling styles.
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).
A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:
https://github.com/smthemex/ComfyUI_TwinFlow
Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.
In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.
Here are some discussions:
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1
@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.
TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.
We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.
L429 - L444:
if sampling_style == "few": t_tgt = torch.zeros_like(t_cur) elif sampling_style == "mul": t_tgt = t_cur elif sampling_style == "any": t_tgt = t_next else: raise ValueError(f"Unknown sampling style: {sampling_style}") x_hat, z_hat, _, _ = self.forward( sampling_model, x_cur.to(input_dtype), t_cur.to(input_dtype), t_tgt.to(input_dtype), **model_kwargs, )Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code
diffusers_patch/transformer_qwenimage.pytemb = temb + temb_2 * timestep.unsqueeze(1)into
temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)to support all sampling styles.
Yes, I noticed this. The "few" setting works best with a small number of steps, while "any" works better with more steps. I'm not sure when "mul" takes effect, but it produces the worst results, almost identical to using ComfyUI's default Ksampler, resulting in a lot of jagged edges in the details.
Additionally, the "few" mode with qwen_image shows some oversampling with multiple steps, resulting in a noticeable increase in image contrast, but this is likely expected. Strangely, z-image doesn't seem to have this problem, or perhaps it does, but it's not as noticeable.
Another point is the sampling method with sampling order = 2. I understand you intended to implement a second-order Heun solver, but it doesn't seem to be used in your code. I tried the original method and encountered some numerical overflows, resulting in strange image colors. Therefore, I implemented a similar method myself, which doesn't require introducing more noise with stochast_ratio when using the Heun method.
TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)
Я настраиваю так пишу промт "Настенные часы с арабскими цифрами вблизи, на нем часовые, минутные и секундные стрелки и название фирмы "BuBaLoM" отключаю рандомный генератор, и подбираю числа
Here's a version compatible with ComfyUI, and while it can generate images in 2-3 steps, the results are relatively poor, especially at high resolutions.
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/
@Jianqiao1 Hi, in very-low inference steps, it is recommended to use our unified sampler: https://github.com/inclusionAI/TwinFlow/blob/main/unified_sampler.py. But I noticed that most ComfyUI users use KSampler or other samplers, it may produce poor images.
We are trying to integrate our sampler into diffusers, will this also help ComfyUI users?
Yes, I've noticed that you're using a rather unique sampler method. This model's sampling relies on an additional target time embedder. To allow ComfyUI users to use your model correctly, you need to provide some proper ComfyUI nodes, or submit your sample code directly to ComfyUI (I've submitted code to ComfyUI before, and this process can take a while; you'll need to write the code according to ComfyUI's coding style and guidelines).
A better approach is to provide a set of custom nodes for users. I've actually already written a set, which I will open-source in the next couple of days. There are also some ComfyUI node implementations that fully integrate your implementation, such as:
https://github.com/smthemex/ComfyUI_TwinFlow
Although smthemex's nodes work, they haven't implemented the nodes according to ComfyUI's style, so their model output and input are specific to TwinFlow. This isn't a good implementation method and has many drawbacks, such as not being able to use LoRA nodes, etc.
In short, I will open-source a set of relatively standard TwinFlow ComfyUI nodes that I've written. You can then review the corresponding implementations.
Here are some discussions:
https://huggingface.co/azazeal2/TwinFlow-Z-Image-Turbo-repacked/discussions/1
@Jianqiao1 That is great! Thank you very much for your effort in adapting TwinFlow to ComfyUI.
TwinFlow-accelerated model is actually any-step model, which means it can be sampled for arbitrary steps, allowing for flexible speed and quality trade offs.
We support 3 sampling styles: few-step (which is currently used), any-step, and classic multi-step. Our latest sampler support to switch between these styles.
L429 - L444:
if sampling_style == "few": t_tgt = torch.zeros_like(t_cur) elif sampling_style == "mul": t_tgt = t_cur elif sampling_style == "any": t_tgt = t_next else: raise ValueError(f"Unknown sampling style: {sampling_style}") x_hat, z_hat, _, _ = self.forward( sampling_model, x_cur.to(input_dtype), t_cur.to(input_dtype), t_tgt.to(input_dtype), **model_kwargs, )Besides, due to initial uploading issues, you need to modify this line for Qwen-Image in my code
diffusers_patch/transformer_qwenimage.pytemb = temb + temb_2 * timestep.unsqueeze(1)into
temb = temb + temb_2 * (timestep - target_timestep).unsqueeze(1)to support all sampling styles.
Yes, I noticed this. The "few" setting works best with a small number of steps, while "any" works better with more steps. I'm not sure when "mul" takes effect, but it produces the worst results, almost identical to using ComfyUI's default Ksampler, resulting in a lot of jagged edges in the details.
Additionally, the "few" mode with qwen_image shows some oversampling with multiple steps, resulting in a noticeable increase in image contrast, but this is likely expected. Strangely, z-image doesn't seem to have this problem, or perhaps it does, but it's not as noticeable.Another point is the sampling method with sampling order = 2. I understand you intended to implement a second-order Heun solver, but it doesn't seem to be used in your code. I tried the original method and encountered some numerical overflows, resulting in strange image colors. Therefore, I implemented a similar method myself, which doesn't require introducing more noise with
stochast_ratiowhen using the Heun method.
Yes. The few setting is suitable for low-step sampling, typically under 8 steps; any can be used for both low-step and multi-step sampling, but usually requires >=4 steps, while mul is the standard Euler scheduler, where >=8 steps is generally more reasonable.
TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)
Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻
TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)
Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻
'a' seems to be related to the overall composition, 'b' seems to be related to details, and 'c' seems to be related to lighting and shadows. Here, I've defaulted to linear values of 1.0, 1.0, and 1.0, but users can adjust these interesting parameters.
TwinFlowScheduler , Correct timer a,b,c a=1,1-1,27 b= 0,8 - 0,85 c=1,0-1,5(hight blur image)
Yes, [1.17, 0.8, 1.1] is theoretical values that can be used for most cases. 👏🏻👏🏻
'a' seems to be related to the overall composition, 'b' seems to be related to details, and 'c' seems to be related to lighting and shadows. Here, I've defaulted to linear values of 1.0, 1.0, and 1.0, but users can adjust these interesting parameters.
mathematically, a and b are the parameters that control the beta distribution
c is a shifting parameter
Yes. The few setting is suitable for low-step sampling, typically under 8 steps; any can be used for both low-step and multi-step sampling, but usually requires >=4 steps, while mul is the standard Euler scheduler, where >=8 steps is generally more reasonable.
You forgot that these are new models. Their high processing and generation speed in 5-8 steps strongly depends on the shape of the noise level distribution (sigma). Linear distribution is inefficient in this case, so the code uses the Kumaravasami transform to create a non-linear schedule.
It's like driving at 200 km/h in a city of small alleys.




