是否可提供官方有效的comfyui参考工作流,目前无法复现例图效果的图片

#12
by catmeat - opened

comfyui内置的常规工作流,使用的2.1八步模型
image

euler采样 + 0.7强度

z-image-turbo_00028_

res_multistep采样 + 0.7 强度

z-image-turbo_00027_

虽然也能用,但是跟你们例图的效果感觉还是有点差距。其他canny之类的也差不多,有差距

可否再详细说一下生成配套推荐使用的各种参数?以及提供例图对应的提示词等参数以复现验证本地生图效果

看了videofun中的采样代码,看不懂。。不知道怎么对照过来comfyui,如果可以的话希望大佬可以提供comfyui中能出效果的参考示例!

请问工作流能分享一下吗 我用官方的工作流 运行2.1 向量报错了RuntimeError: shape '[64, -1, 64, 64]' is invalid for input of size 4096

Alibaba-PAI org

请使用这个提示词:
A photo of Sakura, a 17-year-old high school student from Japan, captured in a candid, high-fidelity cinematic moment on a rainy evening. She is squatting low on the rain-slicked asphalt of an urban sidewalk, holding a transparent vinyl umbrella with a white handle resting over her shoulder in one hand, her other hand resting on her knee. The clear plastic canopy is streaked with rivulets of water and beaded with droplets that catch the ambient city light. A profound, silent interaction defines the scene: Sakura is looking directly downward, her expression gentle and focused, locking eyes with a small black cat sitting on the wet ground in front of her.

Sakura has long, lustrous black hair styled in a precise hime cut with blunt bangs across her forehead and sidelocks framing her cheeks, damp strands clinging subtly to her jacket, with a single red ribbon tied on the left side. Her visible pores on her nose, and a soft sheen of moisture on her cheeks. She wears a dark navy sailor-style school uniform (seifuku) featuring a white collar with red linear detailing and a bright red necktie loosely knotted at the chest; a simple black choker encircles her neck. The uniform jacket has oversized sleeves. Her lower body features a short, dark pleated miniskirt that fans slightly over clean white ankle socks that provide a stark contrast to the wet asphalt, ending in dark leather loafers that gleam with moisture.

The black cat sits upright in a shallow puddle, its short fur slicked by the rain, tilting its head back to stare intently up into Sakura's face, establishing a clear line of sight. The background is anchored by a large, illuminated red vending machine standing against the darkness, its cool bluish-white interior light spilling onto Sakura's profile and the umbrella. The ground reflects the red chassis and the neon streetlights in distorted patches on the wet pavement. Additional cool rain streaks fall through the frame, some caught in sharp focus and others blurred into vertical lines against the background lights. The scene is rendered with a wide-aperture lens creating a shallow depth of field, keeping the girl and cat in sharp focus while softening the background into gentle bokeh, with the texture of fine-grain 35mm film stock.

Alibaba-PAI org

工作流是直接使用comfyui自带的就可以吗?

工作流是直接使用comfyui自带的就可以吗?

是的,你看我截图,就是就是给大模型上叠加加载了controlnet而已,没啥特别的。这是comfyui官方提供的zimage cn工作流,我也不知道这样用是否正确

请问工作流能分享一下吗 我用官方的工作流 运行2.1 向量报错了RuntimeError: shape '[64, -1, 64, 64]' is invalid for input of size 4096

2.1需要你更新一下comfyui

Alibaba-PAI org

工作流是直接使用comfyui自带的就可以吗?

是的,你看我截图,就是就是给大模型上叠加加载了controlnet而已,没啥特别的。这是comfyui官方提供的zimage cn工作流,我也不知道这样用是否正确

应该是正确的,不过我测试基本都是videox-fun里面测试,你看看我给你提供的提示词,试试看。

Alibaba-PAI org

请问工作流能分享一下吗 我用官方的工作流 运行2.1 向量报错了RuntimeError: shape '[64, -1, 64, 64]' is invalid for input of size 4096

2.1需要你更新一下comfyui

应该已经支持了呀。

image
按照你提供的提示词,其他所有参数如无变动,上图为跟例图的对比,说不上来哪里不好,但是很明显质感、画面细腻程度上都比不过例图。
如果你们提供的例图并非特别挑选的个例,我感觉comfyui里的基础工作流应该与你们代码跑的参数或者流程有差异

z-image-turbo_00032_
euler的图,相对好一点,但是感觉还是有差异。我会测试更多case

z-image-turbo_00031_
res_multistep,就比较明显的差了

image
pose + inpaint 效果也并不稳定,直接套用仓库内的例图做测试👆。0.7模型强度+0.8重绘幅度,手经常控不住。

z-image-turbo_00053_

z-image-turbo_00054_

0.85模型强度 + 0.8重绘。两张,可见0.85也不一定控的住。如果提高重绘幅度的话崩图。

image
手动擦蒙版,增加一些羽化过渡区域,重绘看起来自然一点,但是交接处的变化依然有点明显,然后仍然控不住图。

z-image-turbo_00055_

手动擦蒙版的效果图。

另外,关键是,同样在生图质量上与仓库示例有差异,倒不至于很差,但是示例非常好,就很疑惑。我建议你们还是亲自玩下comfy上用这个模型,给一些官方调优过的使用方法或者节点比较好~毕竟玩家主要使用comfy来使用你们的模型

the 8 step inpaint somehow worse than normal one, the masked area gets a higher solution/clearer image than reference image..
pose_inpaint_v5

this is best I can get, and I used my own swift port (https://github.com/mzbac/zimage.swift) follow the videox-fun python imp

我先测试一下哈。

image
pose + inpaint 效果也并不稳定,直接套用仓库内的例图做测试👆。0.7模型强度+0.8重绘幅度,手经常控不住。

z-image-turbo_00053_

z-image-turbo_00054_

0.85模型强度 + 0.8重绘。两张,可见0.85也不一定控的住。如果提高重绘幅度的话崩图。

image
手动擦蒙版,增加一些羽化过渡区域,重绘看起来自然一点,但是交接处的变化依然有点明显,然后仍然控不住图。

z-image-turbo_00055_

手动擦蒙版的效果图。

另外,关键是,同样在生图质量上与仓库示例有差异,倒不至于很差,但是示例非常好,就很疑惑。我建议你们还是亲自玩下comfy上用这个模型,给一些官方调优过的使用方法或者节点比较好~毕竟玩家主要使用comfy来使用你们的模型

The latest Code 8-step effect, although showing some color differences, appears less pronounced than yours, and the shoreline remains continuous without breaking. The numbers following each image represent control strength. All tests were conducted in batches.
From left to right: 0.6, 0.65, 0.7, 0.75, 0.8, 0.85

image

the 8 step inpaint somehow worse than normal one, the masked area gets a higher solution/clearer image than reference image..
pose_inpaint_v5

this is best I can get, and I used my own swift port (https://github.com/mzbac/zimage.swift) follow the videox-fun python imp

测了一下ComfyUI的工作流,是有那么一些差异,额,我改改看,如果无法对齐,我就再整个videox-fun版本的吧
image

如下为VideoX-Fun的效果;
image
如下为ComfyUI工作流的效果:
z-image-turbo_00010_

在预测的时候注意设置高宽符合1328分辨率;
另外看起来似乎ComfyUI官方工作流euler ancestral采样器相似度更高。但多少有一些差异。如果有必要的话,我可以在VideoX-Fun中添加一个ComfyUI工作流来贴近我的预测结果。

如下为VideoX-Fun的效果;
image
如下为ComfyUI工作流的效果:
z-image-turbo_00010_

在预测的时候注意设置高宽符合1328分辨率;
另外看起来似乎ComfyUI官方工作流euler ancestral采样器相似度更高。但多少有一些差异。如果有必要的话,我可以在VideoX-Fun中添加一个ComfyUI工作流来贴近我的预测结果。

我仔细观察您的测试对比,videox-fun中的结果整体在饱和度、画面细腻质感上更好,而comfyui这张图的效果就和我的测试结果一致。不知道这种感官上的差异你是否能看得出来

另外comfyui的结果上,在这套提示词下,脸上很大概率出现雨珠,而你展示的几张videox-fun的效果,我放大仔细观察了一下,应该都是没雨珠的。这也给观感带来了一定的影响,不知道是否是一种错误预测、或者推理不完全的表现。看了你的测试结果后,我更坚信两者的推理存在较大差异

Alibaba-PAI org

我理解你的意思,看起来似乎我出一版本的comfyui更为合理。

我理解你的意思,看起来似乎我出一版本的comfyui更为合理。

是的,拜托了!

Sign up or log in to comment