AI & ML interests

✨Lightweight specialist models for flawless image edits✨

Articles

piercus 
posted an update 2 months ago
piercus 
posted an update 2 months ago
view post
Post
394
🚧 Reproducing LBM-Eraser… in the open [1] !

A major recent paper on erasing is OmniEraser [2].
They open-sourced an evaluation dataset [3] (and I'm using it for the evaluation of our LBM-Eraser 😉).

It's not a big dataset (70 samples), but it's good quality pairs, and that's what matters !

cc @BaiLing

[1] Finegrain LBM Fork : https://github.com/finegrain-ai/LBM
[2] OmniEraser: VDOR: A Video-based Dataset for Object Removal via Sequence Consistency (2501.07397)
[3] BaiLing/RemovalBench
[4] LBM paper: LBM: Latent Bridge Matching for Fast Image-to-Image Translation (2503.07535)
piercus 
posted an update 2 months ago
view post
Post
840
🚧 Reproducing LBM-Eraser… in the open [1] !

Today we have trained a LBM [2] promptless inpainter using Re-LAION-Caption19M[3].

We use a subset of 1.25M images with aesthetic_score > 5.6 and pwatermark < 0.2 and LaMa [2] mask generation.

2 takeaways :
🖼 Inpainting is better compared to our RORD experiments [5]
🦶 "4 steps" outperforms single-step

[1] Finegrain LBM Fork : https://github.com/finegrain-ai/LBM
[2] LBM: Latent Bridge Matching for Fast Image-to-Image Translation (2503.07535)
[3] supermodelresearch/Re-LAION-Caption19M
[4] Resolution-robust Large Mask Inpainting with Fourier Convolutions (2109.07161)
[5] https://huggingface.co/posts/piercus/778833977889788

cc @supermodelresearch @presencesw
piercus 
posted an update 2 months ago
view post
Post
1855
🚧 Reproducing LBM-Eraser… in progress! [1]

When repurposing a T2I model into a pure I2I model, there’s always that orphaned text path — what do we do with it? 🤔

You can reuse it as learnable embeddings in multi-task setups [2], freeze an empty text prompt, distillate or prune the corresponding part.

In LBM, they take a clever route — zeroing [3] and reshaping [4] the text-related cross-attentions into self-attentions.
This gives you fresh weights for I2I computation, nicely integrated into your SD architecture.

📎 References
[1] Our LBM Fork: https://github.com/finegrain-ai/LBM
[2] OmniPaint: OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting (2503.08677)
[3] LBM Zeroing: https://github.com/gojasper/LBM/blob/cafebc46a9ac16dcc61691d289cc4676b5c75380/examples/training/train_lbm_surface.py#L147-L148
[4] LBM Reshaping: https://github.com/gojasper/LBM/blob/cafebc46a9ac16dcc61691d289cc4676b5c75380/examples/training/train_lbm_surface.py#L100
  • 2 replies
·
piercus 
posted an update 3 months ago
piercus 
posted an update 3 months ago
view post
Post
216
In LBM paper, the noise and the conditioning image are merged into a single composite image.

Unlike other inpainting methods (which typically grey-mask the missing area), LBM replaces the masked region with uniformly sampled random pixels.

Intuitively, since LBM is trained from a text-to-image (T2I) model, those random pixels act as a strong signal to the pretrained model — essentially saying: “This is where you can do your generative magic.”

LBM Paper: LBM: Latent Bridge Matching for Fast Image-to-Image Translation (2503.07535)
Our fork (work in progress): https://github.com/finegrain-ai/LBM
piercus 
posted an update 3 months ago
piercus 
posted an update 3 months ago