r/comfyui

▲ 2 r/comfyui+1 crossposts

Training without images

Hello,

So some time ago I've found a way to "train" embeddings without using an image dataset.

I didn't gave it much attention but I searched and asked arround and I couldn't seem to find this existing anywhere so I just want to double check is this something novel?

Without getting into too much details on how this works atm, I take a text and compress it into a small reusable identity file. The trained embedding is a standard textual inversion that works with any sd / sdxl model.

Takes about 1-2 minutes to train, 2gb vram and it's 70 ish kbs.

I made this because I wanted to increase character consistency and to diminish prompt bleeding. And it does the job.

I usually don't engage in posting, heck this is my first reddit post ever, but I'm really curious if this is something new or if I just reinvented the wheel. Also I'm curious if y'all find this useful.

If you got questions I'll gladly provide more details.

reddit.com
u/Disastrous-Many-9653 — 5 hours ago
▲ 93 r/comfyui+12 crossposts

I created an agentic orchestration pipeline for music video generation - [More info in comments]

I’ve been building Uisato Studio, a workflow-based AI creation platform for audiovisual work.

This is the Music Video mode: upload an image + audio, and the system analyzes the input, generates visual direction, creates clips, handles b-roll / lip-sync when needed, and assembles everything into a finished music video through a guided pipeline.

I’m trying to move AI video from isolated generation into orchestration; an agentic production system built for more coherent, edit-ready audiovisual output.

I’ve been building this suite for the past year, hope you guys enjoy it: https://uisato.studio/

u/TasTepeler — 6 hours ago
▲ 207 r/comfyui+2 crossposts

I wish they still made anime like this

Using an old SDXL Lora + NB + Seedance 2.0

u/32bit_badman — 8 hours ago
▲ 21 r/comfyui+1 crossposts

I wish they still made anime like this pt 2.

Since you guys really liked the first one, here is Pt 2. of my attempt to make OVA style retro anime great again. Using an old SDXL Lora + NB + Seedance 2.0.

u/32bit_badman — 8 hours ago
▲ 76 r/comfyui

Flux 2 Klein destiled My Workflow, following numerous requests for yesterday's post.

I'm sharing my workflow that I use for basically any task.

It features easy image aspect activation; just select the one you want.

Sage Attention is activated for quick generation; if you don't have it, just deactivate it.

Lora Manager - where you can store all your Loras; hovering the cursor over them shows a cover image from the store, greatly helping with style identification. When activated, it pulls all activation keys for easy use, eliminating the need to search for activation keys, as it's directly synchronized by Civitate.

It's a straightforward, easy, and simple workflow with high-resolution image generation and very fast speed.

Workflow
https://civitai.com/models/2640066?modelVersionId=2964326

The link to the loras used for realism is in my other post.
https://www.reddit.com/r/StableDiffusion/comments/1tiwruj/comment/on1d4fh/?screen_view_count=2

As promised, here is the workflow, because after this post I received many, many messages requesting the workflow, both on Reddit and Civitate.

I'll bring my I2I soon for realism in any image.

u/Puzzled-Valuable-985 — 9 hours ago

What's the current take on SageAttention?

Last I tried to install it a few months ago it completely broke my comfyui, and AI chats keep saying "comfyui has built in attention mechanisms that give the same speedup" which... might or might not be true?

I'm on a 4090 running fp8 models, mostly F2K 9b.

What is your experience with SageAttention today? Is there any more foolproof way of installing it?

reddit.com
u/Embarrassed-Deal9849 — 5 hours ago

Node name searching by ID or Name?

How in the world can I find a node by name or id, when you have a workflow with like 9485934 nodes? There used to be an option from Easy Use on the left side where there was a nodemap you could search and then click a little eye to bring you to it, but that doesn't seem to work for me anymore.

reddit.com
u/OddJob001 — 5 hours ago

LTX 2.3 question about LoRA teeth training

Hi everybody,

Most likely you all already know that LTX 2.3 sucks when it comes to teeth, especially lower teeth. Even some high quality generations (over 1440p) gives crooked or AI teeth, which bothers me a lot. (I tried prompting something like "clear, defined teeth" etc but it didn't work well.

I have never trained any LoRAs for LTX 2.3 before. So my question is, do you think it is possible to train a teeth LoRA for LTX 2.3. For example, if I find some high quality photos of people smiling, and crop the teeth area for 512x512px (or even 1024px), then train it all, will I get good teeth in videos?

Any other suggestions?

Thanks for your time!

reddit.com
u/Ok-Option-6683 — 8 hours ago

Any Object Removal Workflow, like Photoshop’s content Awareness Fill

Hey guys, is there a workflow/model local (hopefully) or non local that let’s you remove objects as good as Photoshop’s Content Aware Fill for VFX Cleanups and general purpose. Ideally removing objects on a mask-selected area (not generating the whole image), and not replacing objects with something else, straight up removal. Help is appreciated. This is literally the only reason I pay for the whole Adobe’s subscription, and I feel there must be another way.

reddit.com
u/Beginning_Expert_970 — 7 hours ago

Stable Audio 3.0 Showcase

Hey yall! Stable Audio 3.0 Base and Distilled are available in comfys templates. Just update your comfy and itll be there. Pretty small models, around 9gb in size. Encoders are less than 5gb during run so it all fits inside around 16gb of compute. Offers full song generation, sectional editing, extensions to full song from a section, and just straight up instrument or SFX generation as well.

VERY fast, generating a 2 minute and 40 second song in about 60 seconds or less in some runs. Very coherent but VERY limited in seed variation. I noticed running the same prompt on 3 different seeds essentially gives the same output with a SLIGHTLY different melody. Rhythm percussion will pretty much be exact. Kind of sad but changing prompt slightly can rearrange the output.

Full Youtube video showcase:

https://youtu.be/TU3PvItvSO0

u/MFGREBEL — 8 hours ago

Could ComfyUI process queries like LLMs?

So, for example, I can create some characters in 3D on white background, upload them to, say, Gemini and ask it to place those characters in a specific environment, and make them realistic, while preserve their clothes, poses, etc. With this request Gemini generates exactly what I asked for and the characters are put into the environment with correct lightning, shadows, etc.

When I use image to image flow in ComfyUI, I'm unable to get the same results.

I understand why it happens, LLMs use multimodal models where texts and images are processed together, while ComfyUI processes each media type separately. But is it possible to recreate similar experience in ComfyUI?

reddit.com
u/stealth_nsk — 8 hours ago

About anima.

What is all the hype and should it bother me?

I assembled an 80gb lora library for Illustrious and don't feel like moving to a new model. Would it be stupid for me to stick to it, or is it just another hype cycle that will die in half a year? I've seen people who still use sd1.5/sdxl1.0 get dunked on so how do we determine the life cycle of a model?

reddit.com
u/Son-Airys — 13 hours ago
▲ 57 r/comfyui

Can someone please explain me how do these creators maintain the background, outfit, and all other details?

Can anyone please explain to me what these insta AI creators use to keep the background, outfit, hairstyles all of them intact while just changing the pose or angle, keeping the room or place entirely the same. I haven't been able to achieve this level of same background, outfit etc match no matter what I tried in Open source AI models like Z image Turbo or Flux Klein. If they use Nano Banana or Gpt 2.0 then how are they creating, almost nsfw images like that, like large boobs or bikini stuff.

u/Reckless_Venom1507 — 18 hours ago

Wildcard/Randomize prompt not working on ComfyUI Wan2.2 T2V template.

I am currently using Wan 2.2 template downloaded in ComfyUI's 'Template', which its workflow is simple with only two nodes and has 'Turbo Mode', which helps tremendously with the speed compare to previous workflow I used with KSampler and all that.

However, I noticed the prompt box does not seem to recognize a wildcard or randomize texts, the one that use { } and | character.

For example:

A young {American|Japanese|blonde|Polish} female is sitting in {studio with black backdrop|sunny outdoor|well-lit cafe} wearing {red|blue|black} dress.

If I queue them, the prompt should get any one of those wildcard selection randomly that I wrote, but from the rendered videos, it only get the first one wildcard from the prompt and never the other one.

May I know what thing I should get in order to get the wildcards working for the prompt with this workflow?

reddit.com
u/a2z0417 — 10 hours ago
▲ 58 r/comfyui

Stable Audio 3.0 Day-0 Support in ComfyUI: From Sound Effects to Longer, More Musical Tracks

We’re excited to share that Stable Audio 3.0—Stability AI’s new family of music models built for artistic experimentation—is coming to ComfyUI. Trained on fully licensed data, these models bring variable-length generation, on-device-friendly small checkpoints, and stronger musicality for longer structure—so you can go from quick SFX to extended tracks inside the workflows you already use.

Download Workflow

Model highlights

  • Licensed for commercial use — trained on fully licensed music data.
  • Flexible clip length — from quick SFX and short loops to longer tracks (up to about two minutes on Small, six minutes on Medium).
  • Lightweight, small models — run SFX and short music on a CPU, no big GPU required.
  • Medium for longer music — fuller tracks with stronger structure when you have a GPU.

Available Models

  • Small-SFX: Sound effects and short ambiance, up to 2:00,
  • Small-Music: Short music and on-device-friendly loops, up to 2:00
  • Medium: Longer tracks with stronger structure and musicality, up to ~6:20

Small reaches two minutes (vs. 11s / 47s on Stable Audio Open). Medium goes beyond six minutes when you need length.

Get started

  1. Update ComfyUI to v0.22.0 or go to Comfy Cloud
  2. Go to the left sidebar → Template → Audio category → Choose Stable Audio 3.0 Template
  3. For local users, please follow the note in the workflow to download the models and place them in the correct directory
  4. Write a prompt, set the duration in seconds, then hit run.

Download Workflow

More Info and Examples on our Blog

As always, enjoy creating!

u/PurzBeats — 18 hours ago

How to get comfyui to work with a setup of AMD integrated graphics CPU + AMD discrete GPU ?

Hello,
I have an AMD laptop

CPU: AMD Ryzen 9 5900HX

GPU: 6800M (12 GB VRAM)

RAM: 16 GB

Hooked via USB-C to an external monitor (not sure if this is relevant, maybe it would only work using the laptop's screen ?)

And running the latest Fedora Linux and ComfyUI

I have tried to install, reinstall comfyui in many different ways, I tried with things like pinokio, I tried to reinstall pytorch and other packages with specific rocm versions, I tried adding the user to different groups, I tried to mess with json or js files to add stuff like

HSA_OVERRIDE_GFX_VERSION: "10.3.0"

export HSA_OVERRIDE_GFX_VERSION=10.3.0

HIP_VISIBLE_DEVICES: "1"

ROCR_VISIBLE_DEVICES: "1"

But whatever I do does not seem to work. I run a lightweight test on purpose (e.g. generating an image that would take 30 seconds on 8gb VRAM), but:

to me, it seems like that instead of using the VRAM fully (altho some of it seems to be used, which is confusing), the RAM is fully used, which I think means that comfyui must use the CPU integrated graphics + RAM instead of the GPU VRAM.

Using the entire RAM eventually makes the app crash. I made a huge swap partition, and while it stopped crashing, something that should take 30 seconds to generate still does not generate after 20+ minutes (not surprising given how slow swap is). And even if it worked, it is still not an acceptable solution to not be able to use my fast VRAM.

(I am open to dual boot with windows if the OS is the issue).

reddit.com
u/Beneficial_Fish_7509 — 13 hours ago
▲ 151 r/comfyui+1 crossposts

Angelo - A Unified Sampler / Inpainter / Refiner (fix hands etc) for ComfyUI

https://github.com/shootthesound/ComfyUI-Angelo I'm a photographer who kept hitting the same wall in ComfyUI: generate an image, then to fix one thing I'd save it, open a Mask Editor or Photoshop, and fix. It works, but it's not smooth.

I've been editing photos for longer than I've been building nodes, so wanted to bring some some of that to comfy in the the way I like to work. If it works for you too or if you have ideas, let me know.

Right now the smart modes are Klein 9B focused, but should work with other edit models - again , let me know!

Here is a really shitty Youtube demo I just recorded: https://www.youtube.com/watch?v=x0Un3OkEHFA

Pete

UPDATE: Load Image button now included

u/shootthesound — 24 hours ago
▲ 26 r/comfyui

ggufy: easy quantization for the GPU poor

Hello.

I was frustrated by the lack of tooling around image model conversion / quantization, or the extreme RAM requirements and complexity of the scant existing tooling, so I wrote my own. People have said I should post it here, so here it is:

https://github.com/qskousen/ggufy

It has a CLI and a GUI. The GUI is easy to use, you can drag and drop files in. Both CLI and GUI are single-file executables, written in Zig because I like writing in Zig. It's pretty efficient with RAM, and takes about 1.5 minutes to quantize ZiT on my machine.

It supports all the main models that I am aware of, and you can convert to/from gguf or safetensors. It supports I think all the datatypes that are generally supported, such as q3_k through q8_0, f32, bf16, f16, f8_e4m3, f8_e5m2, scaled fp8, mxfp8, and nvfp4. It doesn't do SDNQ yet, but I would like to add it if I can get some time to figure out the format.

It's cross platform, and builds for Linux, Windows, and MacOS (both ARM64 and x86). Github Actions pre-built binaries are available on the releases page.

If there are features you think are in scope and would be useful, or additional models or formats that it doesn't support yet, please open an issue or let me know here. Thanks.

Cross-posted to r/StableDiffusion.

reddit.com
u/exeunt_bits — 19 hours ago