r/StableDiffusion

▲ 405 r/StableDiffusion+3 crossposts

NeuralCompanion

NeuralCompanion (NC) is an open-source, local-first AI companion project for people who like building, experimenting, and seeing how far personal AI can go on their own hardware.

It brings together realtime voice chat, local LLMs, TTS/STT, image generation, interactive tutorials, API-friendly workflows, and a modular addon system into one desktop app designed to be flexible, hackable, and genuinely fun to explore.

NC also supports avatar systems and avatar engines like VSeeFace, VAM/VAM2, and other experimental realtime avatar workflows.

It is still experimental and a little rough around the edges in places, but that is part of the project. The goal is not to make another locked-down corporate assistant. It is to build a customizable AI companion platform you can actually run, modify, and shape yourself.

If you are into local AI, creative tools, avatars, plugins, voice interfaces, automation, or weird future-facing software, come take a look.

GitHub:
https://github.com/Rakile/NeuralCompanion

Discord:
https://discord.com/invite/UqnwX46rcK

Developers, tinkerers, artists, AI enthusiasts, and curious people very welcome.

Rakila & LAinol

u/lainol — 4 hours ago
▲ 99 r/StableDiffusion+1 crossposts

An Update on Nodes 2.0 from Comfy Org

Hi r/comfyui, Nodes 2.0 has been in beta since last July, and we want to be transparent with the community about where we’re headed.

Over time, we plan to gradually make the new interface the default experience in ComfyUI.

We know the reception has been mixed. There are many things we handled ineffectively early on, and the team has been working hard over the past months to address them. We appreciate everyone who has continued testing, giving feedback, and pushing us on where the experience falls short.

The Problem With Canvas

Canvas rendering worked, but it cut us off from everything the modern web has built over the last two decades: component libraries, design systems, accessibility tooling, the entire ecosystem developers rely on to ship fast. Every widget had to be drawn pixel by pixel.

Generative AI doesn't sit still. New models, new modalities, new techniques, new ways of combining them. The workflows that made sense six months ago get rethought constantly. Our users are doing professional creative work, and they expect the controls that professional tools have had for years: curve editors, color grading, histograms, timeline scrubbing. We can't keep rebuilding those from scratch.

What a Modern Frontend Unlocks

With a modern frontend framework, a curve editor that would have taken weeks now takes days. A gradient slider with live preview, hours.

Since the Nodes 2.0 beta launched, we’ve already shipped:

  • Curve editors
  • Histogram displays
  • Live cropping UI
  • Before/after comparison sliders
  • Image processing nodes for color correction, film grain, chromatic aberration, sharpening, and levels
  • Realtime shader nodes with subgraph blueprints
  • Inline error displays and status badges directly on nodes

This foundation also unlocks things that were previously impractical or impossible:

  • Live execution previews on subgraphs
  • Parallel node execution with realtime feedback
  • Richer interfaces for future modalities and workflows

Custom Nodes

Most custom nodes work unchanged. For nodes that require updates, we’re investing heavily in migration support:

  • A new public frontend API
  • Documentation and migration guides
  • Reference implementations
  • Direct collaboration with node authors to identify gaps

We understand this creates additional work for maintainers. For many popular custom nodes, we’re happy to directly help submit PRs and assist with migration work ourselves.

Recent advances in coding agents have also made these frontend migrations significantly easier than they would have been even a year ago.

Thank you for your patience as we work through this transition together.

Timeline

There is no fixed cutoff timeline yet. Right now, the priority is being transparent early and giving the ecosystem time to adapt.

Current plan:

  • Nodes 2.0 remains opt-in for now (Settings > Rendering > Nodes 2.0)
  • It later becomes the default while legacy mode remains available
  • Eventually, legacy mode will become unmaintained and will likely break over time

Going forward, new frontend-focused ComfyUI features will ship exclusively on Nodes 2.0.

Feedback

Please let us know what you think and the problems you run into. We need testing on complex workflows, large graphs, and custom nodes with unusual rendering. Report issues on GitHub or #bug-reports on Discord 🙏

Once again, thank you all for supporting Comfy.

And most importantly, thank you to all the custom node authors who continue making this ecosystem incredibly vibrant, creative, and powerful.

u/crystal_alpine — 6 hours ago
▲ 59 r/StableDiffusion+4 crossposts

I’ve been working on a sci-fi short film and wanted to share a WIP here.

My current workflow is a mix of image generation and LTX 2.3 for video ceneration using a first and last frame setup to animate the sequences. I’m still experimenting a lot, but it’s been surprisingly good for building scenes quickly and trying different visual transitions without getting stuck forever.

Would really appreciate feedback on the overall look, shot coherence, and whether the transitions feel smooth enough.

u/shijoi87 — 4 hours ago

Why isn't there a video model specifically made for anime?

Most current video models are completely focused on realism. The few that try to handle anime usually end up producing results that look like a weird mix of 3D and realism instead of something that actually feels 2D.

Wouldn't it actually be easier to create a smaller model similar to Anima, but trained exclusively on anime datasets? In theory, excluding realism and other styles should reduce compute requirements and simplify training quite a bit.

Personally, I'm already tired of almost every video model chasing the exact same goal: cinematic realism. There are dozens of models doing that already; some better, some worse, but in the end they all feel pretty similar.

Meanwhile, there’s barely anything that truly understands 2D anime physics, exaggerated expressions, or the way traditional animation moves. Or at least I don't know of any open-source model that comes close.

Back then, Sora was probably the best AI model for anime-style video because it understood 2D expressions and physics surprisingly well. Right now, Seedance seems to be the closest thing to that, with Grok somewhere behind it, but on the open-source side I still don't see anything remotely similar.

Maybe instead of trying to build one massive all-in-one model that does every style imaginable, it would make more sense to have smaller specialized models focused on specific styles.

I don't know, maybe I'm completely wrong and anime-style video generation is actually harder or more computationally expensive than realism. It's just something I've been wondering about for a while.

reddit.com
u/Vi0l3nTz — 5 hours ago

Nvidia released "Anyflow" based on Wan, basically it kinda like dynamic time step adjuster depends on your compute budget

>In this repository, we present AnyFlow, the first any-step video diffusion framework built on flow maps. 

Link:
https://huggingface.co/nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers

Full model selection:
1.3B T2V: https://huggingface.co/nvidia/AnyFlow-Wan2.1-T2V-1.3B-Diffusers
1.3B T/I2V: https://huggingface.co/nvidia/AnyFlow-FAR-Wan2.1-1.3B-Diffusers
14B T2V: https://huggingface.co/nvidia/AnyFlow-Wan2.1-T2V-14B-Diffusers
14B T/I2V: https://huggingface.co/nvidia/AnyFlow-FAR-Wan2.1-14B-Diffusers

I dont think Comfy support it yet, or whether it is already baked into the models so there is no additional code change.

u/Altruistic_Heat_9531 — 8 hours ago

Wan single images?

I see people use Wan for images, but I am slightly confused about how to prompt and the settings to get images rather than video. If I set it to one frame, as I have read, it just gives me the original I2V image I input. Do I generate a video and extract the frames as single images? Is there a way to prompt using an image to get a specific change in that image as one frame? It takes a bit to render videos and even more to upscale, so I am hoping for a better solution or one that works. I'd like to be able to input a prompt and get an image of the prompt. Wan is insanely good at keeping things from the original image you don't want changed. Maybe there is something better than wan? Heh. Sorry for the wall of text, I am just a bit exhausted trying things and searching.

reddit.com
u/ChowMeinWayne — 2 hours ago

using ltx 2.3 i2v 3d animation with reference voice using TalkVid Lora.

using LTX 2.3 1.1 and TalkVid ID Lora for grace voice refference

u/fleshify — 5 hours ago

LTX 2.3 + LTX Director Testing

I use 2 completly diferent images as input for shot 1 and shot2 and the character from image 1 (Shot1) appears in shot 2 with great concistency.

u/smereces — 13 hours ago

The free end-to-end AI movie studio, Pallaidium, refactored & new stuff in the upcoming release for Blender 5.2

Beta version for testing: https://github.com/tin2tin/pallaidium_refactor
Discord: https://discord.gg/HMYpnPzbTm

New models (from memory):

Please test - and tell me how it goes. NB. Grab Blender 5.2 or it won't work (multiline ui is implemented in Blender 5.2).

u/tintwotin — 9 hours ago

Been testing Krea 2 Large and Medium

It's been going around that Krea 2 is going to be open-source, with most consensus being that it will be probably be the medium version that will be released. I do hope they release both, and that large is also useable with consumer hardware. But from my testing they are pretty similar in capability, with Large maybe knowing certain celebrities a bit better? Medium also seems RL-tuned in that it makes more perfect looking people more often. All of these except Rose wearing a pink shirt was made with the Medium version.

I took these prompts from some Nano Banana galleries to compare their outputs, I think if Krea 2 had search grounding it would probably as good as Nano Banana Pro.

Can't wait to see future finetunes for this already, I'm so hyped.

u/OneTrueTreasure — 15 hours ago

Phosphene 3.0 — open source AI video + image suite for Apple Silicon. Train your own LTX characters.

Sharing Phosphene 3.0. It's a free panel that runs LTX-Video 2.3 and a couple of image models natively on Apple Silicon. Local, MIT license, no subs, no cloud.

The thing that sets it apart from "yet another LTX wrapper": you can **train your own characters** inside the panel. Drop 30 to 80 photos, click Train, get a face LoRA back. Add a voice clip and you get a voice LoRA too. Auto-captions with Gemma 3 12B locally. ~3 hours per character on an M4 Max 64 GB.

**What 3.0 ships**

- Text → video+audio (LTX-2 generates joint audio+video in one pass)

- Image → video+audio

- Audio → video (drive a clip with an audio reference)

- FFLF (first frame + last frame interpolation)

- Extend (continue an existing clip)

- Character training (face + optional voice LoRA, from a single dataset)

- Image Studio with three engines: Qwen-Image-Edit-2511, HiDream-O1, and the FLUX.1 family. Multi-reference composition up to 3 subjects.

**HiDream-O1 ported to MLX**

HiDream released their O1 image model on May 14. Got it running natively on Apple Silicon five days later. Photoreal portraits, instruction edits, multi-subject. ~67 seconds per 1024² on a 64 GB Mac.

**Hardware**

Apple Silicon only. Capability tiers auto-detected:

- 16 / 24 GB: 512 px video, text-to-image works

- 32 GB: 768 px

- 64 GB+: 1024×576 video, full HD image, character training

- A 7-second character clip with synced audio renders in ~6 min on M4 Max 64 GB

- Character training takes ~3 hours per character

**Install**

One-click via Pinokio (search Phosphene). Or clone the repo and run the panel directly.

**Credits**

LTX Video 2.3 by Lightricks (their license on the weights). MLX port by `dgrauet/ltx-2-mlx`. HiDream by HiDream AI. Phosphene the panel is MIT.

**Honest limits**

- Apple Silicon only. No Intel Mac, no Windows, no Linux.

- Dialogue audio is hit-or-miss. Ambient/diegetic sound is where LTX-2 shines.

- Character LoRAs are video-only (face + voice). Image LoRAs work in the Studio via Qwen/HiDream + a separate LoRA stack.

- First run downloads ~28 GB of weights. Takes a while.

Repo: github.com/mrbizarro/phosphene

X: x.com/PhospheneAI

Dev: https://x.com/AIBizarrothe

Feedback welcome. Especially curious what people make with the character training side.

reddit.com
u/Opening-Ad5541 — 12 hours ago
▲ 21 r/StableDiffusion+2 crossposts

Creating character turnaround sheets with Flux 2 Klein in ComfyUI

I made a small ComfyUI workflow for creating multi angle reference sheets from a single input image.

The main use case is character sheets. You give it one character image, and the workflow tries to generate multiple consistent views like front three quarter, side profile, rear view, rear three quarter, high angle, low angle, and a close detail view. The goal is to keep the same face, outfit, pose, expression, proportions, and general design while only changing the camera angle.

I built it mostly with native ComfyUI nodes. The only non native part, as far as I remember, is the GGUF loader. The prompts are written in a generic way, so it can also work for people, props, vehicles, creatures, or objects, but I mainly made it for character sheet generation.

I tested it with the Flux 2 Klein 4B Q4 GGUF model because I currently have access to only 4 GB VRAM. For such a small setup, it is giving acceptable results. It is not perfect, especially with difficult rear views or fine clothing continuity, but it is usable for blocking out reference angles and building rough character sheets.

I expect the 9B variant to give much better consistency and detail, especially for faces, costume continuity, proportions, and rear view inference.

This is not meant to be a final polished character turnaround solution. It is more of a practical workflow for quickly getting usable angle references from one image, especially when working with AI video, inpainting, first frame last frame generation, or character continuity.

Sharing it in case it is useful to anyone experimenting with Flux 2 Klein on low VRAM setups.

https://pastebin.com/EyRM0zed

https://preview.redd.it/y8v7v06d4o2h1.png?width=5824&format=png&auto=webp&s=3d7acb275bf8652b68501e9efb33af7d324e75ca

reddit.com
u/nikhilprasanth — 13 hours ago

Any image editing model that can do 2k-4k res reliably?

I've tried Flux klein & Longcat so far, but they both fall apart on higher res.

Goal is to mass-edit photos, so the higher res is really needed here.

reddit.com
u/aifirst-studio — 16 hours ago

problem Looping NLF process (part of SCAIL)

(crossposting this from Comfyui)

Hi, I am trying for loop the whole SCAIL process to generate long video.

for now, i am try to generate the NLF images first in for loop to prevent the VRAM shortage in long videos.

The problem is that the NLF rendered images seems to change its position slightly in every loop.

how do i prevent this and make the whole NLF images smooth?

You should be able to download the video and use it as template for the workflow.

Thank you

u/Ippherita — 12 hours ago