Hi everyone,

Recently, Ostris, the creator and maintainer of AI Toolkit, released a new LoRA training method and custom ComfyUI node that make it possible to use Krea2 for image editing, despite Krea2 being a text-to-image model.

I trained several detail enhancement LoRAs with this method, and I am sharing the best one from my experiments.

True resolution versions of the images can be found in HF repo below.

Hugging Face:
https://huggingface.co/reverentelusarca/krea2-detail-enhancer-edit-lora

Civitai:
https://civitai.com/models/2756809/krea-2-detail-enhancer-or-edit-lora?modelVersionId=3102079

Please keep in mind that both this LoRA and the underlying Krea2 editing method are highly experimental. It does not produce great results every time. I am mainly sharing this experiment in the hope that it inspires other developers and community members to explore the method further.

A few important notes:

Krea2 is not an edit model, so do not expect the precision or consistency of Flux.2 Klein or Qwen Image Edit. It can alter the input image.
It sometimes produces faulty results with horizontal aspect ratios.
It can slightly change the lighting and colors.

Trigger word:

enhance this image

Prompt I am using:

My ComfyUI workflow:

https://huggingface.co/reverentelusarca/krea2-detail-enhancer-edit-lora/blob/main/workflow-comfyui-krea2-detail-enhancer-edit-lora.json

Ostris' Krea2 Edit node:

https://github.com/ostris/ComfyUI-Krea2-Ostris-Edit

Ostris' explanatory post about the method:

https://x.com/ostrisai/status/2073428647273447480

Hi folks. I recently saw a post about using Krea + PiD, but there was no workflow attached, so I decided to write a more detailed post based on my own testing. Following content is beautified with an LLM, based on my raw experiment notes.

This is not meant to be a perfect technical paper. These are practical notes from testing different PiD checkpoints, VAEs, and Krea 2 outputs in ComfyUI.

What is NVIDIA PiD?

PiD is NVIDIA’s Pixel Diffusion Decoder. In simple terms, it can replace the normal VAE decode step with a learned pixel-space diffusion decoder/upscaler.

Normally, a workflow does something like:

latent → normal VAE decode → image

PiD does something closer to:

latent → pixel diffusion decoder → higher-resolution image

So in practice, for ComfyUI users, PiD is useful because it can act as a very fast 4× upscale / detail decode stage.

It supports several latent/model families, including Flux.1, Flux.2, SD3, Z-Image, SDXL, Qwen-Image, and others. The important part is that you should match the PiD checkpoint with the correct VAE / latent format.

For example:

Flux.1 PiD → Flux.1 VAE / ae.safetensors
Flux.2 PiD → Flux.2 VAE
Qwen PiD → Qwen Image VAE
SDXL PiD → SDXL VAE
SD3 PiD → SD3 VAE

The final decode VAE node in the PiD branch should stay as pixel_space. That part does not change.

There are probably better posts than this, explains what is PiD, so don't hesitate to dig more.

You can find the diffusion models and text encoders here: https://huggingface.co/Comfy-Org/PixelDiT/tree/main

What this workflow actually does

The workflow can generate an image with Krea 2, then pass it into a PiD upscale branch.

But you do not need to generate the image inside this workflow.

You can also use:

Load Image → VAE Encode → PiD → pixel_space decode

So you can take any already-existing image, encode it with the matching VAE, and run it through PiD as an upscale/detail stage.

In other words, this is not only a Krea workflow. Krea is just what I personally used for the base image.

Why use PiD?

Speed.

On my setup, using an RTX 6000 PRO, I can generate a 4K image with Krea 2 Raw FP8 + Turbo LoRA + 16 steps + PiD upscale in around 10 seconds.

This is much faster than pushing Krea 2 itself to larger latent sizes. Krea 2 can produce smartphone-style photography very well by itself, but generating at larger latent sizes is much slower (almost 3-4x slower). PiD gives you a fast route to a detailed 4K result.

You also do not have to use Krea 2 Raw + Turbo LoRA. You can use Krea 2 Turbo + PiD directly as well.

Color drift / tint caveats

This is the main thing people need to know.

From my tests, every PiD checkpoint + VAE combination changes color and lighting a little differently.

Some combinations make the image brighter.
Some shift it toward blue.
Some add green tint.
Some preserve the original image better.

For example, in my testing:

Flux.1 PiD was better for preserving the original image/color.

Flux.2 PiD was sometimes better for outdoor/high-exposure images, but it can shift the image colder or bluer depending on the input.

Qwen-Image PiD often gave me green-tinted results.

So do not assume one PiD checkpoint is universally best.

The official Flux2 _2606 checkpoint is supposed to fix the old Flux2 color-drift issue, but that does not mean it will be the most neutral choice for every image. In my own tests, Flux.1 sometimes preserved the original image better.

My current practical preference is Flux.1 VAE + Flux.1 PiD when I want to preserve the original image as much as possible.

Resolution rule of thumb

This part is based on my experiments, not an official hard rule.

PiD gets much more fragile when the input becomes too large or too extreme in aspect ratio.

My current rule of thumb:

Do not go over 1024 px on the input side.

So if you are using the PiD branch, keep the loaded image / empty latent image around 1024 px max on width or height.

Then let PiD do the 4× upscale.

Examples:

1024×1024 → 4096×4096
768×1024 → 3072×4096
576×1024 → 2304×4096

For very vertical ratios like 9:16, I noticed tinting appears much more easily, even with Flux.1 PiD. So for vertical images, I recommend starting smaller, for example:

576×1024 → 2304×4096

rather than pushing something like:

1024×1824

If you go too large or too tall, PiD may still work, but the chance of color tint / weird grading goes up.

Workflow notes

In my workflow, I painted the important switchable nodes green.

Those are the nodes you change if you want to test different PiD families:

Flux.1
Flux.2
Qwen-Image
SDXL
SD3

The node marked in red should always stay as pixel_space. That is the final PiD decode side.

If you want to use your own image instead of generating with Krea inside the workflow, just use a Load Image node and connect the lines into the PiD branch accordingly.

My current recommendation

For best color preservation:

Flux.1 VAE + Flux.1 PiD + keep input max side around 1024 px

For outdoor / bright / high-exposure images:

Flux.2 PiD may be worth testing

For Qwen/Krea images:

Qwen PiD sounds logical, but in my tests it often produced green tint, so I would not blindly assume it is the best option.

Final note

If you are chasing a smartphone photography look, Krea 2 is already very capable out of the box. Most of the work is prompting and choosing the right base resolution.

PiD is mainly useful because it gives you a very fast 4K/detail route without making the base generation much slower.

The images I attached were made with different PiD checkpoints. In the workflow folder, I also included a comparison image of different PiD models I tested. It is around 30 MB because it combines multiple 4K outputs into one comparison image.

Workflow and comparison image: https://drive.google.com/drive/folders/1m4o2J3-Y7p1tAuabuHYrHe7gWziXNinm

u/sktksm

Krea 2 Edit LoRA: Detail Enhancer

Krea 2 + NVIDIA PiD workflow notes: fast 4K upscale, caveats, color drift, and model choices