Your local LLM node isn't frozen. The AI is thinking. I built a plugin so you can see it.
I spent 3 hours debugging a workflow that wasn't broken.
Qwen models have an internal reasoning mode. Before they answer, they sometimes stop and think — silently. Zero output. Zero progress bar. You're just staring at a frozen node wondering if it crashed.
It didn't crash. It's reasoning. And there was absolutely no way to see it.
So I forked the Qwen plugin and built ThinkingLLM.
What it does:
Live token streaming — every word appears in the terminal as the model generates it. You can literally watch it think in real time.
RAW_TRACE output — the full inner monologue preserved. Sometimes it's brilliant chain-of-thought. Sometimes the model decides the prompt is too easy and skips reasoning entirely. Now you can tell which is which.
Thinking toggle — let it reason before answering, or push for a direct one-shot response.
Supported models:
Qwen3.5, Qwen3-VL, Qwen2.5-VL, Qwen3, and Gemma 4 — both HF Transformers and GGUF/llama.cpp backends.
Tips for using it:
Pre-process input images with a resize node so large files don't blow up the context window
Connect the RESPONSE output to a Show Text or Show Anything node to read the answer
Connect RAW_TRACE to a second Show Text node to see what the model was thinking
It's free, open source (GPL-3.0), and installable through ComfyUI Manager.