Been hacking on this for a while and I’m starting to wonder if I’m just reinventing a wheel someone smarter already finished. Hoping one of you has been down this road.
The dream is one local dashboard sitting in front of every model I have access to, smart enough to figure out itself which ones to use. I type one short sentence, not a thousand-word system prompt, and it actually gets me. Picks the right combo of engines, runs them in parallel where it makes sense, stitches the output back together, and does it fast enough that I don’t lose my train of thought.
The thing that keeps breaking down for me is real orchestration. Not “call this API then that API,” but actually chaining things across a local LLM, a frontier API, ComfyUI, a voice clone, a video generator, a lipsync model, and having the system handle the whole pipeline.
Concrete example: I want to type one line asking for a short clip of a specific character speaking in their recognisable voice, and have the thing produce script, voice, face, lipsync and final render without me babysitting it.
I want output that’s actually shippable. Photos, video, design, documents that don’t scream generated. The bar is “would this pass in a pitch deck or on a client landing page,” not “look mom, AI made it.” That gap is where most of the open source stacks fall apart for me.
I want a context layer that learns me over time so I can stop writing prompt essays. “Make a moody product shot for the new drop” should be enough context. The system should know my brand, my tone, my last twenty references, and the engines I prefer for which job.
I want it uncensored where it matters. Not for anything weird, just because I’m tired of getting a lecture every third reply when I’m trying to write copy or ideate something edgy. At least the freedom of the less filtered chat models out there, preferably better.
Local first wherever the hardware can keep up, cloud APIs only as a fallback when local genuinely can’t match the quality. I’ve got the machine for it.
I’ve already tried wiring this together with open source pieces, a workflow tool in the middle, an LLM proxy, the usual suspects. It works on paper. In practice it’s fragile, the routing between engines is dumb, the chaining never feels seamless, and there’s no quality control between steps so garbage in one stage poisons the next.
So my actual question: does anything like this already exist as a real product or open source project? I keep finding excellent pieces of the puzzle but nobody who’s solved the whole thing in one place. If you’ve built something close, I’d love to hear what stack you landed on and where you hit walls. Honestly even a “yeah this exists, it’s called X” would save me a few months of my life.