Image 1 — Wrong crossroads, traveler.
Image 2 — Wrong crossroads, traveler.
Image 3 — Wrong crossroads, traveler.

Wrong crossroads, traveler.

Generated with:

  1. ChatGPT

  2. Google Gemini

  3. Stable Diffusion

The Prompt

Cinematic photorealistic wide shot, night scene. A young woman in her mid-20s, slender and strikingly beautiful, with white-blonde hair, brown eyes, and an almost angelic appearance that contradicts everything she is. She wears a deep violet dark magic gown, elaborately cut, fitting her silhouette. She points a dark wooden staff directly at the viewer, not as a warning but as a verdict. Her expression is calm and superior, the look of someone who finds this entire situation mildly amusing, fully aware that whoever stands before her is no equal. Behind her: a crossroads at midnight, three skeletal figures holding hand axes, standing motionless like a personal guard. A full moon dominates the sky above, casting cold silver light across the scene. The angelic and the sinister occupy the same face. Shallow depth of field, film grain, single-frame narrative, movie still.

u/GelliusAI — 4 days ago

Three Faces of Manon Grel: When AI Gets a Revolutionary Right, and Wrong

A few days ago I had a simple image concept: a female musketeer, long black hair, ice-blue eyes, black dress, pale skin, red lips. She stands alone before a fortress as its last defender. A fantasy setting inspired by the French Revolution.

I brought the concept to Claude and built a prompt together. This approach had worked well on earlier projects. But the generated images felt generic and flat, the kind of result you have seen a hundred times before.

So I gave the character a backstory.

Manon Grel spent years torn between the Crown and the revolution. She is not a moral figure, but she has a good heart. Her methods are unconventional and not always ethical. She is the kind of woman who makes her own rules and lives with the consequences.

The image shows the exact moment she chose a side.

That shift in thinking produced the two sentences that changed everything in the prompt: "She raises a flintlock musket in her right hand, not as a threat but as a declaration. Her expression is cold and absolute, a decision already made and impossible to take back."

The results across different tools varied significantly. Stable Diffusion delivered a strong range of images. ChatGPT produced technically convincing results but pushed hard toward sexualization of the figure. Gemini managed the setting well but Manon herself looked blank, almost vacant. Nightcafe was a complete failure: the figure looked like a store mannequin in every single image, none of them usable.

The takeaway: a character with a story generates a better image than a character with a description.

 The Prompt

Cinematic photorealistic wide shot, golden hour. A woman in her mid-20s, slender, pale skin, red lips, long black hair, sharp blue eyes. She wears a tight-fitting black brocade dress with a low neckline, dark lace at the sleeves, a wide-brimmed hat with a black feather. She raises a flintlock musket in her right hand, not as a threat but as a declaration. Her expression is cold and absolute, a decision already made and impossible to take back. She looks directly into the camera without hesitation, without doubt. Behind her: massive ancient fortress walls, completely empty, no other figures. The fortress is hers now. Warm golden light from the left, long shadows across weathered stone, shallow depth of field, film grain, single-frame narrative, movie still.

u/GelliusAI — 7 days ago

Come with me into the darkness... Would you accept Aristella's invitation?

This was the image concept I had in mind: Aristella the Dark is a princess who lives in a dark castle. She is eternally young and immortal. A slender woman of around 25 with green eyes, long red hair, and black clothing. She spends most of her life in complete solitude, but occasionally lures men into her castle.

Aristella the Dark slowly walks down the corridor. She keeps turning around, softly saying "Come with me". She looks at the viewer with a look of seduction; no other person can be seen on the video. The curtains look like ghosts due to the wind, some candles go out. Something dark and at the same time enchanting emanates from Aristella.

u/GelliusAI — 16 days ago
▲ 1 r/AI_ART

This is all I gave it. A few lines. Gemini Omni Flash did the rest.

Yesterday I gave Google's new video model Gemini Omni Flash a simple line drawing to work with, something that vaguely resembled a cat or a fox.

The prompt was deliberately open: "I want to run an experiment with you. I'll upload an abstract image and you turn it into a video. I want a character to emerge, with music in the background."

What genuinely surprised me was how the model brought those few simple lines to life.

u/GelliusAI — 21 days ago

This is all I gave it. A few lines. Gemini Omni Flash did the rest.

Yesterday I gave Google's new video model Gemini Omni Flash a simple line drawing to work with, something that vaguely resembled a cat or a fox.

The prompt was deliberately open: "I want to run an experiment with you. I'll upload an abstract image and you turn it into a video. I want a character to emerge, with music in the background."

What genuinely surprised me was how the model brought those few simple lines to life.

u/GelliusAI — 21 days ago

Am I the Only One Visualizing My Novel Characters with AI?

I started working on my novel with AI support in mid 2025, and from the beginning I visualized my characters. Back then the results were often unconvincing, consistency was the main problem. Today the situation has improved significantly. Generating consistent character images is much easier, and creating short videos is no longer an issue.

I find it valuable to see my own characters visually. What I'm curious about: How many authors in this sub actually visualize their characters?

Right now I only see image forums where almost no one cares about the context of the images or the story behind them. And in writing forums, no one seems interested in AI generated character visuals.

I'm starting to feel like visualization of novel content plays no role at all, and there's no sub on Reddit to discuss it seriously. What relevance do you see in this topic?

reddit.com
u/GelliusAI — 26 days ago

Staring Back: The moment the observed becomes the observer

Nova has been assigned to a scientist in a research complex as a companion. The situation puts her under intense pressure, yet she is also a character who understands the rigid bureaucratic system and exploits every loophole or contradiction to her advantage. At the same time, she is quick to seek confrontation with the surveillance system of the complex.

I asked Gemini and ChatGPT to create an image of her in the kitchen of the facility, looking directly into the camera and inviting that confrontation. I then used the ChatGPT image to create a short video with Google Flow.

The Prompt

Cinematic close-up of a fictional female character in a sterile, clinical environment. She has pale skin and light blonde hair. For the first 3 seconds, she remains motionless, staring intensely and defiantly into the camera. At second 5, she slowly speaks the words: "Are you enjoying the show?" Her expression is cold and calculating, maintaining eye contact throughout. Bright, overexposed lighting. High-quality digital cinematography, realistic facial animation, English speech.

u/GelliusAI — 28 days ago

Visual Character Fluidity: Why I stopped fighting AI inconsistency and started using it

Since mid-2025, I've been creating images of my character Nova in a dystopian setting. Over that time, I've encountered countless iterations of her, a phenomenon shaped by the distinct aesthetics of various AI tools, but also by the simple fact that image generation is inherently inconsistent.

I never saw this as a weakness. Instead, it led me to coin the concept of 'visual character fluidity', the idea that virtual characters never settle into a single, definitive form. I find this especially valuable for novelists looking to visually explore their characters. But really, anyone who creates a character stands to gain from this ambiguity: inconsistent image generation isn't a flaw, it's an invitation to discover your character in all their complexity.

By mid-2026, with AI producing increasingly consistent results, I've deliberately chosen a different approach: I run the same prompt through multiple tools to capture a character across their many facets.

How do you handle this? Do you prioritize a consistent look for your characters, or do you see the variation as a creative opportunity?

reddit.com
u/GelliusAI — 28 days ago

Staring Back: The moment the observed becomes the observer

Nova has been assigned to a scientist in a research complex as a companion. The situation puts her under intense pressure, yet she is also a character who understands the rigid bureaucratic system and exploits every loophole or contradiction to her advantage. At the same time, she is quick to seek confrontation with the surveillance system of the complex.

I asked Gemini and ChatGPT to create an image of her in the kitchen of the facility, looking directly into the camera and inviting that confrontation. I then used the ChatGPT image to create a short video with Google Flow.

The Prompt

Cinematic close-up of a fictional female character in a sterile, clinical environment. She has pale skin and light blonde hair. For the first 3 seconds, she remains motionless, staring intensely and defiantly into the camera. At second 5, she slowly speaks the words: "Are you enjoying the show?" Her expression is cold and calculating, maintaining eye contact throughout. Bright, overexposed lighting. High-quality digital cinematography, realistic facial animation, English speech.

u/GelliusAI — 1 month ago

Staring Back: The moment the observed becomes the observer

I recently posted some images in this sub based on a dystopian short story centered on a specific conflict.

Nova has been assigned to a scientist in a research complex as a companion. The situation puts her under intense pressure, yet she is also a character who understands the rigid bureaucratic system and exploits every loophole or contradiction to her advantage. At the same time, she is quick to seek confrontation with the surveillance system of the complex.

I asked Gemini and ChatGPT to create an image of her in the kitchen of the facility, looking directly into the camera and inviting that confrontation. In my view, Gemini delivers the stronger result here.

u/GelliusAI — 1 month ago

Vulnerable vs. Hollywood: Gemini and ChatGPT’s wildly different takes on my dystopian protagonist

I fed Gemini and ChatGPT my dystopian short story and asked both to visualize three scenes featuring the main character, Nova—a white-blonde woman under immense pressure in a sterile research complex.

  • Nova observing a surveillance camera
  • Nova in the complex's kitchen
  • Nova in conversation with her training academy

The difference is striking: Gemini portrays Nova as stressed and vulnerable, the complex feels sparse, and the communication device looks almost retro. ChatGPT, on the other hand, turns her into a tough Hollywood heroine in a sleek, modern environment, maintaining that vision more consistently across all three images.

Which interpretation truly captures the essence of Nova?

u/GelliusAI — 1 month ago

The historical novel "Ein Kampf um Rom" (A Struggle for Rome) by Felix Dahn depicts the dramatic struggle for survival of the Goths and the inevitable fall of their kingdom. The work is marked by heroism, tragedy and a profound melancholy.

I attempted to capture this feeling of imminent decline in a single motif: Princess Amala in a white gown stands at a towering harp by the fireside and gazes lost into the shadowed room. I wanted to capture the moment when she, surrounded by the shadows of history, sees the fate of her people approaching.

The Prompt

Princess Amala softly plays her harp, the fire crackles in the background. Suddenly she starts, hearing a sound like a sword striking against armor and breaking. She looks up and interprets this as an evil omen. Amala says nothing, the sound of the sword blow is clearly audible.

u/GelliusAI — 1 month ago
▲ 1 r/AI_ART

The historical novel "Ein Kampf um Rom" (A Struggle for Rome) by Felix Dahn depicts the dramatic struggle for survival of the Goths and the inevitable fall of their kingdom. The work is marked by heroism, tragedy and a profound melancholy. The images are inspired by the style of James Archer and John Garrick.

I attempted to capture this feeling of imminent decline in a single motif: Princess Amala in a white gown stands at a towering harp by the fireside and gazes lost into the shadowed room. I wanted to capture the moment when she, surrounded by the shadows of history, sees the fate of her people approaching.

The melancholic, introspective gaze of the princess proved particularly challenging. While some models captured the scenery well, Yeri AI and Nightcafe in particular struggled with correctly rendering the eyes. Leonardo AI failed completely even after adjusting the prompt.

Here are the results for comparison:

ChatGPT (DALL-E 3)

Gemini

Nightcafe

Yeri AI (formerly Stable Diffusion)

The Prompt

A horizontal narrative painting in the style of 19th-century Realism, inspired by 'A Struggle for Rome' by Felix Dahn, rendered with the texture and moody lighting of James Archer and John Garrick. The scene is a dimly lit castle chamber, shrouded in deep shadows, with only a solitary, roaring fireplace on the side illuminating the scene. The central focus is a 25-year-old Gothic princess, slender and attractive, with very long, wavy blonde hair that falls over her shoulder, and deep, melancholic water-blue eyes. She stands motionless beside a large, mannshoch (man-height) wooden Celtic harp, her hands barely resting on the strings, with a lost, introspective expression as if looking past the fire into the doomed future of her kingdom. She is wearing a simple, elegant gown of heavy white wool-silk, accented with a massive Ostrogothic silver fibula brooch on her shoulder. The firelight creates a warm glow on her profile and the dark wood of the harp. At her feet, spreading across the stone floor, is a large, shaggy grey wolf skin rug. Beside the harp, on a dark, rough-hewn wooden stool, an ornate, tarnished silver goblet sits empty and overturned, symbolizing the fading glory and decay of the kingdom. The background dissolves into soft, dark brushstrokes and abstract, ancient stone masonry.

u/GelliusAI — 1 month ago

The Prompt

The pilot sits calmly at the controls. The ship shakes from a hit. She reacts to the attack instantly and says: "See you in hell, bastard."

Negative Prompt

No fire inside the spacecraft, no flashes of light, no smoke, and no red alarm bells appearing suddenly.

u/GelliusAI — 1 month ago
▲ 5 r/AI_ART

A few weeks ago, an idea came to me: What would a bridal boutique look like in a bleak, dystopian world? Inspired by the sense of disorientation in Kafka and the melancholy of Pessoa, I brought this vision to life using Google Gemini and later ChatGPT. It is a world where resplendent white meets grey concrete.

I was surprised by how quickly the AI was able to manifest this world. This was certainly due to the fact that I had a clear picture in my mind and could describe it precisely. The next logical step was to bring the story to life. I created a video using Google Flow, where it became apparent that translating an image into video is far more challenging.

My idea was for a gentle breeze to drift through the broken window and make the wedding dress billow softly. However, keywords like 'ill omen,' 'surprising gust of wind,' and 'none of the people say a word' resulted in a video considerably more dramatic than intended.

The resulting drama left a lasting impression on me. As a creator, one should always strive for control over AI, yet sometimes it is chance that opens the door to new perspectives.

The full video prompt

The shop assistant lightly lifts the bride’s dress when suddenly, a surprising gust of wind—like an ill omen—blows through the window; the bride looks up, irritated. The robot stands still the entire time and does not react. None of the people say a word. The atmosphere is eerie.

u/GelliusAI — 1 month ago

A white-blonde woman in front of a surveillance camera. A bridal boutique with dirty concrete walls. Three female astronauts in a retrofuturistic 1950s spaceship. How do you translate these mental images into visuals without losing days to prompt engineering

Over the past few weeks, I tested three different approaches: the direct route from manuscript to image, visualizing an initial sketch, and methodically creating prompts through a separate AI.

Here are my experiences with ChatGPT, Gemini, and Claude.

Workflow 1: From Manuscript to Scene Visualization

I fed ChatGPT a dystopian short story of 3,600 words. The AI analyzed the document immediately and wrote: "The story already provides very strong visual elements: [...]. The character descriptions are also already quite clearly laid out in the file."

I then named individual scenes from the story and prompted the AI to visualize them. My focus was on the main character Nova: a white blonde woman in a sterile research complex. I wanted to see how precisely the AI could capture the character in relation to the clinical atmosphere.

From manuscript to image: Visualizing Nova's encounter with the surveillance systems.

The visual execution largely matched my imagination. ChatGPT created consistent images within the same chat while also understanding the nature of the character. In the image with the camera, Nova's direct gaze is convincing. It perfectly mirrors how early in the story she seeks confrontation with the surveillance around her.

At the same time, this image illustrates the opportunities that AI visualization offers. In ChatGPT's images, her hair is pinned up. This deviation from my original idea of loose hair is worth considering: hair pinned up on duty and worn down in private time could visually underscore that contrast between functional drill and those rare pockets of personal freedom.

Workflow 2: AI as a Sparring Partner for Early Visions

Another way to visualize your novel ideas is to describe your concept to the AI in detail. This requires a very clear sense of your own world. The prerequisite is a precise vision: the AI can only interpret a world coherently if the author already has it clearly in mind.

I recently had the idea of Mia, a young woman working in a bridal boutique in a bleak, dystopian world. The atmosphere is inspired by Kafka's sense of disorientation and Pessoa's desolation, creating a setting where the magnificent white of the gowns meets bare, grey concrete.

I was struck by how effortlessly Gemini brought this vision to life, as it required no elaborate prompts at all. Here too, the images largely matched my imagination, once again showing how AI can fuel an author's creative vision.

Concept Visualization: Capturing the tension between Kafka's strangeness and Pessoa's bleakness.

Originally, I had planned the shop to feel less grim, with appealing furniture preserving the illusion of an intact world. But Gemini pulled the bleakness of the outside world deep into the interior, right down to the dirty concrete walls.

This moment captures the AI's role as a sparring partner: it delivers a more radical interpretation of my idea, and now I'm faced with the choice of whether to adapt my visual concept accordingly.

Workflow 3: Methodical Prompt Development

In my third approach, I used Claude as a specialized prompt architect: I described my vision and had the AI generate precise prompts from it. My goal was a retrofuturistic 1950s setting featuring three female astronauts in their early thirties. To achieve maximum precision, I gave each character a distinctive name, a specific role on board, and a clear visual signature.

A result of systematic, prompt-driven visualization: Catherine 'Cat' Sterling, 'The Elegant Explorer'

One example is the navigator Catherine 'Cat' Sterling, whom I dubbed 'The Elegant Explorer'. Comparing different image AIs was fascinating: while all models interpreted her elegance through her uniform, ChatGPT made the strongest statement with striking gold accents.

This workflow demonstrates a key principle: the more vivid your world-building, the more effectively Claude acts as a bridge to consistent prompts.

Three Approaches — Different Paths to Visualization

As different as the three approaches are, they all lead to a valuable visualization of your novel idea:

  1. From Word to Image (ChatGPT): This workflow is ideal when a text already exists and you want to test the internal logic and consistency of your story.
  2. The Intuitive Sketch (Gemini): This approach is perfect for the early stages — to explore the atmosphere of your world and let the AI's interpretation challenge you as a sparring partner.
  3. Methodical Prompt Creation (Claude): This workflow is the best option for authors who want to maintain stylistic control and stabilize their vision across different tools.

These workflows are designed to be modular: you can adapt these methods to any AI model, and I encourage you to experiment with your own toolkit to see what sticks.

What Is the Next Step?

After the initial visualization of the world and characters, the logical next step is to bring the story to life. Short clips of about eight seconds—created with tools like Flow, for example—are often enough to develop a deep sense of a project's atmosphere. For one video, I used an image of the dystopian bridal shop that I had previously created with ChatGPT.

The Ill Omen – Gone with the Wind

My idea was for a light breeze to blow through a broken window pane and gently lift the wedding dress. However, poorly chosen keywords like "ill omen," "surprising gust of wind," and "none of the people say a word" resulted in the video becoming significantly more dramatic than I had planned. While an author should always strive to maintain control over AI, in some cases, these "accidents" actually help us gain new and unexpected impressions of our own work.

reddit.com
u/GelliusAI — 1 month ago

Security footage from Sector 7. At 00:12 AM, Subject Nova (23) was caught on camera before intentionally moving into the unmonitored maintenance corridors. Her current whereabouts are unknown.

Prompt:

Create the video from the perspective of a security camera; the images are in color. Nova looks briefly into the camera and then moves away; the camera follows her for as long as possible until she enters a side hallway and disappears from the camera's field of vision. Only the sounds of the camera can be heard; Nova says nothing.

u/GelliusAI — 1 month ago