u/Individual_Hand213

I Used GPT Image 2.0 + Seedance 2.0 to Create a Netflix-Style Luxury Food Commercial Entirely Inside a Fridge (Prompt Below)

To get started use GPT Image 2 and Seedance 2 from https://ai.vadoo.tv

Just created a hyper-realistic, Netflix/Apple-level food commercial… but shot from inside a refrigerator. No crew. No set. No camera. Just AI.

I treated the fridge like a real film set — complete with condensation, practical lighting, cinematic fog, and dramatic camera moves. The result looks like a high-end FMCG ad you’d see during the Super Bowl.

How I made it:

Generated the base visuals with GPT Image 2.0

Animated them with Seedance 2.0 for realistic motion, handheld camera work, and insane detail

The full prompt is below — copy, paste, and try it yourself.

Prompt:

"Ultra-premium cinematic food commercial, inspired by high-end Netflix, Apple, and luxury FMCG advertising. Entire video shot from deep inside a real refrigerator using layered foreground objects, practical fridge lighting, and cinematic depth.

OPENING SHOT — Extreme close-up inside the fridge. Glossy tomatoes, fresh lettuce, carrots, grapes, beverage cans, and condensation-covered glass shelves fill the foreground. Cold cinematic fog rolls across the frame as the refrigerator door opens dramatically. Bright cool light floods in. A beautiful female model appears outside, smiling softly while looking straight at the camera.

SHOT 2 — Behind-the-scenes commercial vibe. The model carefully arranges drinks and fresh produce on glowing glass shelves while cinematic studio lights reflect beautifully. Handheld camera movement, ultra-detailed water droplets everywhere.

SHOT 3 — She leans in close. Her hand reaches right past the lens to grab a cold can, creating natural foreground blur and realistic depth. Subtle breathing, hair movement, fabric flow — everything feels alive.

SHOT 4-6 — [Full detailed shots including camera rig reveal, hero product moment with Prasuma Vegetable Momos, dramatic slow-motion door open with escaping fog, and epic final close-up push-in]

Visual style: Hyper-realistic, ARRI Alexa Mini LF, anamorphic lenses, 18mm ultra-wide POV, shallow depth of field, realistic film grain, moody blue fridge lighting mixed with warm skin tones, Netflix-quality cinematography, 8K, blockbuster commercial aesthetic."

Why this slaps so hard:

True inside-the-fridge POV (super rare and immersive)

Insane condensation, reflections & practical lighting

Realistic micro-movements (breathing, fabric, hand blur)

Perfect hero product shots with premium lighting

Seedance 2.0’s camera intelligence makes it feel like a real DP shot it

This “inside environment POV” style is currently one of the strongest use cases for AI video. It feels less like AI and more like a $100K+ ad concept.

Who else is making wild AI commercials right now? Drop your results below 👇

u/Individual_Hand213 — 13 hours ago

▲ 15 r/GeminiOmniAI+3 crossposts

Gemini Omni Flash vs Seedance 2.0: Which One Handled This Insane Prompt Better?

We ran the exact same insane prompt through Gemini Omni Flash and Seedance 2.0 to see which model handles absurd cinematic scenes better.

The setup is complete chaos:

A robot cowboy scavenger

On a luxury yacht

Fishing a bluefin tuna

Blood all over the deck

Cash scattered everywhere

A dopey ostrich just vibing in the middle of it all

Plus: atomic punk 1960s retro-futurist styling, zombie apocalypse aftermath, harsh summer sunlight, photorealistic physics, single-take fixed camera, and a full action sequence where the robot lands the fish then pulls a revolver and shoots it.

Here is the full original prompt we used:

"【Basic Settings】 Robot Scavenger: A slender male humanoid robot, 180cm tall, designed in a 1960s atomic punk style. Possesses self-awareness. Facial LED display screen replaces facial features, showing low-resolution pixel-style expressions (expressions have no dynamic effects, remain in static frame display, with sci-fi sound effects during expression switches). Wears an American Western cowboy retro-style natural brown cowboy hat, black matte high-waisted design leather jacket, black matte leather gloves, cowboy belt, and holster. Mannequin Model: A posable realistic fashion mannequin, 170cm tall, in a 1960s American retro golden age style. Jointed assembly structure for changing poses, specifically for clothing display. African Ostrich: An adult male African ostrich, with a dopey-looking appearance, eyes skewed asymmetrically, tongue lolling out crookedly Scene: 1960s American retro atomic punk coastal city, zombie crisis outbreak, private yacht after a major battle, steeped in deathly silence, 6 PM with glaring sunlight scattering, sea surface shimmering with waves, air slightly distorted by heat waves under the blazing sun. Overall atmosphere luxurious and relaxed, blending vacation laziness with high-end retro futurism. Distant background is the atomic punk-style coastal city shoreline, main frame subject is an extremely luxurious large yacht on the sea surface, with zombie corpses and bloodstains from dragging and smearing scattered everywhere on the yacht, littered with banknotes, perfume bottles, wine glasses, red wine bottles, broken glass, and tattered dirty swimsuits. Sound: No background music needed, no ambient sound, retain only diegetic sound. 【Atmosphere and Image Quality】 Style Core: Atomic punk, zombie crisis, doomsday romance, black humor, cinematic texture, hyper-realistic, ultra-lifelike, Photorealism- real person real scene shooting, eliminate game CG feel, prohibit sluggish stiff movements, prohibit logical confusion. Visual Tone: Deformed widescreen cinematic texture. Shot using IMAX film camera paired with Panavision C-series lenses (add motion blur). Leisurely comfortable atmosphere forms a stark absurd contrast with the doomsday environment. Color and Tone: 1960s retro sci-fi atomic punk aesthetic, retro warm orange + sea salt blue high-contrast palette, film grain texture, high-contrast tones, minimalist futurist vacation vibe, American 1960s retro utopia, details maxed out, architectural textures sharp, lighting and shadows in layered depth. Midsummer blazing sun overhead, high-saturation intense daylight, hard-edged light and shadow contrasts, frame retains shadow details, highlights with subtle soft focus and moderate film grain. 【Frame Content】 Shot Breakdown: Single continuous take, one shot to the end. Shot Scale: Eye-level full shot. Composition: Centered composition. Camera Movement: Fixed position. Frame Content: The mannequin model sits on the stairs on the right side of the frame, posture dignified, sweetly elegant, remaining completely still. The robot stands at the center of the deck, facial LED screen fixed on a white thinking expression, facing the left side of the frame, engaged in sea fishing. Robot's legs spread wide in a firm horse stance, body leaning back to its limit with full exertion, both hands gripping the sea fishing rod tightly, rod bent into a full extreme bow, high-tension fishing line taut straight into the sea surface; robot steadily turns the reel to retrieve line, sea surface churns violently with white foam splashing everywhere, a 60cm-long silver-blue bluefin tuna bursts out of the water, thrashing wildly with sprays of water flying, line remains perpetually taut without slack; under steady control from the fishing stance, continues reeling in, slowly pulling the still slightly struggling tuna to the yacht's gunwale, fully presenting the entire process from exerting to battle the fish to successfully landing the tuna, splashing water droplets gleaming golden in the sunset, dynamic tension and visual impact maxed out. Then the robot holds the rod single-handed in the left hand, draws the revolver from the waist holster with the right, pulls the trigger to shoot the tuna dead with one shot, tuna instantly ceases movement, blood seeping into the seawater. Upon seeing this, the robot's facial LED screen switches to a green smiling expression, throws head back in a scheming triumphant "haha" laugh. Sea surface reflections, splashing water physics effects precisely recreated, character actions conform to human kinematics, rod, line, tuna movements fully match real physics logic, no stuttering, no clipping, no frame skips, full consistency in scene, characters, props throughout"

Honestly, this is one of those prompts where you can immediately see which model starts falling apart and which one keeps the scene under control.

Watch till the end before you pick a side.

Which one is better? Gemini Omni Flash or Seedance 2.0?

Drop your verdict below 👇

u/Individual_Hand213 — 13 hours ago

▲ 3 r/GeminiOmniAI+1 crossposts

Gemini Omni api now available for developer access worldwide

Available for access from https://github.com/Anil-matcha/Awesome-Gemini-Omni-API-Prompts

https://muapi.ai/gemini-omni

u/Individual_Hand213 — 21 hours ago

▲ 12 r/GeminiOmniAI+7 crossposts

Few developers have reverse engineered and created an api for Gemini Omni model even before Google released it

github.com

u/Individual_Hand213 — 14 hours ago

▲ 7 r/GeminiOmniAI+3 crossposts

I created a GitHub Repo with top Gemini Omni prompts. This model absolutely blew my mind😱 Gemini Omni is insanely powerful and much better than Veo 3. And Veo 3 was already so good!

So I collected top prompts and examples from top X creators and put them in a GitHub repo.

The prompts were categorized into:

Cinematic Text-to-Video

Image-to-Video (Animate a Still)

Character Consistency & Multi-Scene Stories

Product Ads & Commercials

Lifestyle, Travel & B-Roll

Anime & Stylized Animation

Scientific & Educational Visualization

Action, Combat & VFX Sequences

Conversational Edits & Remixes

Audio-Driven & Lip-Sync

If you see a nice prompt and want to contribute, just create a pull request here: https://github.com/Anil-matcha/Awesome-Gemini-Omni-API-Prompts

u/Individual_Hand213 — 1 day ago

▲ 49 r/Seedance_v2+3 crossposts

Seedance 2.1 and Seedance 2.0 Mini are reportedly coming soon — with a 20% quality jump and pricing as low as ~$0.073/sec

ByteDance is moving ridiculously fast in AI video right now.

Rumors suggest:

Seedance 2.1 improves generation quality by ~20% over 2.0

Seedance 2.0 Mini outperforms 2.0 Fast despite being much cheaper

Mini pricing could land around $0.073/sec

If true, this could seriously shake up the AI video model market.

u/Individual_Hand213 — 1 day ago

▲ 2 r/GeminiOmniAI

Gemini Omni is Nano banana but for videos

u/Individual_Hand213 — 2 days ago

▲ 2 r/GeminiOmniAI+1 crossposts

Gemini Omni updates! Characters and Scenes are now available there, too!

Users can create characters that can later be reused for video generation. Different character voices are also available.

u/Individual_Hand213 — 2 days ago

▲ 14 r/GeminiOmniAI+1 crossposts

Gemini Omni test 🔥

One of the best "Cyberpunk hacker robot" videos I've seen so far. It handled scene composition much better than the latest Veo model.

u/Individual_Hand213 — 2 days ago

▲ 65 r/GeminiOmniAI+7 crossposts

Google Gemini Omni AI has been announced in Google IO

u/Individual_Hand213 — 2 days ago

▲ 61 r/Seedance_v2+3 crossposts

How to Create Viral Japanese Harajuku GRWM Videos with GPT Image 2 + Seedance 2.0 (Full Prompt Below)

Used Vadoo AI from https://vadoo.tv to combine GPT Image 2 scene generation with Seedance 2.0 video animation workflows.

The idea was to recreate the chaotic neon “Tokyo fashion creator” aesthetic you usually see on Japanese TikTok / IG Reels:

layered Harajuku outfits

glitter makeup + glossy lips

plushie-filled bedroom setup

VHS overlays & animated Japanese text

fast zoom transitions + mirror selfies

neon pink/cyan lighting

Shibuya street ending with crowded city energy

Prompt used:

"Stylized Japanese Harajuku street fashion “Get Ready With Me” vertical video featuring a trendy Japanese fashion creator preparing for a day out in Shibuya. Bright colorful bedroom filled with posters, plushies, neon signs, accessories, and stacked fashion magazines. She energetically talks in Japanese while applying glitter makeup, colored eyeliner, glossy lips, and styling layered Harajuku outfits. Include fast-paced cuts of oversized jackets, fishnet sleeves, platform sneakers, rings, dyed hair streaks, kawaii handbags, and mirror selfies. Dynamic camera angles, quick zoom transitions, spinning outfit reveals, flashing photo booth effects, VHS overlays, animated Japanese text graphics, energetic J-pop inspired pacing. Neon pink and cyan lighting mixed with daylight from the window. Scenes of her checking outfits in front of a full-length mirror, taking selfies, spraying perfume, grabbing headphones, then leaving the apartment into busy Tokyo streets. Highly detailed fashion textures, youthful trendy atmosphere, anime-inspired realism, social media reel aesthetic Japan"

A few things that surprisingly made a huge difference:

“anime-inspired realism” helped keep the characters stylized without looking too cartoonish

“social media reel aesthetic Japan” improved pacing + framing a lot

specifying fashion accessories individually gave much better outfit layering

adding “photo booth effects” and “VHS overlays” created more authentic Gen Z edit energy

neon pink/cyan lighting mixed with daylight gave a much more cinematic Tokyo vibe

The most impressive part was honestly the motion consistency during outfit transitions and mirror shots. The fashion textures also came out way more detailed than I expected for this type of aesthetic-heavy content.

Feels like this workflow is insanely good for:

GRWM reels

fashion creator content

anime-realism influencer edits

Tokyo/Shibuya aesthetic videos

idol/J-pop inspired short-form clips

Curious if anyone else here is experimenting with GPT Image 2 + Seedance for fashion/social media style generations.

u/Individual_Hand213 — 3 days ago

▲ 29 r/comfyui

The TikTok "color analysis" trend, but as a one-node ComfyUI workflow — drop in a single portrait, get back a 4K Dior-style editorial board with your best colors, undertone, makeup guide, hair, jewelry, and capsule wardrobe in one shot🎨👗💄✨

Workflow link: https://github.com/SamurAIGPT/muapi-comfyui/blob/main/workflows/MuAPI\_Skill\_ColorAnalysisBoard.json

If you've been on TikTok in the last year you've seen the Korean / Japanese **color analysis** trend — women flying to Seoul or paying NYC stylists $300–$500/hr to sit in a chair with draped fabric swatches while a consultant pronounces them a "Soft Autumn" or a "Deep Winter," then hands them a printed board of best colors, undertone, makeup palette, and capsule wardrobe.

I tried to fake the output with regular ComfyUI workflows for two days and got nowhere. Standard pipelines fumble it three ways: (a) `flux-dev` "color analysis board for this person" gives you a Pinterest moodboard of unrelated stock photos, (b) `nano-banana-edit` keeps the face but renders the "palette swatches" as blurred rectangles with hallucinated nonsense hex codes, (c) anything 1K or below makes the small magazine-style typography unreadable — the whole point of the board is the *legible labels* under each panel.

The fix is one specific edit model, one very specific aesthetic anchor, and 4K resolution.

**The Winning Workflow:**

**Step 1** — Single node: `MuAPIImageToImage` with model `gpt-image-2-image-to-image`. This is the only edit model I tested that holds the reference identity *and* renders dozens of small legible labels ("Your Best Colors," "Undertone: Cool," "Capsule Wardrobe," "Hair," "Jewelry") in the same image without text drift. Flux Kontext gets the face but garbles text. Nano-Banana gets text but loses the face. GPT-Image 2 does both.

**Step 2** — The load-bearing aesthetic anchor: prompt it as *"high-end editorial Color Analysis Board in a luxury fashion magazine style (Dior / Ralph Lauren aesthetic), clean beige/ivory background, minimal elegant typography, grid-based layout."* Without "Dior / Ralph Lauren" the model defaults to scrapbook-y Pinterest energy with mismatched fonts. Without "grid-based layout" you get a single hero panel instead of the 8-panel magazine spread. Those two phrases are the entire vibe.

**Step 3** — Output at `image_size: 3840x2160` (already wired in the workflow's `extra_params_json`). The board has 8+ small labeled panels — swatches, undertone strip, makeup grid, capsule wardrobe — and at 1024 res the labels under each swatch turn to mush. At 4K every fabric name and undertone label is readable, *and* the board doubles as a desktop wallpaper / Pinterest landscape pin without re-cropping.

**The trick most people skip:** the input portrait matters more than the prompt. Bad lighting = bad palette read. The model literally reads your skin, hair, and eye color off the source image to pick swatches, so:

- front-facing, eyes open, natural light (not blue-hour, not sodium-lamp, not a TikTok filter)

- no sunglasses, no heavy makeup, no color-cast (the orange glow from a sunset will push you "warm autumn" even if you're a cool winter)

- hair visible, not in a cap

Give it a clean portrait and the board reads correctly — your actual undertone gets marked, the "best colors" panel skews to your real palette, and the makeup grid recommends shades that would actually look good on you. Give it a blue-tinted phone selfie and the model thinks you're an Icy Winter regardless of reality.

The crazy part: the board includes panels the model wasn't even explicitly asked for in the prompt — it adds "Colors to Avoid," "Prints that Flatter," "Style notes," sometimes a small Pantone-style color number under each swatch — because it's been trained on enough actual fashion magazine spreads to know what belongs there. The Dior/Ralph Lauren reference primes it for *all* the editorial conventions, not just the literal layout.

Side by side, the "consultant board" the AI ships in ~30 seconds reads more polished than the printed PDFs most $300 in-person consultants hand you. The fabric swatches are fabric, not flat rectangles. The makeup palette looks like actual makeup product photography. The capsule wardrobe outfits are styled, not stock.

Drop in one portrait, hit Queue Prompt, get a 4K board. Use it as: a personal style reference, a Pinterest landscape board, a desktop wallpaper, a gift to the friend who keeps asking "do I look better in warm or cool tones?"

Highly recommend the open-source ComfyUI workflow — it ships pre-wired with the gpt-image-2 model, the editorial prompt, and the 3840x2160 resolution baked into the node. Three nodes (LoadImage → MuAPIImageToImage → SaveImage), one queue, one board.

Who else is doing personal-styling outputs in ComfyUI? Drop your best color analysis boards, capsule wardrobes, or "you in your colors" outfit grids below 👇

Let's see whose AI consultant out-styles the $300/hr human one the hardest 🎨👗💄✨

u/Individual_Hand213 — 3 days ago

▲ 23 r/Seedance_v2+3 crossposts

I cracked the AI-UGC-that-doesn't-look-like-AI trick — one selfie + one product photo + Seedance 2.0 VIP image-to-video = a 10s vertical "real creator talking to camera" ad with native synced dialogue 📱🛍️ 🎬✨

I am using https://muapi.ai along with the claude skill from here. It has the most powerful seedance 2 with realistic faces support https://github.com/SamurAIGPT/Generative-Media-Skills/blob/main/library/motion/ugc-video-factory/SKILL.md

After about 50 failed runs, I finally cracked the "TikTok creator talking about a product" effect in pure AI — the one where it's actually *your* face, actually *your* product (logo legible, not gibberish), actually a synced voice saying the exact line you wrote, with the casual handheld energy that AI spokesperson clips never have.

Standard pipelines fumble this three ways: (a) text-to-video gives you a stock woman holding a hallucinated bottle labeled "BRNAD," (b) a one-shot image edit slams the product into a hand but the face drifts into a different person, or (c) static photo + bolted-on lipsync gives you a moving mouth on a dead-eyed face that screams AI spokesperson. Recognizable face + legible logo + synced voice is too many "don't look fake" constraints for one i2v call.

The fix is three layered stages the ugc-video-factory skill bakes in.

**The Winning Workflow:**

**Step 1** — GPT writes a *photography* brief, not a video brief. Temperature 0, with hard rules: wearable → person wears it; handheld → person holds it; logo must stay legible; face must not change; 9:16 lifestyle composition, soft daylight, shallow DoF. Casting + composition only — no video grammar yet.

**Step 2** — `nano-banana-pro-edit` fuses selfie + product at 1K, 9:16. **Person first in `image_urls`, product second** — order matters. It's the only edit model that holds the reference face *and* keeps small product text legible in the same pass.

**Step 3** — `seedance-2-vip-image-to-video` animates that frame for 10s with `generate_audio: true`, `cfg_scale: 0.5`. The line lives inside the prompt as a quoted block: *They say in a natural, conversational tone: "{{script}}"*. **VIP tier is non-negotiable** — it's the only Seedance 2.0 tier that accepts realistic human faces in the reference, so it's the only path where your actual selfie shows up in the final video.

**The load-bearing trick most people skip:** keep the script to 1–2 sentences, max ~25 words. Seedance generates audio across a fixed 10s window; cram a 4-sentence read in there and the model compresses — words clip, syllables drop, lipsync drifts. The skill's default sample is 26 words for exactly this reason. Need a 30s read? Generate three 10s clips and cut them — don't fight the duration.

The crazy part: I expected to need a separate ElevenLabs + lipsync + foley pass. Nope. Seedance VIP generates voice, mouth shapes, head tilts, hand gestures, and room ambience together in one pass — synced because they were planned in the same latent. Mouth hits phonemes. Head tilts on stressed words. Hands move on emphasis beats. It just shows up correct.

Side by side it's not even close — text-to-video gives you a stock woman with a 2022 robotic voice. This pipeline reads as an actual creator with an actual product in an actual room. The logo is *readable*. The face is *your* face.

And it's not just hats — beauty serum on a bathroom counter, headphones at a coffee shop window, supplement bottle in a gym mirror, sunglasses on a boardwalk, candle on a couch next to a book. As long as the product is wearable or handheld and the environment is consistent with use, the pipeline ships a usable ad on the first or second seed.

Highly recommend the open-source UGC Video Factory skill — it ships with the GPT director-brief template, the Nano-Banana Pro reference-order spec, the Seedance VIP parameters, and the script-length guardrail baked in. Drop in a selfie, a product photo, and a one-liner. Ship it.

Who else is making UGC ads with one selfie + one product photo? Drop your best AI-creator clips, weirdest product fits, or proudest "I can't believe the logo is readable" wins below 👇

Let's see whose AI creator passes the "is this an ad or just a person?" test the hardest 📱🛍️✨

u/Individual_Hand213 — 4 days ago

▲ 247 r/Seedance_v2+5 crossposts

I cracked the time-freeze cinematic trick — one selfie + Seedance 2.0 reference-to-video = a 15s "snap → frozen world → snap" hero clip with native sound design ❄️ 🎬✨

After about 40 failed runs, I finally cracked the "Quicksilver / Zack Snyder

time-stop" effect in pure AI — the one where the character snaps their

fingers, the world freezes mid-explosion (beer droplets hanging in midair,

popcorn floating, people locked mid-cheer), they stroll through the frozen

scene, snap again, and reality slams back to life.

Standard image-to-video completely fumbles this. Either (a) the whole shot

freezes including the protagonist so nothing happens, (b) you get this jittery

half-motion glitch where the "frozen" extras are doing weird micro-twitches

that scream AI, or (c) the model just ignores you and renders a normal bar

scene with vibes. 15 seconds of "one person moves, 47 other people don't, but

the scene still feels alive" is too many physics-violating instructions for a

single vague i2v prompt to hold together.

The fix turned out to be three layered tricks that the freeze-effect-video

skill bakes in by default.

The Winning Workflow:

Step 1 — bytedance-seedance-2-0-reference-to-video-fast takes ONE reference

photo of the subject (the only person who'll actually move) as @Image1. That

identity anchor is what survives the full 15s without face drift, and

crucially it tells the model "everyone else in frame is not @Image1, therefore

freeze them." The selfie does double duty as casting and as a hard masking

signal.

Step 2 — Time-segmented director brief with FIVE explicit beats, hard

timecoded:

- [0:00–0:03] Sports bar packed, blurred TVs showing a championship

celebration, subject walks confidently through the chaos and snaps their

fingers

- [0:03–0:06] A spherical shockwave bursts from the fingertips, air distortion

+ light refraction rippling outward, EVERYTHING freezes — golden arcs of beer

suspended midair, popcorn floating, neon catching dust and liquid, absolute

silence

- [0:06–0:09] Only @Image1 moves. Soft echoing footsteps. Camera tracks

backward as they duck under a suspended arc of beer and pluck a single

floating popcorn kernel from the air

- [0:09–0:11] They stop in front of a frozen fan locked mid-scream,

mid-high-five, tilt their head, adjust the brim of their cap, whisper

"perfect"

- [0:11–0:15] Snap again, reverse shockwave ripples outward, motion explodes

back — beer splashes, cheers return, people land mid-jump, camera pushes

through the celebrating crowd, fade to black

Step 3 — The load-bearing trick most people skip: an explicit Sound Design

line at the bottom of the prompt — "deafening bar celebration → snap → deep

shockwave bass drop → absolute silence → footsteps → sharp popcorn crunch →

'perfect' → snap → reverse shockwave → deafening celebration returns."

Seedance 2.0 generates audio natively, and if you omit this, the model fills

the silent freeze section with random ambient noise that completely murders

the effect.

The crazy part: I expected to have to comp the bass-drop and the dead-air

myself in DaVinci with a separate foley pass. Nope. Seedance writes the

silence into the timeline at the exact frame the shockwave hits. The cheer

cuts off mid-syllable. The popcorn crunch is on a clean track. The

reverse-snap re-explodes the crowd noise. It just shows up correct.

Side by side it's not even close — generic "snap fingers time stops" i2v gives

you something that looks like a video buffering bug by second 4. The

freeze-effect skill version genuinely looks like a 15s hero shot pulled from a

superhero teaser.

And it's not just bars. Swap the scene in the skill — frozen wedding reception

with rice and confetti hanging in midair, freeze-walking through a nightclub

at peak drop, freeze a stadium during the championship goal with foam

suspended above the crowd, freeze a busy NYC crosswalk with cabs caught

mid-honk, freeze a paintball arena with pellets hanging in midair. The

five-beat snap → freeze → walk → snap → resume structure holds for any

high-energy crowd scene where the contrast between chaos and absolute

stillness carries the shot. I think this is currently one of the strongest

pipelines for hero-character cinematic moments where you need a

physics-violating effect to read as intentional instead of as an AI artifact.

Highly recommend the open-source Freeze Effect Video skill — it ships with the

5-beat director brief, the shockwave/reverse-shockwave symmetry, the "only

@Image1 moves" identity lock, and the native sound-design arc baked in. Drop

in any selfie, change the venue, ship it.

Who else is making time-stop or bullet-time style hero clips with this stack?

Drop your best freeze moments, snap-and-stop scenes, or wildest "everyone but

me is paused" experiments below 👇

Let's see who can freeze the wildest scene! ❄️ 🎬⏸️

u/Individual_Hand213 — 3 days ago

▲ 18 r/Seedance_v2+2 crossposts

I cracked the storyboard-first trick for AI cooking videos — GPT Image 2 builds a 9-panel reference sheet, Seedance 2.0 turns it into a 15s cinematic pasta tutorial 🍝🔥

I am using https://muapi.ai along with the claude skill from here

https://github.com/SamurAIGPT/Generative-Media-Skills/blob/main/library/motion

/storyboard-to-cooking-video/SKILL.md

After a stupid amount of failed runs, I finally got AI cooking tutorials to

feel like a real Bon Appétit clip instead of a glitchy AI loop where the

chef's face liquifies between cracking the egg and plating the dish.

Standard image-to-video gives you maybe one decent beat — pour the flour, look

good, then the second the hands move to knead, the face drifts, the apron

changes color, and suddenly the marble counter is a wooden table. 15 seconds

of cooking choreography is just too many distinct actions for a single i2v

prompt to hold together.

The fix turned out to be weirdly simple: stop asking the video model to invent

the choreography, and hand it a pre-baked storyboard image as a second

reference.

The Winning Workflow:

Step 1 — gpt-image-v2-edit builds ONE big 3840x2160 composite reference sheet

from the selfie. Not a final frame — a production board with 9 numbered action

panels across the top (flour well → crack egg → mix → knead → rest → roll →

cut → lift → plate), a character sheet of the same person from 4 angles in the

middle-left, and a kitchen location reference on the right. Basically the

same thing a real cooking show art director would tape to the wall.

Step 2 — bytedance-seedance-2-0-reference-to-video-fast gets TWO image

references in fixed order: the original selfie as @Image1 (identity anchor)

and the reference sheet as @Image2 (choreography + environment anchor). Order

is load-bearing — swap them and the model treats the storyboard as the person

and renders a cubist nightmare.

Strict-identity prompt block — explicit "preserve face, hair, eye color, skin

tone with 100% accuracy throughout entire video" tied to @Image1. This is what

kills the mid-knead face drift.

9-beat single-take timeline — exact 15s sequence: 0–2s flour well on marble,

2–4s cracking egg, 4–6s mixing with fork, 6–8s kneading, 8–9s resting dough,

9–11s rolling, 11–13s cutting noodles, 13–14s lifting strands from copper pot

with tongs, 14–15s plating close-up.

The crazy part: Seedance 2.0 generates the audio natively too — pouring flour,

the wet slap of dough on marble, water boiling, a faint warm jazz underscore.

No ffmpeg sound design pass, no separate TTS layer, no foley library. It just

shows up correct.

Side by side it's not even close — single-image i2v gives you something that

screams AI by second 4, the reference-sheet version genuinely looks like a 15s

teaser someone cut from a longer cooking show.

And it's not just pasta. Swap the dish in the skill — sushi rolls, wood-fired

pizza, matcha latte, cocktail mixing — the 9-panel reference sheet pattern

holds for any sequential prep workflow. I think this is currently one of the

strongest pipelines for any multi-step process video where character identity

has to survive a lot of distinct actions: cooking, makeup tutorials, craft

demos, mechanic walkthroughs, anything with a procedure.

Highly recommend the open-source Storyboard to Cooking Video skill — it ships

with the full reference-sheet generator prompt, the dual-reference identity

lock, and the 9-beat director brief baked in.

Who else is making cooking or tutorial-style videos with this stack? Drop your

best chef clips, recipe reels, or weirdest cuisine experiments below 👇

Let's see some plates! 🍝🍣🍕

u/Individual_Hand213 — 7 days ago

▲ 13 r/Seedance_v2+3 crossposts

I just made a Grammy-level AI Award Ceremony Video with a host announcing the winner, spotlight reveal, and LED stage display all in 15 seconds using Seedance 2.0 🏆🔥

I am using https://muapi.ai along with the claude skill from here

https://github.com/SamurAIGPT/Generative-Media-Skills/blob/main/library/motion/award-ceremony-video/SKILL.md

After a lot of testing, I finally cracked how to make AI ceremony videos

actually feel like a real broadcast instead of a flat AI render with two

strangers standing on a stage.

Normal Seedance 2.0 i2v with two people usually breaks identity halfway

through — the winner morphs into someone else by the time they hit the podium,

or the host changes outfits mid-shot.

The fix? Lock both faces with @image_1 / @image_2 strict-identity tags AND

segment the 15 seconds into 5 hard broadcast beats with explicit camera

grammar — close-up, spotlight cut, handheld follow, stage hand-off, wide hero.

The Winning Workflow:

Seedance 2.0 (reference-to-video-fast) — feed it TWO reference images in fixed

order: Winner first (@image_1), Host second (@image_2). Order is load-bearing

— swap them and the wrong person walks up to the podium.

Strict-identity prompt block — explicit "no modifications to face or build"

lines for both characters. This is what kills the mid-shot face drift.

5-beat broadcast timeline — 0–3s host announcement close-up → 3–6s spotlight

snaps onto winner in the crowd → 6–9s handheld follows them up the aisle →

9–12s stage hand-off + LED reveal → 12–15s wide hero shot with standing

ovation.

LED display callout — the prompt literally instructs Seedance to render the

winner's name on the stage screen with "THE BEST ACTOR" beneath it. It

actually holds the typography.

The crazy part: it also generates the audio natively — host voice through

venue speakers, crowd murmur turning into thundering applause, footsteps on

stage. No separate TTS or sound design pass needed.

The difference is massive — one version feels like two AI photos in front of a

stage, the other feels like a real awards broadcast clip.

Highly recommend this open-source Award Ceremony Skill — it ships with the

full 15-second director brief, the strict-identity lock pattern, and the

LED-display naming trick baked in:

This setup (Seedance 2.0 reference-to-video + identity-locked dual-character

prompts + timecoded beat structure) is currently one of the strongest

pipelines for any 2-character broadcast scene — awards, interviews, debates,

talk shows.

Who else is making ceremony or broadcast-style videos with this stack?

Drop your best winners, hosts, or trailer clips below 👇

Let's see some standing ovations!

u/Individual_Hand213 — 7 days ago

▲ 5 r/Seedance_v2+3 crossposts

I just made a Hollywood-level AI Fight Scene with 16 dense cuts in 15 seconds using GPT Image 2 + Nano Banana 2 + Seedance 2.0 🔥

I am using https://muapi.ai along with the claude skill from here https://github.com/SamurAIGPT/Generative-Media-Skills/blob/main/library/motion/ai-fight-scene/SKILL.md

After testing heavily, I finally cracked how to make AI fight scenes actually feel intense instead of slow and empty.

Normal Seedance 2.0 outputs usually give you only 3-4 lazy beats in 15 seconds.

The fix? Use GPT Image 2 to create a dense 4x4 (16-cell) storyboard first with camera moves, shot sizes, and rhythm notes — then feed it into Seedance 2.0.

The Winning Workflow:

GPT Image 2 — Generate character sheets + full 16-shot storyboard (with shot types, camera arrows, and pacing notes).

Nano Banana 2 — Create strong scene concepts and environments.

Seedance 2.0 — Turn the storyboard into a high-energy 15-second video with proper cut density and choreography.

They even tested it on a crazy asymmetric character (Ranx with one black thigh-high sock, red holster, cyan knee piping, and weird cable details) and GPT Image 2 still held perfect consistency.

The difference is massive — one version feels like a basic demo, the other feels like a real trailer.

Highly recommend this open-source AI Fight Scene Skill — it includes battle-tested prompt templates and structure for exactly this kind of dense action choreography:

This combo (GPT Image 2 + Nano Banana 2 + Seedance 2.0) is currently one of the strongest pipelines for action shorts and fight scenes.

Who else is making fight scenes or trailers with this stack?

Drop your best results, clips, or tips below 👇

Let’s see some chaos!

u/Individual_Hand213 — 9 days ago

▲ 48 r/Seedance_v2+2 crossposts

I finally figured out how to get insane results with GPT Image 2 + Seedance 2.0 🔥

I prefer using GPT image 2 and Seedance 2 from https://vadoo.tv or https://muapi.ai as they are the best budget friendly platforms supporting AI models with full access

After testing non-stop for the past few days, I finally cracked the perfect workflow combining GPT Image 2 and Seedance 2.0.

The magic isn’t just using either tool alone — it’s using GPT Image 2 as the ultimate visual brain and feeding it straight into Seedance 2.0 for cinematic motion.

GPT Image 2 × Seedance 2.0 Workflow:

Generate your base assets with GPT Image 2 — character sheets, storyboard panels, style references, and key scenes. (It’s ridiculously good at consistency, text, and complex compositions.)

Upload 2–4 strong reference images from GPT Image 2 into Seedance 2.0 along with your motion prompt.

Generate 8–12 second cinematic clips with native audio and smooth camera moves.

Extend the clip using the previous video + the same GPT Image 2 references for near-perfect consistency.

Why this combo slaps so hard:

GPT Image 2 fixes the usual AI video problems (character drift, bad anatomy, weak composition)

Seedance 2.0 adds beautiful motion, physics, audio, and director-level cinematography

My Go-To Prompt Style for Seedance:

textCinematic continuation, ultra consistent character and style from reference images.

Photorealistic, dramatic lighting, smooth camera movement.

[Describe the action here]...

Maintain exact same character appearance, clothing details, art style, and lighting from the reference images.

I’ve been getting trailer-level quality in minutes. Characters actually stay on-model, lighting is consistent, and the motion looks way more natural than raw Seedance alone.

This combo is genuinely next-level for short films, ads, storytelling, and UGC content.

Who else is playing with GPT Image 2 + Seedance 2.0?

Drop your best results, prompts, or tips below 👇

u/Individual_Hand213 — 10 days ago

▲ 15 r/Seedance_v2+3 crossposts

Seedance 2 chinese version supports nsfw content

I am using seedance 2 from https://muapi.ai/sd-2 vip models and it supports nsfw content which is crazy since global version of seedance 2 doesn't support nsfw content

u/Individual_Hand213 — 12 days ago

▲ 2 r/SoraAi

Exact trick for near-perfect text in GPT Image 2 (99% readable now)

u/Individual_Hand213 — 13 days ago