
I built a full 3D Artemis II mission explorer with AI agents writing 100% of the code. The scope got much bigger than expected.
I wanted to share a real-world Codex / GPT-5.5 coding case study.
I built an interactive 3D Artemis II Mission Explorer:
https://artemis.astrography.com/
The important part for this subreddit: 100% of the application code was written through AI-agent workflows under my direction. I handled the product vision, research direction, visual taste, iteration, QA, architecture decisions, and a lot of painful “this still feels wrong” feedback loops. But the code itself was produced by agents.
The main split was this:
Codex / GPT-5.5 handled the hard technical core. It was responsible for the physics-adjacent logic, trajectory handling, 3D viewport mechanics, camera systems, rendering logic, coordinate transforms, model loading, performance-oriented rendering decisions, fallback systems, and the more algorithmic parts of the project.
Claude Code was used more for the UI layer and for assembling the application into a usable product. It helped with interface structure, controls, layout, panels, state wiring, component organization, and integrating the pieces into a coherent Vite + Three.js app.
What started as a small experiment became a surprisingly large browser-based spatial application.
The app is a static Vite + Three.js project with no backend, database, auth, or environment variables. It includes an interactive Earth-Moon-Orion 3D scene, Artemis II free-return trajectory visualization, timeline scrubbing, MET/UTC readouts, playback speed from 1x to 10,000x, mission checkpoint cards, telemetry-style range/velocity/altitude readouts, DSN-style light-delay context, and multiple camera modes.
The technically demanding parts were mostly Codex-driven:
- building the Earth-Moon-Orion 3D scene
- implementing the trajectory layer and fallback curves
- constructing the mission timeline logic
- handling MET/UTC conversion and playback speed
- building camera systems for overview, Orion follow, Earth, Moon, capsule-in, and capsule-out modes
- implementing first-person capsule view controls
- handling camera roll, roll reset, animated aim targets, and smooth transitions
- wiring telemetry-style values into the simulation state
- building the 3D model loading pipeline for NASA Orion assets
- working through coordinate-space issues and scene scaling
- patching Three.js shaders with onBeforeCompile
- optimizing live clouds, night lights, texture LOD, and GPU-heavy branches
- implementing Earthshine, cloud shadows, eclipse/corona handling, Sun glow, lens flare, and occultation logic
- building the Developer HUD for FPS, camera position, rotation, quaternion, FOV, target, and copyable viewport state
The rendering layer became its own rabbit hole. The app has textured Earth with day map, night lights, atmosphere rim, Earthshine, live clouds, and cloud shadows. The Moon uses NASA texture data, LOLA terrain detail, Earthshine response, eclipse handling, and distance-aware LOD. Orion is assembled from NASA OBJ/MTL assets. The scene also includes a real star field from the Yale Bright Star Catalog, a Milky Way background, Sun glow, lens flare, eclipse corona, and occultation logic.
There is also a gallery reconstruction system. Artemis II images are connected to the simulation, so selected photos can move the 3D scene to the approximate mission moment and apply a matching camera preset. Each gallery item keeps NASA source metadata, credit lines, image IDs, dates, and source links where available.
From a Codex perspective, the most interesting part was not “can it generate code?” The interesting part was whether it could sustain difficult work across a large, messy, visually sensitive 3D application where correctness depends on geometry, rendering, performance, UI feel, source handling, and many small interactions that all need to work together.
My experience: Codex / GPT-5.5 was strongest when I treated it like a serious technical partner rather than autocomplete. It was very good at turning a complex spatial problem into working implementation steps, especially when I gave it clear constraints, concrete expected behavior, screenshots, logs, and precise feedback about what felt physically or visually wrong.
It was also very useful for debugging “almost correct” behavior: camera motion that looked plausible but was wrong, coordinate transforms that were subtly off, scene scale problems, state synchronization issues, model orientation problems, and performance regressions hidden behind impressive visuals.
The hardest parts were not generating code. The hardest parts were:
- keeping the spatial and visual logic coherent
- preventing “almost correct” 3D math from becoming invisible technical debt
- keeping performance reasonable in the browser
- making the UI feel like an exploratory mission interface rather than a demo
- getting agents to respect existing architecture instead of inventing new patterns
- turning vague visual judgment into actionable engineering instructions
- deciding when to use Codex for deep technical work and when to use Claude Code for UI/product assembly
- continuously reducing scope while the project kept wanting to grow
The biggest lesson for me: Codex makes ambitious solo projects possible, but it does not remove the need for architecture, taste, QA, and obsession. In fact, it makes those things more important, because the speed of implementation increases the cost of unclear thinking.
The workflow that worked best was: use Codex / GPT-5.5 for the heavy algorithmic and 3D engineering problems, then use Claude Code to help shape the UI, connect the layers, and make the project feel like a complete app rather than a pile of impressive systems.
I’m sharing this here because I’d love feedback from people building seriously with Codex and agentic coding tools.