u/Crazy-Carob-6361

Building a Long-Term AI DM Exposed Serious LLM Architecture Problems

I'm working on what started as an AI Dungeon Master project for D&D 5e, but it has gradually turned into a much larger LLM architecture problem and I need advice from people who understand long-term agent systems better than I do.

What I'm trying to build is NOT:

  • a single giant prompt
  • a chatbot persona
  • an “Act as a DM” setup
  • a lightweight RPG assistant

What I'm trying to build is effectively a persistent AI-operated campaign runtime system.

Core goals:

  • long-term campaign continuity
  • stable world-state tracking
  • rules-as-written prioritization
  • modular architecture
  • procedural NPC generation
  • autonomous companions/players
  • persistent memory
  • scalable extensibility
  • external persistence and reconstruction

Current architecture direction:

  • governance layer
  • operational doctrine
  • dependency structure
  • reconstruction system
  • anti-drift systems
  • modular file governance
  • external persistence to Obsidian
  • layered retrieval hierarchy

One major realization: ChatGPT itself cannot reliably function as the memory layer once system complexity increases.

So now I’m attempting to externalize cognition into structured documents and retrieval systems.

The rough architecture I’m exploring is:

LAYER 1 — “Book Smart” System

  • Core D&D 5e rules intelligence.
  • PDFs uploaded into ChatGPT Projects.
  • Project instructions designed specifically to communicate with those PDFs.
  • Sourcebooks/modules/campaigns treated as PRIMARY AUTHORITY.
  • AI must prioritize RAW before any inference or improvisation.
  • AI should retrieve rules instead of hallucinating or relying on latent memory.

The goal is: The uploaded sourcebooks become the backbone cognition layer.

LAYER 2 — “Table Smart” System

  • Community-derived 5e operational knowledge from 2014–2024 ONLY.
  • No 5.5e content.
  • Table heuristics.
  • Encounter balancing realities.
  • DM wisdom.
  • emergent gameplay patterns.
  • unofficial but battle-tested practices.

Basically: “what experienced tables actually discovered after a decade of play.”

LAYER 3 — Persona Runtime System

  • DM personalities.
  • player personalities.
  • autonomous companions.
  • behavioral sliders.
  • dynamic personality synthesis instead of static presets.
  • companions function like independent players rather than puppets.

LAYER 4 — Creativity Engine

  • Attempts to compensate for creative flattening and safety homogenization in ChatGPT.
  • Should allow tonal flexibility, experimental campaign structures, emergent storytelling styles, unconventional worldbuilding, etc.
  • Goal is preventing the model from collapsing into generic assistant outputs.

The major issues I keep hitting:

  • memory drift
  • instruction degradation
  • retrieval instability
  • continuity collapse
  • context poisoning
  • overlapping systems
  • document retrieval failure
  • abstraction creep
  • the model reverting back to “generic helpful assistant”
  • giant prompts becoming unstable

At this point I’m trying to figure out:

  • Is ChatGPT fundamentally the wrong tool for this?
  • Is this actually an agent/orchestration problem?
  • Would local models + RAG + vector DBs make more sense?
  • Is there a standard architecture pattern for persistent simulation systems?
  • Am I accidentally rebuilding existing tooling badly?
  • At what point does this require actual software engineering rather than advanced prompting?

I’m a non-programmer currently, but I’m willing to learn if necessary.

What I’m looking for:

  • architectural guidance
  • framework recommendations
  • retrieval/memory advice
  • orchestration patterns
  • persistence approaches
  • anti-drift strategies
  • long-context management
  • agent system design advice

The D&D side is almost secondary now. The project became a stress test for long-term LLM continuity and modular cognition systems.

reddit.com
u/Crazy-Carob-6361 — 7 days ago

I Think I Found the Limits of Prompt Engineering

I started building a large-scale AI Dungeon Master system for D&D 5e and I think I’ve gradually discovered where prompt engineering starts breaking down entirely.

At first I assumed: “better prompts = better system.”

Now I’m no longer convinced.

The more complex the system became, the more I encountered:

  • memory drift
  • instruction degradation
  • continuity collapse
  • retrieval inconsistency
  • overlapping instructions
  • abstraction creep
  • the AI reverting to generic assistant behavior
  • unstable giant prompts

So the architecture slowly evolved into:

  • modular documents
  • governance systems
  • external persistence
  • reconstruction systems
  • retrieval hierarchy
  • operational doctrine
  • anti-drift structures

What I want:

  • uploaded PDFs to act as authoritative cognition sources
  • project instructions that explicitly coordinate with those PDFs
  • sourcebooks/modules/campaigns treated as RAW authority
  • persistent continuity
  • autonomous NPCs/companions
  • dynamic personality systems
  • long-term stable campaigns

The deeper I go, the more it feels like: prompt engineering alone cannot reliably support persistent modular cognition systems.

At this point I’m trying to figure out whether:

  • advanced prompting is still the correct path
  • this should become a true agent system
  • memory/state must exist externally
  • orchestration frameworks are required
  • ChatGPT Projects are insufficient for this scale

I’m curious whether others hit this same wall when trying to build larger persistent systems.

reddit.com
u/Crazy-Carob-6361 — 7 days ago

At What Point Does a Prompt-Based System Become an Actual Agent Architecture?

I’ve been building what started as a ChatGPT-based AI Dungeon Master system, but I’ve reached the point where it no longer feels like a prompting problem and instead feels like an orchestration/state-management problem.

I’m trying to understand whether what I’m building maps onto existing agent architecture patterns or whether I’m approaching this incorrectly.

The system requirements:

  • persistent long-term continuity
  • modular subsystems
  • rules retrieval
  • structured source authority hierarchy
  • world-state persistence
  • autonomous NPC/player behaviors
  • external memory persistence
  • reconstruction after context degradation
  • anti-drift mechanisms

The architecture I’m moving toward currently includes:

  • governance layer
  • file governance
  • dependency declarations
  • reconstruction systems
  • layered cognition
  • operational doctrine
  • external persistence to Obsidian
  • modular runtime structures

The retrieval hierarchy is especially important.

I want uploaded D&D 5e sourcebooks/modules/campaigns to act as PRIMARY AUTHORITY. Meaning: the AI should consult uploaded material before relying on latent model knowledge or improvisation.

I also want:

  • a second “table-smart” layer containing community-derived operational knowledge from 2014–2024
  • personality systems
  • autonomous companions
  • dynamic DM behavioral synthesis
  • creativity systems to avoid generic assistant flattening

Major failure points:

  • memory drift
  • retrieval instability
  • context poisoning
  • instruction degradation
  • assistant reversion
  • continuity collapse
  • prompt instability at scale

The key realization: ChatGPT Projects + PDFs seem useful as a retrieval layer, but not sufficient as a true long-term architecture.

So I’m trying to understand:

  • Should this become a proper RAG pipeline?
  • Should orchestration frameworks like LangGraph be involved?
  • Should state management exist entirely outside the model?
  • Is there a clean architecture pattern for persistent simulation systems?
  • How would you structure source authority hierarchies?
  • Is agent decomposition necessary here?

I’m a non-programmer currently but trying to understand the correct architectural direction before I continue scaling the system incorrectly.

reddit.com
u/Crazy-Carob-6361 — 7 days ago

AI Dungeon Master Matrix

I've been working on a project for a while now and honestly I'm starting to hit a wall and I really need advice from people who understand AI systems better than I do.

So what I'm trying to build is a full-fledged AI Dungeon Master system for D&D 5e, not just some "DM prompt" or chatbot character, but something that can actually work long-term, stably.

Here's what I want it to do:

  • Operate a campaign for an extended period.
  • Understand and correctly use sourcebooks.
  • Maintain consistency session to session.
  • Keep track of the state of the world.
  • Procedurally generate characters.
  • Have intelligent, believable NPCs/companions.
  • Actually use and retrieve rules rather than rely on nebulous memory.
  • Have modular components that I can continue adding to without the whole system collapsing.

I quickly figured out that the typical "just make a better prompt" approach wasn't cutting it once things got to a certain size.

Here are the issues I keep hitting:

  • Memory drift.
  • Instructions degrading over long conversations.
  • The AI slowly starting to abstract or simplify the system in question. -Uploaded PDFs not being reliably used.
  • Huge prompts becoming unstable.
  • Systems overlapping each other.
  • Continuity breaking.
  • The AI defaulting back to generic assistant behavior instead of DM.
  • Retrieval failing once too many documents were uploaded.

As such I've gone way more in-depth on the architecture than I intended:

What I'm basically building is:

  • Governance layer.
  • Operating doctrine.
  • Reconstruction system.
  • File governance.
  • State tracker.
  • Dependency structure.
  • Modular documentation.
  • Anti-drift system.
  • External persistence to Obsidian.

The AI itself almost acts as the processor of a highly structured external system rather than the memory itself.

I know this is going to sound incredibly over-engineered but the few times I've tried to simplify it, it has inevitably collapsed later on.

I'm a non-programmer, but I'm perfectly happy to learn about things that I need direction on. However, I'm extremely concerned I'm trying to brute force something that should have a proper solution.

So, has anyone here tried building something similar? Am I massively overengineering this? Is ChatGPT the right tool for this job in the long run? Would a different model, system, framework, be better? Is there a standard architecture for something like this? Would this eventually require actual coding and custom tooling? Is there a cleaner way to handle the memory, retrieval, and long-term continuity that I'm missing?

I'm hoping to connect with people who understand LLM architecture, agents systems, prompt engineering, RAG, memory, etc. Because I feel like I'm somewhere between "this is an interesting idea" and "I'm completely re-implementing software incorrectly."

I've tried the traditional Act as a Dungeon Master prompts that you can find on the internet. A lot of those are not bad but I feel like they don't encapsulate what I am really trying to build here. It sometimes feels as if, from what I'm seeing, it feels as if ChatGPT (because that's the model that I use, that's the AI that I use). Sometimes we'll say, "Oh wow, this is a great idea!" and then secretly it kinda gives me the middle finger. That's at least how it feels like on my end. I'm looking for help with this project

reddit.com
u/Crazy-Carob-6361 — 7 days ago