u/InevitableSeason8749

Hey everyone,

I wanted to share an open-source project I’ve been building. I wanted a clean, lightweight way to handle local AI roleplay using Ollama and Phi-3, but I kept running into issues with character drift, streaming lag, and context management.

Instead of dealing with massive, bloated AI agent frameworks, I built a dedicated, lightweight Python framework specifically for this use case. It handles context-locking and LLM streaming natively so the character stays locked in without eating up unnecessary token RAM.

Features built-in:

Ollama Integration: Tailored directly for local models like Phi-3 (Other model are also supported but I use phi-3).
Native LLM Streaming: No long wait times for generations.
Context Preservation: Keeps the persona intact even over longer multi-turn chats.
MIT License: Purely open-source.

Because my Reddit account is brand new, the automod will delete my post if I include a direct GitHub hyperlink.

If you want to check out the code, look at the files, or help test it, the direct link is pinned on my Reddit profile bio, or you can search GitHub for: tegetgoofficial-bot/ai-roleplay-framework

I’m looking to see if the structure makes sense to other developers here. What do you think of handling local roleplay state this way?

Built a lightweight Python framework for local LLM roleplay (Ollama/Phi-3) to stop context drift. Looking for feedback.