u/Hot-Necessary-4945 — reddlx

I’ve been thinking about the role of the context window in LLMs and why it isn’t used more directly as a way to teach models new knowledge—essentially turning it into a form of memory.

In theory, if this were possible, users could “train” a model on the fly by feeding it knowledge through the context window, rather than relying only on its pretraining. This would allow highly customized models tailored to specific tasks (math, coding, niche domains, etc.), Instead of using massive general-purpose models (which are costly and require data center-scale resources), we could move toward smaller models that users customize with only the knowledge they need.

The problem is that the context window is inherently static, linear, and limited. So I started experimenting with ways to make it behave more like working memory.

Here’s what I built:

First, a RAG system—but not in the usual sense. I designed custom construction and retrieval algorithms inspired by how human memory works. I call this the “memory window.”
Second, a pipeline that converts datasets (e.g., from Hugging Face) into what I’d describe as artificial memories, which can then be injected into the model.

Initial testing:

Model: Qwen3.5 2B
Dataset: 2,701 medium-difficulty math problems, converted into artificial memory format

Results:

Without the memory system: the model produced mostly incorrect or nonsensical answers
With the memory system enabled: it was able to answer correctly

This raised an important question: is it actually learning, or just memorizing?

To test this, I generated new questions based on the same underlying mathematical concepts (using Claude), rather than reusing the dataset directly. The model was still able to answer them correctly, which suggests some level of generalization.

Next steps:

This is still an early experiment. I plan to:

Test on larger datasets
Try different domains beyond math
Share results and (if possible) release the project for others to try

I’d really appreciate any feedback, criticism, or related ideas—especially if you’ve explored something similar.