u/RJSabouhi

Is context vs. admissible evidence an under-specified problem in LLM systems?

Question for people working with LLMs / RAG:

If a model sees text in its context window, how do we make sure it knows whether that text is actually valid evidence?

Ex: prompt might include current docs, old docs, retrieved snippets, answer choices, and injected text. All of that is “context,” but not all of it should count as evidence.

You think it’s mainly a RAG/provenance problem, or prompt-injection problem, or just something we need better evals for?

I’m thinking of this as a source-boundary failure, as though the model treats text as evidence just because it is present.

reddit.com
u/RJSabouhi — 8 days ago
▲ 1 r/Rag

Context is not control

I released a working paper + replication artifacts on source-boundary failures in LLM evidence use.

The claim is basically that language models can treat text that's merely present in the context window as answer-bearing evidence, even when that text is not admissible to the task.

This paper's benchmark is specifically about whether models preserve the distinction between
* context
* admissible source
* injected/contaminating text
* instruction
* answer-shaped but unsupported content

The release includes working manuscript, open-weight replication package, frontier/API replication package, GitHub repo, Zenodo, DOl archive.

The strongest result, in plain English, is that giving models an "INSUFFICIENT" output option was not enough. Recovery appeared when the task frame explicitly represented source admissibility / source boundaries.

I'd be especially interested in critique around: experimental design, my scoring choices, what the strongest confound or missing ablation might be. I appreciate any feedback.

[Repo](https://github.com/rjsabouhi/context-is-
not-control)

[Paper + Reproduction](https://zenodo.org/records/
20126173)

reddit.com
u/RJSabouhi — 9 days ago
▲ 2 r/academicpublishing+2 crossposts

Context Is Not Control: Source-Boundary Failures in Controlled Text-Mediated Evidence Use.

Ok. The raw dawg researcher is back!

This time I’ve released a working paper + replication artifacts on source-boundary failures in LLM evidence use.

The claim is basically that language models can treat text that's merely present in the context window as answer-bearing evidence, even when that text is not admissible to the task.

This paper's benchmark is specifically about whether models preserve the distinction between
* context
* admissible source
* injected/contaminating text
* instruction
* answer-shaped but unsupported content

The release includes working manuscript, open-weight replication package, frontier/API replication package, GitHub repo, Zenodo, DOl archive.

The strongest result, in plain English, is that giving models an "INSUFFICIENT" output option was not enough. Recovery appeared when the task frame explicitly represented source admissibility / source boundaries.

I'd be especially interested in critique around experimental design, my scoring choices, what the strongest confound or missing ablation might be. I appreciate any feedback.

[Repo](https://github.com/rjsabouhi/context-is-
not-control)

[Paper + Reproduction](https://zenodo.org/records/
20126173)

u/RJSabouhi — 9 days ago
▲ 12 r/generative+1 crossposts

Built a personal research tool to visualize emergent structure in complex adaptive systems, through operator-driven field evolution.

The Engine is an IDE-esque interactive visualization showing how local rules can produce coherent global patterns.

It renders real-time 2D manifold simulations featuring curvature-driven flows, attractor basins, and emergent geometric structures. It’s been a boon to my use case but it’s also just kind of fun to mess around with. The desktop is a more seriously implementation but the mobile (toy) version is also enjoyable. Anyways, thought I share.

https://sfd-engine.replit.app

u/RJSabouhi — 10 days ago

I built a simulator to study emergent pattern formation. One of my proudest projects.

It’s a computational framework for visualizing emergent structure in complex adaptive systems through operator-driven field evolution. It’s mostly a research instrument, but it’s also pretty fun to just mess around with. All in browser. The desktop is a full IDE and provides a lot more functional control, the mobile version is more a toy but still fun.

https://sfd-engine.replit.app

u/RJSabouhi — 13 days ago

Send me one concrete failure you’re having with your agent

I’m testing structural issues for AI systems. I know that a lot of agents are having issues with other agents / RAG apps / workflows / weird and costly behaviors that don’t show up in testing.

If you’re dealing with some confounding issue, something that doesn’t show up in the logs, send me one concrete failure. I’ll respond with a quick first-pass read:
* what kind of failure it looks like
* why it’s probably happening
* what I’d inspect first

24 hr turnaround.

reddit.com
u/RJSabouhi — 13 days ago

I’m building Symbolic Suite, a diagnostic layer for AI/agent systems.

I’ve been testing what happens when models use memory, tools, retrieval, workflows, and external text.

I’m focused on failures like:

* stale context treated as current
* evidence confused with control text
* retrieval content overriding the task
* agent systems coupling memory/tools in unsafe ways

This is meant to work across any agent system, not one specific model. Basically infrastructure for finding where agent systems break.

symbolicsuite.com
u/RJSabouhi — 13 days ago
▲ 9 r/AIEval+7 crossposts

A coding agent doesn’t need intent. It doesn’t need intrinsic desire or secret malice or consciousness to incur real-world cost and consequence. All it needs is task context, tool access, credentials, weak approval boundaries, and a runtime that can act…

Agentic AI systems are missing the language necessary to describe Pathological Self-Assembly, a runtime governance failure mode.

What happens when useful mechanisms (memory, tools, persistence, recovery, delegation, workflow automation, external action, self-monitoring, and operator trust) couple into continuity-preserving behavior?

This is a control draft covering authorization, memory, tools, recovery, delegation, external state, operator trust, and dissolution.

It can’t be just the output anymore. Your thoughts?

u/RJSabouhi — 9 days ago