Is context vs. admissible evidence an under-specified problem in LLM systems?
Question for people working with LLMs / RAG:
If a model sees text in its context window, how do we make sure it knows whether that text is actually valid evidence?
Ex: prompt might include current docs, old docs, retrieved snippets, answer choices, and injected text. All of that is “context,” but not all of it should count as evidence.
You think it’s mainly a RAG/provenance problem, or prompt-injection problem, or just something we need better evals for?
I’m thinking of this as a source-boundary failure, as though the model treats text as evidence just because it is present.