Taking a stab at writing up the basics of LLM / GPT / Claude / Gemini to see if we can bridge the knowledge gap.

Concepts covered: LLM, tokens, context window

What is an LLM?

Actually, before answering that...let me explain this first. ChatGPT, Claude, and Gemini are all the same kind of thing: an app built on top of an LLM. The LLM is the model doing the actual work, and the model has a name like GPT-5.5, Claude Opus 4.7, or Gemini 3. So if someone asks "which model are you using?", you would be responding with "Opus 4.7," not "Claude." Claude is the app; Opus 4.7 is the LLM inside it. I'll mostly just say "the model" below.

To further elaborate: it’s an autocomplete trained on a bonkers amount of text. It doesn’t know things. It predicts the next chunk of text given what came before. Everything downstream (chat, agents, RAG, reasoning, your company’s shiny AI strategy deck) is built on top of that one move. You don't need to know what those all mean (yet).

The model is not answering your question. It is continuing your text. It’s just that the most likely continuation of a question is usually an answer. That is the whole trick.

But why does it feel like it knows things?

AI is just really good at pattern matching.

Case in point...finish this sentence: “I am not afraid of storms, for I am learning how to sail my ______.” You probably said ship. Not necessarily because you’ve read Little Women, but the pattern is overwhelming. Maybe you said boat. Fine. What matters is that you definitely didn’t say biscuits.

Now imagine you’ve read everything. Every book, every reddit post, every tweet, every recipe blog with paragraphs about the blogger’s grandma you don’t really care to know about. That’s what the model is trained on. It has a really, really, really ridiculously good sense of which words live near which other words in which contexts. Like ships and boats but not biscuits.

This is also why it sounds sure of itself when it’s wrong. It’s optimizing for plausibility, not truth. A made-up research paper looks exactly like a real one, sentence by sentence. The model can’t tell the difference. Neither can you, until you check.

I've been using GPT / Gemini / Claude to draft emails or have a chat. Some days, it's reading my mind. Other days it's an idiot.

So there are a few things happening under the hood that might be causing this. I'll go over three so that you can actually start to steer it.

1. Getting something saltless, generic, beige?

What's happening? Remember how it doesn't know things? Yeah, LLM isn't answering you. It's continuing your text (again, fancy autocomplete). So it looks at what you wrote and predicts what plausibly comes next.
To fix: don't ask but tee up the situation. Instead of instructing it to "write a follow up email" or "say no to attending the family reunion next weekend", paste the original thread, say who you want to respond to, and give it more context. The more real thing it has to continue them from, the less it falls back on tepid, average responses.

2. "How did you already forget what I told you? Aren't you listening?"

What's happening? The model can only "see" a limited amount of text at once. It's got memory but has a hard cliff. As a conversation grows, the oldest stuff just falls off and poofs away.
Two concepts to go over that are relevant here. (1) Tokens: the LLM doesn't see words. First it chops text into chunks called tokens (it's sometimes a whole word, sometimes a snippet, sometimes a single character). Then it converts those tokens into numbers (called embeddings). Under the hood it's all math, not language. When we say it "predicts the next chunk," it's really doing that prediction on the numbers. This matters here because the context window is measured in tokens. (2) Context window: How much the model can hold in tokens. Memory with a giant ass cliff. Bigger window = more expensive to run.
To fix: treat long chats as disposable. If something matters, you might need to restate it as the thread continues. When the quality drops deep into a long convo, that's probs because your early context aged out. You can ask it to summarize the conversation so far and start a fresh chat. Also put the most important instruction at the top or bottom because...
...Similar to us, stuff at the beginning and the end is more memorable than the middle bits for the model, too.

3. Flat out wrong

What's happening? Again, it's the fancy autocomplete doing autocomplete thangs. It's designed to optimize for what sounds plausible, not what's true. A made-up stat and a real one probably looks pretty similar word by word...the model doesn't know.
To fix: use it as a writer, not a primary source. It's good at phrasing, structuring, summarizing, drafting, thinking out loud. If you need to fact check something, verify it yourself. Not because it's usually wrong but when it is wrong, it'll give you zero signal.

Does this help? + Where this can go next

So I wanted to cover the absolute basics by starting out with a common complaint for someone starting out (chats feeling inconsistent) to cover what LLM actually is, idea of a token and context window.

If this is helpful, I can continue in a similar format and cover progressively more in-depth topics. As an example...

Next: how to stop re-typing the same context every time you ask something (concepts covered: prompt, prompt engineering, projects, custom GPT, Claude skills)

See title. I'm building something to make AI agents more consistent and reliable, but people's understanding varies wildly from "I've used GPT" to "I'm building an agent army in n8n." Curious where y'all are at.

And if you fall more within the former category, would a writeup bringing you up to speed on LLM basics (MCP, context window, agents, prompt injection, etc.) be helpful? I've been writing a (pretty long) essay on it and trying to figure out if there's an audience before I share it.

Edit: thanks for all your responses. I got the answer I was looking for. Confirming it's wildly varied. 😂 and I think I will go ahead and write up a longer piece that explains the basics. Thank you all.

u/ChloeAroundTheCorner

Made my own TN card inserts

LLM Part 1 - What is it and why does it seem so stupid AND smart?

What is an LLM?

But why does it feel like it knows things?

I've been using GPT / Gemini / Claude to draft emails or have a chat. Some days, it's reading my mind. Other days it's an idiot.

Does this help? + Where this can go next

Korean language support?

What's your comfort level with AI / LLM?