So… how does LLM actually work?
LLM doesn’t think. It doesn’t read your files. It doesn’t decide anything.
All it does is generate the next token (a token is a chunk of text; sometimes a word, sometimes part of a word, sometimes just punctuation). One token at a time. With some randomness built in.
It’s predicting. That’s it.
But what about file that you asked it to read the other day and it did!
Well… it didn’t. When you send a prompt, the LLM doesn’t just receive your message. It also gets a list of tools available to it: read a file, write a file, search the web, run code, etc.
So when you say “read my file”, the LLM doesn’t understand your request and decide to help. It predicts: user said “read file” + instructions say tool “read file” exists → statistically, the next tokens should be a call to that tool.
Prediction + Memory in play.
Next time you use ChatGPT or Claude and it does something that looks smart, remember: it didn’t choose to help you. It predicted that helping is what comes next.
graph TD A["Tools list"] --> C["LLM - Predicts next token"] B["Your prompt"] --> C C --> D{"Prediction:<br/>tool or text?"} D -- "Tool" --> E["Tool executes"] D -- "Text" --> F["Response to you"] E -- "Result back" --> C classDef prompt fill:#dee3c6,stroke:#758879,stroke-width:1.5px,color:#2d2d30 classDef toolsList fill:#e3d7c6,stroke:#8b7e7a,stroke-width:1.5px,color:#2d2d30 classDef llm fill:#7c838e,stroke:#5a6f8f,stroke-width:1.5px,color:#f5f5f0 classDef decision fill:#ddc6e3,stroke:#7e7a8b,stroke-width:1.5px,color:#2d2d30 classDef toolExec fill:#e3c6d2,stroke:#8b7a87,stroke-width:1.5px,color:#2d2d30 classDef response fill:#c5e0cb,stroke:#758879,stroke-width:1.5px,color:#2d2d30 class A toolsList class B prompt class C llm class D decision class E toolExec class F response