Anatomy of an AI query: queries, chunks, and tokens
A plain-language tour of the three words you keep hearing — what they mean, how they fit together, and how fast they add up once AI starts working on your behalf.
Three words come up over and over when people talk about AI: query, chunk, and token. They sound technical, but they don't have to be. They're three layers of the same simple thing — a question, the content used to answer it, and the math underneath. Once you've got them straight, almost everything else about AI gets easier to follow.
1. The query
A query is whatever the user asks the AI.
That's it. If you type a question, the question is the query. If you upload a document and say "summarize this," the document plus your instruction is the query. Whatever input the user hands the model — that's the query.
The query is the starting point. Everything else follows from it.
2. The chunk
When the AI receives a query, it goes looking for relevant content. As people, we tend to think in whole documents — a book, an article, a magazine issue. AI doesn't.
AI systems work in chunks.
A chunk is a small slice of a document — usually a few paragraphs. When publisher content gets indexed for AI use, each article is cut into chunks. A 20-paragraph article might become five or six chunks, with three or four paragraphs in each. Every chunk is stored separately, and every chunk can be retrieved on its own.
When a query comes in, the AI doesn't fetch the whole article. It fetches the chunks most relevant to the question — usually somewhere between five and a hundred of them — and writes its answer from just those.
So when you read an AI response that says "based on three sources," those sources are almost always three chunks. Not three full articles. Just the slices that matched.
3. The token
Chunks are made of text. But computers don't read text — they read numbers and patterns. So before the AI can do anything with a chunk, the text has to be translated.
The translation is called tokenization, and the pieces it produces are called tokens.
Roughly, you can think of a token as a word. Publishing is one token. The is one token. Punctuation marks and even leading spaces sometimes get their own. Longer or unusual words get broken into pieces — unbelievable might split into un, believ, able. Each token has a numeric ID drawn from a vocabulary of about 200,000.
You can see exactly how this works. Type something below — the chips you see are how the AI sees your text:
That's it. To the model, every sentence in this article is just a sequence of those numbers.
The math: how fast it adds up
A single query might cause the AI to pull in anywhere from 5 to 100 chunks of text to write its answer.
If a typical chunk is around 500 tokens, that means a single AI query can ingest somewhere between 2,500 and 50,000 tokens — for one question.
Some familiar reference points, to give a sense of scale:
So one heavily-researched query can consume the equivalent of an entire short novel — every time it runs.
When one query becomes a thousand
Now picture this. You ask an AI agent:
"What's the most popular restaurant in each of the lower 48 states?"
That looks like one query. But to a modern AI agent, it's actually 48. The agent fans out:
"Most popular restaurant in each of the lower 48 states?"
│
├─ sub-agent → Alabama
├─ sub-agent → Arizona
├─ sub-agent → Arkansas
├─ sub-agent → California
├─ sub-agent → Colorado
├─ … 43 more states …
└─ sub-agent → Wyoming
Each sub-agent runs its own query. Each sub-query pulls its own chunks. If each sub-query pulls 30 chunks at ~500 tokens each, that's about 15,000 tokens per sub-agent — and roughly 720,000 tokens for the full task. From one user question.
This is where AI is heading. Agents will increasingly read, research, and reason on people's behalf, and the chunks they consume will multiply far beyond what any one person could ever read. Every one of those chunks is somebody's work.
That's why these three words matter:
- Queries are how questions enter the system.
- Chunks are how content participates in the answer.
- Tokens are the underlying unit — the currency — that all of it gets paid in.
See it in numbers
Tokens become a lot more concrete once there's a price tag attached. Our ROI calculator walks through what AI access could earn against a real catalog. After this article, the inputs will make sense — and the outputs will start to feel less abstract.
Need help applying this to your catalog, contracts, or AI roadmap?
Our team works with publishers to navigate AI strategy, licensing, and implementation.
Talk to Us