METHODOLOGY · July 1, 2026
'RAG is dead' is a category error
Every time context windows get bigger, the headline returns: 'RAG is dead, just put everything in the prompt.' It's wrong every time, and it's wrong for an instructive reason — it confuses a technique with a problem. Retrieval didn't die; naive vector-database RAG did, and it's being replaced by smarter retrieval, not by no retrieval. 'X is dead' is almost always the sound of someone mistaking the current implementation for the permanent need underneath it.
There's a headline that comes back like clockwork every time context windows grow: "RAG is dead." The million-token window shipped, so surely you can just stuff everything into the prompt and delete your retrieval pipeline. It's a tidy story, it trends every few months, and it's been wrong every single time. Worth understanding why — because the mistake is a thinking pattern, not a fact about RAG.
Technique vs. problem
"RAG is dead" confuses a technique with the problem it solves. RAG — retrieval-augmented generation — is one implementation. The problem underneath it is permanent: get the right context in front of the model, cheaply and traceably. That problem does not go away when the window gets bigger. If anything it gets harder, because now you can dump everything in — and the model quietly ignores the middle of it while your bill balloons.
The economics alone kill the "just use long context" plan: retrieval is reportedly 8–82x cheaper than stuffing a long context for typical workloads, with better latency. Paying to shove 400k tokens through the model on every call, so it can skim past the one paragraph that mattered, is not an upgrade over fetching that paragraph.
What actually died
Something did die — just not retrieval. Naive RAG died: chunk everything, embed it, top-k cosine similarity, cross your fingers. That specific recipe was always brittle, and it's being replaced by retrieval that's smarter, not absent. The emerging consensus is "naive RAG is dead, sophisticated RAG is thriving": hybrid search, retrieval as an agentic action the model chooses when it needs it, grep and structured queries alongside vectors, high-recall fetch with the model doing the final filtering. The technique evolved. The need didn't move an inch.
"X is dead" is almost always someone mistaking the current implementation for the permanent problem. The plumbing churns constantly. The thing the plumbing is for rarely changes at all.
The pattern is the point
This is worth internalizing beyond RAG, because you'll hear "X is dead" about something every quarter: prompt engineering, fine-tuning, agents, RAG, whatever's next. Run it through the same filter every time:
- Separate the need from the tool. What permanent problem was this technique solving? Retrieval: getting the right context in. That need is not on the deprecation list.
- Ask what specifically broke. Usually it's the naive version, and the "death" is really an evolution. "Naive X is dead, good X is thriving" is the honest headline nine times out of ten.
- Follow the economics and the failure modes. The cheaper, more traceable, more debuggable approach wins in production regardless of what's trending. Long context lost on cost and on the-middle-is-a-graveyard, not on capability.
- Be suspicious of "just put it all in." Any pitch that ends in "you don't need the discipline anymore" is usually selling you a bigger bucket, not a better answer.
For me this ties straight to grounding: getting the right, verifiable context in front of the model is the whole game, and retrieval is how you do it. That requirement is not dying — the tools for meeting it just keep getting better, which is the opposite of dead.
The bottom line
"RAG is dead" comes back every time the window grows and misses the point every time: it kills a technique and declares the problem solved, when the problem — right context, cheap, traceable — is exactly as alive as it ever was.
Don't retire the need when the implementation churns. When you hear "X is dead," ask what problem X solved — and whether that problem actually went anywhere. It usually didn't.
Comments
No comments yet
Sign in to join the conversation.
Be the first to share a thought.