The RAG Pipeline Is Dead. Probably. Maybe. I Don't Know.

What RAG Is and Why Anyone Bothered Retrieval-Augmented Generation, or RAG, was the answer to a straightforward problem: LLMs are smart but amnesiac. They know a lot, but only up to their training cutoff, and they know nothing about your data. RAG solved this by bolting on an external retrieval layer — typically a vector store — that chunks your documents, embeds them into a high-dimensional space, and retrieves semantically relevant chunks at query time to inject into the model’s context window. The LLM never actually “learns” your data. It just gets handed relevant pieces of it right before it answers. Think of it less as teaching the model and more as handing it a briefing document every single time it walks into the room. ...

March 13, 2026 · 8 min · Remington Winters