Start With the Prompt, Not RAG: Giving an AI Feature Access to Your Knowledge

Written by Madalina Turlea
15 Jan 2026
When an AI feature needs to draw on your own documents, like a course curriculum or an internal policy, the instinct is to reach for a retrieval system. The advice that works better is the opposite: do as little infrastructure around it as possible, and only add complexity when you actually need it.
The options, from simplest to most involved
There is a ladder of ways to give a model the knowledge it needs, and most teams jump too far up it too soon.
The first option is to extract the most important parts of the source, the key concepts, the lessons, the structure, and put that directly into the prompt. For many features this is enough on its own.
The second option is to give the model a link to the page and access to the web, so that when it runs the prompt it visits the page and pulls what it needs. This is more expensive, because of how web search works for an LLM: it goes to the page, extracts the text, tries to understand it, and then takes the information out, all of which costs tokens.
The third option applies when you scale beyond a single document, say to ten courses instead of one. You can either write one prompt per course, or keep one generic prompt and pass the course information as part of the test case, so the user's question arrives together with the course they are asking about.
The fourth option, and the last one to reach for, is a RAG system, which stands for Retrieval Augmented Generation. You give the AI a database of information that it retrieves from and includes in the prompt. If you build this, it also has to be validated, because you need to check whether it is pulling the right information.
Why the prompt goes further than people expect
The reason to resist jumping to RAG is that you can do a surprising amount with just the prompt and the context window.
One example: a company had an intern rating other companies on their sustainability factors, working from a big Notion document that explained all the logic and what to check. Putting that whole document into the prompt reached 90% accuracy, and the experiment even surfaced errors in the humans' earlier decisions.
So the rule of thumb is to start at the bottom of the ladder. Put the important knowledge in the prompt, see how far that gets you, and only add web access, per-case context, or a retrieval system when the volume of information genuinely demands it. Each step up adds cost and something new to validate, so you want to earn your way there, not start there.
You might also like

When to Use RAG Instead of Putting the Whole Document in the Prompt
Putting the whole policy in the prompt works — until you have many policies or one that has grown too big. The simple rule for when to switch to RAG, and the cost reason underneath it.

Why ChatGPT Gets Dumber the Longer You Talk to It
Long chats degrade for a concrete reason: the context window. The model is not remembering — it is being re-sent the whole conversation every turn, and there is a hard limit.

Does the AI Know What Day It Is? Cutoff Dates, Web Search, and Why Models Differ
Models do not know the current date and stop knowing the world at their cutoff. What the cutoff is, why web search is expensive, and the three dimensions that actually make models different.