Why ChatGPT Gets Dumber the Longer You Talk to It

Written by Madalina Turlea
15 Jan 2026
If you have had a long conversation with ChatGPT or Claude, you have probably noticed it. At the start it is snappy, it keeps answering, it knows all the stuff. The longer you talk to it, the slower it gets, and it sometimes focuses on the wrong things. There is a concrete reason for this, and it is called the context window.
What the context window is
Every model has a maximum number of tokens it can keep in its memory, or operate with at once. That maximum is the context window. One model, for example, has a context of 200,000 tokens. Context length also varies between models: one powerful model could only take half the maximum context of GPT-5, the same as a much smaller model.
The model does not remember your conversation
Here is the part that surprises people. It is not that the AI knows about your conversation and carries it along. Every time you ask a new question, it sends all the history from before, plus your new question, together. There can be some optimisations to this, but that is the basic picture.
So as the conversation grows, the amount of text being sent grows with it. At some point, everything you have said reaches the maximum context window, and the model tells you that you cannot continue talking in this conversation, because it has reached its limit. It is simply not able to hold more than that much information.
That is also why the model can start focusing on the wrong stuff late in a long chat. There is more and more competing for its attention inside that fixed window.
What to do about it
There are tools and techniques to work around this, built into the larger infrastructure around models, but the simplest thing to understand is the cause. The degradation you feel in a long chat is not the model getting tired or confused in a human sense. It is the context window filling up with everything that came before.
Knowing this changes how you use it. If a conversation has gone long and the answers are getting worse, starting fresh and bringing only the relevant context back in will often work better than pushing further into a window that is already full.
You might also like

Tokens, Explained: What You Are Actually Paying For When You Build With LLMs
Tokens are the thing you pay for and the only number you can rely on. What a token actually is, why the same prompt costs different amounts across models, and the hidden reasoning tokens.

Start With the Prompt, Not RAG: Giving an AI Feature Access to Your Knowledge
The instinct is to reach for RAG. The advice that works is the opposite: start by stuffing the important knowledge into the prompt, and only add complexity when the volume actually demands it.

Markdown, XML, or JSON: How to Format a Prompt So the Model Understands It
The format you put your prompt in changes accuracy more than you would expect. When to use Markdown, when to reach for XML tags, and when JSON is the right tool.