The context window is akin to a short-term memory that determines how much text the model can consider for generating responses.
For instance, GPT-3 can manage a context of 2,000 tokens, while GPT-4 Turbo extends to 128,000 tokens.