Issue
I noticed that Ollama re-loads the model upon every single request.
Version and model info
Aider v0.86.1
I can see at the Ollama server that the model is reloaded at every request.
This is probably due to varying context sizes.
num_ctx = int(self.token_count(messages) * 1.25) + 8192
This really slows down usage in Ollama.
I have not been able to set the context window via .aider.model.settings.yml
num_ctx probably comes from litellm, it would be nice if there was a way in Aider to just set it, to avoid reloading of the model each time.