Skip to content

Ollama reloads the model upon each request #4687

@lhofhansl

Description

@lhofhansl

Issue

I noticed that Ollama re-loads the model upon every single request.

Version and model info

Aider v0.86.1

I can see at the Ollama server that the model is reloaded at every request.
This is probably due to varying context sizes.

        num_ctx = int(self.token_count(messages) * 1.25) + 8192

This really slows down usage in Ollama.
I have not been able to set the context window via .aider.model.settings.yml
num_ctx probably comes from litellm, it would be nice if there was a way in Aider to just set it, to avoid reloading of the model each time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions