Skip to content

feat(embedding): make vector dimension configurable#324

Open
Windsander wants to merge 1 commit into
EverMind-AI:mainfrom
Windsander:feat/configurable-embedding-dim
Open

feat(embedding): make vector dimension configurable#324
Windsander wants to merge 1 commit into
EverMind-AI:mainfrom
Windsander:feat/configurable-embedding-dim

Conversation

@Windsander

Copy link
Copy Markdown

Motivation

Different embedding models produce vectors with different dimensions (e.g., intfloat/multilingual-e5-small uses 384, while intfloat/multilingual-e5-large uses 1024). Currently EverOS hard-codes _DIM = 1024 in LanceDB table schemas and build_embedding_provider, which prevents using smaller/larger embedding models without patching source code.

Changes

  • Added dim field to EmbeddingSettings with env binding EVEROS_EMBEDDING__DIM.
  • Added dim = 1024 default in default.toml.
  • Updated build_embedding_provider to use settings.dim when no explicit dim is passed.
  • Updated LanceDB table schemas (agent_case, agent_skill, atomic_fact, episode, foresight, knowledge_topic) to read _DIM from load_settings().embedding.dim instead of hard-coding 1024.

Backward compatibility

The default remains 1024, so existing configurations continue to work without changes.

Testing

  • python -m py_compile passes for all modified Python files.
  • The changes are minimal and do not alter behavior unless EVEROS_EMBEDDING__DIM or [embedding].dim is explicitly set.

- Add dim field to EmbeddingSettings
- Use settings.dim in build_embedding_provider
- Read dimension from settings in LanceDB table schemas
- Add default dim = 1024 in default.toml

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Windsander

Copy link
Copy Markdown
Author

Hi EverOS team, thanks for the great work on this project.

We're building CodeKeeper Advance (which will release to Github after we fullfill all features 😊) on top of EverOS. In our setup, we use embedding models like intfloat/multilingual-e5-small, which outputs 384-dim vectors. The hard-coded _DIM = 1024 in LanceDB table schemas and build_embedding_provider currently prevents us from using these models without forking/patching EverOS.

This PR makes the embedding dimension configurable via EVEROS_EMBEDDING__DIM or [embedding].dim, while keeping 1024 as the default so existing users are not affected. The changes are intentionally minimal and only touch the embedding pipeline.

Happy to adjust naming or implementation based on your feedback. Thanks for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant