Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR demonstrates an alternative searching approach for sandcastles that uses a vector embedding search instead of the existing pagefind search. It updates the process of building the gallery to create a vector embedding of each sandcastle, and then compares those embeddings against the user query.
To allow everything to run locally, we have selected a small MIT licensed model from Huggingface. One consideration for this PR is concerns over model download size and hardware performance limitations. Memory usage from the model appeared in testing to not rise above what is consumed for several of the existing sandcastles, however the download size is approximately 30MB which is several times larger than the largest file download I have been able to produce from existing sandcastles.
I will open this PR as a draft for now, as modifications will likely need to be made to support a hybrid approach between Pagefind and Vector embedding search, which we can discuss further within the PR.
Issue number and link
https://github.com/iTwin/platform-bentley-community/issues/306
Testing plan
My changes were tested manually against a small subset of search queries that I found to be gaps in the current Cesium Sandcastle search.
I did not include any automated testing of the functionality as it doesn't seem testing is configured in the application, but I would be glad to add automated testing if that is requested before moving the PR out of draft.
Author checklist
CONTRIBUTORS.mdCHANGES.mdwith a short summary of my change