-
Notifications
You must be signed in to change notification settings - Fork 22
[MOD-9685] Introduce SVS Basic Benchmarks #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
introduce BM_VecSimSVSTrain class with 2 methods: Train and TrainAsync add GoogleTest to benchmarks so we can use ASSERT_* API tieredIndexMock: possible to initialize with a specific thread count add train bemchmark to CI benchmark dispatcher
rename svs_training_fp32 ->svs_indices_training_fp32 add to bm_files.sh
move svs params init to CreateTieredSVSIndex only 5 iterations
add compressed index bm
remove some prints
move UNIT_AND_ITERATIONS and QUANT_BITS_ARGS to bm_vecsim_basics_Svs
… to bm_training_initialize.h define DATA_TYPE_INDEX_T in bm_svs_training_fp*.cpp remove th 10k and 50k for arm
fix include header for fp16
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #829 +/- ##
=======================================
Coverage 96.63% 96.63%
=======================================
Files 126 126
Lines 7379 7379
=======================================
Hits 7131 7131
Misses 248 248 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
we hard code the name of the data file based on quantBits
benchmark deletsion according to a when we exceed 0.5 index size this benchmark takes 20K ms (20s) for 500 vectors!!! that's a lot after revert - benchmark only gc to detrmine
runGC instead of delete label to not be depnd on consolidation_threshold that can't be controloed and runs for vrey very long!
Co-authored-by: BenGoldberger <[email protected]>
BenGoldberger
approved these changes
Nov 12, 2025
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin 8.2
git worktree add -d .worktree/backport-829-to-8.2 origin/8.2
cd .worktree/backport-829-to-8.2
git switch --create backport-829-to-8.2
git cherry-pick -x 11cdc8b7a8435dfb2e62f37293f6dabd55b7465b |
meiravgri
added a commit
that referenced
this pull request
Nov 13, 2025
* initial imp of training bm introduce BM_VecSimSVSTrain class with 2 methods: Train and TrainAsync add GoogleTest to benchmarks so we can use ASSERT_* API tieredIndexMock: possible to initialize with a specific thread count add train bemchmark to CI benchmark dispatcher * make bm_files general for other algs rename svs_training_fp32 ->svs_indices_training_fp32 add to bm_files.sh * replace std::formtat only supported from gcc13 with ostringstream * format * revrt assert * intialize quantBits move svs params init to CreateTieredSVSIndex only 5 iterations * move iterawtion logic to runTrainBMIteration add compressed index bm * assert depdnding on HAVE_SVS_LVQ * sepearate non compression and compression bm * TO REVERT !!! test abort * fix if else * revrt timeoutgurard vhanges * dont pause after training to see how it affrects performance remove some prints * fix #ifdef HAVE_SVS_LVQ to #if HAVE_SVS_LVQ * use pause timers its faster * do 3 iter instread of 5 and test if results are stable * use 5 again * fix download all all script * fp16 bm remove 100K * remove 100K from fp32 * increase timeout * try bigger machine * try a bigger machine * try 2 iter move UNIT_AND_ITERATIONS and QUANT_BITS_ARGS to bm_vecsim_basics_Svs * unify bm_training_initialize_fp32.h and bm_training_initialize_fp16.h to bm_training_initialize.h define DATA_TYPE_INDEX_T in bm_svs_training_fp*.cpp remove th 10k and 50k for arm * reevet timeout to 10 fix include header for fp16 * move CreateTieredSVSParams and verifyNumThreads to svs params * revert increease machine size * change assert to log * fix * fix2 * format * introduce bm_svs * add tiered add NewIndex from existing svs to tiered factory imp AddLabel add AddLabelBatches(not implmneted) take bm_utils from meiravg_svs_training_bm * introduce setUpdateTriggerThreshold in BUILD_TESTS move initialize index to a function introfuce bm function: addlabel: insert one by one AddLabelBatches: add in batches with one thread AddLabelAsync: add in batches with multiple threads * fix comment add to yml * remove lock * format * fix num threads in addlabelinplace fix assertupdateTriggerThreshold in AddLabelAsync * use train svs instead * format * small fixes * rename BM_VecSimSVSTrain->BM_VecSimSVS bm_vecsim_svs_train.h->bm_vecsim_svs * remove unrelated * align with new name * revert unnecessary changes in bm_vecsim_index add LVQ BM if HAVE_SVS_LVQ * fix include * fix quantbits * extract general * fix missing main on LVQ cpp * replace vectors file * run only BENCHMARK_MAIN * try dummy for mac * fix DATA_TYPE_INDEX_T definition LVQ * quantBits is now static and needs to be intizlied by the CPP file we hard code the name of the data file based on quantBits * TO REVERT: benchmark deletsion according to a when we exceed 0.5 index size this benchmark takes 20K ms (20s) for 500 vectors!!! that's a lot after revert - benchmark only gc to detrmine * REVERT svs.h change consolidation_threshold runGC instead of delete label to not be depnd on consolidation_threshold that can't be controloed and runs for vrey very long! * revert unrelated changes * fix LVQ8 cpp for non LVQ * foirmat * cleanups fix mac * remove new line in cmake * Update tests/benchmark/bm_vecsim_svs.h Co-authored-by: BenGoldberger <[email protected]> --------- Co-authored-by: BenGoldberger <[email protected]> (cherry picked from commit 11cdc8b)
github-merge-queue bot
pushed a commit
that referenced
this pull request
Nov 13, 2025
* [MOD-9685] Introduce SVS Basic Benchmarks (#829) * initial imp of training bm introduce BM_VecSimSVSTrain class with 2 methods: Train and TrainAsync add GoogleTest to benchmarks so we can use ASSERT_* API tieredIndexMock: possible to initialize with a specific thread count add train bemchmark to CI benchmark dispatcher * make bm_files general for other algs rename svs_training_fp32 ->svs_indices_training_fp32 add to bm_files.sh * replace std::formtat only supported from gcc13 with ostringstream * format * revrt assert * intialize quantBits move svs params init to CreateTieredSVSIndex only 5 iterations * move iterawtion logic to runTrainBMIteration add compressed index bm * assert depdnding on HAVE_SVS_LVQ * sepearate non compression and compression bm * TO REVERT !!! test abort * fix if else * revrt timeoutgurard vhanges * dont pause after training to see how it affrects performance remove some prints * fix #ifdef HAVE_SVS_LVQ to #if HAVE_SVS_LVQ * use pause timers its faster * do 3 iter instread of 5 and test if results are stable * use 5 again * fix download all all script * fp16 bm remove 100K * remove 100K from fp32 * increase timeout * try bigger machine * try a bigger machine * try 2 iter move UNIT_AND_ITERATIONS and QUANT_BITS_ARGS to bm_vecsim_basics_Svs * unify bm_training_initialize_fp32.h and bm_training_initialize_fp16.h to bm_training_initialize.h define DATA_TYPE_INDEX_T in bm_svs_training_fp*.cpp remove th 10k and 50k for arm * reevet timeout to 10 fix include header for fp16 * move CreateTieredSVSParams and verifyNumThreads to svs params * revert increease machine size * change assert to log * fix * fix2 * format * introduce bm_svs * add tiered add NewIndex from existing svs to tiered factory imp AddLabel add AddLabelBatches(not implmneted) take bm_utils from meiravg_svs_training_bm * introduce setUpdateTriggerThreshold in BUILD_TESTS move initialize index to a function introfuce bm function: addlabel: insert one by one AddLabelBatches: add in batches with one thread AddLabelAsync: add in batches with multiple threads * fix comment add to yml * remove lock * format * fix num threads in addlabelinplace fix assertupdateTriggerThreshold in AddLabelAsync * use train svs instead * format * small fixes * rename BM_VecSimSVSTrain->BM_VecSimSVS bm_vecsim_svs_train.h->bm_vecsim_svs * remove unrelated * align with new name * revert unnecessary changes in bm_vecsim_index add LVQ BM if HAVE_SVS_LVQ * fix include * fix quantbits * extract general * fix missing main on LVQ cpp * replace vectors file * run only BENCHMARK_MAIN * try dummy for mac * fix DATA_TYPE_INDEX_T definition LVQ * quantBits is now static and needs to be intizlied by the CPP file we hard code the name of the data file based on quantBits * TO REVERT: benchmark deletsion according to a when we exceed 0.5 index size this benchmark takes 20K ms (20s) for 500 vectors!!! that's a lot after revert - benchmark only gc to detrmine * REVERT svs.h change consolidation_threshold runGC instead of delete label to not be depnd on consolidation_threshold that can't be controloed and runs for vrey very long! * revert unrelated changes * fix LVQ8 cpp for non LVQ * foirmat * cleanups fix mac * remove new line in cmake * Update tests/benchmark/bm_vecsim_svs.h Co-authored-by: BenGoldberger <[email protected]> --------- Co-authored-by: BenGoldberger <[email protected]> (cherry picked from commit 11cdc8b) * fix factory --------- Co-authored-by: BenGoldberger <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces a benchmarking suite for the SVS algorithm, including basic operations on loaded indices built on top of the existing training phase infrastructure.
New Benchmarks Added
BM_AddLabelOneByOne - Measures time to add individual vectors to a loaded SVS index one-by-one
BM_TriggerUpdateTiered - Measures time to move vectors from frontend (flat buffer) to backend (SVS index) in tiered index
BM_RunGC - Tests graph repairing after deletions