Add balancer meter fixes #4470#4471
Conversation
|
I confirmed that with uno instance configured to use a LoggingProvider, the meters appear in the manager metrics log: |
4598f9c to
d0d9997
Compare
| balancerMetrics.setMigratingCount(params.migrationsOut().size()); | ||
| balancerMetrics.setNeedMigrationCount(migrations.size()); |
There was a problem hiding this comment.
I believe that the migrations variable in the Manager contains all of the in-flight migrations. The params.migrationsOut contains the tablets that need to be migrated from this call to the balancer. I think based on what I'm seeing here, it's double counting. I wonder if we just want the size of the migrations list as that will encompass newly added, newly removed, and migrations waiting to happen.
There was a problem hiding this comment.
If you want to capture the newly added migrations, then it might make sense to capture the migration completions (which are in the TabletGroupWatcher) and cancellations (which happen in a couple places in the Manager).
There was a problem hiding this comment.
So to complete the #4470 ask, just recording migrations.size() would be sufficient.
migrationsOut().size() seems useful as a secondary metric for alerting if balancer progress stalls.
Doing some general tests and with a single balancer call, both of the currently implemented metrics come back with the same values.
I think keeping migrations.size() under the migrations.needed metric makes sense.
in-progress wording may need to be tweaked a bit to better describe last set of migrations generated.
For users, a possible alert condition would be migrationsOut().size() is 0 while migrations.size() is still greater than 0.
There was a problem hiding this comment.
put up #4699 as a continuation of these changes.
- Sorts and turns of formatting for meter exceptions in MetricsIT
|
Closed in favor of #4699 |
Adds two gauges that can be used to monitor balancing
This fixes #4470