Update kv metrics and refactor glog init#325
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughDisabled per-metric telemetry in multiple eloqdoc configs; added public flag Changes
Sequence Diagram(s)sequenceDiagram
participant Conf as eloqdoc.conf
participant GO as EloqGlobalOptions
participant KV as EloqKVEngine
participant Metrics as metrics::module
participant Glog as GoogleLogging
rect rgba(200,230,255,0.25)
Conf->>GO: read storage.eloq.metrics (enableMetrics) and optional enableKVMetrics
note right of GO: enableKVMetrics defaults true\nif provided: combined = enableMetrics && enableKVMetrics
GO->>KV: pass eloqGlobalOptions
end
rect rgba(220,255,220,0.18)
KV->>Metrics: metrics::enable_kv_metrics = eloqGlobalOptions.enableKVMetrics
note right of Metrics: gates KV-specific metric emission
end
rect rgba(255,240,200,0.18)
KV->>Glog: InitGlog(custom prefix, per-level files, symlinks, disable stdout/stderr)
Glog-->KV: logging initialized
KV->>Glog: on shutdown -> google::ShutdownGoogleLogging()
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (12)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf (1)
63-64: Confirm intent to fully disable legacy metrics for this artifact
enableMetrics: falsewill turn off the legacy metrics endpoint while leavingmetricsPort: 18081configured but effectively unused. Given the PR’s move towardenableKVMetricsand removal of per-metric knobs, please double‑check that:
- This Concourse artifact is not relied on by tests or tooling that still scrape the old metrics endpoint.
- The new KV‑metrics wiring covers any observability you still need from this profile (or that it’s intentionally metrics‑silent).
If the endpoint is no longer expected to run at all here, you might also consider dropping
metricsPortin a follow‑up to avoid confusion.concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf (1)
55-56: Verify cluster‑wide observability expectations with metrics disabledFor the
cloud‑cnode,enableMetrics: falsemeans this node will no longer expose the legacy metrics endpoint onmetricsPort: 18081. Since this is part of a 3‑node cluster:
- Confirm that all cluster configs (a/b/c) are aligned on whether legacy metrics are enabled or disabled.
- Ensure any dashboards/alerts that scraped these endpoints are either updated to use the new KV‑metrics path or intentionally retired.
If the whole cluster is expected to be metrics‑silent for this profile, keeping
metricsPortaround is harmless but slightly confusing; you could remove it later for clarity.concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf (1)
35-36: Check CI/test expectations after disabling legacy metrics in RocksDB profileHere
enableMetrics: falsedisables the legacy metrics endpoint for the standard RocksDB artifact, while leavingmetricsPortin place. With the newenableKVMetricswiring in code:
- Please confirm that any Concourse jobs or tests using this profile no longer depend on the old metrics endpoint.
- Ensure they are either updated to rely on KV‑metrics (if exposed elsewhere) or no longer require metrics.
If legacy metrics are fully deprecated for this profile, consider dropping
metricsPortin a later cleanup to avoid implying that an endpoint is available.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf(1 hunks)
🔇 Additional comments (2)
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf (1)
53-56: No action required. The configuration is correct as-is.The original review comment incorrectly assumes
enableKVMetricsshould appear in configuration files. However, verification shows:
enableKVMetricsis a code-level parameter defined with a default value oftrueineloq_global_options.h:162, not a required configuration file entry.The logic at
eloq_global_options.cpp:1144shows thatenableKVMetricsis gated byenableMetrics: even ifenableKVMetricswere configured, it would be disabled whenenableMetrics: false.All three cluster configs (a, b, c) consistently omit
enableKVMetricsfrom their configuration files, confirming this is the intended pattern.The configuration change is correct: disabling the legacy metrics system (
enableMetrics: false) appropriately prevents all metrics, including the new KV metrics. The PR successfully adds theenableKVMetricsfeature to the codebase; no additional configuration file changes are needed.Likely an incorrect or invalid review comment.
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf (1)
53-56: Verify operational impact of metrics disablement and ensure new observability system is active.Metrics are consistently disabled across all cluster configurations (
enableMetrics: false). While the code includes a newenableKVMetricsflag alongside other granular metrics options, none of the cluster configuration files have explicitenableKVMetricsentries—meaning the new metrics system is also disabled via the centralenableMetrics: falsesetting.Before merging, confirm:
- The new
enableKVMetricsflag is intentionally disabled alongside other metrics, or whether it should be explicitly enabled in configs as the observability replacement- Existing monitoring dashboards and alerts have been updated or have fallback mechanisms
- This consistent disablement across all clusters is intentional and documented
- A rollback or re-enablement plan exists if observability needs to be restored
819eb19 to
1b220e6
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf(1 hunks)src/mongo/db/modules/eloq/src/eloq_global_options.cpp(2 hunks)src/mongo/db/modules/eloq/src/eloq_global_options.h(1 hunks)src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp(4 hunks)src/mongo/db/modules/eloq/store_handler(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/mongo/db/modules/eloq/store_handler
🚧 Files skipped from review as they are similar to previous changes (1)
- concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf
🧰 Additional context used
🧬 Code graph analysis (1)
src/mongo/db/modules/eloq/src/eloq_global_options.cpp (1)
src/mongo/db/modules/eloq/src/eloq_global_options.h (1)
enableKVMetrics(162-163)
🔇 Additional comments (8)
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf (1)
53-56: Disabling metrics for cluster_b is behavior‑changing but consistent with other configsSetting
enableMetrics: falsewill fully disable metrics export for this cluster (given the per‑metric toggles were removed). That matches the other concourse configs in this PR; just confirm this is acceptable for your observability/alerting for cluster_b.concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf (1)
61-64: Root CLOUD_S3 config now disables metrics globally
enableMetrics: falsewill turn off metrics export for this deployment, while keeping the port configured. This is consistent with the cluster_* configs; please double‑check that CI/production expectations don’t rely on these metrics.src/mongo/db/modules/eloq/src/eloq_global_options.h (1)
149-163: NewenableKVMetricsoption is consistent with existing metrics fieldsAdding
bool enableKVMetrics{true};alongside the other metrics toggles fits the existing pattern (public member with an in‑class default) and gives a clear hook for the KV metrics gate.concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf (1)
53-56: Cluster_a metrics disabled to match other environmentsWith
enableMetrics: false, this cluster won’t export metrics. That keeps cluster_a aligned with cluster_b and the ROCKSDB configs; verify this won’t break any monitoring that was expecting per‑cluster metrics.src/mongo/db/modules/eloq/src/eloq_global_options.cpp (1)
578-586: KV metrics option wiring matches existing metrics patternThe new
storage.eloq.metrics.enableKVMetricsoption (defaulttrue) and the store‑side assignmentif (params.count("storage.eloq.metrics.enableKVMetrics")) { enableKVMetrics = enableMetrics && params["storage.eloq.metrics.enableKVMetrics"].as<bool>(); }mirror how the other fine‑grained metrics toggles are gated by
enableMetrics. That keeps KV metrics disabled whenever both the global and per‑metric gates are explicitly turned off, while defaulting to enabled when metrics are globally on and the KV toggle is omitted.Also applies to: 655-660, 1142-1145
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf (1)
33-36: ROCKSDB config now disables Eloq metrics globally
enableMetrics: falseturns off metrics for this deployment, aligning the non‑cloud ROCKSDB setup with the CLOUD_S3 configs. Please confirm that this is intended and that any dashboards for this environment don’t rely on those metrics.src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp (2)
567-576: KV metrics flag correctly surfaces into metrics namespace
metrics::enable_kv_metrics = eloqGlobalOptions.enableKVMetrics;cleanly connects the new configuration flag through to the metrics subsystem, alongside the existingenable_metricsand per‑metric toggles.
1282-1298: Double-shutdown risk is mitigated; process-wide concerns remain unverifiableCode inspection confirms:
cleanShutdown()sets_txService.reset()(line 1296), and the destructor checksif (_txService)before re-calling it (line 786)—double shutdown is preventedInitGlog()is called once in the constructor;ShutdownGoogleLogging()is called once incleanShutdown()- No logging observed immediately after
cleanShutdown()inshutdownGlobalStorageEngineCleanly()(storage_engine_init.cpp:212–214)However, the broader concern—whether other subsystems in the same process depend on glog after
EloqKVEngineis destroyed—cannot be ruled out without auditing the complete process shutdown sequence. Verify that no other components attempt to log aftershutdownGlobalStorageEngineCleanly()completes.
ebd4a86 to
b415836
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf (1)
55-55: Metrics disabled - consistent with cluster_a and cluster_b.This cluster node also has metrics disabled, consistent with the other cluster configurations. The same operational considerations apply.
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf (1)
55-55: Metrics disabled - consistent with other cluster nodes.This completes the metrics disabling across all cluster configurations (cluster_a, cluster_b, cluster_c). Ensure this is the intended deployment state.
🧹 Nitpick comments (2)
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf (1)
35-35: Verify the operational impact of disabling metrics collection.Metrics collection is now completely disabled (
enableMetrics: false). This affects observability and monitoring capabilities for this deployment. Ensure that:
- Monitoring/alerting systems don't depend on these metrics
- This change aligns with operational requirements
- Alternative observability mechanisms are in place if needed
concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf (1)
55-55: Verify the operational impact of disabling metrics collection.Metrics collection is now completely disabled for this cluster node. This affects observability across the cluster deployment. Ensure this change is intentional and coordinated across all cluster nodes, and that monitoring systems are updated accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_a.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_b.conf(1 hunks)concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc_cluster_c.conf(1 hunks)src/mongo/db/modules/eloq/src/eloq_global_options.cpp(2 hunks)src/mongo/db/modules/eloq/src/eloq_global_options.h(1 hunks)src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp(6 hunks)src/mongo/db/modules/eloq/store_handler(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/mongo/db/modules/eloq/store_handler
🚧 Files skipped from review as they are similar to previous changes (1)
- concourse/artifact/ELOQDSS_ROCKSDB_CLOUD_S3/eloqdoc.conf
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.946Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
📚 Learning: 2025-11-24T03:11:24.946Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.946Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_kv_engine.cppsrc/mongo/db/modules/eloq/src/eloq_global_options.h
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Applied to files:
src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp
📚 Learning: 2025-09-25T12:24:06.434Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 211
File: src/mongo/db/modules/eloq/cmake/build_eloq_log_service.cmake:26-80
Timestamp: 2025-09-25T12:24:06.434Z
Learning: The user githubzilla implemented a fix in commit fe98aaf to address the MEMORY state incorrectly triggering RocksDB discovery in build_eloq_log_service.cmake. This was a control flow issue where MEMORY builds were falling into an else branch that still attempted RocksDB discovery.
Applied to files:
concourse/artifact/ELOQDSS_ROCKSDB/eloqdoc.conf
🧬 Code graph analysis (1)
src/mongo/db/modules/eloq/src/eloq_global_options.cpp (1)
src/mongo/db/modules/eloq/src/eloq_global_options.h (1)
enableKVMetrics(162-163)
🔇 Additional comments (7)
src/mongo/db/modules/eloq/src/eloq_global_options.h (1)
162-162: LGTM! New KV metrics flag added appropriately.The new
enableKVMetricsflag with defaulttruefollows the existing pattern for other metrics flags in this class and provides fine-grained control over KV-level metrics collection.src/mongo/db/modules/eloq/src/eloq_global_options.cpp (2)
655-660: LGTM! KV metrics option registration follows the established pattern.The option registration is consistent with other metrics flags in this file, properly documented, and uses the appropriate default value.
1142-1145: KV metrics gating follows the established pattern correctly.The parsing logic applies the same gating pattern as other per-metric flags (lines 1100-1140):
enableKVMetricsis set toenableMetrics && option_value. This means KV metrics can only be enabled whenenableMetricsis true. Since the current configs disableenableMetrics, KV metrics will be disabled regardless of this flag's default value.src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp (4)
573-573: LGTM! KV metrics configuration and registration implemented correctly.The KV metrics flag is properly set from global options (line 573) and registration (lines 588-596) is correctly gated by both the global
enable_metricsflag and the specificenable_kv_metricsflag. The common labels include appropriate node identification.Also applies to: 588-596
1297-1297: LGTM! Proper cleanup added for log server and glog.Line 1297 explicitly nullifies
_logServerduring shutdown, and line 1308 properly shuts down Google Logging. These cleanup steps are necessary and correctly placed in thecleanShutdown()sequence.Also applies to: 1308-1308
1348-1371: LGTM! Custom log prefix function implemented correctly.The
CustomPrefixfunction properly formats log entries with ISO8601-style timestamps, severity, thread ID, and optional source location. The formatting usesstd::setwandstd::setfillcorrectly (now available via the<iomanip>header). The function signature matches the older glog version in use, as confirmed by previous discussion. Based on learnings, this implementation is correct for your glog version.
1373-1395: Based on my verification of glog's implementation and documentation:Symlink configuration is correct; resolve review comment.
The
SetLogSymlinkfunction operates per severity level, with each severity having its ownlog_destinationinstance. A pair of files is generated for each log severity (INFO, WARNING, ERROR, FATAL), and each severity maintains separate symlinks. Setting all three severity levels to the same symlink base name "txservice" is the standard pattern—it does not cause a collision. Each severity level maintains its own independent symlink (e.g.,txservice.INFO,txservice.WARNING,txservice.ERROR).The log file naming with
.INFO.,.WARNING.,.ERROR.suffixes aligns with glog's per-severity logging design.
b415836 to
46a4862
Compare
46a4862 to
32bde02
Compare
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.