Skip to content

Read core_number from data substrate#421

Merged
liunyl merged 1 commit intomainfrom
core_num
Jan 12, 2026
Merged

Read core_number from data substrate#421
liunyl merged 1 commit intomainfrom
core_num

Conversation

@liunyl
Copy link
Copy Markdown
Contributor

@liunyl liunyl commented Jan 12, 2026

Eloqdoc should not read directly from data substrate config file. It should read from data substrate.

Summary by CodeRabbit

  • Refactor
    • Consolidated data substrate configuration and moved initialization to a single, unified flow.
    • Removed duplicate startup wiring for data substrate and logging; startup sequence and logging initialization streamlined across components.

✏️ Tip: You can customize this high-level summary in your review settings.

@liunyl liunyl requested a review from githubzilla January 12, 2026 04:01
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 12, 2026

Caution

Review failed

The pull request is closed.

Walkthrough

Consolidates DataSubstrate configuration and Google Logging initialization into eloq_global_options.cpp, replacing INI-based parsing with DataSubstrate-driven initialization and moving bootstrap/core_num assignments to DataSubstrate outputs; removes duplicate init logic from eloq_kv_engine.cpp.

Changes

Cohort / File(s) Summary
Global options / init
src/mongo/db/modules/eloq/src/eloq_global_options.cpp
Replaces INI parsing with DataSubstrate::Instance().Init(FLAGS_data_substrate_config) flow; includes <filesystem>; initializes Google Logging using serverGlobalParams.logpath parent; sets serverGlobalParams.bootstrap and reservedThreadNum from DataSubstrate::Instance().GetCoreConfig(); changes flag declaration to DEFINE_string(data_substrate_config, ..., ...).
Engine startup cleanup
src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp
Removes DEFINE_string(data_substrate_config, ...), the local Google Logging init, and the explicit DataSubstrate::Instance().Init(...) call — startup now relies on centralized init in global options.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Init as Process startup
    participant Global as EloqGlobalOptions
    participant DataSub as DataSubstrate::Instance()
    participant GLog as Google Logging
    participant SGP as serverGlobalParams

    Init->>Global: call global options init
    Global->>GLog: determine log_dir from SGP.logpath parent
    Global->>GLog: InitGoogleLogging(argv_synthetic)
    Global->>DataSub: Init(FLAGS_data_substrate_config)
    DataSub-->>Global: return CoreConfig { bootstrap, core_num }
    Global->>SGP: set bootstrap = CoreConfig.bootstrap
    Global->>SGP: set reservedThreadNum = CoreConfig.core_num
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • githubzilla
  • lzxddz

Poem

🐰 I hopped through code at break of day,
Moved configs and logs the DataSub way,
Bootstrap and cores now come from one tree,
A tidy start for Eloq — from me to thee! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Read core_number from data substrate' directly aligns with the main objective stated in the PR description and is supported by the primary code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch core_num

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32aa3e0 and 30dd807.

📒 Files selected for processing (2)
  • src/mongo/db/modules/eloq/src/eloq_global_options.cpp
  • src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @src/mongo/db/modules/eloq/src/eloq_global_options.cpp:
- Around line 82-85: DataSubstrate::Instance().Init(...) is called without
checking its return value; modify the code around
DataSubstrate::Instance().Init(FLAGS_data_substrate_config) to capture its
Status/return value, check for non-OK/failure, and on error log an explanatory
message and return that non-OK Status (or otherwise propagate the error) instead
of proceeding to call GetCoreConfig(); ensure subsequent uses like
DataSubstrate::Instance().GetCoreConfig(), serverGlobalParams.bootstrap and
serverGlobalParams.reservedThreadNum only run when Init succeeded.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30c7bf6 and 32aa3e0.

📒 Files selected for processing (2)
  • src/mongo/db/modules/eloq/src/eloq_global_options.cpp
  • src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp
💤 Files with no reviewable changes (1)
  • src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: concourse/scripts/main.ent.bash:38-39
Timestamp: 2025-11-28T10:52:25.527Z
Learning: In the eloqdoc enterprise build (main.ent.bash), eloq_log_service is fetched separately as the resource `eloq_logservice_src` in the pipeline (main.ent.yml) and then symlinked into the data_substrate directory structure at build time using `ln -s $WORKSPACE/eloq_logservice_src eloq_log_service`. This is the intended architecture for placing eloq_log_service within data_substrate in the enterprise configuration.
📚 Learning: 2025-11-14T03:10:22.027Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/dbmain.cpp:0-0
Timestamp: 2025-11-14T03:10:22.027Z
Learning: In this repository, avoid defining --config via gflags in src/mongo/db/dbmain.cpp; MongoDB’s options_parser owns --config. Prefer gflags flags with a distinct prefix (e.g., --data_substrate_config) parsed before options_parser, with options_parser configured to allow unregistered options.

Applied to files:

  • src/mongo/db/modules/eloq/src/eloq_global_options.cpp
📚 Learning: 2025-11-28T10:55:21.020Z
Learnt from: githubzilla
Repo: eloqdata/eloqdoc PR: 279
File: src/mongo/db/modules/eloq/CMakeLists.txt:54-55
Timestamp: 2025-11-28T10:55:21.020Z
Learning: In src/mongo/db/modules/eloq/CMakeLists.txt, the CMake options EXT_TX_PROC_ENABLED, FORK_HM_PROCESS, and STATISTICS are used by the data_substrate submodule build system and should be retained as configuration interface options.

Applied to files:

  • src/mongo/db/modules/eloq/src/eloq_global_options.cpp
📚 Learning: 2025-11-24T03:11:24.955Z
Learnt from: lzxddz
Repo: eloqdata/eloqdoc PR: 325
File: src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp:1362-1384
Timestamp: 2025-11-24T03:11:24.955Z
Learning: The EloqDoc codebase uses an older version of glog that supports the two-parameter signature `google::InitGoogleLogging(const char* argv0, PrefixCallback)` for custom log prefix formatting, as seen in src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp line 1383.

Applied to files:

  • src/mongo/db/modules/eloq/src/eloq_global_options.cpp
🔇 Additional comments (4)
src/mongo/db/modules/eloq/src/eloq_global_options.cpp (4)

24-24: LGTM!

The new includes are appropriate for the added filesystem and logging functionality.

Also applies to: 28-28


37-37: LGTM!

The gflags definition uses the appropriate data_substrate_ prefix as per repository conventions. Based on learnings, this avoids conflicts with MongoDB's options_parser which owns --config.


72-77: LGTM!

The static qualifier on logdir ensures the path persists for the lifetime needed by SetCommandLineOption. The has_parent_path() check correctly handles edge cases where logpath might be empty or a filename without a directory component.


78-80: No issue found. EloqGlobalOptions::store() is called exactly once during startup via the MONGO_STARTUP_OPTIONS_STORE initialization macro. It is not invoked during config reloads or in any scenario that would cause InitGoogleLogging to be called multiple times.

Comment on lines +82 to +85
DataSubstrate::Instance().Init(FLAGS_data_substrate_config);
serverGlobalParams.bootstrap = DataSubstrate::Instance().GetCoreConfig().bootstrap;
MONGO_LOG(1) << "serverGlobalParams.bootstrap: " << serverGlobalParams.bootstrap;

const char* field_core = "core_number";
uint32_t core_num = FLAGS_core_number;
if (CheckCommandLineFlagIsDefault(field_core)) {
if (ds_config_reader.HasValue("local", field_core)) {
core_num = ds_config_reader.GetInteger("local", field_core, 0);
assert(core_num);
} else {
if (!NUM_VCPU) {
MONGO_LOG(1) << "config is missing: " << field_core;
return Status{ErrorCodes::InvalidOptions,
str::stream() << "config is missing: " << field_core};
}
const uint min = 1;
if (enable_data_store) {
core_num = std::max(min, (NUM_VCPU * 3) / 5);
MONGO_LOG(1) << "give cpus to checkpointer " << core_num;
} else {
core_num = std::max(min, (NUM_VCPU * 7) / 10);
}
MONGO_LOG(1) << "config is automatically set: " << field_core << "=" << core_num
<< ", vcpu=" << NUM_VCPU;
}
}
serverGlobalParams.reservedThreadNum = core_num;
serverGlobalParams.reservedThreadNum = DataSubstrate::Instance().GetCoreConfig().core_num;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the file and the specific lines mentioned
cat -n src/mongo/db/modules/eloq/src/eloq_global_options.cpp | head -100 | tail -30

Repository: eloqdata/eloqdoc

Length of output: 1607


🏁 Script executed:

# Search for DataSubstrate class definition and Init method
rg -n "class DataSubstrate" --type cpp

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Find where DataSubstrate::Init is defined
rg -n "Init\(" --type cpp -A 5 | grep -A 5 "DataSubstrate"

Repository: eloqdata/eloqdoc

Length of output: 1066


🏁 Script executed:

# Look for DataSubstrate::Instance().Init() usage patterns in the codebase
rg -n "DataSubstrate::Instance\(\)\.Init" --type cpp -B 2 -A 3

Repository: eloqdata/eloqdoc

Length of output: 771


🏁 Script executed:

# Check the complete function that contains lines 82-85
sed -n '70,95p' src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 1059


🏁 Script executed:

# Find DataSubstrate class definition
rg -n "class DataSubstrate" --type cpp -A 20

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Search for Init method definition in DataSubstrate
rg -n "DataSubstrate.*Init\(" --type cpp -B 3 -A 10

Repository: eloqdata/eloqdoc

Length of output: 1446


🏁 Script executed:

# Check for DataSubstrate.h header files
find . -name "*data_substrate*" -type f

Repository: eloqdata/eloqdoc

Length of output: 444


🏁 Script executed:

# Look for Init method with signature
ast-grep --pattern 'Init($$$)'

Repository: eloqdata/eloqdoc

Length of output: 1461


🏁 Script executed:

# Search for files containing DataSubstrate
fd ".*" --type f | xargs rg -l "class DataSubstrate" 2>/dev/null

Repository: eloqdata/eloqdoc

Length of output: 44


🏁 Script executed:

# Search in the eloq directory specifically
fd ".*" src/mongo/db/modules/eloq --type f | xargs rg -l "DataSubstrate" 2>/dev/null | head -20

Repository: eloqdata/eloqdoc

Length of output: 319


🏁 Script executed:

# Look for DataSubstrate header file
find src -name "*.h" -o -name "*.hpp" | xargs rg "class DataSubstrate" 2>/dev/null

Repository: eloqdata/eloqdoc

Length of output: 44


🏁 Script executed:

# Check what files are in the eloq directory
find src/mongo/db/modules/eloq -name "*.h" -o -name "*.cpp" | head -30

Repository: eloqdata/eloqdoc

Length of output: 1433


🏁 Script executed:

# Search all cpp and h files for DataSubstrate class definition
rg "class DataSubstrate" --type cpp --type h -A 30

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Search for DataSubstrate::Instance in includes or other patterns
rg "DataSubstrate" --type cpp --type h | head -40

Repository: eloqdata/eloqdoc

Length of output: 1489


🏁 Script executed:

# Check the includes in eloq_global_options.cpp
head -50 src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 1970


🏁 Script executed:

# Find the data_substrate.h header file
find . -name "data_substrate.h" -type f

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Check if it's in a third-party or external directory
find . -path "*data_substrate*" -name "*.h" -type f

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Look for the Init method signature in the header
grep -r "Init(" src/mongo/db/modules/eloq/data_substrate --include="*.h" -A 5 2>/dev/null | head -40

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Check what's in the data_substrate directory
ls -la src/mongo/db/modules/eloq/data_substrate/ 2>/dev/null || echo "Directory doesn't exist or is empty"

Repository: eloqdata/eloqdoc

Length of output: 168


🏁 Script executed:

# Check if data_substrate is a git submodule or external dependency
cat .gitmodules 2>/dev/null | grep data_substrate || echo "Not found in gitmodules"

Repository: eloqdata/eloqdoc

Length of output: 163


🏁 Script executed:

# Check all includes in eloq_global_options.cpp
grep -n "#include" src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 663


🏁 Script executed:

# Search for where Init return type might be documented
rg "void.*Init\|Status.*Init\|bool.*Init" src/mongo/db/modules/eloq --type cpp --type h -A 3

Repository: eloqdata/eloqdoc

Length of output: 42


🏁 Script executed:

# Check if there's a return value check pattern elsewhere in similar initialization code
rg "Instance\(\)\.Init" src/mongo/db/modules/eloq --type cpp -B 2 -A 5

Repository: eloqdata/eloqdoc

Length of output: 980


🏁 Script executed:

# Check if there are any comments or documentation about Init() in the eloq files
rg -n "Init" src/mongo/db/modules/eloq/src/eloq_global_options.cpp -B 3 -A 3

Repository: eloqdata/eloqdoc

Length of output: 556


🏁 Script executed:

# Check if the include path gives us any clues
grep -n "data_substrate.h" src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 140


🏁 Script executed:

# Look at eloq_kv_engine.cpp to see if there's any Init() error handling elsewhere
grep -n "Init\|Shutdown\|EnableEngine" src/mongo/db/modules/eloq/src/eloq_kv_engine.cpp -B 2 -A 3

Repository: eloqdata/eloqdoc

Length of output: 1299


🏁 Script executed:

# Check for Status return patterns in eloq_global_options.cpp
grep -n "Status" src/mongo/db/modules/eloq/src/eloq_global_options.cpp | head -20

Repository: eloqdata/eloqdoc

Length of output: 275


🏁 Script executed:

# Check if there's any documentation or comments about error handling expectations
grep -rn "error\|Error\|fail\|Fail" src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 201


🏁 Script executed:

# Look at the function signature to understand context
sed -n '55,90p' src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 1793


🏁 Script executed:

# Check if there are try-catch blocks anywhere in the function
grep -n "try\|catch" src/mongo/db/modules/eloq/src/eloq_global_options.cpp

Repository: eloqdata/eloqdoc

Length of output: 42


Add error handling for DataSubstrate initialization.

The DataSubstrate::Instance().Init() call on line 82 does not check for failure, yet the function returns Status and demonstrates error handling elsewhere (line 68). If initialization fails, the function still returns Status::OK() and subsequent GetCoreConfig() calls may operate on uninitialized state, allowing the server to start with invalid configuration.

Suggested fix: Add error handling
-    DataSubstrate::Instance().Init(FLAGS_data_substrate_config);
+    auto init_status = DataSubstrate::Instance().Init(FLAGS_data_substrate_config);
+    if (!init_status) {  // Adjust based on actual return type
+        return Status{ErrorCodes::InternalError,
+                      str::stream() << "Failed to initialize DataSubstrate"};
+    }

Note: The exact error handling pattern depends on how Init() signals failure. Refer to the DataSubstrate header for the correct return type and error condition check.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
DataSubstrate::Instance().Init(FLAGS_data_substrate_config);
serverGlobalParams.bootstrap = DataSubstrate::Instance().GetCoreConfig().bootstrap;
MONGO_LOG(1) << "serverGlobalParams.bootstrap: " << serverGlobalParams.bootstrap;
const char* field_core = "core_number";
uint32_t core_num = FLAGS_core_number;
if (CheckCommandLineFlagIsDefault(field_core)) {
if (ds_config_reader.HasValue("local", field_core)) {
core_num = ds_config_reader.GetInteger("local", field_core, 0);
assert(core_num);
} else {
if (!NUM_VCPU) {
MONGO_LOG(1) << "config is missing: " << field_core;
return Status{ErrorCodes::InvalidOptions,
str::stream() << "config is missing: " << field_core};
}
const uint min = 1;
if (enable_data_store) {
core_num = std::max(min, (NUM_VCPU * 3) / 5);
MONGO_LOG(1) << "give cpus to checkpointer " << core_num;
} else {
core_num = std::max(min, (NUM_VCPU * 7) / 10);
}
MONGO_LOG(1) << "config is automatically set: " << field_core << "=" << core_num
<< ", vcpu=" << NUM_VCPU;
}
}
serverGlobalParams.reservedThreadNum = core_num;
serverGlobalParams.reservedThreadNum = DataSubstrate::Instance().GetCoreConfig().core_num;
auto init_status = DataSubstrate::Instance().Init(FLAGS_data_substrate_config);
if (!init_status) { // Adjust based on actual return type
return Status{ErrorCodes::InternalError,
str::stream() << "Failed to initialize DataSubstrate"};
}
serverGlobalParams.bootstrap = DataSubstrate::Instance().GetCoreConfig().bootstrap;
MONGO_LOG(1) << "serverGlobalParams.bootstrap: " << serverGlobalParams.bootstrap;
serverGlobalParams.reservedThreadNum = DataSubstrate::Instance().GetCoreConfig().core_num;
🤖 Prompt for AI Agents
In @src/mongo/db/modules/eloq/src/eloq_global_options.cpp around lines 82 - 85,
DataSubstrate::Instance().Init(...) is called without checking its return value;
modify the code around
DataSubstrate::Instance().Init(FLAGS_data_substrate_config) to capture its
Status/return value, check for non-OK/failure, and on error log an explanatory
message and return that non-OK Status (or otherwise propagate the error) instead
of proceeding to call GetCoreConfig(); ensure subsequent uses like
DataSubstrate::Instance().GetCoreConfig(), serverGlobalParams.bootstrap and
serverGlobalParams.reservedThreadNum only run when Init succeeded.

Copy link
Copy Markdown
Collaborator

@githubzilla githubzilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Eloqdoc should not read directly from data substrate config file.
It should read from data substrate.
@liunyl liunyl merged commit d1a38ad into main Jan 12, 2026
2 of 3 checks passed
@liunyl liunyl deleted the core_num branch January 12, 2026 05:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants