Skip to content

Fix slow open rocksdb cloud when db is large#88

Merged
githubzilla merged 4 commits intomainfrom
fix_slow_rocksdb_cloud_open
Sep 19, 2025
Merged

Fix slow open rocksdb cloud when db is large#88
githubzilla merged 4 commits intomainfrom
fix_slow_rocksdb_cloud_open

Conversation

@githubzilla
Copy link
Collaborator

@githubzilla githubzilla commented Sep 18, 2025

Summary by CodeRabbit

  • New Features

    • New command-line option to open the database and exit cleanly.
    • Clearer logging around database open/close, including column family count before opening.
  • Performance Improvements

    • Faster database open via temporary pre-open optimizations and relaxed open-time checks.
    • Automatically restores safe defaults after open to maintain normal runtime behavior.
    • Reduced startup resource usage and improved open stability.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 18, 2025

Walkthrough

Adds pre-open CloudFileSystem options to speed DB open (constant SST file size = 64MB; skip_cloud_files_in_getchildren = true), removes a forced max_open_files override, restores skip flag after open, and adds a new --opendb CLI flag and open-only exit path in the dump tool. No other public signatures changed except the flag and struct field.

Changes

Cohort / File(s) Summary
Pre-open Cloud FS tweaks
eloq_data_store_service/rocksdb_cloud_data_store.cpp, eloq_data_store_service/rocksdb_cloud_dump.cpp
Before DB open set CloudFileSystemOptions: constant_sst_file_size_in_sst_file_manager = 64 * 1024 * 1024 and skip_cloud_files_in_getchildren = true.
Post-open restoration & open options
eloq_data_store_service/rocksdb_cloud_data_store.cpp, eloq_data_store_service/rocksdb_cloud_dump.cpp
After successful DB::Open restore skip_cloud_files_in_getchildren = false. Removed explicit options.max_open_files = 0. In dump tool, also set best_efforts_recovery = false, skip_checking_sst_file_sizes_on_db_open = true, skip_stats_update_on_db_open = true.
Open-only CLI path
eloq_data_store_service/rocksdb_cloud_dump.cpp
Added DEFINE_bool(opendb, false, ...); added bool opendb to CmdLineParams; parse flag, log column-family count, provide early-exit path that opens DB, destroys CF handles, closes/removes DB, cleans cloud env/fs, shuts down AWS/logging, and exit(0).
CLI usage & logs
eloq_data_store_service/rocksdb_cloud_dump.cpp
Updated usage/help and no-action validation to include --opendb; added explicit logs around DB open, CF count, and close/cleanup steps.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant CLI as User/CLI
  participant Dump as rocksdb_cloud_dump
  participant CloudFS as CloudFileSystem
  participant DB as RocksDB

  CLI->>Dump: Start with flags (incl. --opendb?)
  Dump->>CloudFS: Set pre-open options (constant_sst=64MB, skip_getchildren=true)
  Dump->>DB: DB::Open(options)
  DB-->>Dump: Open success/failure

  alt Open success
    Dump->>CloudFS: Restore skip_getchildren=false
    Dump->>Dump: if --opendb then early-exit
    alt --opendb=true
      Dump->>DB: Destroy CF handles, Close DB, Delete DB
      Dump->>CloudFS: Clear cloud_env/cloud_fs
      Dump->>CLI: Shutdown logging/AWS, Exit(0)
    else --opendb=false
      Dump-->>CLI: Continue with dump/get_value workflows
    end
  else Open failure
    Dump-->>CLI: Report error and exit
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

I hop through clouds with tiny paws,
Sixty-four megabytes—no pause.
I open, peek, then tidy bright,
Close handles, dim the logging light.
A rabbit's nibble, quick and neat—opendb complete. 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Fix slow open rocksdb cloud when db is large" accurately and concisely summarizes the primary intent of the changeset—open-time performance fixes for cloud-backed RocksDB instances (the diff shows pre-open CloudFileSystem tweaks and open-time RocksDB option adjustments to speed DB open for large databases). It is specific, short, and directly related to the main change described in the provided summaries.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix_slow_rocksdb_cloud_open

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86a416b and eb8d41c.

📒 Files selected for processing (1)
  • eloq_data_store_service/rocksdb_cloud_dump.cpp (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • eloq_data_store_service/rocksdb_cloud_dump.cpp

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
eloq_data_store_service/rocksdb_cloud_dump.cpp (1)

441-449: Bug: status from ListColumnFamilies is never checked (using stale value).

Capture and check the return of ListColumnFamilies; current check uses an unrelated prior status.

Apply:

-        rocksdb::DBCloud::ListColumnFamilies(
-            options, params.db_path, &column_families);
-        if (!status.ok())
+        status = rocksdb::DBCloud::ListColumnFamilies(
+            options, params.db_path, &column_families);
+        if (!status.ok())
         {
             LOG(ERROR) << "Cannot list column families: " << status.ToString();
             throw std::runtime_error("Cannot list column families: " +
                                      status.ToString());
         }
🧹 Nitpick comments (6)
eloq_data_store_service/rocksdb_cloud_data_store.cpp (2)

327-333: Pre‑open FS tweaks look good; tighten constant literal and document scope.

  • Prefer an unsigned literal to avoid accidental sign/width issues on some platforms.
  • Since this constant is a temporary open-time hack, ensure it has no unintended effect post‑open (there’s no post‑open reset here).

Apply:

-    cfs_options_.constant_sst_file_size_in_sst_file_manager = 64 * 1024 * 1024L;
+    cfs_options_.constant_sst_file_size_in_sst_file_manager = 64ULL * 1024 * 1024;

Please confirm whether this knob influences runtime behavior after DB::Open; if so, consider restoring it post‑open similar to skip_cloud_files_in_getchildren.


621-630: Post‑open resets: check SetDBOptions status, null‑check dynamic_cast, and keep local options in sync.

  • db_->SetDBOptions return value is ignored.
  • dynamic_cast on cloud_fs_ isn’t null‑checked.
  • cfs_options_ remains with skip_cloud_files_in_getchildren=true, diverging from the live FS options.

Apply:

-    // Reset max_open_files to default value of -1 after DB::Open
-    db_->SetDBOptions({{"max_open_files", "-1"}});
+    // Reset max_open_files to default value of -1 after DB::Open
+    {
+      auto s = db_->SetDBOptions({{"max_open_files", "-1"}});
+      if (!s.ok()) {
+        LOG(WARNING) << "SetDBOptions(max_open_files=-1) failed: " << s.ToString();
+      }
+    }

-    rocksdb::CloudFileSystem *cfs =
-        dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs_.get());
-    auto &cfs_options_ref = cfs->GetMutableCloudFileSystemOptions();
-    cfs_options_ref.skip_cloud_files_in_getchildren = false;
+    if (auto* cfs = dynamic_cast<rocksdb::CloudFileSystem*>(cloud_fs_.get())) {
+      auto& cfs_options_ref = cfs->GetMutableCloudFileSystemOptions();
+      cfs_options_ref.skip_cloud_files_in_getchildren = false;
+      // Keep local copy consistent with the live FS.
+      cfs_options_.skip_cloud_files_in_getchildren = false;
+    } else {
+      LOG(WARNING) << "cloud_fs_ is not a CloudFileSystem; cannot restore skip_cloud_files_in_getchildren";
+    }
eloq_data_store_service/rocksdb_cloud_dump.cpp (4)

386-393: Pre‑open FS tweaks mirror the service — consider same literal tweak and confirm post‑open side‑effects.

  • Use 64ULL * 1024 * 1024 for clarity/safety.
  • You restore skip_cloud_files_in_getchildren later; verify whether constant_sst_file_size_in_sst_file_manager needs restoring.

Apply:

-        cfs_options.constant_sst_file_size_in_sst_file_manager =
-            64 * 1024 * 1024L;
+        cfs_options.constant_sst_file_size_in_sst_file_manager =
+            64ULL * 1024 * 1024;

431-436: Speed up open: avoid opening all files during DB::Open.

Using max_open_files = -1 can force eager file opens on large DBs. Align with the service: open with 0, then optionally flip to -1 post‑open if needed.

Apply:

-        options.max_open_files = -1;              // Allow unlimited open files
+        options.max_open_files = 0;               // Faster DB::Open; avoid eager FDs

Optionally, after a successful open:

@@
         if (!status.ok())
         {
             LOG(ERROR) << "Unable to open RocksDB Cloud database: "
                        << status.ToString();
             throw std::runtime_error("Unable to open RocksDB Cloud database: " +
                                      status.ToString());
         }
+        // Optional: restore unlimited FDs if desired after faster open.
+        // db->SetDBOptions({{"max_open_files", "-1"}});

489-495: Null‑check CloudFileSystem before mutating options.

dynamic_cast may return nullptr depending on FS wiring.

Apply:

-        rocksdb::CloudFileSystem *cloud_fs_ptr =
-            dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs.get());
-        auto &cfs_options_ref =
-            cloud_fs_ptr->GetMutableCloudFileSystemOptions();
-        // Restore skip_cloud_files_in_getchildren to false
-        cfs_options_ref.skip_cloud_files_in_getchildren = false;
+        if (auto* cloud_fs_ptr =
+                dynamic_cast<rocksdb::CloudFileSystem*>(cloud_fs.get())) {
+          auto& cfs_options_ref =
+              cloud_fs_ptr->GetMutableCloudFileSystemOptions();
+          // Restore skip_cloud_files_in_getchildren to false
+          cfs_options_ref.skip_cloud_files_in_getchildren = false;
+        } else {
+          LOG(WARNING) << "cloud_fs is not a CloudFileSystem; cannot restore skip_cloud_files_in_getchildren";
+        }

496-513: Early‑exit path looks correct; add symmetric flag shutdown.

For completeness, shut down gflags here as well.

Apply:

             Aws::ShutdownAPI(aws_options);
             google::ShutdownGoogleLogging();
+            gflags::ShutDownCommandLineFlags();
             LOG(INFO) << "Database closed successfully.";
             return 0;
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca901f6 and bd836cd.

📒 Files selected for processing (2)
  • eloq_data_store_service/rocksdb_cloud_data_store.cpp (3 hunks)
  • eloq_data_store_service/rocksdb_cloud_dump.cpp (9 hunks)
🔇 Additional comments (4)
eloq_data_store_service/rocksdb_cloud_data_store.cpp (1)

852-852: LGTM (namespace close).

No issues.

eloq_data_store_service/rocksdb_cloud_dump.cpp (3)

82-83: Usage string updated — looks good.

Matches the new flag.


104-105: Wiring for opendb param — looks good.

Struct field and assignment are correct.

Also applies to: 124-125


474-476: LGTM (open log context).

Helpful diagnostic for large CF sets.

It seem it does not matter to Open large DB, and it can conflict with
skip_cloud_files_in_getchildren when open db
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
eloq_data_store_service/rocksdb_cloud_data_store.cpp (1)

602-613: Restore skip flag on both success and failure; add nullptr guard and keep local copy consistent.

  • If DB open fails, the CloudFS option remains true; restore it before returning.
  • Add a null check on dynamic_cast.
  • Also update cfs_options_ to keep the stored copy in sync.
@@
-    if (!status.ok())
+    if (!status.ok())
     {
+        if (auto *cfs_restore =
+                dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs_.get()))
+        {
+            auto &opts = cfs_restore->GetMutableCloudFileSystemOptions();
+            opts.skip_cloud_files_in_getchildren = false;
+        }
+        cfs_options_.skip_cloud_files_in_getchildren = false;
         ttl_compaction_filter_ = nullptr;
@@
-    rocksdb::CloudFileSystem *cfs =
-        dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs_.get());
-    auto &cfs_options_ref = cfs->GetMutableCloudFileSystemOptions();
-    cfs_options_ref.skip_cloud_files_in_getchildren = false;
+    if (auto *cfs = dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs_.get()))
+    {
+        auto &cfs_options_ref = cfs->GetMutableCloudFileSystemOptions();
+        cfs_options_ref.skip_cloud_files_in_getchildren = false;
+    }
+    else
+    {
+        LOG(WARNING) << "CloudFileSystem cast failed when restoring skip flag";
+    }
+    cfs_options_.skip_cloud_files_in_getchildren = false;

Also applies to: 614-620

eloq_data_store_service/rocksdb_cloud_dump.cpp (1)

441-449: Bug: status not assigned from ListColumnFamilies (always checks stale value).

Assign the return status; otherwise the error check is meaningless.

-        rocksdb::DBCloud::ListColumnFamilies(
-            options, params.db_path, &column_families);
+        status = rocksdb::DBCloud::ListColumnFamilies(
+            options, params.db_path, &column_families);
         if (!status.ok())
♻️ Duplicate comments (2)
eloq_data_store_service/rocksdb_cloud_dump.cpp (2)

54-55: New --opendb flag: looks good.

Flag definition is clear and scoped.


158-163: Validation now admits --opendb‑only runs.

Matches prior feedback.

🧹 Nitpick comments (3)
eloq_data_store_service/rocksdb_cloud_data_store.cpp (1)

327-333: Pre‑open tuning: OK, but avoid signed long literal and clarify comment.

  • Prefer unsigned 64-bit literal to avoid platform‑dependent “long” width.
  • Comment mentions DumpDBSummary; here it’s general open-time behavior. Reword to avoid confusion.
-    cfs_options_.constant_sst_file_size_in_sst_file_manager = 64 * 1024 * 1024L;
-    // Skip listing cloud files in GetChildren when DumpDBSummary to speed
-    // up open db
+    cfs_options_.constant_sst_file_size_in_sst_file_manager = 64ULL * 1024 * 1024;
+    // Temporarily skip listing cloud files in GetChildren to speed up DB open
eloq_data_store_service/rocksdb_cloud_dump.cpp (2)

386-393: Pre‑open CloudFS tweaks: mirror service; minor literal nit.

Same speedups as service; prefer unsigned literal.

-        cfs_options.constant_sst_file_size_in_sst_file_manager =
-            64 * 1024 * 1024L;
+        cfs_options.constant_sst_file_size_in_sst_file_manager =
+            64ULL * 1024 * 1024;

489-495: Guard cast and handle failure when restoring skip flag.

Avoid undefined behavior if the cast ever fails.

-        rocksdb::CloudFileSystem *cloud_fs_ptr =
-            dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs.get());
-        auto &cfs_options_ref =
-            cloud_fs_ptr->GetMutableCloudFileSystemOptions();
-        // Restore skip_cloud_files_in_getchildren to false
-        cfs_options_ref.skip_cloud_files_in_getchildren = false;
+        if (auto *cloud_fs_ptr =
+                dynamic_cast<rocksdb::CloudFileSystem *>(cloud_fs.get()))
+        {
+            auto &cfs_options_ref =
+                cloud_fs_ptr->GetMutableCloudFileSystemOptions();
+            // Restore skip_cloud_files_in_getchildren to false
+            cfs_options_ref.skip_cloud_files_in_getchildren = false;
+        }
+        else
+        {
+            LOG(WARNING)
+                << "CloudFileSystem cast failed when restoring skip flag";
+        }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd836cd and 86a416b.

📒 Files selected for processing (2)
  • eloq_data_store_service/rocksdb_cloud_data_store.cpp (3 hunks)
  • eloq_data_store_service/rocksdb_cloud_dump.cpp (10 hunks)
🔇 Additional comments (5)
eloq_data_store_service/rocksdb_cloud_dump.cpp (5)

82-83: Usage text updated correctly.

Includes --opendb with concise description.


104-105: CmdLineParams extended with opendb.

Struct addition aligns with flag.


124-125: Flag wiring OK.

Correctly assigns FLAGS_opendb.


433-436: Open‑time DB options align with faster open path.

best_efforts_recovery=false, skip_*_on_db_open=true are appropriate for speed.

Please confirm the RocksDB Cloud version in use supports these options at the same names to avoid compile‑time mismatches.


474-476: Helpful log.

Pre‑open CF count log is useful for diagnosing slow opens.

@liunyl liunyl linked an issue Sep 19, 2025 that may be closed by this pull request
@githubzilla githubzilla merged commit 7ca9d31 into main Sep 19, 2025
1 check passed
This was referenced Sep 29, 2025
@coderabbitai coderabbitai bot mentioned this pull request Oct 17, 2025
@coderabbitai coderabbitai bot mentioned this pull request Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rockscloud should not download all files from s3 on startup

1 participant