put slices in different ranges into a single batch write#156
Conversation
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
|
Warning Rate limit exceeded@liunyl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 55 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
WalkthroughDispatchRangeSliceBatches now accepts a vector of RangeSliceBatchPlan objects and merges segments across all plans into batched BatchWriteRecords calls, dispatching when batch size thresholds or finalization conditions are met; callers were updated to pass vectors instead of single plans. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Dispatch
participant Accumulator
participant RPC
Note over Caller,Dispatch: New flow — multi-plan aggregation
Caller->>Dispatch: call(plans: vector<RangeSliceBatchPlan>)
loop for each plan in plans
Dispatch->>Accumulator: append plan.segments
alt batch size >= MAX_WRITE_BATCH_SIZE
Accumulator->>RPC: BatchWriteRecords(merged batch)
RPC-->>Dispatch: ack / async completion
end
end
alt remaining segments
Accumulator->>RPC: BatchWriteRecords(final batch)
RPC-->>Dispatch: ack / async completion
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
data_store_service_client.cpp (1)
1693-1707: Use-after-free risk: passing a temporarystd::vector<RangeSliceBatchPlan>into DispatchRangeSliceBatches.Here:
auto slice_plan = PrepareRangeSliceBatches(table_name, version, slices, partition_id); SyncConcurrentRequest *slice_sync_concurrent = sync_concurrent_request_pool_.NextObject(); PoolableGuard slice_guard(slice_sync_concurrent); slice_sync_concurrent->Reset(); DispatchRangeSliceBatches(kv_range_slices_table_name, KvPartitionIdOf(table_name), version, std::vector<RangeSliceBatchPlan>{std::move(slice_plan)}, slice_sync_concurrent);
DispatchRangeSliceBatchesbuildsstd::vector<std::string_view>pointing intoRangeSliceBatchPlan::segment_keys/segment_records. Because you construct a temporarystd::vector<RangeSliceBatchPlan>and pass it by const reference, that temporary (and theRangeSliceBatchPlanit contains) is destroyed immediately afterDispatchRangeSliceBatchesreturns, while the asyncBatchWriteRecordsclosures are still in flight. Thestring_views in those closures then dangle → undefined behavior / use-after-free.
UpsertRangesavoids this by keepingslice_plansalive until after waiting forunfinished_request_cnt_ == 0.UpdateRangeSlicesneeds the same pattern.Suggested fix: materialize a local
std::vector<RangeSliceBatchPlan>that outlives the async writes and pass it by reference, without moving it into a temporary:@@ bool DataStoreServiceClient::UpdateRangeSlices( - // 1- Prepare slice batches - auto slice_plan = - PrepareRangeSliceBatches(table_name, version, slices, partition_id); + // 1- Prepare slice batches + auto slice_plan = + PrepareRangeSliceBatches(table_name, version, slices, partition_id); + // Wrap in a local vector so the plan's owned buffers stay alive + // until all async writes complete. + std::vector<RangeSliceBatchPlan> slice_plans; + slice_plans.emplace_back(std::move(slice_plan)); @@ - slice_sync_concurrent->Reset(); - DispatchRangeSliceBatches(kv_range_slices_table_name, - KvPartitionIdOf(table_name), - version, - std::vector<RangeSliceBatchPlan>{std::move(slice_plan)}, - slice_sync_concurrent); + slice_sync_concurrent->Reset(); + DispatchRangeSliceBatches(kv_range_slices_table_name, + KvPartitionIdOf(table_name), + version, + slice_plans, + slice_sync_concurrent);This mirrors the safe lifetime pattern used in
UpsertRangesand ensures allsegment_keys/segment_recordsstorage remains valid until after the wait loop completes.
🧹 Nitpick comments (1)
data_store_service_client.cpp (1)
1374-1509: Cross-plan DispatchRangeSliceBatches batching and concurrency logic look sound.The function correctly:
- Accumulates total segment count to reserve vector capacity once.
- Uses
write_batch_sizeplus an overhead estimate to respectMAX_WRITE_BATCH_SIZE.- Applies the same
SyncConcurrentRequestpattern used elsewhere: it blocks whenunfinished_request_cnt_reachesmax_flying_write_count, increments the counter per BatchWriteRecords call, and leaves coordination to the shared callback + the caller’sall_request_started_/ wait loop.- Reuses the batch vectors after each dispatch via
clear()+reserve(total_segments), which is cheap once capacity is established.Given that it only captures
std::string_views into theRangeSliceBatchPlanstorage, this function assumes callers keep allplans[i].segment_keys/segment_recordsalive until all async writes complete; that contract is honored inUpsertRangesbut currently violated inUpdateRangeSlices(see separate comment).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
data_store_service_client.cpp(4 hunks)data_store_service_client.h(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-17T11:08:35.322Z
Learnt from: liunyl
Repo: eloqdata/store_handler PR: 83
File: data_store_service_client_closure.cpp:408-448
Timestamp: 2025-09-17T11:08:35.322Z
Learning: In the DataStoreServiceClient batch write system, the global coordinator (SyncPutAllData) manages concurrency at the partition level, not at the individual batch level. The global unfinished_request_cnt_ is only decremented when an entire partition is fully completed or failed, not after each individual batch within a partition.
Applied to files:
data_store_service_client.cpp
🧬 Code graph analysis (1)
data_store_service_client.cpp (5)
eloq_data_store_service/data_store_service_config.cpp (2)
GetShardIdByPartitionId(816-821)GetShardIdByPartitionId(816-817)data_store_service_client_closure.cpp (2)
SyncConcurrentRequestCallback(484-491)SyncConcurrentRequestCallback(484-487)eloq_data_store_service/data_store_service.cpp (4)
BatchWriteRecords(900-948)BatchWriteRecords(900-904)BatchWriteRecords(1181-1246)BatchWriteRecords(1181-1194)eloq_data_store_service/eloq_store_data_store.cpp (2)
BatchWriteRecords(166-229)BatchWriteRecords(166-166)eloq_data_store_service/rocksdb_data_store_common.cpp (2)
BatchWriteRecords(499-661)BatchWriteRecords(499-500)
🔇 Additional comments (2)
data_store_service_client.h (1)
590-600: Multi-plan DispatchRangeSliceBatches signature looks consistent.The updated declaration to accept
const std::vector<RangeSliceBatchPlan> &plansmatches the implementation and enables the intended cross-plan batching. Just note that callers must ensureRangeSliceBatchPlaninstances outlive the async writes (see cpp comments).data_store_service_client.cpp (1)
1819-1832: UpsertRanges multi-plan dispatch wiring is correct and lifetime-safe.
slice_plansis built once, passed by const reference toDispatchRangeSliceBatches, and kept alive until after:while (slice_sync_concurrent->unfinished_request_cnt_ != 0) { slice_sync_concurrent->cv_.wait(lk); }returns, so the
string_views inside BatchWriteRecords closures always reference validsegment_keys/segment_records. Using a sharedSyncConcurrentRequestfor all ranges also correctly enforces the global max-flying-write cap across all slice batches.
737969f to
79c1b96
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
data_store_service_client.cpp (1)
1693-1707: Avoid reading from moved‑fromslice_planin UpdateRangeSlicesHere:
auto slice_plan = PrepareRangeSliceBatches(table_name, version, slices, partition_id); // ... std::vector<RangeSliceBatchPlan> slice_plans; slice_plans.emplace_back(std::move(slice_plan)); // ... EnqueueRangeMetadataRecord(..., slice_plan.segment_cnt, meta_acc);
slice_planhas been moved intoslice_plans, so relying onslice_plan.segment_cntafterward is confusing and (in strict C++ terms) relies on unspecified moved-from state, even though current implementations will effectively copy theuint32_t.Use the moved‑to object (or cache the count before moving) instead. For example:
- std::vector<RangeSliceBatchPlan> slice_plans; - slice_plans.emplace_back(std::move(slice_plan)); + std::vector<RangeSliceBatchPlan> slice_plans; + slice_plans.emplace_back(std::move(slice_plan)); + const uint32_t segment_cnt = slice_plans.back().segment_cnt; DispatchRangeSliceBatches(kv_range_slices_table_name, KvPartitionIdOf(table_name), version, slice_plans, slice_sync_concurrent); // ... EnqueueRangeMetadataRecord(catalog_factory, table_name, range_start_key, partition_id, range_version, version, - slice_plan.segment_cnt, + segment_cnt, meta_acc);This keeps the intent clear and avoids depending on moved‑from object state.
Also applies to: 1728-1737
🧹 Nitpick comments (2)
data_store_service_client.h (1)
596-600: Private API change to accept multiple RangeSliceBatchPlan objects looks consistentThe switch to
const std::vector<RangeSliceBatchPlan> &plansmatches the new multi-plan batching inUpdateRangeSlicesandUpsertRanges, and since this is a private helper there’s no external ABI risk. You might optionally add a brief note in the comment (in the .cpp) that callers must ensure all plans target the same KV table/partition/version, as currently enforced only by convention, not assertions.data_store_service_client.cpp (1)
1374-1510: DispatchRangeSliceBatches batching and concurrency look correct; consider a couple of small cleanupsThe new implementation correctly:
- Aggregates segments from all
plansinto batches bounded byMAX_WRITE_BATCH_SIZE(based on key+record size plus per-record overhead).- Uses
SyncConcurrentRequest’sunfinished_request_cnt_and condition variable in the same pattern asPutArchivesAll/DispatchRangeMetadataBatches, so global in‑flight write limits remain honored.- Holds
plans(and their backingstd::strings) alive in the caller until all RPCs complete, so thestd::string_viewkeys/records passed toBatchWriteRecordsremain valid.Two minor polish points you might consider:
- Instead of the magic
constexpr size_t overhead_per_segment = 20;, compute it from the actual field sizes for clarity and future safety, e.g.:constexpr size_t overhead_per_segment = sizeof(uint64_t) // records_ts + sizeof(uint64_t) // records_ttl + sizeof(WriteOpType);
- After each dispatch you
clear()and thenreserve(total_segments)again. Since capacity is already at leasttotal_segmentsafter the first reserve, the subsequentreservecalls are effectively no‑ops; you could drop them for readability, or (if you ever want tighter bounds) reserve based on an estimated max segments per batch instead. This is cosmetic, not correctness-related.Overall, the logic and concurrency behavior look solid.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
data_store_service_client.cpp(4 hunks)data_store_service_client.h(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-17T11:08:35.322Z
Learnt from: liunyl
Repo: eloqdata/store_handler PR: 83
File: data_store_service_client_closure.cpp:408-448
Timestamp: 2025-09-17T11:08:35.322Z
Learning: In the DataStoreServiceClient batch write system, the global coordinator (SyncPutAllData) manages concurrency at the partition level, not at the individual batch level. The global unfinished_request_cnt_ is only decremented when an entire partition is fully completed or failed, not after each individual batch within a partition.
Applied to files:
data_store_service_client.cpp
🧬 Code graph analysis (1)
data_store_service_client.cpp (4)
data_store_service_client_closure.cpp (2)
SyncConcurrentRequestCallback(484-491)SyncConcurrentRequestCallback(484-487)eloq_data_store_service/data_store_service.cpp (4)
BatchWriteRecords(900-948)BatchWriteRecords(900-904)BatchWriteRecords(1181-1246)BatchWriteRecords(1181-1194)eloq_data_store_service/eloq_store_data_store.cpp (2)
BatchWriteRecords(166-229)BatchWriteRecords(166-166)data_store_service_client.h (1)
KvPartitionIdOf(700-705)
🔇 Additional comments (1)
data_store_service_client.cpp (1)
1828-1833: Multi-plan DispatchRangeSliceBatches call in UpsertRanges is consistent and improves batchingThe new single call:
DispatchRangeSliceBatches(kv_range_slices_table_name, kv_partition_id, version, slice_plans, slice_sync_concurrent);correctly reuses the shared
slice_sync_concurrent, respects the table-widekv_partition_id, and allows segments from different ranges to be merged into fewerBatchWriteRecordsRPCs without changing the existing wait/finish pattern below. This matches the intended PR goal of cross-range batching.
79c1b96 to
5f1b729
Compare
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.