Conversation
WalkthroughAdds per-table and per-archive row counting and runtime flush/write metrics in PutAll/PutArchivesAll, and moves FetchRecord timing capture to just before the actual read path for non-snapshot/non-archives-only fetches. No public APIs changed. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Client as DataStoreServiceClient
participant Storage
rect `#F2F7FF`
note right of Client: PutAll / PutArchivesAll flow (per-table/archive)
Caller->>Client: PutAll(table)
Client->>Client: init records_count = 0
loop per batch
Client->>Storage: write batch
Storage-->>Client: ack
Client->>Client: records_count += batch.size()
Client-->>Client: emit per-batch runtime metric
end
Client-->>Caller: all partitions done
Client->>Client: emit NAME_KV_FLUSH_ROWS_TOTAL (records_count)
end
sequenceDiagram
participant Caller
participant Client as DataStoreServiceClient
participant Storage
rect `#FFF9F2`
note right of Client: FetchRecord timing moved
Caller->>Client: FetchRecord(request)
alt snapshot or archives-only
Client->>Storage: perform snapshot/archive read
else normal fetch
Client->>Client: record start_ (just before read)
Client->>Storage: perform read
end
Storage-->>Client: response
Client-->>Caller: return record
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
data_store_service_client.cpp (2)
2324-2329: Consider aggregating archive metrics across partitions for consistency.Currently,
NAME_KV_FLUSH_ROWS_TOTALwith scope "archive" is emitted once per partition (inside the loop at line 2152). This differs fromPutAll, which accumulates counts across all partitions and emits once per table (line 465-469).If you process 10 partitions, this emits 10 separate metric data points instead of 1 aggregated point. Consider:
- Aggregate approach (like PutAll): Move
recs_cntaccumulation outside the partition loop and emit once perPutArchivesAllcall for consistency.- Document the intent: If per-partition granularity is preferred for archive observability, add a comment explaining why this differs from the base table pattern.
Example aggregation:
+ size_t total_archive_recs_cnt = 0; + // Send the batch request for (auto &[partition_id, archive_ptrs] : partitions_map) { + total_archive_recs_cnt += archive_ptrs.size(); ... - if (metrics::enable_kv_metrics) - { - metrics::kv_meter->Collect( - metrics::NAME_KV_FLUSH_ROWS_TOTAL, recs_cnt, "archive"); - } } + + if (metrics::enable_kv_metrics) + { + metrics::kv_meter->Collect( + metrics::NAME_KV_FLUSH_ROWS_TOTAL, total_archive_recs_cnt, "archive"); + }
3299-3302: Timing capture inconsistency across fetch paths.The timing is now captured just before the
Readcall for non-snapshot, non-archives-only fetches, which is more accurate. However:
FetchSnapshot inconsistency:
FetchSnapshot(line 3414-3417) still captures timing at function entry, not right before the actual fetch operation. Consider moving it closer to the read for consistency.Missing archives-only timing: When
only_fetch_archives_is true (line 3294-3297),FetchArchivesis called but no timing is captured. If archive fetch performance is important, consider capturing timing in theFetchArchivespath as well.If these timing differences are intentional due to different performance characteristics of snapshot vs. base vs. archive reads, please add comments explaining the rationale.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
data_store_service_client.cpp(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-17T11:08:35.322Z
Learnt from: liunyl
Repo: eloqdata/store_handler PR: 83
File: data_store_service_client_closure.cpp:408-448
Timestamp: 2025-09-17T11:08:35.322Z
Learning: In the DataStoreServiceClient batch write system, the global coordinator (SyncPutAllData) manages concurrency at the partition level, not at the individual batch level. The global unfinished_request_cnt_ is only decremented when an entire partition is fully completed or failed, not after each individual batch within a partition.
Applied to files:
data_store_service_client.cpp
🔇 Additional comments (1)
data_store_service_client.cpp (1)
294-294: Metrics collection for base table flush looks good.The per-table record count accumulation and metric emission is correctly placed after error checking and cleanup. The scope "base" appropriately distinguishes base table flushes from archive flushes.
Minor suggestion: Consider whether emitting metrics with
records_count = 0(when a table has no records to flush) adds observability value or just metric noise.Also applies to: 310-310, 464-469
008259e to
1005e76
Compare
Summary by CodeRabbit
Chores
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.