Add UserValueChecksum interface for end-to-end value integrity checks#14449
Add UserValueChecksum interface for end-to-end value integrity checks#14449xingbowang wants to merge 1 commit intofacebook:mainfrom
Conversation
✅ clang-tidy: No findings on changed linesCompleted in 336.2s. |
6b9e79a to
486255f
Compare
Introduce a pluggable UserValueChecksum interface that allows applications to attach and verify checksums on user values during flush and compaction output verification. This catches silent data corruption (e.g., from buggy merge operators or compaction filters) before it is persisted to SST files. Key components: - UserValueChecksum abstract class with Validate/ValidateWideColumns methods - New options: user_value_checksum, verify_user_value_checksum_on_flush, verify_user_value_checksum_on_compaction (dynamically mutable) - Statistics counters: USER_VALUE_CHECKSUM_COMPUTE_COUNT and USER_VALUE_CHECKSUM_MISMATCH_COUNT - Validation in flush (builder.cc) and compaction output verification (compaction_job.cc), integrated into existing iteration loops - Remote compaction support via VerifyUserValueChecksumsOnOutputFiles() which opens output files directly from the remote output path - Helper (user_value_checksum_helper.h) handles all value types: plain values, TimedPut (unpacks trailing seqno), wide columns, merge operands; skips deletions, blob indices, and empty values - db_stress support with CRC32c-based checksum validator - Comprehensive test coverage including corrupt/valid flush and compaction, dynamic toggle, paranoid_file_checks integration, standalone user checksum loop, TimedPut/WideColumn/merge/SingleDelete/DeleteRange/blob, ingest external file, recovery flush, randomized sizes, and statistics counter assertions to verify validation actually runs
486255f to
6a77c51
Compare
|
@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D98243050. |
|
/claude-review |
pdillinger
left a comment
There was a problem hiding this comment.
This would execute in one of the inner-most loops of RocksDB operation, along with CompactionFilter. A custom interface like this has CPU overheads. I think it's likely that there will be significant overlap between users of CompactionFilter and this feature. And those users will pay the overhead of both customization interfaces, perhaps even double-deserialization of custom value formats.
I would rather see CompactionFilter modified to support reporting corruption, in support of things like custom checksums. (Note that DB flush is run through CompactionFilter at the discretion of the CompactionFilter.) Besides, it seems odd to essentially endorse custom per-key-value checksums when we have our own. (What is the deficiency of our own? Just adapting/leveraging what users are already doing?)
And for configuring whether to re-verify values after compaction or flush completes, this is closely related to VerifyOutputFlags::kVerifyIteration. I don't think we should have distinct options unrelated to VerifyOutputFlags that essentially control whether / how we do the extra verification of compaction and flush outputs. I recommend having a way for the CompactionFilter to say that it wants to be executed during "verify iteration" runs (just checking for corruption, not dropping or changing values) IF/WHEN kVerifyIteration is set.
IMHO this provides a much more unified/integrated experience leveraging existing overheads, customization interfaces, and options.
I agree the capability belongs under CompactionFilter and VerifyOutputFlags, not as a separate checksum interface and separate booleans. The use case is application-level validation that RocksDB’s own SST checksums do not cover. The goal is to close the gap from the application level code to hand off the buffer until the point where our own checksum is computed. I will rework it as additive compaction-filter verification hooks, implement flush support in VerifyOutputFlags, and preserve the current merge/wide-column/TimedPut coverage. |
Add UserValueChecksum interface for end-to-end value integrity checks
Introduce a pluggable UserValueChecksum interface that allows applications
to attach and verify checksums on user values during flush and compaction
output verification. This catches silent data corruption (e.g., from buggy
merge operators or compaction filters) before it is persisted to SST files.
Key components:
UserValueChecksumabstract class withValidate/ValidateWideColumnsmethodsuser_value_checksum,verify_user_value_checksum_on_flush,verify_user_value_checksum_on_compaction(dynamically mutable)USER_VALUE_CHECKSUM_COMPUTE_COUNTandUSER_VALUE_CHECKSUM_MISMATCH_COUNTbuilder.cc) and compaction output verification(
compaction_job.cc), integrated into existing iteration loopsVerifyUserValueChecksumsOnOutputFiles()which opens output files directly from the remote output path
user_value_checksum_helper.h) handles all value types: plainvalues, TimedPut (unpacks trailing seqno), wide columns, merge operands;
skips deletions, blob indices, and empty values
db_stresssupport with CRC32c-based checksum validatordynamic toggle,
paranoid_file_checksintegration, standalone userchecksum loop, TimedPut/WideColumn/merge/SingleDelete/DeleteRange/blob,
ingest external file, recovery flush, randomized sizes, and statistics
counter assertions to verify validation actually runs