Skip to content

Add UserValueChecksum interface for end-to-end value integrity checks#14449

Open
xingbowang wants to merge 1 commit intofacebook:mainfrom
xingbowang:2026_02_28_interlock_checksum
Open

Add UserValueChecksum interface for end-to-end value integrity checks#14449
xingbowang wants to merge 1 commit intofacebook:mainfrom
xingbowang:2026_02_28_interlock_checksum

Conversation

@xingbowang
Copy link
Copy Markdown
Contributor

@xingbowang xingbowang commented Mar 10, 2026

Add UserValueChecksum interface for end-to-end value integrity checks

Introduce a pluggable UserValueChecksum interface that allows applications
to attach and verify checksums on user values during flush and compaction
output verification. This catches silent data corruption (e.g., from buggy
merge operators or compaction filters) before it is persisted to SST files.

Key components:

  • UserValueChecksum abstract class with Validate/ValidateWideColumns methods
  • New options: user_value_checksum, verify_user_value_checksum_on_flush,
    verify_user_value_checksum_on_compaction (dynamically mutable)
  • Statistics counters: USER_VALUE_CHECKSUM_COMPUTE_COUNT and
    USER_VALUE_CHECKSUM_MISMATCH_COUNT
  • Validation in flush (builder.cc) and compaction output verification
    (compaction_job.cc), integrated into existing iteration loops
  • Remote compaction support via VerifyUserValueChecksumsOnOutputFiles()
    which opens output files directly from the remote output path
  • Helper (user_value_checksum_helper.h) handles all value types: plain
    values, TimedPut (unpacks trailing seqno), wide columns, merge operands;
    skips deletions, blob indices, and empty values
  • db_stress support with CRC32c-based checksum validator
  • Comprehensive test coverage including corrupt/valid flush and compaction,
    dynamic toggle, paranoid_file_checks integration, standalone user
    checksum loop, TimedPut/WideColumn/merge/SingleDelete/DeleteRange/blob,
    ingest external file, recovery flush, randomized sizes, and statistics
    counter assertions to verify validation actually runs

@meta-cla meta-cla Bot added the CLA Signed label Mar 10, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 10, 2026

✅ clang-tidy: No findings on changed lines

Completed in 336.2s.

@xingbowang xingbowang force-pushed the 2026_02_28_interlock_checksum branch 5 times, most recently from 6b9e79a to 486255f Compare March 12, 2026 19:55
@xingbowang xingbowang marked this pull request as ready for review March 12, 2026 21:16
@xingbowang xingbowang requested a review from pdillinger March 12, 2026 21:16
Introduce a pluggable UserValueChecksum interface that allows applications
to attach and verify checksums on user values during flush and compaction
output verification. This catches silent data corruption (e.g., from buggy
merge operators or compaction filters) before it is persisted to SST files.

Key components:
- UserValueChecksum abstract class with Validate/ValidateWideColumns methods
- New options: user_value_checksum, verify_user_value_checksum_on_flush,
  verify_user_value_checksum_on_compaction (dynamically mutable)
- Statistics counters: USER_VALUE_CHECKSUM_COMPUTE_COUNT and
  USER_VALUE_CHECKSUM_MISMATCH_COUNT
- Validation in flush (builder.cc) and compaction output verification
  (compaction_job.cc), integrated into existing iteration loops
- Remote compaction support via VerifyUserValueChecksumsOnOutputFiles()
  which opens output files directly from the remote output path
- Helper (user_value_checksum_helper.h) handles all value types: plain
  values, TimedPut (unpacks trailing seqno), wide columns, merge operands;
  skips deletions, blob indices, and empty values
- db_stress support with CRC32c-based checksum validator
- Comprehensive test coverage including corrupt/valid flush and compaction,
  dynamic toggle, paranoid_file_checks integration, standalone user
  checksum loop, TimedPut/WideColumn/merge/SingleDelete/DeleteRange/blob,
  ingest external file, recovery flush, randomized sizes, and statistics
  counter assertions to verify validation actually runs
@xingbowang xingbowang force-pushed the 2026_02_28_interlock_checksum branch from 486255f to 6a77c51 Compare March 26, 2026 01:04
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented Mar 26, 2026

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D98243050.

@xingbowang
Copy link
Copy Markdown
Contributor Author

/claude-review

Copy link
Copy Markdown
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would execute in one of the inner-most loops of RocksDB operation, along with CompactionFilter. A custom interface like this has CPU overheads. I think it's likely that there will be significant overlap between users of CompactionFilter and this feature. And those users will pay the overhead of both customization interfaces, perhaps even double-deserialization of custom value formats.

I would rather see CompactionFilter modified to support reporting corruption, in support of things like custom checksums. (Note that DB flush is run through CompactionFilter at the discretion of the CompactionFilter.) Besides, it seems odd to essentially endorse custom per-key-value checksums when we have our own. (What is the deficiency of our own? Just adapting/leveraging what users are already doing?)

And for configuring whether to re-verify values after compaction or flush completes, this is closely related to VerifyOutputFlags::kVerifyIteration. I don't think we should have distinct options unrelated to VerifyOutputFlags that essentially control whether / how we do the extra verification of compaction and flush outputs. I recommend having a way for the CompactionFilter to say that it wants to be executed during "verify iteration" runs (just checking for corruption, not dropping or changing values) IF/WHEN kVerifyIteration is set.

IMHO this provides a much more unified/integrated experience leveraging existing overheads, customization interfaces, and options.

@xingbowang
Copy link
Copy Markdown
Contributor Author

This would execute in one of the inner-most loops of RocksDB operation, along with CompactionFilter. A custom interface like this has CPU overheads. I think it's likely that there will be significant overlap between users of CompactionFilter and this feature. And those users will pay the overhead of both customization interfaces, perhaps even double-deserialization of custom value formats.

I would rather see CompactionFilter modified to support reporting corruption, in support of things like custom checksums. (Note that DB flush is run through CompactionFilter at the discretion of the CompactionFilter.) Besides, it seems odd to essentially endorse custom per-key-value checksums when we have our own. (What is the deficiency of our own? Just adapting/leveraging what users are already doing?)

And for configuring whether to re-verify values after compaction or flush completes, this is closely related to VerifyOutputFlags::kVerifyIteration. I don't think we should have distinct options unrelated to VerifyOutputFlags that essentially control whether / how we do the extra verification of compaction and flush outputs. I recommend having a way for the CompactionFilter to say that it wants to be executed during "verify iteration" runs (just checking for corruption, not dropping or changing values) IF/WHEN kVerifyIteration is set.

IMHO this provides a much more unified/integrated experience leveraging existing overheads, customization interfaces, and options.

I agree the capability belongs under CompactionFilter and VerifyOutputFlags, not as a separate checksum interface and separate booleans. The use case is application-level validation that RocksDB’s own SST checksums do not cover. The goal is to close the gap from the application level code to hand off the buffer until the point where our own checksum is computed. I will rework it as additive compaction-filter verification hooks, implement flush support in VerifyOutputFlags, and preserve the current merge/wide-column/TimedPut coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants