Skip to content

legacy/v0.1: fix truncating U32 cast in offset check; drop bogus pointer-as-int check#4680

Open
evilgensec wants to merge 1 commit into
facebook:devfrom
evilgensec:fix-zstd-v01-offset-check
Open

legacy/v0.1: fix truncating U32 cast in offset check; drop bogus pointer-as-int check#4680
evilgensec wants to merge 1 commit into
facebook:devfrom
evilgensec:fix-zstd-v01-offset-check

Conversation

@evilgensec
Copy link
Copy Markdown

Summary

ZSTD_execSequence in lib/legacy/zstd_v01.c has two related correctness issues in its sequence-offset validation.

1. Truncating U32 cast at lib/legacy/zstd_v01.c:1735

if (sequence.offset > (U32)(oLitEnd - base)) return ERROR(corruption_detected);

The cast forces the ptrdiff_t operand into 32-bit before the comparison. On 64-bit builds with a destination buffer larger than UINT32_MAX, the cast truncates silently and the bound becomes the low-32-bit remainder of the actual distance. A v0.1 legacy frame whose sequence.offset exceeds the truncated value but is still ≤ oLitEnd - base passes the check and reaches the match-copy step with an oversized offset.

The runtime catch at line 1758 (if (match < base) return ERROR(corruption_detected);) still fires for the resulting underflow, so this is not a directly exploitable arbitrary read; but the design intent of the line-1735 check -- catching the corruption before pointer arithmetic -- is defeated whenever the destination buffer is multi-GiB.

2. Bogus pointer-as-integer comparison at lib/legacy/zstd_v01.c:1759

if (sequence.offset > (size_t)base) return ERROR(corruption_detected);

Compares the offset against the numeric address of the base pointer. On any 64-bit system, base is a virtual address far higher than any 32-bit offset can reach, so the comparison never returns true. On 32-bit systems it fires only when base is allocated at a low address, which is unrelated to whether op - sequence.offset underflows.

The actual underflow detection is the if (match < base) line just above, which uses the post-subtraction pointer and is correct.

Fix

  • Replace the (U32) cast with (size_t) so the comparison uses full pointer precision.
  • Remove the no-op pointer-as-int check; the adjacent if (match < base) line is the real guard.

Diff

 lib/legacy/zstd_v01.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

Verification

  • Built locally on macOS/arm64: make -C lib libzstd.a clean, no new warnings.
  • Legitimate v0.1 frames continue to decode (the (size_t) comparison is mathematically a strict superset of the previous (U32) comparison for valid inputs; it can only refuse more malformed inputs, never reject a previously-accepted valid frame).
  • The legacy v0.1 path is reachable from ZSTD_decompress via the auto-detection in ZSTD_decompressLegacy; ZSTD_LEGACY_SUPPORT is 8 by default (lib/legacy/zstd_legacy.h:25).

Refs

This is in the legacy/ tree, which is opt-in but enabled by default and reachable from ZSTD_decompress via magic auto-detection.

Copilot AI review requested due to automatic review settings May 27, 2026 17:51
@meta-cla
Copy link
Copy Markdown

meta-cla Bot commented May 27, 2026

Hi @evilgensec!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes offset validation in legacy Zstandard v0.1 decompression to avoid truncation and meaningless pointer-address comparisons, improving correctness for large buffers and 64-bit builds.

Changes:

  • Replace a truncating (U32) cast with full-width arithmetic for sequence.offset validation.
  • Remove a logically incorrect sequence.offset > (size_t)base check and document why it was invalid.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/legacy/zstd_v01.c
Comment on lines +1735 to +1739
/* Use full-precision size_t arithmetic; the previous `(U32)` cast on the
* `ptrdiff_t` truncated silently when `oLitEnd - base` exceeded
* `UINT32_MAX`, letting larger `sequence.offset` values bypass the
* check on 64-bit builds with multi-GiB destination buffers. */
if (sequence.offset > (size_t)(oLitEnd - base)) return ERROR(corruption_detected);
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added an explicit if (oLitEnd < base) return ERROR(corruption_detected); before the unsigned cast so a negative ptrdiff_t cannot wrap. Kept the comparison in size_t for consistency with the rest of the file (the project does not use ptrdiff_t arithmetic in legacy decoders). Force-pushed.

Comment thread lib/legacy/zstd_v01.c Outdated
Comment on lines +1761 to +1770
/* check */
if (match < base) return ERROR(corruption_detected);
if (sequence.offset > (size_t)base) return ERROR(corruption_detected);
/* The previous `sequence.offset > (size_t)base` check compared the
* offset against the numeric address of the `base` pointer, which
* is meaningless: on 64-bit systems `base` is a high virtual
* address that no 32-bit offset can exceed; on 32-bit systems the
* check fires only on accidental low-address allocations rather
* than on any underflow condition. The real underflow detection
* is the `match < base` line above, which uses the actual
* post-subtraction pointer. */
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer-arithmetic UB concern is closed by the upstream check at line 1738 (sequence.offset > (size_t)(oLitEnd - base)): op == oLitEnd at line 1756 because op += litLength already ran at line 1747, so the upstream check provably guarantees op - sequence.offset >= base before match = op - sequence.offset forms. I made this explicit in the comment block above the kept if (match < base) defense-in-depth check. Force-pushed.

…pointer-as-int check

`ZSTD_execSequence` in `lib/legacy/zstd_v01.c` had two related
correctness issues in its sequence-offset validation:

1. The check at line 1735

       if (sequence.offset > (U32)(oLitEnd - base)) return ERROR(...);

   casts the `ptrdiff_t` operand to `U32` before comparing.  On 64-bit
   builds with destination buffers larger than `UINT32_MAX`, the cast
   truncates silently and the comparison is taken against the
   low-32-bit remainder of the actual distance.  An attacker that
   crafts a v0.1 legacy frame whose `sequence.offset` exceeds the
   truncated value but is still within `oLitEnd - base` passes the
   check and reaches the match-copy step with an oversized offset.

2. The check at line 1759

       if (sequence.offset > (size_t)base) return ERROR(...);

   compares the offset against the numeric address of the `base`
   pointer.  On any 64-bit system the address is far higher than any
   32-bit offset can reach, so the check never fires; on 32-bit
   systems it fires only when `base` happens to be allocated at a
   low address, which is unrelated to whether `op - sequence.offset`
   underflows `base`.

The real underflow detection is the `if (match < base)` line that
runs in between -- which is correct, and is left untouched.

Replace the U32 cast with a `(size_t)` cast so the comparison uses
full pointer precision; remove the no-op pointer-as-int check.
v0.1 legacy frames remain decodable; the legitimate parser path
through `ZSTDv01_decompressDCtx` does not regress.

Built locally: `make -C lib libzstd.a` clean.

* lib/legacy/zstd_v01.c (ZSTD_execSequence): fix offset bound and
  remove bogus pointer comparison.
@evilgensec evilgensec force-pushed the fix-zstd-v01-offset-check branch from 9c16d8f to 82b4fb7 Compare May 28, 2026 01:56
@meta-cla meta-cla Bot added the CLA Signed label May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants