Skip to content

[WIP] Add span-based Deflate, ZLib and GZip encoder/decoder APIs#123145

Draft
iremyux wants to merge 62 commits intodotnet:mainfrom
iremyux:62113-zlib-encoder-decoder
Draft

[WIP] Add span-based Deflate, ZLib and GZip encoder/decoder APIs#123145
iremyux wants to merge 62 commits intodotnet:mainfrom
iremyux:62113-zlib-encoder-decoder

Conversation

@iremyux
Copy link
Contributor

@iremyux iremyux commented Jan 13, 2026

This PR introduces new span-based, streamless compression and decompression APIs for Deflate, ZLib, and GZip formats, matching the existing BrotliEncoder/BrotliDecoder pattern.

New APIs

  • DeflateEncoder / DeflateDecoder
  • ZLibEncoder / ZLibDecoder
  • GZipEncoder / GZipDecoder

These classes provide:

  • Instance-based API for streaming/chunked compression with Compress(), Decompress(), and Flush()
  • Static one-shot API via TryCompress() and TryDecompress() for simple scenarios
  • GetMaxCompressedLength() to calculate buffer sizes

Closes #62113
Closes #39327
Closes #44793

/// <returns>One of the enumeration values that describes the status with which the operation finished.</returns>
public OperationStatus Flush(Span<byte> destination, out int bytesWritten)
{
return Compress(ReadOnlySpan<byte>.Empty, destination, out _, out bytesWritten, isFinalBlock: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this force writing output (if available), I think this should lead to FlushCode.SyncFlush to the native API

/// <param name="source">A read-only span of bytes containing the source data to compress.</param>
/// <param name="destination">When this method returns, a span of bytes where the compressed data is stored.</param>
/// <param name="bytesWritten">When this method returns, the total number of bytes that were written to <paramref name="destination"/>.</param>
/// <param name="compressionLevel">A number representing compression level. -1 is default, 0 is no compression, 1 is best speed, 9 is best compression.</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more clear which default we mean.

Suggested change
/// <param name="compressionLevel">A number representing compression level. -1 is default, 0 is no compression, 1 is best speed, 9 is best compression.</param>
/// <param name="compressionLevel">A number representing compression level. -1 means implementation default, 0 is no compression, 1 is best speed, 9 is best compression.</param>

@iremyux iremyux changed the title [WIP] Add span-based ZlibEncoder and ZlibDecoder APIs [WIP] Add span-based Deflate, ZLib and GZip encoder/decoder APIs Jan 19, 2026
CompressionLevel.Fastest => ZLibNative.CompressionLevel.BestSpeed,
CompressionLevel.NoCompression => ZLibNative.CompressionLevel.NoCompression,
CompressionLevel.SmallestSize => ZLibNative.CompressionLevel.BestCompression,
_ => throw new ArgumentOutOfRangeException(nameof(compressionLevel)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would fail on valid native compression levels not covered by the CompressionLevel enum. Instead I think it should check if the value is is < -1 or > 9 to throw out of range instead.

Copy link
Member

@AraHaan AraHaan Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to add on to the above, now those who want compression levels that just happen to == a value in the CompressionLevel enum will now not be able to use those compression levels either. Perhaps a solution to this is to expose a version of the ctor with CompressionLevel and a version with int that gets casted to ZLibNative.CompressionLevel after a range check.

Copilot AI review requested due to automatic review settings February 26, 2026 11:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 3 comments.


/// <summary>
/// Defines the compression algorithms that can be used for <see cref="DeflateStream"/>, <see cref="GZipStream"/> or <see cref="ZLibStream"/>.
/// Defines the compression algorithms that can be used for <see cref="DeflateStream"/>, <see cref="GZipStream"/> or <see cref="ZLibStream"/>.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence should have "Defines" capitalized consistently with the rest of the documentation, and there's an extra space after "Defines". Should be "Defines the compression algorithms..."

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 4, 2026 09:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 4 comments.

ArgumentOutOfRangeException.ThrowIfNegative(inputLength);
ArgumentOutOfRangeException.ThrowIfGreaterThan(inputLength, uint.MaxValue);

return (long)Interop.ZLib.compressBound((uint)inputLength);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeflateEncoder.GetMaxCompressedLength relies on Interop.ZLib.compressBound returning a 32-bit value. For inputLength close to uint.MaxValue, zlib's compressBound calculation can exceed 4GiB, so truncating to uint can produce an undersized bound (and the method currently permits inputLength == uint.MaxValue). Consider changing the native/interop contract to return a 64-bit size (e.g., ulong/nuint) and adding an overflow check (throw) when the bound can't be represented safely.

Suggested change
return (long)Interop.ZLib.compressBound((uint)inputLength);
// Compute the same upper bound as zlib's compressBound macro using 64-bit arithmetic:
// bound = sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + (sourceLen >> 25) + 13
ulong length = (ulong)inputLength;
ulong bound = length
+ (length >> 12)
+ (length >> 14)
+ (length >> 25)
+ 13;
if (bound > long.MaxValue)
{
throw new ArgumentOutOfRangeException(nameof(inputLength));
}
return (long)bound;

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +151
Calculates and returns an upper bound on the compressed size after deflate compressing sourceLen bytes.
This is a worst-case estimate that accounts for incompressible data and zlib wrapper overhead.
The actual compressed size will typically be smaller.

Returns the maximum number of bytes the compressed output could require.
*/
FUNCTIONEXPORT uint32_t FUNCTIONCALLINGCONVENTION CompressionNative_CompressBound(uint32_t sourceLen);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CompressionNative_CompressBound is declared as returning uint32_t, but compressBound can compute values larger than 4GiB for large inputs. Since the managed API exposes GetMaxCompressedLength(long), consider returning a 64-bit size (uint64_t/size_t) from the native export (and updating interop) or tightening the accepted input range so the bound never overflows.

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +92
_state = ZLibNative.ZLibStreamHandle.CreateForDeflate(
(ZLibNative.CompressionLevel)quality,
windowBits,
ZLibNative.Deflate_DefaultMemLevel,
ZLibNative.CompressionStrategy.DefaultStrategy);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeflateEncoder always passes ZLibNative.Deflate_DefaultMemLevel to CreateForDeflate, even when quality/options specify no compression (level 0). DeflateStream adjusts memLevel for NoCompression (uses Deflate_NoCompressionMemLevel=7), so aligning DeflateEncoder with that behavior would reduce memory usage and keep the span-based API consistent with the stream implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +63 to +83
/// <summary>
/// Gets or sets the base-2 logarithm of the window size for a compression stream.
/// </summary>
/// <exception cref="ArgumentOutOfRangeException">The value is less than -1 or greater than 15, or between 0 and 7.</exception>
/// <remarks>
/// Can accept -1 or any value between 8 and 15 (inclusive). Larger values result in better compression at the expense of memory usage.
/// -1 requests the default window log which is currently equivalent to 15 (32KB window). The default value is -1.
/// </remarks>
public int WindowLog
{
get => _windowLog;
set
{
if (value != -1)
{
ArgumentOutOfRangeException.ThrowIfLessThan(value, ZLibNative.MinWindowLog);
ArgumentOutOfRangeException.ThrowIfGreaterThan(value, ZLibNative.MaxWindowLog);
}

_windowLog = value;
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZLibCompressionOptions gained the WindowLog property (including range validation and new Min/Max/Default window log APIs), but the existing ZLibCompressionOptionsUnitTests don't cover default value, valid assignments, or out-of-range cases for WindowLog. Adding targeted tests would help prevent regressions in the new option surface area.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings March 4, 2026 10:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Comment on lines +156 to +164
/// <exception cref="ArgumentOutOfRangeException"><paramref name="inputLength"/> is negative or exceeds <see cref="uint.MaxValue"/>.</exception>
public static long GetMaxCompressedLength(long inputLength)
{
ArgumentOutOfRangeException.ThrowIfNegative(inputLength);
ArgumentOutOfRangeException.ThrowIfGreaterThan(inputLength, uint.MaxValue);

// compressBound() returns the upper bound for zlib-wrapped deflate output.
// For raw deflate (no header/trailer) this slightly overestimates, which is safe.
return (long)Interop.ZLib.compressBound((uint)inputLength);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new encoders expose GetMaxCompressedLength(long), but these APIs are described as matching the BrotliEncoder pattern and are span-based (Span length is int). Consider aligning with BrotliEncoder by taking an int input size and throwing when the computed bound would exceed int.MaxValue; otherwise callers can receive bounds they cannot use to allocate a Span-backed buffer.

Suggested change
/// <exception cref="ArgumentOutOfRangeException"><paramref name="inputLength"/> is negative or exceeds <see cref="uint.MaxValue"/>.</exception>
public static long GetMaxCompressedLength(long inputLength)
{
ArgumentOutOfRangeException.ThrowIfNegative(inputLength);
ArgumentOutOfRangeException.ThrowIfGreaterThan(inputLength, uint.MaxValue);
// compressBound() returns the upper bound for zlib-wrapped deflate output.
// For raw deflate (no header/trailer) this slightly overestimates, which is safe.
return (long)Interop.ZLib.compressBound((uint)inputLength);
/// <exception cref="ArgumentOutOfRangeException">
/// <paramref name="inputLength"/> is negative or the computed maximum compressed length exceeds <see cref="int.MaxValue"/>.
/// </exception>
public static int GetMaxCompressedLength(int inputLength)
{
ArgumentOutOfRangeException.ThrowIfNegative(inputLength);
// compressBound() returns the upper bound for zlib-wrapped deflate output.
// For raw deflate (no header/trailer) this slightly overestimates, which is safe.
uint bound = Interop.ZLib.compressBound((uint)inputLength);
if (bound > int.MaxValue)
{
throw new ArgumentOutOfRangeException(nameof(inputLength));
}
return (int)bound;

Copilot uses AI. Check for mistakes.
public System.Buffers.OperationStatus Compress(System.ReadOnlySpan<byte> source, System.Span<byte> destination, out int bytesConsumed, out int bytesWritten, bool isFinalBlock) { throw null; }
public void Dispose() { }
public System.Buffers.OperationStatus Flush(System.Span<byte> destination, out int bytesWritten) { throw null; }
public static long GetMaxCompressedLength(long inputLength) { throw null; }
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetMaxCompressedLength is part of the public surface area here and currently uses a long input/return type. If the intent is to follow the existing BrotliEncoder shape, consider using an int input size and throwing when the bound would exceed int.MaxValue (matching BrotliEncoder.GetMaxCompressedLength). This keeps the API consistent with Span-based limits and avoids returning sizes that can't back a Span.

Suggested change
public static long GetMaxCompressedLength(long inputLength) { throw null; }
public static int GetMaxCompressedLength(int inputLength) { throw null; }

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[API Proposal]: Add Deflate, ZLib and GZip encoder/decoder APIs Add static compression helper methods Span-based (non-stream) compression APIs

4 participants