Skip to content

reedz/lz4.ai.sharp

Repository files navigation

LZ4.AI.Sharp

⚠️ AI-Generated Code — This entire library was migrated from C to C# and performance-tuned using GitHub Copilot.

NuGet License

A high-performance C#/.NET implementation of the LZ4 compression algorithm, featuring both fast and high-compression modes. This library was created as an AI-assisted port of the LZ4 reference implementation to demonstrate modern AI-powered code migration and optimization.

Features

  • Fast Compression (LZ4Codec) — Optimized for speed with competitive compression ratios
  • High Compression (LZ4HC) — Better compression ratios at slower speeds
  • Dictionary Compression — Pre-trained dictionaries for superior small-message compression
  • Frame Format (LZ4Frame) — Stream-oriented API with checksums and metadata
  • XXHash — Fast non-cryptographic hash function
  • Pure Managed C# — No external native library dependencies, runs on any .NET platform (uses unsafe code for performance)
  • Zero Allocations — Work directly with byte arrays and spans

Installation

dotnet add package LZ4.AI.Sharp

Quick Start

Fast Compression

using LZ4Sharp;

byte[] input = GetYourData();
byte[] compressed = new byte[LZ4Codec.CompressBound(input.Length)];
// accel: 1 (higher compression) to 65537 (faster), default is 1
int compressedSize = LZ4Codec.CompressFast(input, compressed, input.Length, compressed.Length, acceleration: 1);

byte[] decompressed = new byte[input.Length];
LZ4Codec.DecompressSafe(compressed, decompressed, compressedSize, decompressed.Length);

High Compression

using LZ4Sharp;

byte[] compressed = new byte[LZ4HC.CompressBound(input.Length)];
int compressedSize = LZ4HC.CompressHC(input, compressed, input.Length, compressed.Length);

Dictionary Compression

Dictionary compression dramatically improves ratios for small, similar messages (e.g., JSON API responses, log lines, protocol buffers) by pre-training on representative data.

using LZ4Sharp;

// Build a dictionary from representative samples
byte[] dictionary = BuildDictionaryFromSamples();

// Compress with dictionary (fast mode)
byte[] compressed = new byte[LZ4Codec.CompressBound(data.Length)];
int compressedSize = LZ4Codec.CompressWithDict(data, compressed, data.Length, compressed.Length, dictionary);

// Decompress with the SAME dictionary
byte[] decompressed = new byte[originalSize];
int size = LZ4Codec.DecompressWithDict(compressed, decompressed, compressedSize, decompressed.Length, dictionary);

// HC dictionary compression for better ratios
int hcSize = LZ4HC.CompressHCWithDict(data.AsSpan(), compressed, dictionary.AsSpan(), compressionLevel: 9);

Note: Both compressor and decompressor must use the identical dictionary. Only the last 64 KB of the dictionary is used. Span overloads are also available for zero-copy scenarios. See DICTIONARY-GUIDE.md for a complete guide on building and using dictionaries.

Performance

See BENCHMARKS.md for detailed performance results on JSON and Silesia corpus datasets.

Highlights (Intel N100, .NET 10):

  • JSON dataset: ratios 0.29–0.77 vs K4os (1.3x–3.4x faster)
  • Silesia corpus (accel=1): ratios 0.09–1.00 vs K4os (up to 11.1x faster)
  • Silesia: faster in 11/12 files at accel=1
  • Compression ratios: Match the reference implementation

Testing

The test suite includes:

  • Unit Tests — Comprehensive algorithm correctness tests
  • Compatibility Tests — Cross-validation with K4os.Compression.LZ4 to ensure interoperability
  • Round-trip Tests — Compression/decompression verification

All tests verify that LZ4Sharp produces output compatible with other LZ4 implementations.

Development

# Build
dotnet build src/LZ4Sharp.sln -c Release

# Test
dotnet test src/LZ4Sharp.sln -c Release

# Run benchmarks
dotnet run --project src/LZ4Sharp.Benchmarks -c Release

# Package
dotnet pack src/LZ4Sharp/LZ4Sharp.csproj -c Release

AI Provenance

This library is 100% AI-generated code:

  1. Migration — Ported from C reference implementation using GitHub Copilot
  2. Optimization — Performance-tuned through iterative AI-assisted refinement
  3. Testing — Test cases and benchmarks created with AI assistance

The project demonstrates AI capabilities in code translation, optimization, and maintaining algorithmic correctness while adapting to different language paradigms.

Attribution

License

MIT License — See LICENSE for details.

About

Extremely Fast Compression algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages