Skip to content

[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555

Closed
SteNicholas wants to merge 5 commits into
apache:mainfrom
SteNicholas:CELEBORN-2218
Closed

[CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566#3555
SteNicholas wants to merge 5 commits into
apache:mainfrom
SteNicholas:CELEBORN-2218

Conversation

@SteNicholas

@SteNicholas SteNicholas commented Dec 3, 2025

Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

  • Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566.
  • Lz4Decompressor follows the suggestion to move from fastDecompressor to safeDecompressor to mitigate the performance.

Backport:

Why are the changes needed?

  • CVE‐2025‐12183: Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0.

  • CVE-2025-66566: Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected.

Therefore, lz4-java version should upgrade to 1.10.4.

Does this PR resolve a correctness bug?

No.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI.

@yawkat

yawkat commented Dec 3, 2025

Copy link
Copy Markdown

I recommend you stick with fastestInstance. It is secure as long as you are on 1.8.1+. It will be much slower than in previous versions, but that can be mitigated by moving to fastestInstance.safeDecompressor like you did in this PR, which is much faster.

@SteNicholas

Copy link
Copy Markdown
Member Author

@yawkat, thanks for review. I have sticked with fastestInstance for the workaround With the 1.8.1 patch applied, these workarounds are not necessary. It is still recommended to move from fastDecompressor to safeDecompressor to mitigate the performance impact of the fix, however. PTAL.

@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from c113f4f to 202ba06 Compare December 4, 2025 13:49
Comment thread client/src/main/java/org/apache/celeborn/client/compress/Lz4Decompressor.java Outdated
@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from 95d3ed2 to 7ec6523 Compare December 11, 2025 05:56
@SteNicholas SteNicholas force-pushed the CELEBORN-2218 branch 2 times, most recently from da9e0bc to 8043b4a Compare December 11, 2025 06:24
@yawkat

yawkat commented Dec 11, 2025

Copy link
Copy Markdown

Also fyi there was another cve (CVE-2025-66566) that needs a newer version

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.8.1 to resolve CVE‐2025‐12183 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.0 to resolve CVE‐2025‐12183 and CVE-2025-66566 Dec 12, 2025
@Marcono1234

Copy link
Copy Markdown

CVE-2025-66566 affects versions less than or equal to 1.10.0. You should upgrade to 1.10.1.

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.0 to resolve CVE‐2025‐12183 and CVE-2025-66566 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.1 to resolve CVE‐2025‐12183 and CVE-2025-66566 Dec 15, 2025
@SteNicholas

Copy link
Copy Markdown
Member Author

Ping @pan3793, @yawkat, @Marcono1234.

Comment thread client-tez/tez-shaded/pom.xml Outdated
@pan3793

pan3793 commented Dec 16, 2025

Copy link
Copy Markdown
Member

lz4 is famous for its ultra-fast speed, the upgrade is not free, my test shows it has perf impact - apache/spark#53453

I understand that security takes precedence over performance, so I'm fine with this change.

for the suggestion of 'moving to fastestInstance.safeDecompressor', I think we can NOT do that blindly - Celeborn Spark/Flink clients use the lz4-java libs provided by the engine libs, since we support a wide range of Spark/Flink versions, it's possible that the engine still ships old lz4-java jar, we may need to dynamiclly check and bind the fastDecompressor or safeDecompressor based on runtime version of lz4-java

@yawkat

yawkat commented Dec 16, 2025

Copy link
Copy Markdown

@pan3793 safeDecompressor should work just fine on old versions, and even on those old versions, it should be slightly faster than fastDecompressor. In fact, using safeDecompressor gets rid of most (but not all) of the security impact of the CVEs on old versions.

Comment thread client/benchmarks/LZ4TPCDSDataBenchmark-jdk17-results.txt Outdated
@pan3793

pan3793 commented Mar 2, 2026

Copy link
Copy Markdown
Member

@SteNicholas Code change looks fine, but let's wait for a while to collect feedback from other reviewers, about the performance drop.

@SteNicholas

SteNicholas commented Mar 2, 2026

Copy link
Copy Markdown
Member Author

@yawkat. could you please take a look at performance drop of safeDecompressor which refers to the benchmark report of LZ4TPCDSDataBenchmark-jdk17-results.txt?

@yawkat

yawkat commented Mar 2, 2026

Copy link
Copy Markdown

@SteNicholas I don't have time for deep benchmarking, but I just threw an AI agent at it, and it figured out that some build flag changes that distros do can improve performance by 10%. Could you test with the 1.10.4 I just released?

@pan3793

pan3793 commented Mar 3, 2026

Copy link
Copy Markdown
Member

@yawkat thanks! it indeed solves the performance regression. our benchmark shows 1.10.4 is much faster than 1.10.3, and even faster than 1.8.0! and I got a similar result in Spark apache/spark#54585

@RexXiong

RexXiong commented Mar 3, 2026

Copy link
Copy Markdown
Contributor

@yawkat thanks! it indeed solves the performance regression. our benchmark shows 1.10.4 is much faster than 1.10.3, and even faster than 1.8.0! and I got a similar result in Spark apache/spark#54585

Sounds Great, I think we can keep LZ4 as default.

@SteNicholas SteNicholas changed the title [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.3 to resolve CVE‐2025‐12183 and CVE-2025-66566 [CELEBORN-2218] Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566 Mar 3, 2026
SteNicholas added a commit that referenced this pull request Mar 3, 2026
… CVE‐2025‐12183 and CVE-2025-66566

- Bump lz4-java version from 1.8.0 to 1.10.4 to resolve CVE‐2025‐12183 and CVE-2025-66566.
- `Lz4Decompressor` follows the [suggestion](apache/spark#53290 (comment)) to move from `fastDecompressor` to `safeDecompressor` to mitigate the performance.

Backport:

- apache/spark#53327
- apache/spark#53347
- apache/spark#53971
- apache/spark#53454
- apache/spark#54585

- [CVE‐2025‐12183](https://sites.google.com/sonatype.com/vulnerabilities/cve-2025-12183): Various lz4-java compression and decompression implementations do not guard against out-of-bounds memory access. Untrusted input may lead to denial of service and information disclosure. Vulnerable Maven coordinates: org.lz4:lz4-java up to and including 1.8.0.

- [CVE-2025-66566](GHSA-cmp6-m4wj-q63q): Insufficient clearing of the output buffer in Java-based decompressor implementations in lz4-java 1.10.0 and earlier allows remote attackers to read previous buffer contents via crafted compressed input. In applications where the output buffer is reused without being cleared, this may lead to disclosure of sensitive data. JNI-based implementations are not affected.

Therefore, lz4-java version should upgrade to 1.10.4.

No.

No.

CI.

Closes #3555 from SteNicholas/CELEBORN-2218.

Lead-authored-by: SteNicholas <programgeek@163.com>
Co-authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: SteNicholas <programgeek@163.com>
(cherry picked from commit dca3749)
Signed-off-by: SteNicholas <programgeek@163.com>
@SteNicholas

Copy link
Copy Markdown
Member Author

Thanks for all. Merged to main(v0.7.0) and branch-0.6(v0.6.3).

@yawkat

yawkat commented Mar 3, 2026

Copy link
Copy Markdown

Given that zstd is from the same authors as lz4 but newer, it may still be a good idea to move to zstd as the default long-term.

pan3793 added a commit to pan3793/iceberg that referenced this pull request Mar 5, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
pan3793 added a commit to pan3793/trino that referenced this pull request Mar 5, 2026
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
electrum pushed a commit to trinodb/trino that referenced this pull request Mar 5, 2026
Trino switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
pan3793 added a commit to pan3793/clickhouse-java that referenced this pull request Mar 5, 2026
ClickHouse Java Client switched to `at.yawk.lz4:lz4-java` for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Apache Celeborn and Apache Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
huaxingao pushed a commit to apache/iceberg that referenced this pull request Mar 6, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
RjLi13 pushed a commit to RjLi13/iceberg that referenced this pull request Mar 12, 2026
Iceberg switched to `at.yawk.lz4:lz4-java` group for security reasons, but it unintentionally introduced performance regression.

https://github.com/yawkat/lz4-java/releases/tag/v1.10.4

> These changes attempt to fix the native performance regression in 1.9+. They should have no functional or security impact.

See the benchmark reports in Celeborn and Spark projects

- CELEBORN-2218 / apache/celeborn#3555
- SPARK-55803 / apache/spark#54585
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants