fix: OTLP wire format — raw zstd + correct resource attribute field tag by strawgate · Pull Request #887 · strawgate/fastforward

strawgate · 2026-04-04T02:53:08Z

Summary

Raw zstd compression (OTLP sink sends logfwd custom wire format with Content-Encoding: zstd instead of raw zstd #455): Replace ChunkCompressor (which prepends a 16-byte logfwd-internal header) with zstd::bulk::Compressor so OTLP receivers get standard zstd frames per HTTP Content-Encoding spec.
Correct resource attribute field tag (P0: OTLP resource attributes encoded with wrong protobuf field tag #642): Add field_number parameter to encode_key_value_{string,int,double,bool}. Resource attributes now use field 1 (Resource.attributes), LogRecord attributes use field 6 (LogRecord.attributes).

Fixes #455, Fixes #642
Part of work-unit #869

Test plan

cargo test -p logfwd-output — 140 tests pass (including OTLP encoding tests)
cargo test -p logfwd-core — 66 tests pass
cargo clippy -p logfwd-output -- -D warnings — clean
cargo fmt — clean
Manual: send to an OTLP collector and verify resource attributes decode correctly
Manual: verify zstd-compressed payloads are accepted without the logfwd header

🤖 Generated with Claude Code

Two P0 fixes in the OTLP encoding path: 1. Use raw zstd compression instead of ChunkCompressor (#455). ChunkCompressor prepends a 16-byte logfwd-internal header that OTLP receivers cannot parse. Now uses zstd::bulk::Compressor directly to produce standard zstd frames per HTTP Content-Encoding spec. 2. Parameterize protobuf field tag in encode_key_value_* functions (#642). Resource.attributes is field 1, LogRecord.attributes is field 6. Previously all call sites hardcoded field 6, causing resource attributes to be misplaced in the protobuf output. Fixes #455, Fixes #642 Part of work-unit #869 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-04T02:53:23Z

Walkthrough

The PR replaces the OTLP sink's zstd compression mechanism from a custom logfwd wire format to raw zstd frames by adding the zstd crate dependency and switching from logfwd_io::compress::ChunkCompressor to zstd::bulk::Compressor. Additionally, it corrects protobuf field tag encoding for Resource attributes by parameterizing helper functions with field numbers to emit Resource attributes on field 1 and LogRecord attributes on field 6 per OTLP schema specifications.

Possibly related PRs

fix: gRPC framing correctness and robustness follow-up #650: Modifies the same file's zstd compressor type and compressed payload handling in the OTLP sink.
fix: CLI UX, config, OTLP HTTP conformance, and metrics double-count bugs #744: Updates the OTLP sink's send path compression and buffer management alongside other sink behavior.
refactor: decommission shadow JSON parser in otlp.rs (#357) #370: Refactored OTLP encoding consolidation that directly overlaps with the protobuf field encoding changes in this PR.

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

Ignore

❌ Failed checks (3 errors, 3 warnings)

Check name	Status	Explanation	Resolution
High-Quality Rust Practices	❌ Error	The `pub fn new()` method on `OtlpSink` lacks a `///` doc comment, violating the requirement for all public items to have documentation.	Add a doc comment to `pub fn new()` explaining its purpose and documenting each parameter (endpoint, protocol, compression, headers).
Formal Verification Coverage	❌ Error	PR parameterizes encode_key_value_* functions correctly but lacks all required formal verification: no Kani proofs, proptest roundtrip tests, or VERIFICATION.md updates specified in issue `#642`.	Add four Kani proofs to otlp_sink.rs, proptest roundtrip test with 10K iterations, and update VERIFICATION.md per issue `#642` requirements before merging.
Crate Boundary And Dependency Integrity	❌ Error	zstd dependency declared directly in logfwd-output/Cargo.toml instead of in workspace.dependencies section.	Move zstd = "0.13" to root workspace.dependencies and reference via { workspace = true } in logfwd-output/Cargo.toml.
Linked Issues check	⚠️ Warning	PR partially addresses `#455` and `#642` but omits critical requirements: missing Kani proofs, proptest roundtrip, fuzz targets, and field_number parameter for all encode_key_value_* functions.	Add missing verification: (1) Kani proofs for encode_key_value_string/int/double/bool verifying correct field tags; (2) proptest roundtrip test for resource attributes; (3) fuzz target for encode_batch; (4) ensure all four encode_key_value_* helpers accept field_number parameter.
Documentation Thoroughly Updated	⚠️ Warning	PR implements parameterized field_number in otlp_sink.rs with doc comments but omits required documentation: Kani proofs, proptest roundtrip tests, fuzz target, VERIFICATION.md update, DESIGN.md ADR, DEVELOPING.md lesson, and PHASES.md status mark.	Add #[cfg(kani)] mod with 4 proofs; add proptest roundtrip test; add fuzz_targets/encode_batch.rs; update VERIFICATION.md, DESIGN.md ADR, DEVELOPING.md lesson, PHASES.md.
Maintainer Fitness	⚠️ Warning	PR implements core logic for issue `#642` but omits mandatory verification work including Kani proofs, proptest roundtrip test, fuzz target, and documentation.	Implement all acceptance criteria: add cfg(kani) verification block with 4 proof harnesses, proptest_resource_attrs_roundtrip test, encode_batch fuzz target, manual testing validation, and risk documentation.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Out of Scope Changes check	✅ Passed	All changes directly address linked issues `#455` (zstd compression format) and `#642` (protobuf field tags). No unrelated modifications detected.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

crates/logfwd-output/src/otlp_sink.rs (2)

641-650: ⚠️ Potential issue | 🟡 Minor

expect() on payload size could panic on oversized batches.

If payload.len() > u32::MAX, this panics. While 4 GiB payloads are unlikely, a defensive check returning io::Error would be safer in production.

🛡️ Defensive alternative

 fn write_grpc_frame(buf: &mut Vec<u8>, payload: &[u8], compressed: bool) {
+    let len = u32::try_from(payload.len())
+        .map_err(|_| io::Error::new(io::ErrorKind::InvalidInput, "gRPC payload exceeds 4 GiB"))?;
     buf.clear();
     buf.push(u8::from(compressed));
-    buf.extend_from_slice(
-        &u32::try_from(payload.len())
-            .expect("gRPC message payload must be < 4 GiB")
-            .to_be_bytes(),
-    );
+    buf.extend_from_slice(&len.to_be_bytes());
     buf.extend_from_slice(payload);
+    Ok(())
 }

This requires updating the signature to return io::Result<()> and propagating in send_batch.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/otlp_sink.rs` around lines 641 - 650, The helper
write_grpc_frame currently calls expect(...) and can panic for payloads >
u32::MAX; change write_grpc_frame to return std::io::Result<()> instead of
panicking, validate payload.len() <= u32::MAX, and return
Err(std::io::Error::new(io::ErrorKind::InvalidData, "gRPC message payload too
large")) when it exceeds; update callers (notably send_batch) to propagate the
error (using ?), and adapt any call sites to handle or convert the io::Error as
needed.

201-215: ⚠️ Potential issue | 🟠 Major

Hot-path allocation: compress() allocates a new Vec on every batch.

compressor.compress() creates a fresh buffer per send call, violating the "no per-record allocations" rule. The correct fix reuses compress_buf with compress_to_buffer() via a Cursor, as shown in crates/logfwd-io/src/compress.rs.

🔧 Correct fix: reuse buffer with Cursor

             Compression::Zstd => {
                 if let Some(ref mut compressor) = self.compressor {
                     // Produce raw zstd frames — no logfwd-internal header.
                     // OTLP receivers expect standard zstd per HTTP Content-Encoding.
-                    self.compress_buf.clear();
-                    let compressed = compressor
-                        .compress(&self.encoder_buf)
-                        .map_err(io::Error::other)?;
-                    self.compress_buf = compressed;
+                    // Resize to worst-case bound to avoid reallocation.
+                    let max_compressed = zstd::zstd_safe::compress_bound(self.encoder_buf.len());
+                    self.compress_buf.clear();
+                    self.compress_buf.resize(max_compressed, 0);
+                    let mut cursor = std::io::Cursor::new(&mut self.compress_buf);
+                    let compressed_len = compressor
+                        .compress_to_buffer(&self.encoder_buf, &mut cursor)
+                        .map_err(io::Error::other)?;
+                    self.compress_buf.truncate(compressed_len);
                     &self.compress_buf

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/otlp_sink.rs` around lines 201 - 215, The current
hot path calls compressor.compress(&self.encoder_buf) which allocates a new Vec
each send; change it to reuse self.compress_buf by calling the non-allocating
API (compress_to_buffer) with a Cursor over self.compress_buf: clear or truncate
self.compress_buf, create a Cursor (std::io::Cursor) around it, call
compressor.compress_to_buffer(&mut cursor, &self.encoder_buf) (as implemented in
compress.rs), then use &self.compress_buf as the payload; keep the branch and
symbol names (Compression::Zstd, self.compressor, self.compress_buf,
self.encoder_buf) so only the allocation call is replaced with the Cursor-based
compress_to_buffer approach.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/logfwd-output/src/otlp_sink.rs`:
- Around line 641-650: The helper write_grpc_frame currently calls expect(...)
and can panic for payloads > u32::MAX; change write_grpc_frame to return
std::io::Result<()> instead of panicking, validate payload.len() <= u32::MAX,
and return Err(std::io::Error::new(io::ErrorKind::InvalidData, "gRPC message
payload too large")) when it exceeds; update callers (notably send_batch) to
propagate the error (using ?), and adapt any call sites to handle or convert the
io::Error as needed.
- Around line 201-215: The current hot path calls
compressor.compress(&self.encoder_buf) which allocates a new Vec each send;
change it to reuse self.compress_buf by calling the non-allocating API
(compress_to_buffer) with a Cursor over self.compress_buf: clear or truncate
self.compress_buf, create a Cursor (std::io::Cursor) around it, call
compressor.compress_to_buffer(&mut cursor, &self.encoder_buf) (as implemented in
compress.rs), then use &self.compress_buf as the payload; keep the branch and
symbol names (Compression::Zstd, self.compressor, self.compress_buf,
self.encoder_buf) so only the allocation call is replaced with the Cursor-based
compress_to_buffer approach.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: b9fab4b7-891f-4a68-8f6f-0abb144a5237

📥 Commits

Reviewing files that changed from the base of the PR and between 26cee6e and 50f043c.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

crates/logfwd-output/Cargo.toml
crates/logfwd-output/src/otlp_sink.rs

… panic Reuse self.compress_buf via compress_to_buffer() instead of allocating a new Vec on every batch in the hot path. Replace expect() panic in write_grpc_frame with io::Result propagation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… panic (#888) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 4, 2026

View reviewed changes

strawgate merged commit d3cea6d into master Apr 4, 2026
7 checks passed

strawgate deleted the fix/otlp-wire-format-869 branch April 4, 2026 03:26

strawgate mentioned this pull request Apr 4, 2026

fix: address CodeRabbit feedback on #887 — reuse compress buf, remove panic #888

Merged

3 tasks

strawgate added a commit that referenced this pull request Apr 4, 2026

fix: address CodeRabbit feedback on #887 — reuse compress buf, remove…

5a32147

… panic (#888) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This was referenced Apr 4, 2026

Roadmap: logfwd implementation phases #889

Closed

work-unit: logfwd-output/otlp — P0 OTLP wire format fix #869

Closed

This was referenced Apr 4, 2026

feat: add verification coverage for OTLP protobuf encoding #907

Merged

Fix OTLP sink error handling and init path panics #1164

Merged

fix(logfwd-output): OTLP sink retry and gRPC Content-Encoding fixes #1242

Closed

github-actions Bot mentioned this pull request Apr 6, 2026

OTLP receiver refactor — axum, streaming decompression, unified receivers #1311

Open

7 tasks

This was referenced Apr 8, 2026

fix(otlp): resource column grouping contract (#971) #1573

Merged

[codex] Preserve OTLP resource grouping for null-safe resource tuples (#971) #1588

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: OTLP wire format — raw zstd + correct resource attribute field tag#887

fix: OTLP wire format — raw zstd + correct resource attribute field tag#887
strawgate merged 1 commit into
masterfrom
fix/otlp-wire-format-869

strawgate commented Apr 4, 2026

Uh oh!

coderabbitai Bot commented Apr 4, 2026 •

edited

Loading

Pre-merge checks failed

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

strawgate commented Apr 4, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Pre-merge checks failed

❌ Failed checks (3 errors, 3 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 4, 2026 •

edited

Loading