Skip to content

Add fuzz testing harness (3 targets, found scanner crash)#57

Merged
strawgate merged 1 commit into
masterfrom
fuzz-harness
Mar 29, 2026
Merged

Add fuzz testing harness (3 targets, found scanner crash)#57
strawgate merged 1 commit into
masterfrom
fuzz-harness

Conversation

@strawgate

Copy link
Copy Markdown
Owner

Summary

Adds cargo-fuzz / libfuzzer fuzz testing for logfwd-core. Three fuzz targets that test the boundaries where untrusted data enters the system.

Fuzz targets

Target What it tests 30s results
fuzz_format_parser JsonParser, RawParser, CriParser with arbitrary bytes (binary, partial, huge) 852K runs, no crashes
fuzz_scanner JSON scanner → Arrow RecordBatch with malformed input CRASH FOUND
fuzz_pipe_event PipeEvent struct deserialization from corrupted ring buffer data 57.9M runs, no crashes

Bug found: scanner crash on binary keys (#56)

The scanner panics when processing malformed JSON with binary (non-UTF-8) key bytes. The batch builder produces columns with mismatched lengths. Minimized to 18 bytes.

Likely resolved by the upcoming SIMD parser rewrite (#23).

Usage

cd crates/logfwd-core
cargo +nightly fuzz run fuzz_format_parser   # run until stopped
cargo +nightly fuzz run fuzz_scanner -- -max_total_time=60  # 60 seconds

Test plan

  • cargo test --workspace — 102 tests pass (fuzz crate is separate workspace)
  • cargo clippy --workspace -- -D warnings — clean
  • All fuzz targets build with cargo +nightly fuzz build
  • 30-second fuzz runs complete without issues (except the known scanner bug)

🤖 Generated with Claude Code

@coderabbitai

coderabbitai Bot commented Mar 29, 2026

Copy link
Copy Markdown

Walkthrough

A new fuzzing test suite is added under crates/logfwd-core/fuzz. A Cargo.toml manifest defines a non-published fuzz crate with three binaries. New fuzz targets added: format_parser (exercises JsonParser, RawParser, CriParser with reset/state and output-size bounds), scanner (runs two scan configurations exercising Scanner and validates column lengths), and pipe_event (parses a fixed-layout PipeEvent and performs bounds/UTF‑8 checks). A .gitignore was added to exclude corpus/ and target/.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-core/fuzz/fuzz_targets/format_parser.rs`:
- Around line 52-60: The output-size invariant is only checked after the final
parser run because `out` is reused and cleared; move or duplicate the size
assertion to run immediately after each parser invocation (i.e., after each time
`out` is populated following calls that exercise `JsonParser`, `RawParser`, and
the CRI invocation) so that amplification in any individual parser is caught;
reference the `out` buffer, the existing assert that compares `out.len()` to
`data.len() * 20 + 4096`, and the places where `out.clear()` is called and each
parser is invoked, and perform the same length check right after each parser
produces output.

In `@crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs`:
- Line 37: The code unsafely creates a reference to a PipeEvent from
data.as_ptr() which can be misaligned and causes undefined behavior; instead
copy the struct out using std::ptr::read_unaligned (or
std::ptr::read_unaligned::<PipeEvent>(data.as_ptr() as *const PipeEvent)) into a
local PipeEvent value and then use that value (replace the unsafe reference
creation to event with a read_unaligned-based copy), ensuring you no longer
dereference a potentially misaligned pointer.

In `@crates/logfwd-core/fuzz/fuzz_targets/scanner.rs`:
- Around line 29-65: Extend the same RecordBatch invariants to the pushdown
path: after creating scanner2 and computing batch2, assert batch2.num_columns()
and batch2.num_rows() are non-negative and, if batch2.num_rows() > 0, iterate
0..batch2.num_columns() and assert batch2.column(i).len() == batch2.num_rows()
(use schema = batch2.schema() and include schema.field(i).name() in the
assertion message) so the field-pushdown path (Scanner::new, ScanConfig,
FieldSpec, scanner2, batch2) enforces the same column-length consistency as the
original batch checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e74a1e91-36f4-4533-b1bf-c27b8468349f

📥 Commits

Reviewing files that changed from the base of the PR and between 42e5629 and 525b860.

⛔ Files ignored due to path filters (1)
  • crates/logfwd-core/fuzz/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • crates/logfwd-core/fuzz/.gitignore
  • crates/logfwd-core/fuzz/Cargo.toml
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/crash-47eef4b87ea82d35dc7da973c3391a3f15800a30
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-2d5b3aeca551c891189e4856971b929d4fefab9e
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-47eef4b87ea82d35dc7da973c3391a3f15800a30
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-5258ff094acbab2d15a3be0c8f89be128454d4bc
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-878968522d930a45a393a40ae4b00fe2a2b95e1c
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-ab479c7a0300377f0cdfa22164d1db0a0a326df2
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-aed4337a5c6da1ffddc9b3acc7c42a86377e2420
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-cb529b0dba4299132befe93a913113cb667acaa2
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-e7f9636f4b6c13489c3188335bd2e0a3d33f2a04
  • crates/logfwd-core/fuzz/fuzz_targets/format_parser.rs
  • crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs
  • crates/logfwd-core/fuzz/fuzz_targets/scanner.rs

Comment thread crates/logfwd-core/fuzz/fuzz_targets/format_parser.rs Outdated
}

// Simulate what userspace does: cast raw bytes to PipeEvent.
let event = unsafe { &*(data.as_ptr() as *const PipeEvent) };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Checking for direct '&*(data.as_ptr() as *const T)' patterns in fuzz targets..."
rg -nP --type rust '&\*\(\s*data\.as_ptr\(\)\s+as\s+\*const\s+[A-Za-z0-9_:]+' crates/logfwd-core/fuzz/fuzz_targets -C2 || true

echo
echo "Checking for any '&*(...as_ptr() as *const T)' patterns in fuzz targets..."
rg -nP --type rust '&\*\(\s*[^)]*as_ptr\(\)\s+as\s+\*const\s+[A-Za-z0-9_:]+' crates/logfwd-core/fuzz/fuzz_targets -C2 || true

Repository: strawgate/memagent

Length of output: 1131


🏁 Script executed:

cat crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs

Repository: strawgate/memagent

Length of output: 1999


🏁 Script executed:

# Search for PipeEvent struct definition
fd -t f '\.rs$' crates/logfwd-core | xargs grep -l 'struct PipeEvent' | head -5

Repository: strawgate/memagent

Length of output: 112


🏁 Script executed:

# Check PipeEvent definition and alignment
rg -n 'struct PipeEvent' crates/logfwd-core -A 10 -B 2

Repository: strawgate/memagent

Length of output: 1011


Fix undefined behavior in raw-byte struct cast.

Line 37 creates a reference from data.as_ptr() without alignment guarantee. PipeEvent has #[repr(C)] with a u64 field (8-byte alignment), but libfuzzer's input is an arbitrary byte slice (1-byte aligned). Dereferencing a misaligned pointer is undefined behavior.

Use std::ptr::read_unaligned() to safely copy the value:

Fix
-    let event = unsafe { &*(data.as_ptr() as *const PipeEvent) };
+    let event = unsafe { std::ptr::read_unaligned(data.as_ptr() as *const PipeEvent) };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs` at line 37, The code
unsafely creates a reference to a PipeEvent from data.as_ptr() which can be
misaligned and causes undefined behavior; instead copy the struct out using
std::ptr::read_unaligned (or std::ptr::read_unaligned::<PipeEvent>(data.as_ptr()
as *const PipeEvent)) into a local PipeEvent value and then use that value
(replace the unsafe reference creation to event with a read_unaligned-based
copy), ensuring you no longer dereference a potentially misaligned pointer.

Comment thread crates/logfwd-core/fuzz/fuzz_targets/scanner.rs Outdated
Fuzz targets (cargo-fuzz / libfuzzer):
- fuzz_format_parser: all FormatParser impls, output size checked per run
- fuzz_scanner: JSON scanner with column-length validation on both modes
- fuzz_pipe_event: PipeEvent deserialization with safe aligned copy

Scanner crash found: malformed JSON with binary keys causes column
length mismatch panic (#56, likely fixed by SIMD parser rewrite #23).

Run: cd crates/logfwd-core && cargo +nightly fuzz run <target>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs (1)

35-40: ⚠️ Potential issue | 🔴 Critical

Fix UB: Vec<u8> pointer cast is still potentially misaligned

Line 39 dereferences a *const PipeEvent from buf.as_ptr(); Vec<u8> has byte alignment, so this can be misaligned and undefined behavior.

🔧 Proposed fix
-    // Safe deserialization: copy into aligned buffer instead of raw pointer cast
-    // (fuzz input may not be aligned to PipeEvent's alignment requirement).
-    let mut buf = vec![0u8; event_size];
-    buf.copy_from_slice(&data[..event_size]);
-    let event = unsafe { &*(buf.as_ptr() as *const PipeEvent) };
+    // Safe deserialization from potentially unaligned fuzz input.
+    let event = unsafe { std::ptr::read_unaligned(data.as_ptr() as *const PipeEvent) };

Use this read-only check to confirm the risky pattern is removed:

#!/bin/bash
set -euo pipefail

echo "Looking for reference casts from u8 buffers to PipeEvent:"
rg -nP --type rust '&\*\(\s*(buf|data)\.as_ptr\(\)\s+as\s+\*const\s+PipeEvent\s*\)' -C2 crates/logfwd-core/fuzz/fuzz_targets

echo
echo "Looking for unaligned reads to PipeEvent:"
rg -nP --type rust 'read_unaligned\s*\(\s*(buf|data)\.as_ptr\(\)\s+as\s+\*const\s+PipeEvent\s*\)' -C2 crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs` around lines 35 - 40, The
current unsafe cast from buf.as_ptr() to *const PipeEvent can still be
misaligned; replace the direct reference creation (the unsafe line creating
event from buf.as_ptr() as *const PipeEvent) with a safe unaligned read:
allocate a properly aligned local PipeEvent value (or use
std::ptr::read_unaligned) and copy bytes from buf into that aligned instance
before taking a reference, or use a safe byte-to-struct conversion helper (e.g.
bytemuck::try_from_bytes or core::slice::align_to) to produce a &PipeEvent;
update the code around buf, event_size and the created event so no pointer-cast
from Vec<u8>::as_ptr() to *const PipeEvent remains.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs`:
- Around line 35-40: The current unsafe cast from buf.as_ptr() to *const
PipeEvent can still be misaligned; replace the direct reference creation (the
unsafe line creating event from buf.as_ptr() as *const PipeEvent) with a safe
unaligned read: allocate a properly aligned local PipeEvent value (or use
std::ptr::read_unaligned) and copy bytes from buf into that aligned instance
before taking a reference, or use a safe byte-to-struct conversion helper (e.g.
bytemuck::try_from_bytes or core::slice::align_to) to produce a &PipeEvent;
update the code around buf, event_size and the created event so no pointer-cast
from Vec<u8>::as_ptr() to *const PipeEvent remains.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 35af5d7e-43fd-4cdd-9739-237dc28c72e6

📥 Commits

Reviewing files that changed from the base of the PR and between 525b860 and 67c5722.

⛔ Files ignored due to path filters (1)
  • crates/logfwd-core/fuzz/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • crates/logfwd-core/fuzz/.gitignore
  • crates/logfwd-core/fuzz/Cargo.toml
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/crash-47eef4b87ea82d35dc7da973c3391a3f15800a30
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-2d5b3aeca551c891189e4856971b929d4fefab9e
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-47eef4b87ea82d35dc7da973c3391a3f15800a30
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-5258ff094acbab2d15a3be0c8f89be128454d4bc
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-878968522d930a45a393a40ae4b00fe2a2b95e1c
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-ab479c7a0300377f0cdfa22164d1db0a0a326df2
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-aed4337a5c6da1ffddc9b3acc7c42a86377e2420
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-cb529b0dba4299132befe93a913113cb667acaa2
  • crates/logfwd-core/fuzz/artifacts/fuzz_scanner/minimized-from-e7f9636f4b6c13489c3188335bd2e0a3d33f2a04
  • crates/logfwd-core/fuzz/fuzz_targets/format_parser.rs
  • crates/logfwd-core/fuzz/fuzz_targets/pipe_event.rs
  • crates/logfwd-core/fuzz/fuzz_targets/scanner.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant