audit: fix scan_error metrics, write_row_json bugs, strengthen OTLP proofs by strawgate · Pull Request #416 · strawgate/fastforward

strawgate · 2026-03-31T16:37:54Z

Fixes found during codebase audit

scan_error metrics lost (#339)

inc_scan_error() and inc_dropped_batch() missing from async flush_batch — lost when sync run() was deleted.

write_row_json RFC 8259 violations (#405, #406)

Control characters (0x00-0x1F) now escaped as \uXXXX
f64 infinity/NaN now emit null instead of invalid inf/nan
9 unit tests added covering every code path

OTLP encoder proofs strengthened (#365)

encode_fixed64 proof verifies tag bytes, not just size
encode_varint_field proof verifies both tag and value bytes

🤖 Generated with Claude Code

The async flush_batch path was missing inc_scan_error() and inc_dropped_batch() on scan failure — the old sync run() had them but they were lost when we deleted it. Found during dead code audit (inc_scan_error appeared unused but should have been called). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T16:38:08Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 823566fc-d269-4838-9c22-a80ada66d44e

📥 Commits

Reviewing files that changed from the base of the PR and between d9f9df2 and 663d40c.

📒 Files selected for processing (1)

crates/logfwd/src/pipeline.rs

Walkthrough

Pipeline flush behavior: Pipeline::flush_batch now increments scan_error and dropped_batch when scanner scan(...) fails inside block_in_place, and it also increments dropped_batch when output.send_batch(...) fails (in addition to existing output error metrics). crates/logfwd-output: adds serde_json as a dev-dependency and changes JSON rendering so non-finite floats emit null, string escaping is expanded to RFC 8259-compliant escapes for control characters, and a new test module verifies JSON rendering/escaping. crates/logfwd-core: Kani verification tests now validate exact encoded tag and varint bytes.

Possibly related PRs

strawgate/memagent PR 208: Modifies crates/logfwd/src/pipeline.rs scanner error handling and metrics updates in flush_batch, directly overlapping the same error paths.
strawgate/memagent PR 221: Adjusts async flush_batch behavior and error handling for scan/transform/send failures, intersecting with dropped-batch and metric changes.
strawgate/memagent PR 399: Refactors batching/flush logic in pipeline.rs, touching the same flush and batching code paths modified here.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Fix two RFC 8259 violations in write_row_json: - Control characters (0x00-0x1F) now escaped as \uXXXX instead of emitted raw (#405) - f64 infinity and NaN now emit null instead of 'inf'/'-inf'/'nan' which are not valid JSON (#406) Add 9 unit tests covering every write_row_json code path: - basic string, integer, float fields - null values skipped - string escaping: quotes, backslash, newline/tab/cr, control chars - float infinity/NaN → null - multiple fields produce valid JSON Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-03-31T16:45:42Z

No actionable review feedback was found for this PR.

Unresolved review threads: 0
Outdated unresolved threads requiring action: 0
Code changes made: none
Thread replies posted: none

Validation

Attempted just ci, but just is not available in this environment (command not found).
Since no code changes were required, no further test execution was needed.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

pr:audit: fix scan_error metrics, write_row_json bugs, strengthen OTLP proofs #416 (pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)

What is this? | From workflow: AI: Address PR Review Feedback

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

…ze (#365) encode_fixed64 proof now verifies: - Size = tag_len + 8 (was already verified) - Tag bytes match encode_varint(field_number << 3 | wire_type) (NEW) - Payload bytes decode to original value (was already verified) encode_varint_field proof now verifies: - Size = tag_len + val_len (was already verified) - Tag bytes match expected encoding (NEW) - Value bytes match expected encoding (NEW) Closes the "length-only" gap identified in #365. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

crates/logfwd-output/src/lib.rs (2)
172-174: ⚠️ Potential issue | 🟠 Major

Escape JSON object keys as well as values.

Values are now RFC-compliant, but keys are still emitted unescaped (field_name.as_bytes()), which can still generate invalid JSON for special characters in column names. Reuse the same escaping path for both key and string value emission.

Also applies to: 199-210
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/lib.rs` around lines 172 - 174, The code currently
writes column keys with out.extend_from_slice(col.field_name.as_bytes()) which
emits unescaped JSON keys; update the serializer to run col.field_name through
the same escaping path used for string values (i.e., call the existing
string-escaping/emit routine used elsewhere for values instead of directly using
as_bytes()), and replace the analogous occurrences (including the other block
referencing col.field_name) so both object keys and values use the identical
RFC-compliant escaping routine.
177-193: ⚠️ Potential issue | 🔴 Critical

Prevent panic on suffix/type mismatch in write_row_json.

as_primitive::<Int64Type/Float64Type>() panics when suffix-based dispatch from field names (e.g., status_int) disagrees with actual Arrow type. Enrichment creates all JSON key fields as Utf8, so names like status_int encode the desired type in the string, not the schema. When write_row_json dispatches on "int" but the array is Utf8, the downcast panics. This violates: "Public APIs must return Result, never panic on user input".

Add a runtime type check before downcasting:
-        match col.type_suffix.as_str() {
-            "int" => {
+        match (col.type_suffix.as_str(), arr.data_type()) {
+            ("int", DataType::Int64) => {
                 let arr = arr.as_primitive::<arrow::datatypes::Int64Type>();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/lib.rs` around lines 177 - 193, In write_row_json,
avoid panics from `arr.as_primitive::<...>()` by checking the runtime Arrow type
before downcasting: when handling the "int" arm of `col.type_suffix` verify
`arr.data_type()` is Int64 (or downcast via
`as_any().downcast_ref::<PrimitiveArray<Int64Type>>()`) and only then call
`as_primitive::<arrow::datatypes::Int64Type>()` and write the numeric value; if
the runtime type mismatches (e.g. Utf8 produced by enrichment) fall back to a
safe path such as treating the cell as a string/nullable value (downcast to
`StringArray` or emit JSON null when invalid) instead of unconditionally
downcasting and panicking; do the analogous runtime-type check for the "float"
arm (Float64) and preserve the existing is_finite logic when the type matches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/logfwd-output/src/lib.rs`:
- Around line 172-174: The code currently writes column keys with
out.extend_from_slice(col.field_name.as_bytes()) which emits unescaped JSON
keys; update the serializer to run col.field_name through the same escaping path
used for string values (i.e., call the existing string-escaping/emit routine
used elsewhere for values instead of directly using as_bytes()), and replace the
analogous occurrences (including the other block referencing col.field_name) so
both object keys and values use the identical RFC-compliant escaping routine.
- Around line 177-193: In write_row_json, avoid panics from
`arr.as_primitive::<...>()` by checking the runtime Arrow type before
downcasting: when handling the "int" arm of `col.type_suffix` verify
`arr.data_type()` is Int64 (or downcast via
`as_any().downcast_ref::<PrimitiveArray<Int64Type>>()`) and only then call
`as_primitive::<arrow::datatypes::Int64Type>()` and write the numeric value; if
the runtime type mismatches (e.g. Utf8 produced by enrichment) fall back to a
safe path such as treating the cell as a string/nullable value (downcast to
`StringArray` or emit JSON null when invalid) instead of unconditionally
downcasting and panicking; do the analogous runtime-type check for the "float"
arm (Float64) and preserve the existing is_finite logic when the type matches.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 330ec3f9-457d-44fa-9ef4-dea9a45b9ca0

📥 Commits

Reviewing files that changed from the base of the PR and between a42e06b and c19fd84.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (2)

crates/logfwd-output/Cargo.toml
crates/logfwd-output/src/lib.rs

Same bug pattern as scan_error — output.send_batch() failure called output_error() but not inc_dropped_batch(). Found during robustness audit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-core/src/otlp.rs`:
- Around line 755-756: Rename the abbreviated identifiers to conform to naming
rules: change tag_val to tag_value and val_buf to value_buf (and any other
occurrences of *_val or val_buf in this module, e.g., around the code near the
other mention at the 799-800 region). Update all usages in the surrounding
functions/blocks (e.g., where tag_val is computed and passed to varint_len and
where val_buf is created/used) to the new names so compilation still succeeds
and retain existing behavior; ensure imports or pattern matches referencing
these identifiers are also updated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84bd45fb-842a-40c2-80ce-e8f6e5e2b1fa

📥 Commits

Reviewing files that changed from the base of the PR and between c19fd84 and d9f9df2.

📒 Files selected for processing (1)

crates/logfwd-core/src/otlp.rs

coderabbitai · 2026-03-31T16:58:21Z

+        let tag_val = ((field_number as u64) << 3) | 1;
+        let tag_len = varint_len(tag_val);


🧹 Nitpick | 🔵 Trivial

Rename *_val identifiers to satisfy repository naming rules.

Please replace tag_val/val_buf with non-abbreviated names (for example, tag_value/value_buf).

As per coding guidelines: "Allow only specific abbreviations in names: buf, pos, len, idx, cfg, ctx. No other abbreviations."

Also applies to: 799-800

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-core/src/otlp.rs` around lines 755 - 756, Rename the abbreviated identifiers to conform to naming rules: change tag_val to tag_value and val_buf to value_buf (and any other occurrences of *_val or val_buf in this module, e.g., around the code near the other mention at the 799-800 region). Update all usages in the surrounding functions/blocks (e.g., where tag_val is computed and passed to varint_len and where val_buf is created/used) to the new names so compilation still succeeds and retain existing behavior; ensure imports or pattern matches referencing these identifiers are also updated.

As per PR feedback, this issue is already addressed in the audit branch (PR #416). Restoring files to their original state and closing this effort. Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>

github-actions

Requesting changes due to one correctness issue that can emit invalid JSON output.

Note

🔒 Integrity filtering filtered 1 item

Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

pr:#416 (pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)

What is this? | From workflow: AI: PR Review

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

strawgate mentioned this pull request Mar 31, 2026

[audit:dead-code] Find and remove unused code across workspace #339

Closed

4 tasks

coderabbitai Bot approved these changes Mar 31, 2026

View reviewed changes

strawgate changed the title ~~fix: restore scan_error metrics lost during sync→async migration~~ audit: fix scan_error metrics, write_row_json bugs, strengthen OTLP proofs Mar 31, 2026

coderabbitai Bot reviewed Mar 31, 2026

View reviewed changes

fix: output error must increment dropped_batches counter (#346)

663d40c

Same bug pattern as scan_error — output.send_batch() failure called output_error() but not inc_dropped_batch(). Found during robustness audit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot requested changes Mar 31, 2026

View reviewed changes

This was referenced Mar 31, 2026

fix: emit 'null' for non-finite f64 values in JSON output #417

Closed

fix: escape all JSON control characters (RFC 8259 compliance) #419

Closed

strawgate merged commit c42b4cf into master Mar 31, 2026
13 of 14 checks passed

strawgate deleted the audit/dead-code-cleanup branch March 31, 2026 17:05

github-actions Bot requested changes Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audit: fix scan_error metrics, write_row_json bugs, strengthen OTLP proofs#416

audit: fix scan_error metrics, write_row_json bugs, strengthen OTLP proofs#416
strawgate merged 4 commits into
masterfrom
audit/dead-code-cleanup

strawgate commented Mar 31, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 31, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		let tag_val = ((field_number as u64) << 3) \| 1;
		let tag_len = varint_len(tag_val);

Conversation

strawgate commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fixes found during codebase audit

scan_error metrics lost (#339)

write_row_json RFC 8259 violations (#405, #406)

OTLP encoder proofs strengthened (#365)

Uh oh!

coderabbitai Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Uh oh!

github-actions Bot commented Mar 31, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

strawgate commented Mar 31, 2026 •

edited

Loading

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading