Skip to content

fix: CLI UX, config, OTLP HTTP conformance, and metrics double-count bugs#744

Merged
strawgate merged 5 commits into
masterfrom
fix/cli-ux-config-diagnostics-bugs
Apr 3, 2026
Merged

fix: CLI UX, config, OTLP HTTP conformance, and metrics double-count bugs#744
strawgate merged 5 commits into
masterfrom
fix/cli-ux-config-diagnostics-bugs

Conversation

@strawgate
Copy link
Copy Markdown
Owner

Summary

Nine bugs fixed in one non-conflicting PR (safe from PRs #684 and #701):

Also fixes a pre-existing resource_attrs: vec![] type mismatch in elasticsearch tests and bench (would fail cargo check --all-targets).

Test plan

🤖 Generated with Claude Code

@strawgate strawgate force-pushed the fix/cli-ux-config-diagnostics-bugs branch from 3b93199 to c226da4 Compare April 2, 2026 23:30
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces derived OutputType deserialization with a custom Serde visitor that treats YAML null (and the string "null") as OutputType::Null and rejects unknown strings; adds validate_bind_addr() and config-time validation for server.diagnostics (skipping values with ${...}) with tests; tightens OTLP HTTP handling to return 405+Allow: POST for non-POST /v1/logs and 400 when OTLP JSON lacks a valid resourceLogs array with tests; moves line-counting out of sinks so sinks now update byte counters; adds normalize_args() and installs SIGHUP handling that warns but does not shutdown.

Possibly related PRs


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (2 errors, 2 warnings)

Check name Status Explanation Resolution
High-Quality Rust Practices ❌ Error PR violates critical error-handling and metrics correctness rules in two places: normalize_args() silently returns original args instead of explicit config error; OtlpSink::send_batch counts wrong buffer size. Fix normalize_args() to return Err(CliError::Config) for missing --config path; change OtlpSink line 248 to count payload.len() after compression/framing.
Formal Verification Coverage ❌ Error PR failed to correct stale Rule 5 cross-reference in .coderabbit.yaml line 59-60 that misdirects reviewers to wrong policy document. Remove or correct the stale Rule 5 reference at line 59-60; either delete it or add direct pointer to correct CRATE_RULES.md section documenting Kani exemption categories.
Documentation Thoroughly Updated ⚠️ Warning PR introduces SIGHUP signal handling and null output type support but omits required documentation updates in CONFIG_REFERENCE.md and DEVELOPING.md. Add null output type entry to CONFIG_REFERENCE.md's Output types table; add SIGHUP signal registration lesson to DEVELOPING.md's Things that will bite you section documenting crash-prevention requirements.
Maintainer Fitness ⚠️ Warning Three unresolved flagged issues remain in the PR: (1) normalize_args() returns misleading error for missing config path; (2) .coderabbit.yaml references non-existent Rule 5 in CRATE_RULES.md; (3) otlp_sink counts uncompressed encoder_buf instead of actual transmitted payload bytes. Fix: (1) have normalize_args() return explicit config error when --config lacks value; (2) correct/remove Rule 5 cross-reference; (3) change inc_bytes(self.encoder_buf.len()) to inc_bytes(payload.len()).
✅ Passed checks (1 passed)
Check name Status Explanation
Crate Boundary And Dependency Integrity ✅ Passed PR maintains all crate boundary rules: no new crates, logfwd-core untouched, dependency graph valid, no unauthorized dependencies, proper separation of concerns.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 1343-1350: The test normalize_args_canonical_form_unchanged in
this library crate is a no-op (it only references Config::load_str) because
normalize_args is defined in the binary crate; remove this test from
crates/logfwd-config/src/lib.rs (or relocate it into the binary crate) and, if
you want coverage for normalize_args, add real tests in
crates/logfwd/src/main.rs that call normalize_args with example arg vectors and
assert the expected rearranged vectors.

In `@crates/logfwd-io/src/otlp_receiver.rs`:
- Around line 983-1003: The test wrong_http_method_returns_405 currently only
asserts the 405 status; update it to also assert the Allow header equals "POST".
When iterating over the ureq calls in wrong_http_method_returns_405
(method/result loop), capture the response object for both Ok(resp) and the Err
variant that carries the response (so you can read headers), then call
resp.header("Allow") and assert it equals Some("POST") (or "POST" as
appropriate). Use the existing result match in the test to preserve status
extraction but also keep the response object to check the Allow header for each
request to /v1/logs on the OtlpReceiverInput server.
- Around line 63-64: The request acceptance check currently uses prefix matching
on the variable `url` (`starts_with("/v1/logs")`) and should instead require an
exact path match; update the conditional that inspects `url` in the request
handler (the `if !url.starts_with("/v1/logs")` branch) to only accept the exact
endpoint (e.g., compare `url` for equality to "/v1/logs") so paths like
"/v1/logsfoo" or "/v1/logs/extra" return 404.

In `@crates/logfwd-output/src/otlp_sink.rs`:
- Around line 246-249: The change removed tracking of output line counts, so
restore line metrics by either calling self.stats.inc_lines(...) where the
success branch currently only calls self.stats.inc_bytes(self.encoder_buf.len()
as u64) (so reinstate increment of lines in the same success path), or implement
deferred line counting in the pipeline ack handler referenced by apply_pool_ack
(crates::logfwd::pipeline::apply_pool_ack) and ensure the pipeline calls
inc_output_success with the row count; update the code paths around
self.stats.inc_bytes, self.stats.inc_lines, and the pipeline's
apply_pool_ack/inc_output_success to consistently increment line counts so
metrics are not lost.

In `@crates/logfwd-output/src/stdout.rs`:
- Around line 31-34: The `stats: Arc<ComponentStats>` field in the stdout sink
is unused; either remove the field or update the sink to update stats like other
sinks. Locate the stdout sink struct that declares `stats: Arc<ComponentStats>`
and either (A) delete that field and any construction/assignment related to it,
or (B) call `self.stats.inc_lines(1)` and `self.stats.inc_bytes(n)` in the
sink's write/emit method (use the actual byte count written as `n`) so it
mirrors TCP/UDP/OTLP sinks; also apply the same change for the duplicate
instance at the other mentioned location (lines 253-255).

In `@crates/logfwd-output/src/tcp_sink.rs`:
- Around line 111-112: Elasticsearch sink still increments line counts while TCP
sink removed that; remove the calls to self.stats.inc_lines(...) in the
Elasticsearch sink (the invocations at the two locations that currently call
inc_lines, e.g., within the Elasticsearch sink's send/batch/flush logic) so only
byte accounting via self.stats.inc_bytes(...) remains, and update any
surrounding comments to reflect that lines are not tracked; ensure references to
inc_bytes remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4d60fa13-959a-4c82-88d3-d8ffe208762f

📥 Commits

Reviewing files that changed from the base of the PR and between b5e229e and c226da4.

📒 Files selected for processing (8)
  • crates/logfwd-config/src/lib.rs
  • crates/logfwd-io/src/otlp_receiver.rs
  • crates/logfwd-output/src/null.rs
  • crates/logfwd-output/src/otlp_sink.rs
  • crates/logfwd-output/src/stdout.rs
  • crates/logfwd-output/src/tcp_sink.rs
  • crates/logfwd-output/src/udp_sink.rs
  • crates/logfwd/src/main.rs

Comment thread crates/logfwd-config/src/lib.rs
Comment thread crates/logfwd-io/src/otlp_receiver.rs Outdated
Comment on lines +983 to +1003
#[test]
fn wrong_http_method_returns_405() {
let receiver = OtlpReceiverInput::new_with_capacity("test", "127.0.0.1:0", 16).unwrap();
let port = receiver.local_addr().port();
let url = format!("http://127.0.0.1:{port}/v1/logs");

for (method, result) in [
("GET", ureq::get(&url).call()),
("DELETE", ureq::delete(&url).call()),
] {
let status: u16 = match result {
Ok(resp) => resp.status().as_u16(),
Err(ureq::Error::StatusCode(code)) => code,
Err(e) => panic!("unexpected error for {method}: {e}"),
};
assert_eq!(
status, 405,
"{method} /v1/logs should return 405 Method Not Allowed, got {status}"
);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add assertion for Allow: POST in the 405 regression test.

The new test validates status code 405, but it does not verify the required Allow header. Please assert Allow: POST as well so the HTTP conformance fix is fully protected.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-io/src/otlp_receiver.rs` around lines 983 - 1003, The test
wrong_http_method_returns_405 currently only asserts the 405 status; update it
to also assert the Allow header equals "POST". When iterating over the ureq
calls in wrong_http_method_returns_405 (method/result loop), capture the
response object for both Ok(resp) and the Err variant that carries the response
(so you can read headers), then call resp.header("Allow") and assert it equals
Some("POST") (or "POST" as appropriate). Use the existing result match in the
test to preserve status extraction but also keep the response object to check
the Allow header for each request to /v1/logs on the OtlpReceiverInput server.

Comment on lines +31 to 34
// Stored for API consistency with other sinks; byte tracking on stdout is
// not currently wired up.
#[allow(dead_code)]
stats: Arc<ComponentStats>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

stats field is completely unused, unlike other sinks that still call inc_bytes.

TCP, UDP, and OTLP sinks track bytes via stats.inc_bytes(). Stdout sink now does neither inc_lines nor inc_bytes, making the field dead code. Consider either:

  1. Removing the stats field entirely, or
  2. Adding inc_bytes tracking for consistency with other sinks

Also applies to: 253-255

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/stdout.rs` around lines 31 - 34, The `stats:
Arc<ComponentStats>` field in the stdout sink is unused; either remove the field
or update the sink to update stats like other sinks. Locate the stdout sink
struct that declares `stats: Arc<ComponentStats>` and either (A) delete that
field and any construction/assignment related to it, or (B) call
`self.stats.inc_lines(1)` and `self.stats.inc_bytes(n)` in the sink's write/emit
method (use the actual byte count written as `n`) so it mirrors TCP/UDP/OTLP
sinks; also apply the same change for the duplicate instance at the other
mentioned location (lines 253-255).

@strawgate strawgate force-pushed the fix/cli-ux-config-diagnostics-bugs branch from aea6002 to e72d70f Compare April 2, 2026 23:43
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (6)
crates/logfwd-config/src/lib.rs (1)

1343-1350: 🧹 Nitpick | 🔵 Trivial

No-op test should be removed or relocated.

normalize_args lives in the binary crate (main.rs), not this library. The test body does nothing. Add real tests in main.rs if coverage is needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-config/src/lib.rs` around lines 1343 - 1350, Remove the no-op
unit test function normalize_args_canonical_form_unchanged from the library (it
does nothing and tests binary-only logic); either delete the test or move an
equivalent integration/test into the binary crate (main.rs) where normalize_args
lives, and if you need to assert behavior via Config parsing keep only
meaningful assertions using Config::load_str in the appropriate crate; ensure no
dead imports remain after removal.
crates/logfwd-output/src/otlp_sink.rs (1)

247-249: ⚠️ Potential issue | 🔴 Critical

Line metrics are lost—pipeline does not actually call inc_output_success.

The comment claims the pipeline counts lines, but apply_pool_ack (pipeline.rs:632-635) explicitly states: "Row count is not tracked here." Removing inc_lines without implementing pipeline-side counting means lines_total is permanently zero for this output.

Either restore self.stats.inc_lines(batch.num_rows() as u64) here, or implement the deferred counting in the pipeline's ack handler.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/otlp_sink.rs` around lines 247 - 249, The metrics
reporting currently only increments bytes which drops line counts because the
pipeline does not track rows (see apply_pool_ack in pipeline.rs stating "Row
count is not tracked here"); restore line metrics by calling
self.stats.inc_lines(batch.num_rows() as u64) in the OTLP sink where bytes are
incremented (refer to self.stats.inc_lines and self.stats.inc_bytes in
otlp_sink.rs) before returning Ok(()), or alternatively implement the deferred
row counting inside the pipeline ack handler (apply_pool_ack) so lines_total
gets incremented there — pick one approach and ensure the row count is
incremented in the same code path that records bytes.
crates/logfwd-output/src/stdout.rs (1)

31-34: 🧹 Nitpick | 🔵 Trivial

stats field is dead code—no metrics recorded.

Unlike TCP/UDP/OTLP sinks that still call inc_bytes, stdout calls neither inc_lines nor inc_bytes. Either remove the field entirely or add byte tracking for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/stdout.rs` around lines 31 - 34, The stats field
(stats: Arc<ComponentStats>) is dead — stdout sink never updates metrics — so
either remove the field from the stdout struct to avoid dead code, or wire up
metrics updates by calling ComponentStats::inc_bytes and
ComponentStats::inc_lines from the stdout emission path (e.g., inside the stdout
sink's write/emit/handle method that writes to stdout); keep the field name
stats and use the existing ComponentStats methods inc_bytes/inc_lines to record
bytes and lines for parity with TCP/UDP/OTLP sinks.
crates/logfwd-io/src/otlp_receiver.rs (2)

63-64: ⚠️ Potential issue | 🟡 Minor

Prefix matching accepts invalid paths like /v1/logsfoo.

starts_with("/v1/logs") matches /v1/logs/extra and /v1/logsfoo. Use exact path matching:

-                    if !url.starts_with("/v1/logs") {
+                    let path = url.split_once('?').map_or(url.as_str(), |(p, _)| p);
+                    if path != "/v1/logs" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-io/src/otlp_receiver.rs` around lines 63 - 64, The path check
using url.starts_with("/v1/logs") accepts invalid paths like "/v1/logsfoo";
change the conditional around the request handler (the block that currently
tests the url variable) to only accept the exact OTLP endpoint by matching url
== "/v1/logs" or url == "/v1/logs/" (or allow "/v1/logs/" and "/v1/logs"
explicitly), or use a starts_with check that also verifies the next character is
'/' when present; update the if !url.starts_with("/v1/logs") condition in
otlp_receiver.rs to one of these exact-match approaches so only valid "/v1/logs"
(and optionally "/v1/logs/") are accepted.

983-1003: ⚠️ Potential issue | 🟡 Minor

Test should assert Allow: POST header, not just status 405.

The test validates status code but doesn't verify the Allow header required by RFC 7231. Extract the response and check resp.header("Allow") == Some("POST").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-io/src/otlp_receiver.rs` around lines 983 - 1003, Update the
test wrong_http_method_returns_405 in crates/logfwd-io/src/otlp_receiver.rs to
also assert the Allow header equals "POST": when calling the endpoint created by
OtlpReceiverInput::new_with_capacity for methods "GET" and "DELETE", capture the
HTTP response object (not only the status code) from both success and error
variants so you can read resp.header("Allow") and assert it is Some("POST") in
addition to asserting status == 405 for the /v1/logs requests.
crates/logfwd-output/src/tcp_sink.rs (1)

111-112: ⚠️ Potential issue | 🔴 Critical

Line metrics lost; inconsistent with Elasticsearch sink.

Same issue as other sinks—the pipeline doesn't actually call inc_output_success. Additionally, a past review noted Elasticsearch sink (lines 293, 545) still calls inc_lines, creating metric inconsistency across outputs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/tcp_sink.rs` around lines 111 - 112, The TCP sink
currently only calls self.stats.inc_bytes(self.buf.len() as u64) which loses
line/record metrics and doesn't mark successful outputs; update tcp_sink.rs to
also call self.stats.inc_output_success() when a send succeeds and increment the
line/record metric with self.stats.inc_lines(record_count) (use the actual
record count variable used when building the buffer, not buf.len() which is
bytes) so metrics match other sinks; also align Elasticsearch sink usage by
removing any direct/duplicate inc_lines calls there or switching it to rely on
inc_output_success semantics so all outputs use the same inc_output_success +
inc_lines pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-output/src/null.rs`:
- Around line 138-145: The test currently asserts that
stats.lines_total.load(Ordering::Relaxed) == 0 (in the null.rs test) which
treats a known-buggy behavior as correct; remove or relax that assertion and add
a TODO comment referencing that pipeline-side counting should own lines_total so
this test does not enforce the wrong contract. Specifically, delete or comment
out the assert_eq! around stats.lines_total.load(Ordering::Relaxed) (or replace
it with a non-failing placeholder check), keep the valid row-level check for
sink.rows_discarded(), and add a clear TODO mentioning pipeline-side counting /
an issue number so the assertion can be restored to the intended behavior later.
- Around line 55-58: The comment is incorrect:
PipelineMetrics::inc_output_success is never called, so sinks must update the
line counter themselves; update the send_batch path (where Ok(()) is returned)
to call PipelineMetrics::inc_output_success with the number of lines processed
so stats.lines_total is incremented (reference
PipelineMetrics::inc_output_success and stats.lines_total), or if the design is
to centralize counting, adjust the pipeline to invoke
PipelineMetrics::inc_output_success after send_batch instead of the sink—remove
or correct the misleading comment in null.rs accordingly.

In `@crates/logfwd-output/src/stdout.rs`:
- Around line 253-255: The comment claiming "Line counting is done once by the
pipeline (PipelineMetrics::inc_output_success)" is incorrect because
PipelineMetrics::inc_output_success is not called by the pipeline (see
apply_pool_ack), leaving this sink without observability; update the stdout sink
in stdout.rs to increment the pipeline metric itself (call
PipelineMetrics::inc_output_success at the appropriate success path in the
sink's processing/writing function) or, if intended to be pipeline-owned, remove
the misleading comment and add a TODO to implement the metric increment in
apply_pool_ack — reference the PipelineMetrics::inc_output_success symbol and
the apply_pool_ack flow so the change is applied in the correct success path.

In `@crates/logfwd-output/src/udp_sink.rs`:
- Around line 111-112: The comment in udp_sink.rs is misleading because
apply_pool_ack does not call inc_output_success, so line metrics remain zero;
update the sink to actually increment line counts where acknowledgements are
handled (either call inc_output_success from apply_pool_ack or explicitly call
stats.inc_lines alongside stats.inc_bytes in the same acknowledgement path) and
adjust the comment to reflect that bytes are tracked here but lines must be
incremented in apply_pool_ack (refer to apply_pool_ack, inc_output_success,
stats.inc_lines, and stats.inc_bytes).

---

Duplicate comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 1343-1350: Remove the no-op unit test function
normalize_args_canonical_form_unchanged from the library (it does nothing and
tests binary-only logic); either delete the test or move an equivalent
integration/test into the binary crate (main.rs) where normalize_args lives, and
if you need to assert behavior via Config parsing keep only meaningful
assertions using Config::load_str in the appropriate crate; ensure no dead
imports remain after removal.

In `@crates/logfwd-io/src/otlp_receiver.rs`:
- Around line 63-64: The path check using url.starts_with("/v1/logs") accepts
invalid paths like "/v1/logsfoo"; change the conditional around the request
handler (the block that currently tests the url variable) to only accept the
exact OTLP endpoint by matching url == "/v1/logs" or url == "/v1/logs/" (or
allow "/v1/logs/" and "/v1/logs" explicitly), or use a starts_with check that
also verifies the next character is '/' when present; update the if
!url.starts_with("/v1/logs") condition in otlp_receiver.rs to one of these
exact-match approaches so only valid "/v1/logs" (and optionally "/v1/logs/") are
accepted.
- Around line 983-1003: Update the test wrong_http_method_returns_405 in
crates/logfwd-io/src/otlp_receiver.rs to also assert the Allow header equals
"POST": when calling the endpoint created by
OtlpReceiverInput::new_with_capacity for methods "GET" and "DELETE", capture the
HTTP response object (not only the status code) from both success and error
variants so you can read resp.header("Allow") and assert it is Some("POST") in
addition to asserting status == 405 for the /v1/logs requests.

In `@crates/logfwd-output/src/otlp_sink.rs`:
- Around line 247-249: The metrics reporting currently only increments bytes
which drops line counts because the pipeline does not track rows (see
apply_pool_ack in pipeline.rs stating "Row count is not tracked here"); restore
line metrics by calling self.stats.inc_lines(batch.num_rows() as u64) in the
OTLP sink where bytes are incremented (refer to self.stats.inc_lines and
self.stats.inc_bytes in otlp_sink.rs) before returning Ok(()), or alternatively
implement the deferred row counting inside the pipeline ack handler
(apply_pool_ack) so lines_total gets incremented there — pick one approach and
ensure the row count is incremented in the same code path that records bytes.

In `@crates/logfwd-output/src/stdout.rs`:
- Around line 31-34: The stats field (stats: Arc<ComponentStats>) is dead —
stdout sink never updates metrics — so either remove the field from the stdout
struct to avoid dead code, or wire up metrics updates by calling
ComponentStats::inc_bytes and ComponentStats::inc_lines from the stdout emission
path (e.g., inside the stdout sink's write/emit/handle method that writes to
stdout); keep the field name stats and use the existing ComponentStats methods
inc_bytes/inc_lines to record bytes and lines for parity with TCP/UDP/OTLP
sinks.

In `@crates/logfwd-output/src/tcp_sink.rs`:
- Around line 111-112: The TCP sink currently only calls
self.stats.inc_bytes(self.buf.len() as u64) which loses line/record metrics and
doesn't mark successful outputs; update tcp_sink.rs to also call
self.stats.inc_output_success() when a send succeeds and increment the
line/record metric with self.stats.inc_lines(record_count) (use the actual
record count variable used when building the buffer, not buf.len() which is
bytes) so metrics match other sinks; also align Elasticsearch sink usage by
removing any direct/duplicate inc_lines calls there or switching it to rely on
inc_output_success semantics so all outputs use the same inc_output_success +
inc_lines pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9dc35ee0-7b40-4298-be98-1893ec0727fb

📥 Commits

Reviewing files that changed from the base of the PR and between aea6002 and e72d70f.

📒 Files selected for processing (9)
  • .coderabbit.yaml
  • crates/logfwd-config/src/lib.rs
  • crates/logfwd-io/src/otlp_receiver.rs
  • crates/logfwd-output/src/null.rs
  • crates/logfwd-output/src/otlp_sink.rs
  • crates/logfwd-output/src/stdout.rs
  • crates/logfwd-output/src/tcp_sink.rs
  • crates/logfwd-output/src/udp_sink.rs
  • crates/logfwd/src/main.rs

Comment thread crates/logfwd-output/src/null.rs
Comment thread crates/logfwd-output/src/null.rs
Comment thread crates/logfwd-output/src/stdout.rs
Comment on lines +111 to 112
// inc_lines is counted by the pipeline; only track bytes here.
self.stats.inc_bytes(total_bytes);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Same issue: line metrics lost because pipeline doesn't track them.

The comment is misleading—apply_pool_ack does not call inc_output_success. This sink will report zero lines forever. See the parallel concern raised on otlp_sink.rs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/udp_sink.rs` around lines 111 - 112, The comment in
udp_sink.rs is misleading because apply_pool_ack does not call
inc_output_success, so line metrics remain zero; update the sink to actually
increment line counts where acknowledgements are handled (either call
inc_output_success from apply_pool_ack or explicitly call stats.inc_lines
alongside stats.inc_bytes in the same acknowledgement path) and adjust the
comment to reflect that bytes are tracked here but lines must be incremented in
apply_pool_ack (refer to apply_pool_ack, inc_output_success, stats.inc_lines,
and stats.inc_bytes).

@strawgate strawgate force-pushed the fix/cli-ux-config-diagnostics-bugs branch from e72d70f to 55624fb Compare April 3, 2026 00:31
strawgate and others added 3 commits April 2, 2026 19:37
Fixes bugs #702, #703, #707, #709, #712, #722, #723, #725, #735.

- main.rs: --help and --validate now print to stdout (not stderr)
- main.rs: normalize_args() reorders args so --config can appear anywhere
- main.rs: SIGHUP no longer crashes the process (logs warning, continues)
- logfwd-config: type: null in YAML no longer silently parses as wrong
  variant (custom Deserialize handles YAML null scalar via visit_unit)
- logfwd-config: server.diagnostics address validated at --validate time
- otlp_receiver: wrong HTTP method returns 405 with Allow: POST header
- otlp_receiver: missing resourceLogs in JSON body returns 400 not 200
- output sinks: removed double inc_lines calls (pipeline.rs already counts)
- elasticsearch tests/bench: fix pre-existing resource_attrs type mismatch

New tests for every fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CodeRabbit rejected the config: instructions field for check[1] was
10,651 chars, exceeding the 10,000 char limit. Condensed TLA+ property
lists (removed verbose parenthetical descriptions) and shortened the
kani::cover!() rule. All semantic content preserved; 9,924 chars after.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@strawgate strawgate force-pushed the fix/cli-ux-config-diagnostics-bugs branch from 83ee390 to 48d3c5e Compare April 3, 2026 00:39
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
crates/logfwd-config/src/lib.rs (1)

1343-1350: ⚠️ Potential issue | 🟡 Minor

Remove this no-op test or move real coverage to main.rs.

This test does not execute normalize_args and only references Config::load_str, so it provides no behavior coverage for bug #709.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-config/src/lib.rs` around lines 1343 - 1350, The test
normalize_args_canonical_form_unchanged is a no-op (it only references
Config::load_str) and should be removed or replaced with real coverage that
exercises normalize_args; either delete this test from lib.rs, or move/implement
an integration test in the binary crate (main.rs or tests in the binary) that
directly calls normalize_args with argv cases (including when "--config" is
already at position 1) and asserts the returned args are unchanged, ensuring the
actual function normalize_args is exercised rather than only referencing
Config::load_str.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 566-573: validate_bind_addr currently uses
addr.parse::<std::net::SocketAddr>() which accepts only IP literals and rejects
hostnames like "localhost:8080" even though
tiny_http::Server::http(&self.bind_addr) resolves hostnames at runtime; change
the validation to perform the same resolution by using std::net::ToSocketAddrs
(call addr.to_socket_addrs()) instead of parse so DNS-resolvable hostnames
succeed, and preserve the existing error formatting (update the error map to
include the original addr and the resolution error); refer to validate_bind_addr
and the runtime call in diagnostics.rs (tiny_http::Server::http) when making the
change.

In `@crates/logfwd-output/src/tcp_sink.rs`:
- Around line 111-112: The comment is wrong: the pipeline does not count lines,
so you should restore line accounting here; in tcp_sink.rs, alongside the
existing self.stats.inc_bytes(self.buf.len() as u64) call, also call
self.stats.inc_lines(...) using the number of rows in self.buf (e.g., count
newlines/rows in self.buf) and update the comment to reflect that both bytes and
lines are tracked locally (or alternatively implement line counting in pipeline
and add an issue link if you choose that path). Ensure you reference
apply_pool_ack's note (pipeline.rs) when updating the comment to avoid
confusion.

In `@crates/logfwd/src/main.rs`:
- Around line 142-173: The printed usage and validation success messages
currently use use_color(), which checks STDERR and causes ANSI codes to leak
into redirected stdout; update print_usage() to call use_color_stdout() instead
of use_color(), and change the success/output paths in validate_pipelines() (the
branches that print success/usage-style messages to stdout) to also call
use_color_stdout() so color detection is based on STDOUT TTY state; leave any
error-path logs that should use stderr unchanged.
- Around line 435-437: Replace the .expect() call used when registering the
SIGTERM handler the same way SIGHUP is handled: instead of panicking on failure,
use a conditional pattern (e.g., if let Ok(sigterm) =
signal(SignalKind::terminate()) ) to obtain the sigterm stream or log a warning
and skip registering that handler; do the same for the existing sighup
registration (signal(SignalKind::hangup())) so both signal registrations
(sigterm and sighup) fail gracefully and do not call .expect().

---

Duplicate comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 1343-1350: The test normalize_args_canonical_form_unchanged is a
no-op (it only references Config::load_str) and should be removed or replaced
with real coverage that exercises normalize_args; either delete this test from
lib.rs, or move/implement an integration test in the binary crate (main.rs or
tests in the binary) that directly calls normalize_args with argv cases
(including when "--config" is already at position 1) and asserts the returned
args are unchanged, ensuring the actual function normalize_args is exercised
rather than only referencing Config::load_str.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: a1bf4cf3-eed7-49d5-97b0-17fdd8104249

📥 Commits

Reviewing files that changed from the base of the PR and between e72d70f and 55624fb.

📒 Files selected for processing (9)
  • .coderabbit.yaml
  • crates/logfwd-config/src/lib.rs
  • crates/logfwd-io/src/otlp_receiver.rs
  • crates/logfwd-output/src/null.rs
  • crates/logfwd-output/src/otlp_sink.rs
  • crates/logfwd-output/src/stdout.rs
  • crates/logfwd-output/src/tcp_sink.rs
  • crates/logfwd-output/src/udp_sink.rs
  • crates/logfwd/src/main.rs

Comment on lines +566 to +573
fn validate_bind_addr(addr: &str) -> Result<(), String> {
if addr.contains("${") {
return Ok(());
}
addr.parse::<std::net::SocketAddr>()
.map(|_| ())
.map_err(|e| format!("'{addr}' is not a valid host:port address: {e}"))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify parser/runtime mismatch evidence in current tree
rg -n "fn validate_bind_addr|parse::<std::net::SocketAddr>|Server::http\\(" -C2 crates/logfwd-config/src/lib.rs crates/logfwd-io/src/diagnostics.rs

Repository: strawgate/memagent

Length of output: 1178


🏁 Script executed:

# Check tiny_http dependency and search for any hostname usage examples
rg -n "bind_addr|Server::http" crates/logfwd-config/src/lib.rs crates/logfwd-io/src/ -A3 -B1 | head -50

Repository: strawgate/memagent

Length of output: 2964


🏁 Script executed:

# Look for tests that might use hostname patterns
fd -e rs -x grep -l "localhost\|bind_addr" {} + 2>/dev/null | head -10

Repository: strawgate/memagent

Length of output: 441


🌐 Web query:

tiny_http Server::http hostname support DNS resolution

💡 Result:

Yes, the tiny_http crate's Server::http method supports hostnames and performs DNS resolution. It accepts any type A implementing std::net::ToSocketAddrs, such as &str. When passed a hostname like "example.com:8080", ToSocketAddrs resolves it to SocketAddr(s) via the system's getaddrinfo (or equivalent), handling DNS lookup. The library then binds the TcpListener to one of the resolved addresses using ConfigListenAddr::from_socket_addrs(addr). Examples in docs use "0.0.0.0:0", but hostname strings work identically due to the trait.

Citations:


🏁 Script executed:

# Look for test patterns with hostnames or bind_addr
rg -n "localhost|127.0.0.1|bind_addr.*:" crates/logfwd-io/tests/ crates/logfwd-config/tests/ -A1 -B1 2>/dev/null | head -40

Repository: strawgate/memagent

Length of output: 2940


🏁 Script executed:

# Check if there are IPv6 address patterns that need special handling
rg -n "\\[|ipv6|::" crates/logfwd-config/src/lib.rs crates/logfwd-io/src/diagnostics.rs 2>/dev/null | head -20

Repository: strawgate/memagent

Length of output: 1738


🏁 Script executed:

# Check for tests of validate_bind_addr function
rg -n "validate_bind_addr" crates/logfwd-config/src/ crates/logfwd-config/tests/ -B2 -A5

Repository: strawgate/memagent

Length of output: 1326


🏁 Script executed:

# Verify IPv6 bracket patterns would be caught by first parse attempt
cat > /tmp/test_socketaddr.rs << 'EOF'
fn main() {
    // Test what SocketAddr::parse accepts
    let cases = vec![
        "127.0.0.1:8080",
        "localhost:8080",
        "[::1]:8080",
        "[2001:db8::1]:9090",
    ];
    
    for addr in cases {
        match addr.parse::<std::net::SocketAddr>() {
            Ok(_) => println!("{} -> SocketAddr::parse OK", addr),
            Err(_) => println!("{} -> SocketAddr::parse FAIL", addr),
        }
    }
}
EOF
rustc /tmp/test_socketaddr.rs -o /tmp/test_socketaddr && /tmp/test_socketaddr

Repository: strawgate/memagent

Length of output: 108


validate_bind_addr() rejects hostnames that tiny_http::Server::http() accepts.

Line 570 uses addr.parse::<std::net::SocketAddr>(), which only accepts IP literals (e.g., 127.0.0.1:8080, [::1]:8080). Hostnames like localhost:8080 are rejected, but runtime binding at diagnostics.rs:408 passes the same string to tiny_http::Server::http(&self.bind_addr), which resolves hostnames via DNS and succeeds. This creates a validate-time false negative.

🔧 Proposed fix
 fn validate_bind_addr(addr: &str) -> Result<(), String> {
     if addr.contains("${") {
         return Ok(());
     }
-    addr.parse::<std::net::SocketAddr>()
-        .map(|_| ())
-        .map_err(|e| format!("'{addr}' is not a valid host:port address: {e}"))
+    // Accept SocketAddr literals first (IPv4/IPv6 with port).
+    if addr.parse::<std::net::SocketAddr>().is_ok() {
+        return Ok(());
+    }
+
+    // Accept hostname:port without forcing DNS resolution at validate time.
+    let Some((host, port)) = addr.rsplit_once(':') else {
+        return Err(format!("'{addr}' is not a valid host:port address"));
+    };
+    if host.is_empty() || port.parse::<u16>().is_err() {
+        return Err(format!("'{addr}' is not a valid host:port address"));
+    }
+    Ok(())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-config/src/lib.rs` around lines 566 - 573, validate_bind_addr
currently uses addr.parse::<std::net::SocketAddr>() which accepts only IP
literals and rejects hostnames like "localhost:8080" even though
tiny_http::Server::http(&self.bind_addr) resolves hostnames at runtime; change
the validation to perform the same resolution by using std::net::ToSocketAddrs
(call addr.to_socket_addrs()) instead of parse so DNS-resolvable hostnames
succeed, and preserve the existing error formatting (update the error map to
include the original addr and the resolution error); refer to validate_bind_addr
and the runtime call in diagnostics.rs (tiny_http::Server::http) when making the
change.

Comment thread crates/logfwd-output/src/tcp_sink.rs
Comment thread crates/logfwd/src/main.rs
Comment on lines +142 to 173
println!(
"{}logfwd{} {}v{VERSION}{} -- fast log forwarder with SQL transforms",
bold(),
reset(),
dim(),
reset(),
);
eprintln!();
eprintln!("{}USAGE:{}", bold(), reset());
eprintln!(" logfwd --config <config.yaml> [--validate] [--dry-run]");
eprintln!(" logfwd --blackhole [bind_addr]");
eprintln!(" logfwd --generate-json <num_lines> <output_file>");
eprintln!();
eprintln!("{}OPTIONS:{}", bold(), reset());
eprintln!(" -c, --config <path> Run pipeline from YAML config");
eprintln!(" --validate Validate config and exit (alias: --check)");
eprintln!(" --dry-run Build pipelines without running");
eprintln!(" --blackhole [addr] OTLP blackhole receiver (default: 127.0.0.1:4318)");
eprintln!(" --generate-json Generate synthetic JSON log file");
eprintln!(" -h, --help Show this help");
eprintln!(" -V, --version Show version");
eprintln!();
eprintln!("{}EXIT CODES:{}", bold(), reset());
eprintln!(" 0 Success");
eprintln!(" 1 Configuration error");
eprintln!(" 2 Runtime error");
eprintln!();
eprintln!(
println!();
println!("{}USAGE:{}", bold(), reset());
println!(" logfwd --config <config.yaml> [--validate] [--dry-run]");
println!(" logfwd --blackhole [bind_addr]");
println!(" logfwd --generate-json <num_lines> <output_file>");
println!();
println!("{}OPTIONS:{}", bold(), reset());
println!(" -c, --config <path> Run pipeline from YAML config");
println!(" --validate Validate config and exit (alias: --check)");
println!(" --dry-run Build pipelines without running");
println!(" --blackhole [addr] OTLP blackhole receiver (default: 127.0.0.1:4318)");
println!(" --generate-json Generate synthetic JSON log file");
println!(" -h, --help Show this help");
println!(" -V, --version Show version");
println!();
println!("{}EXIT CODES:{}", bold(), reset());
println!(" 0 Success");
println!(" 1 Configuration error");
println!(" 2 Runtime error");
println!();
println!(
"{}Respects NO_COLOR (https://no-color.org){}",
dim(),
reset(),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify stream mismatch: stdout println! paths + stderr-based TTY check
rg -n "fn use_color|STDERR_FILENO|fn print_usage|validate_pipelines|println!\\(" -C2 crates/logfwd/src/main.rs

Repository: strawgate/memagent

Length of output: 8016


Stdout color detection uses stderr TTY check, breaking machine-readable output when stdout is redirected.

The use_color() function (line 60) checks STDERR_FILENO, but print_usage() (lines 142–173) and validate_pipelines() (lines 335–358) output to stdout. If stdout is redirected to a file while stderr remains a TTY, ANSI escape codes leak into stdout and corrupt script parsing.

🔧 Fix
+use std::io::IsTerminal;
+
-fn use_color() -> bool {
-    // SAFETY: isatty is a simple query on a well-known fd; no invariants to uphold.
-    env::var_os("NO_COLOR").is_none() && unsafe { libc::isatty(libc::STDERR_FILENO) != 0 }
+fn use_color_stderr() -> bool {
+    env::var_os("NO_COLOR").is_none() && std::io::stderr().is_terminal()
+}
+
+fn use_color_stdout() -> bool {
+    env::var_os("NO_COLOR").is_none() && std::io::stdout().is_terminal()
 }

Update print_usage() and success paths in validate_pipelines() to call use_color_stdout() instead of use_color().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd/src/main.rs` around lines 142 - 173, The printed usage and
validation success messages currently use use_color(), which checks STDERR and
causes ANSI codes to leak into redirected stdout; update print_usage() to call
use_color_stdout() instead of use_color(), and change the success/output paths
in validate_pipelines() (the branches that print success/usage-style messages to
stdout) to also call use_color_stdout() so color detection is based on STDOUT
TTY state; leave any error-path logs that should use stderr unchanged.

Comment thread crates/logfwd/src/main.rs
Comment on lines +435 to +437
let mut sighup =
signal(SignalKind::hangup()).expect("failed to register SIGHUP handler");
loop {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify production-path expect usage around signal registration
rg -nP "signal\\(SignalKind::hangup\\)\\)|\\.expect\\(" -C3 crates/logfwd/src/main.rs

Repository: strawgate/memagent

Length of output: 698


Fix .expect() in both SIGTERM and SIGHUP signal handlers.

Lines 432 and 436 both use .expect() in the production signal registration path. Line 432 (SIGTERM) also needs the same fix—do not let signal handler registration failures panic the runtime. Handle gracefully using if let or log a warning and continue without that handler, as shown for SIGHUP.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd/src/main.rs` around lines 435 - 437, Replace the .expect() call
used when registering the SIGTERM handler the same way SIGHUP is handled:
instead of panicking on failure, use a conditional pattern (e.g., if let
Ok(sigterm) = signal(SignalKind::terminate()) ) to obtain the sigterm stream or
log a warning and skip registering that handler; do the same for the existing
sighup registration (signal(SignalKind::hangup())) so both signal registrations
(sigterm and sighup) fail gracefully and do not call .expect().

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/logfwd-io/src/otlp_receiver.rs (1)

283-298: ⚠️ Potential issue | 🟠 Major

Reject empty JSON bodies as 400 too.

Line 285 still returns Ok(Vec::new()) for an empty body, so Content-Type: application/json with no payload still gets a 200 and bypasses this new validation entirely. That leaves one malformed-request path silently accepted.

Suggested fix
 fn decode_otlp_logs_json(body: &[u8]) -> Result<Vec<u8>, String> {
     if body.is_empty() {
-        return Ok(Vec::new());
+        return Err("empty OTLP JSON payload".to_string());
     }

Please also add the empty-body case to missing_resource_logs_returns_400.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-io/src/otlp_receiver.rs` around lines 283 - 298, The function
decode_otlp_logs_json currently returns Ok(Vec::new()) for empty bodies which
lets an empty JSON payload with Content-Type: application/json be treated as
200; update decode_otlp_logs_json to treat an empty body as a bad request by
returning Err with a clear message (e.g., "empty request body" or similar)
instead of Ok, and adjust any call sites/tests accordingly—specifically update
the test named missing_resource_logs_returns_400 to include the empty-body case
to ensure an empty JSON body yields a 400 response; reference function
decode_otlp_logs_json and the test missing_resource_logs_returns_400 when making
changes.
♻️ Duplicate comments (5)
crates/logfwd-output/src/stdout.rs (1)

31-34: ⚠️ Potential issue | 🟡 Minor

Stdout output now records no sink bytes at all.

After removing inc_lines, this sink never updates stats, so diagnostics for stdout outputs stay flat even while data is being written. Either accumulate written bytes and call inc_bytes, or drop the field until stdout metering is supported.

Also applies to: 253-255

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/stdout.rs` around lines 31 - 34, The stdout sink
still keeps the stats field (stats: Arc<ComponentStats>) but, since inc_lines
was removed, it never updates stats and so shows zero bytes; either remove the
unused stats field or update the stdout write path to call the meter method
(e.g., ComponentStats::inc_bytes) with the number of bytes written whenever data
is written to stdout (referencing the stats field in the stdout write/flush
method), ensuring written bytes are accumulated; if you choose to remove
metering, delete the stats field and any references to it to avoid dead code.
crates/logfwd-config/src/lib.rs (2)

1343-1350: ⚠️ Potential issue | 🟡 Minor

This test still doesn't cover normalize_args.

normalize_args lives in crates/logfwd/src/main.rs, and this body only references Config::load_str, so #709 still has no executable regression coverage here. Move the assertions into main.rs (or an integration test) and exercise real argv permutations there.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-config/src/lib.rs` around lines 1343 - 1350, The current unit
test in lib.rs does not exercise normalize_args (it only references
Config::load_str), so add real coverage by moving assertions into the binary
crate where normalize_args is defined (crates/logfwd/src/main.rs) or by creating
an integration test that runs the compiled binary with different argv
permutations; specifically, either (1) expose normalize_args (e.g., make it
pub(crate) or pub) and add #[cfg(test)] tests in src/main.rs that call
normalize_args with several argv vectors and assert canonical output, or (2) add
an integration test in tests/ that invokes the binary (using Command/assert_cmd)
with varied command-line orders (including --config at position 1 and not) and
asserts that the observed behavior matches the expected normalized argv and
Config parsing, keeping references to normalize_args and Config::load_str to
locate the logic.

563-572: ⚠️ Potential issue | 🟠 Major

Don't reject hostname-based diagnostics addresses.

SocketAddr::parse() only accepts IP literals. If diagnostics still binds by passing the raw string into tiny_http::Server::http, values like localhost:9090 will fail --validate even though runtime binding succeeds. Validate host:port syntax (or resolve via ToSocketAddrs) instead of requiring an IP literal.

Does `tiny_http::Server::http` accept hostname:port strings via `ToSocketAddrs`, or only IP-literal `SocketAddr` values?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-config/src/lib.rs` around lines 563 - 572, The current
validate_bind_addr function rejects hostname:port values because it uses
SocketAddr::parse; update validate_bind_addr to allow hostnames by attempting
resolution via ToSocketAddrs instead: keep the early check for "${" unchanged,
and replace the addr.parse::<SocketAddr>() call with
addr.to_socket_addrs().map(|_|()).map_err(|e| format!("'{addr}' is not a valid
host:port address or cannot be resolved: {e}")) so hostnames like
"localhost:9090" validate the same way tiny_http would accept them; reference
the validate_bind_addr function when making this change.
crates/logfwd/src/main.rs (2)

433-436: ⚠️ Potential issue | 🟠 Major

Don't panic on signal handler registration.

The new SIGHUP path repeats the existing SIGTERM expect() behavior in the runtime path. A registration failure should warn and continue without that handler, not abort startup.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd/src/main.rs` around lines 433 - 436, The SIGHUP handler
registration currently calls signal(SignalKind::hangup()).expect(...), which
will abort on failure; change it to handle errors gracefully by matching the
Result, logging a warning (e.g., using the same logger used for SIGTERM) and
continuing without the SIGHUP handler. Specifically, replace the expect call
around signal(SignalKind::hangup()) and make the variable (sighup) an Option or
otherwise handle None so the rest of the setup can proceed when registration
fails; reference the existing sighup binding and the
signal(SignalKind::hangup()) invocation to find and update the code.

142-173: ⚠️ Potential issue | 🟠 Major

Use stdout TTY detection for stdout-only output.

These paths now print to stdout, but the styling helpers still read STDERR_FILENO. If stdout is redirected while stderr stays interactive, help and validation output will still contain ANSI escapes.

Also applies to: 335-358

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd/src/main.rs` around lines 142 - 173, The help text prints to
stdout but the styling helpers (bold(), dim(), reset()) still detect TTY against
stderr; update the TTY detection to use stdout so ANSI escapes are only emitted
when stdout is a TTY. Concretely, change the implementation used by
bold()/dim()/reset() to check stdout (e.g., use atty::is(Stream::Stdout) or
isatty(STDOUT_FILENO)) or add parameterized variants (e.g., bold_for_stdout) and
call those from the help/usage printing sites where println! writes to stdout
(the blocks using bold(), dim(), reset()); apply the same change to the other
help/usage block referenced later so both stdout-only outputs suppress ANSI when
stdout is redirected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.coderabbit.yaml:
- Around line 59-60: The cross-reference to "Rule 5" in the sentence mentioning
CRATE_RULES.md is stale and points to the new-crate checklist instead of the
Kani exemption list; update that sentence (the line that reads "These exemptions
mirror the ones in CRATE_RULES.md and are the same exemptions referenced by the
Crate Boundary check's Rule 5.") to either point to the actual document/section
that lists Kani exemptions (replace "Rule 5" with the correct rule/section or
filename/anchor), or remove the "Rule 5" clause entirely so reviewers aren't
directed to the wrong policy.

In `@crates/logfwd-output/src/otlp_sink.rs`:
- Around line 246-249: The metric currently uses self.encoder_buf.len(), which
measures the uncompressed protobuf body and is wrong when compression or gRPC
framing are applied; change the increment to reflect the actual on-the-wire size
by using the length of the final framed/compressed buffer or the number of bytes
returned from the write/send call instead of self.encoder_buf.len(). Locate the
send path around the call to self.stats.inc_bytes(...) (referencing
self.stats.inc_bytes and self.encoder_buf) and replace the argument with the
length of the buffer that is actually written (e.g., compressed_buf.len() or
framed_payload.len()), or capture the byte-count returned by the socket/HTTP
write/send and pass that to self.stats.inc_bytes.

In `@crates/logfwd/src/main.rs`:
- Around line 202-205: The current branch in main.rs that handles parsing the
config flag (the code checking `if pos + 1 >= args.len()`) silently returns the
original `args`, losing the real error; change this to surface a
missing-config-path error instead: when `--config`/`-c` is the last arg, return
an explicit error/indicator (instead of `args`) that `main_inner()` can detect
and report a "missing config path" (preserving the same UX as invoking `logfwd
--config`); update the code around the `pos`/`args` handling so the missing-path
case is propagated to `main_inner()` rather than falling back to the default
dispatch.

---

Outside diff comments:
In `@crates/logfwd-io/src/otlp_receiver.rs`:
- Around line 283-298: The function decode_otlp_logs_json currently returns
Ok(Vec::new()) for empty bodies which lets an empty JSON payload with
Content-Type: application/json be treated as 200; update decode_otlp_logs_json
to treat an empty body as a bad request by returning Err with a clear message
(e.g., "empty request body" or similar) instead of Ok, and adjust any call
sites/tests accordingly—specifically update the test named
missing_resource_logs_returns_400 to include the empty-body case to ensure an
empty JSON body yields a 400 response; reference function decode_otlp_logs_json
and the test missing_resource_logs_returns_400 when making changes.

---

Duplicate comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 1343-1350: The current unit test in lib.rs does not exercise
normalize_args (it only references Config::load_str), so add real coverage by
moving assertions into the binary crate where normalize_args is defined
(crates/logfwd/src/main.rs) or by creating an integration test that runs the
compiled binary with different argv permutations; specifically, either (1)
expose normalize_args (e.g., make it pub(crate) or pub) and add #[cfg(test)]
tests in src/main.rs that call normalize_args with several argv vectors and
assert canonical output, or (2) add an integration test in tests/ that invokes
the binary (using Command/assert_cmd) with varied command-line orders (including
--config at position 1 and not) and asserts that the observed behavior matches
the expected normalized argv and Config parsing, keeping references to
normalize_args and Config::load_str to locate the logic.
- Around line 563-572: The current validate_bind_addr function rejects
hostname:port values because it uses SocketAddr::parse; update
validate_bind_addr to allow hostnames by attempting resolution via ToSocketAddrs
instead: keep the early check for "${" unchanged, and replace the
addr.parse::<SocketAddr>() call with
addr.to_socket_addrs().map(|_|()).map_err(|e| format!("'{addr}' is not a valid
host:port address or cannot be resolved: {e}")) so hostnames like
"localhost:9090" validate the same way tiny_http would accept them; reference
the validate_bind_addr function when making this change.

In `@crates/logfwd-output/src/stdout.rs`:
- Around line 31-34: The stdout sink still keeps the stats field (stats:
Arc<ComponentStats>) but, since inc_lines was removed, it never updates stats
and so shows zero bytes; either remove the unused stats field or update the
stdout write path to call the meter method (e.g., ComponentStats::inc_bytes)
with the number of bytes written whenever data is written to stdout (referencing
the stats field in the stdout write/flush method), ensuring written bytes are
accumulated; if you choose to remove metering, delete the stats field and any
references to it to avoid dead code.

In `@crates/logfwd/src/main.rs`:
- Around line 433-436: The SIGHUP handler registration currently calls
signal(SignalKind::hangup()).expect(...), which will abort on failure; change it
to handle errors gracefully by matching the Result, logging a warning (e.g.,
using the same logger used for SIGTERM) and continuing without the SIGHUP
handler. Specifically, replace the expect call around
signal(SignalKind::hangup()) and make the variable (sighup) an Option or
otherwise handle None so the rest of the setup can proceed when registration
fails; reference the existing sighup binding and the
signal(SignalKind::hangup()) invocation to find and update the code.
- Around line 142-173: The help text prints to stdout but the styling helpers
(bold(), dim(), reset()) still detect TTY against stderr; update the TTY
detection to use stdout so ANSI escapes are only emitted when stdout is a TTY.
Concretely, change the implementation used by bold()/dim()/reset() to check
stdout (e.g., use atty::is(Stream::Stdout) or isatty(STDOUT_FILENO)) or add
parameterized variants (e.g., bold_for_stdout) and call those from the
help/usage printing sites where println! writes to stdout (the blocks using
bold(), dim(), reset()); apply the same change to the other help/usage block
referenced later so both stdout-only outputs suppress ANSI when stdout is
redirected.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 228b5427-2d92-4bbc-b411-21714ae79b0f

📥 Commits

Reviewing files that changed from the base of the PR and between 55624fb and 48d3c5e.

📒 Files selected for processing (9)
  • .coderabbit.yaml
  • crates/logfwd-config/src/lib.rs
  • crates/logfwd-io/src/otlp_receiver.rs
  • crates/logfwd-output/src/null.rs
  • crates/logfwd-output/src/otlp_sink.rs
  • crates/logfwd-output/src/stdout.rs
  • crates/logfwd-output/src/tcp_sink.rs
  • crates/logfwd-output/src/udp_sink.rs
  • crates/logfwd/src/main.rs

Comment thread .coderabbit.yaml
Comment on lines 246 to 249
Ok(_) => {
self.stats.inc_lines(batch.num_rows() as u64);
// inc_lines is counted by the pipeline; only track bytes here.
self.stats.inc_bytes(self.encoder_buf.len() as u64);
return Ok(());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Count the bytes actually sent on the wire.

encoder_buf.len() is the uncompressed protobuf body. With Zstd or gRPC framing enabled, it diverges from the transmitted payload, so this new byte metric is wrong exactly in the configurations that care most about egress size.

Suggested fix
-                    self.stats.inc_bytes(self.encoder_buf.len() as u64);
+                    self.stats.inc_bytes(payload.len() as u64);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Ok(_) => {
self.stats.inc_lines(batch.num_rows() as u64);
// inc_lines is counted by the pipeline; only track bytes here.
self.stats.inc_bytes(self.encoder_buf.len() as u64);
return Ok(());
Ok(_) => {
// inc_lines is counted by the pipeline; only track bytes here.
self.stats.inc_bytes(payload.len() as u64);
return Ok(());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd-output/src/otlp_sink.rs` around lines 246 - 249, The metric
currently uses self.encoder_buf.len(), which measures the uncompressed protobuf
body and is wrong when compression or gRPC framing are applied; change the
increment to reflect the actual on-the-wire size by using the length of the
final framed/compressed buffer or the number of bytes returned from the
write/send call instead of self.encoder_buf.len(). Locate the send path around
the call to self.stats.inc_bytes(...) (referencing self.stats.inc_bytes and
self.encoder_buf) and replace the argument with the length of the buffer that is
actually written (e.g., compressed_buf.len() or framed_payload.len()), or
capture the byte-count returned by the socket/HTTP write/send and pass that to
self.stats.inc_bytes.

Comment thread crates/logfwd/src/main.rs
Comment on lines +202 to +205
// config_path is the value after --config/-c.
if pos + 1 >= args.len() {
return args;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Preserve the missing config path error when --config is last.

When the user runs logfwd --validate --config, lines 203-205 return the original argv, so main_inner() dispatches on --validate and reports unknown command instead of the real problem. Return an explicit config error here so the non-canonical form keeps the same UX as logfwd --config.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/logfwd/src/main.rs` around lines 202 - 205, The current branch in
main.rs that handles parsing the config flag (the code checking `if pos + 1 >=
args.len()`) silently returns the original `args`, losing the real error; change
this to surface a missing-config-path error instead: when `--config`/`-c` is the
last arg, return an explicit error/indicator (instead of `args`) that
`main_inner()` can detect and report a "missing config path" (preserving the
same UX as invoking `logfwd --config`); update the code around the `pos`/`args`
handling so the missing-path case is propagated to `main_inner()` rather than
falling back to the default dispatch.

strawgate and others added 2 commits April 2, 2026 20:04
Replace Vec::new() with Vec::with_capacity(N) in 4 varint/tag proofs
and a fixed [u8;256] array slice in verify_bytes_field_size. Varint max
is 10 bytes so pre-sizing to 10 eliminates all realloc exploration paths.
bytes_field_size output buf pre-sized to 276 (10+10+256 max).

These harnesses were generating tens of thousands of "aborting path on
assume(false)" from alloc::raw_vec internals. Coverage is identical —
those paths were already pruned by the stdlib's own overflow guard.

Also codifies the rule in .coderabbit.yaml so future harnesses are
caught in review before hitting CI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant