feat: logfwd-core is now #![no_std] + #![forbid(unsafe_code)] by strawgate · Pull Request #375 · strawgate/fastforward

strawgate · 2026-03-31T07:32:14Z

Summary

The proven kernel. logfwd-core compiles with no_std and forbids all unsafe code. The compiler enforces: no IO, no filesystem, no threads, no unsafe — structurally impossible, not just linted.

What moved

Impure modules → new logfwd-io crate:

diagnostics.rs — HTTP server, metrics, threads
checkpoint.rs — filesystem persistence
enrichment.rs — CSV/JSON file loading, Arrow tables
tail.rs — file watching, inotify
input.rs — file input polling
compress.rs — zstd compression (std::io)
format.rs — old format parser
filter_hints.rs — glob-based filtering

What remains in logfwd-core

Pure logic only, 52 Kani proofs:

structural.rs — SIMD detection via wide (safe)
scanner.rs — JSON field extraction
framer.rs — newline framing (Kani-proven)
aggregator.rs — CRI P/F reassembly (Kani-proven)
otlp.rs — protobuf encoding helpers
cri.rs — CRI format parsing
scan_config.rs — field selection
byte_search.rs — proven byte search

Dependencies: memchr + wide (both no_std compatible).

Test plan

439 workspace tests pass
cargo check -p logfwd-core compiles with no_std
#![forbid(unsafe_code)] — compiler rejects any unsafe in core
All downstream crates updated to import from logfwd-io
Clippy clean

🤖 Generated with Claude Code

The proven kernel. logfwd-core compiles with no_std and forbids all unsafe code. The compiler enforces: no IO, no filesystem, no threads, no network, no unsafe — structurally impossible, not just linted. Impure modules moved to new logfwd-io crate: - diagnostics, checkpoint, enrichment, tail, input (IO/threads) - compress (std::io), format (old parser), filter_hints logfwd-core now contains only proven pure logic: - structural.rs — SIMD detection via wide (safe) - scanner.rs — JSON field extraction - framer.rs — newline framing (Kani-proven) - aggregator.rs — CRI P/F reassembly (Kani-proven) - otlp.rs — protobuf encoding helpers - cri.rs — CRI format parsing - scan_config.rs — field selection - byte_search.rs — proven byte search Dependencies: only memchr + wide (both no_std compatible). 439 workspace tests pass. Zero compilation warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T07:33:38Z

Warning

Rate limit exceeded

@strawgate has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 17 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 3 minutes and 17 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: febb2623-2f20-49b7-95fb-0e8f00b2d658

📥 Commits

Reviewing files that changed from the base of the PR and between a6da672 and 096975c.

📒 Files selected for processing (8)

crates/logfwd-core/fuzz/Cargo.toml
crates/logfwd-core/fuzz/fuzz_targets/format_parser.rs
crates/logfwd-core/src/aggregator.rs
crates/logfwd-core/src/cri.rs
crates/logfwd-core/src/lib.rs
crates/logfwd-core/src/otlp.rs
crates/logfwd-core/src/scan_config.rs
crates/logfwd-core/src/structural.rs

Walkthrough

A new crate, logfwd-io, is added and published into the workspace. logfwd-io exposes public modules for checkpointing, compression, diagnostics, enrichment, filter hints, format parsing, input abstractions, and tailing. logfwd-core is converted to no_std/alloc, unsafe usage forbidden, many IO/impure modules are removed from its public surface, and alloc imports/doc comments were added. Downstream crates (logfwd, logfwd-transform, logfwd-output, logfwd-bench, fuzz targets, and fuzz/fuzz targets) update dependencies and imports to use logfwd-io for the moved functionality; new public types/traits (e.g., CheckpointStore, ChunkCompressor, EnrichmentTable, FilterHints, FileTailer) are introduced in logfwd-io. Workspace members updated to include crates/logfwd-io.

Possibly related PRs

feat: logfwd-core is now #![no_std] + #![forbid(unsafe_code)] #375: Extracts IO/impure modules into a new crate and converts core to no_std/forbid unsafe — mirrors the workspace-level refactor and module moves present here.
Add enrichment table framework: K8s path parser, host info, static labels #38: Adds the enrichment framework (EnrichmentTable, StaticTable, K8sPathTable, CSV/JSONL) — directly corresponds to enrichment types moved into logfwd-io.
fix: Run cargo fmt + fix clippy on master #185: Changes compression and OTLP sink surfaces — relates to moving ChunkCompressor/compress APIs into logfwd-io and updating consumers.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

With no_std + forbid(unsafe_code) in place, add missing_docs warning to enforce documentation on all public items in the kernel. Fixed 30 missing doc items: - Crate-level doc comment explaining the kernel's guarantees - Module-level docs on all 8 pub mod declarations - RawBlockMasks: all 10 field docs - Severity enum: all 7 variant docs - FieldSpec: struct + field docs - CriLine, CriAggregator, CriReassembler::new: struct/fn docs - StreamingClassifier::new: constructor doc - Fixed misplaced `use alloc` statements breaking doc comment chains logfwd-core now enforces: no_std, forbid(unsafe_code), warn(missing_docs). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

strawgate · 2026-03-31T07:40:22Z

@coderabbitai this is a very important PR please be very thorough and accurate

coderabbitai · 2026-03-31T07:43:03Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

@strawgate, understood — I'll make sure this gets a thorough treatment. Let me gather the full picture of the changes first.

🏁 Script executed:

#!/bin/bash
# Get PR details and file list
gh pr view 375 --repo strawgate/memagent --json files,additions,deletions,changedFiles | jq '.'

# List changed files
gh pr diff 375 --repo strawgate/memagent --name-only

# Get the PR diff (truncated to avoid overload)
gh pr diff 375 --repo strawgate/memagent | head -2000

Length of output: 79483

Let me get the rest of the diff and key files.

🏁 Script executed:

#!/bin/bash
# Get the remaining diff
gh pr diff 375 --repo strawgate/memagent | tail -n +2001 | head -2000

# Check logfwd-io lib.rs
gh api repos/strawgate/memagent/contents/crates/logfwd-io/src/lib.rs \
  --jq '.content' | base64 -d

# Check if debug_bitmask feature is referenced anywhere in the codebase
# (it was removed from logfwd-core/Cargo.toml)
gh pr diff 375 --repo strawgate/memagent | grep -n "debug_bitmask" || true

# Check logfwd-core/src/framer.rs for no_std compliance (not in diff but must be updated)
gh api repos/strawgate/memagent/contents/crates/logfwd-core/src/framer.rs \
  --jq '.content' | base64 -d | head -40

# Check structural.rs for no_std compliance and wide crate usage
gh api repos/strawgate/memagent/contents/crates/logfwd-core/src/structural.rs \
  --jq '.content' | base64 -d | head -60

# Verify wide and memchr are actually no_std compatible (check their Cargo.toml features)
gh api repos/strawgate/memagent/git/refs/heads \
  --jq '.[].ref' | head -5

Length of output: 75733

@strawgate: I've done a thorough read of all 36 changed files. Here's my full analysis before the automated review runs:

Analysis Summary

✅ What's correct and well-executed

logfwd-core no_std transformation:

#![no_std] + #![forbid(unsafe_code)] + extern crate alloc are all correct in lib.rs
memchr = { version = "2", default-features = false } and wide = { version = "1.2.0", default-features = false } — both correctly disable std features for no_std compatibility
All std::ptr::eq → core::ptr::eq, std::str::from_utf8 → core::str::from_utf8, Vec/String imports from alloc:: — all correct
framer.rs was already no_std-clean (stack-only fixed array, no heap), no changes needed
Import migration in all downstream crates (logfwd-bench, logfwd-transform, logfwd-output, logfwd, tests) is complete and consistent
debug_bitmask feature removal is safe — no downstream #[cfg(feature = "debug_bitmask")] usage found

logfwd-io new crate:

Proper Cargo.toml with all std-dependent deps correctly placed here
Good test coverage throughout (checkpoint, compress, diagnostics, enrichment, format)
FileCheckpointStore::flush correctly uses write-to-tmp → fsync → rename — crash-safe
K8sPathTable::update_from_paths correctly sorts before dedup_by so consecutive-duplicate semantics are satisfied
DiagnosticsServer correctly reads atomic counters lock-free on the hot path

⚠️ Issues to review

1. diagnostics.rs — cpu_user_ms is not CPU time (Medium)

In process_metrics():

let uptime_ms = proc.run_time() * 1000;  // wall-clock uptime, NOT cpu time
(rss, uptime_ms, 0)  // returned as (rss, cpu_user_ms, cpu_sys_ms)

sysinfo::Process::run_time() returns wall-clock seconds since process launch, not CPU user time. The field is exposed in /api/stats as cpu_user_ms, and cpu_sys_ms is always 0. The comment in the code acknowledges this but the field name is misleading to API consumers. Consider renaming to uptime_ms / removing cpu_sys_ms, or adding a note in the response.

2. tail.rs — let-chains require Rust 1.88+ (Medium, if MSRV < 1.88)

if let Some(parent) = path.parent()
    && watched_dirs.insert(parent.to_path_buf())

Let-chains (if let ... && expr) were stabilized in Rust 1.88 (May 2025). Edition 2024 requires only Rust ≥ 1.85. If your CI/CD or any user targets Rust 1.85–1.87 this will fail to compile. Worth verifying the pinned toolchain version covers this.

3. logfwd-io/Cargo.toml — xxhash-rust version inconsistency (Low)

logfwd-io pins xxhash-rust = { version = "0.8.15", ... } (exact version), while the rest of the workspace uses "0.8" (semver range). These will resolve to the same crate in Cargo.lock, but the inconsistency is a maintenance smell — should be "0.8" or a workspace dep.

4. scan_config.rs — misplaced doc comment (Low)

+/// Specification for a single field to extract.
+use alloc::{string::String, vec, vec::Vec};
 /// Specification for a single field to extract.
 pub struct FieldSpec {

The first /// attaches to the use statement, not the struct. The duplicate comment on the use line should be removed.

5. checkpoint.rs — libc_getuid on macOS always returns 1000 (Low)

On non-Linux unix (macOS), the function unconditionally returns 1000, so root processes on macOS will always get the $HOME/.logfwd data dir instead of /var/lib/logfwd. This is documented in comments but worth a code comment clarifying it's intentional for the macOS case specifically.

6. enrichment.rs — parse_cri_log_path first-match semantics (Low)

path.find("/pods/") returns the first occurrence of /pods/ in the path. A contrived path like /pods/default_app_aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee/pods/something/0.log would match the wrong segment. CRI paths are standard enough that this is extremely unlikely in practice, but using rfind or anchoring to /var/log/pods/ would be more robust.

Let me now trigger the full automated review.

[full_review]

🧠 Learnings used

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-core/**/*.rs : Implement Kani proofs for pure logic in logfwd-core; three-layer verification approach: Kani (scalar) → proptest (SIMD ≡ scalar) → Kani (consumers)

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-core/**/*.rs : No per-line heap allocations in hot paths (scanner, format parser, CRI parser); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: CODE_STYLE.md:0-0
Timestamp: 2026-03-31T06:20:47.516Z
Learning: Applies to **/*.rs : Use Kani proofs for pure logic in logfwd-core (see `dev-docs/DECISIONS.md`)

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-output/**/*.rs : No per-line heap allocations in hot paths (OTLP encoder, compress); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-arrow/**/*.rs : No per-line heap allocations in hot paths (builders, Arrow construction); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-31T06:20:58.181Z
Learning: Applies to **/*.rs : Use Rust stable toolchain with no hand-written unsafe code — use SIMD via wide crate for performance-critical operations

The old files (diagnostics, checkpoint, enrichment, tail, input, compress, format, filter_hints, dashboard.html) were copied to logfwd-io but not deleted from logfwd-core/src/. They weren't compiled (not in lib.rs) but were dead files in the directory. Also: fix fuzz target format_parser.rs to import from logfwd-io. -4,340 lines of dead code removed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

strawgate · 2026-03-31T07:48:50Z

@coderabbitai this is a very important PR please be very thorough and accurate

coderabbitai

Actionable comments posted: 14

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/logfwd-core/src/lib.rs`:
- Around line 6-14: Remove the stale moved modules left in logfwd-core/src so
there's only one source of truth: delete the old files compress.rs,
diagnostics.rs, and enrichment.rs from logfwd-core/src (and any other files that
were moved to logfwd-io) and ensure lib.rs has no remaining pub mod lines
referencing them; after removal, run a build/test (cargo build / cargo test) to
confirm no references remain.

In `@crates/logfwd-core/src/scan_config.rs`:
- Around line 7-8: Remove the stray doc comment that precedes the import
statement "use alloc::{string::String, vec, vec::Vec};" (do not document use
statements); either delete that doc comment or move the import above the doc
comment for the "FieldSpec" struct so the struct retains its doc comment and the
use line has no documentation attached.

In `@crates/logfwd-io/src/checkpoint.rs`:
- Around line 186-217: libc_getuid() does file I/O on every call (reading
/proc/self/status), so change it to compute the UID once and cache it or call
the libc syscall directly; specifically, update libc_getuid() (and any callers
like default_data_dir()) to use a static one-time cache (e.g.,
once_cell::sync::OnceCell or std::sync::Once) that stores the parsed u32 on
first invocation and returns the cached value thereafter, or replace the parsing
branch with a direct libc::getuid() call under cfg(unix) if you prefer adding
the libc dependency—ensure the function still returns a fallback (1000) when
detection fails.
- Around line 123-146: The flush method currently writes to a tmp file and
renames it but misses syncing the parent directory; after std::fs::rename(&tmp,
&final_path)? call, open the parent directory of final_path and call sync_all()
on that directory File to ensure rename is durable (use
final_path.parent().unwrap_or(Path::new(".")) to locate the dir), propagate any
io errors as io::Result so flush still returns Err on failure; reference the
flush function and the tmp_path() / checkpoints_path() names when making the
change.

In `@crates/logfwd-io/src/dashboard.html`:
- Around line 118-134: The poll() loop can overlap requests and allow
out-of-order responses to make rate() return negative values and leave the UI
showing "live" on errors; fix by making poll() non-reentrant (use an inFlight
flag or start the next poll from xhr.onloadend instead of setInterval) so a new
XHR is not issued until the prior request completes, update xhr.onload to treat
non-200 responses as disconnected (set the "status" element to "disconnected"
and skip render), move the JSON.parse try/catch to set status to "disconnected"
on parse errors (inside render call handling), and harden rate() to guard dt > 0
and clamp negative deltas to zero before computing (or return 0 on invalid data)
to avoid negative rates.

In `@crates/logfwd-io/src/diagnostics.rs`:
- Around line 647-660: esc() currently only handles a few characters and fails
to escape control codepoints (e.g., U+0000, U+0008, U+000C), which can produce
invalid JSON; update the esc function to, after matching '"', '\\', '\n', '\r',
'\t', check the character's Unicode scalar value (as u32) and if it's less than
0x20 emit a \u00XX style escape (four hex digits) for that codepoint, otherwise
fall back to pushing the character; keep the existing explicit escapes for
quotes/backslashes/newlines/r/ t and ensure the \u escape uses zero-padded
two-digit hex in the lower byte (formatted as \u00NN) so all control characters
are safely encoded.

In `@crates/logfwd-io/src/enrichment.rs`:
- Around line 356-365: The current code reads CSV headers into let headers:
Vec<String> using csv_reader.headers() and only checks headers.is_empty();
update this block to also reject empty header names and duplicate header names
up front: after building headers, check for any header.trim().is_empty() and
return Err("CSV has empty column name".to_string()) if found, and check for
duplicates by inserting header strings into a HashSet (or using
headers.iter().collect::<HashSet<_>>() length comparison) and return Err("CSV
has duplicate column names".to_string()) if duplicates exist; keep using
csv_reader.headers() and the same headers variable so callers and the downstream
RecordBatch building logic remain unchanged.
- Around line 369-404: The CSV padding currently writes empty strings into
columns, collapsing empty vs missing; change the column storage and array
construction to preserve nulls: make columns a Vec<Vec<Option<String>>> (instead
of Vec<Vec<String>>), push Some(field.to_string()) for present fields and push
None when padding short rows (in the loop over csv_reader.records and the
padding loop), then when creating Arrow arrays convert each Option<String> to
Option<&str> (e.g., via as_deref()) and collect into a StringArray so missing
fields become nulls; keep the existing Field::new(..., DataType::Utf8, true) and
RecordBatch::try_new usage.

In `@crates/logfwd-io/src/format.rs`:
- Around line 107-125: The current guard `if !line.is_empty()` drops empty raw
lines, losing explicit blank records; remove that condition so the code always
emits a JSON record for every `line` (including empty ones) by writing out the
prefix `{"_raw":"`, running the existing escaping loop (which is a no-op for
empty slices), appending the closing `"}\n"`, and incrementing `count` for every
line; update the block around `line`, `out`, and `count` in format.rs
accordingly so blank `b"\n"` lines produce `{"_raw":""}\n` instead of being
skipped.
- Around line 68-69: The partial buffer growth must be bounded: wherever you
call self.partial.extend_from_slice(&bytes[start..]) (seen in the JsonParser,
RawParser and CriParser parsing paths), check and enforce max_line_size before
extending — compute new_len = self.partial.len() + slice.len(), and if new_len >
self.max_line_size only extend up to the remaining allowed bytes (or stop
extending and mark/truncate the line) instead of unboundedly appending; ensure
CriParser uses its outer max_line_size when extending its outer partial as well
and propagate a truncation/overflow flag or behavior consistently so a single
newline-free input cannot grow memory without bound.

In `@crates/logfwd-io/src/tail.rs`:
- Around line 406-408: Avoid allocating a new Vec each poll by not calling
self.files.keys().cloned().collect(); instead iterate the key iterator or
maintain a persistent order vector. Replace the current allocation with a loop
like `for path in self.files.keys().cloned()` (no collect) or add a persistent
`read_order: Vec<PathBuf>` field and iterate that, updating it when files
change; update the loop that uses `paths` (the current `paths: Vec<PathBuf>`
variable) to use the new iteration approach and ensure borrows of `self.files`
inside the loop are handled correctly.
- Around line 502-511: The loop reads into result incrementally causing
reallocations; before the loop allocate result with Vec::with_capacity by
computing the expected remaining bytes (e.g., let remaining =
(file_size.saturating_sub(tailed.offset)) as usize) using the tailed file's
metadata and tailed.offset, then call Vec::with_capacity(remaining) so the
subsequent loop (tailed.file.read into self.read_buf, extending result, and
updating tailed.offset) won't reallocate; ensure you clamp or check remaining to
a sane max (and handle metadata errors) to avoid panic or huge allocations.
- Around line 345-347: The code currently clones the entire Vec<PathBuf> each
poll by calling watch_paths.clone(); change the loop to iterate over a reference
to the existing collection to avoid the allocation: stop cloning and use for
path in &self.watch_paths (or self.watch_paths.iter()) inside the polling code
where watch_paths is defined, ensuring any mutations that previously required an
owned Vec are adjusted (e.g., collect indices or clone only when absolutely
needed).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: da038166-9ab5-4490-a1c2-16edbc9c62cf

📥 Commits

Reviewing files that changed from the base of the PR and between 6ab7864 and a6da672.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (35)

Cargo.toml
crates/logfwd-bench/Cargo.toml
crates/logfwd-bench/benches/pipeline.rs
crates/logfwd-bench/src/e2e_profile.rs
crates/logfwd-core/Cargo.toml
crates/logfwd-core/src/aggregator.rs
crates/logfwd-core/src/byte_search.rs
crates/logfwd-core/src/cri.rs
crates/logfwd-core/src/lib.rs
crates/logfwd-core/src/otlp.rs
crates/logfwd-core/src/scan_config.rs
crates/logfwd-core/src/scanner.rs
crates/logfwd-core/src/structural.rs
crates/logfwd-core/tests/compliance_data.rs
crates/logfwd-io/Cargo.toml
crates/logfwd-io/src/checkpoint.rs
crates/logfwd-io/src/compress.rs
crates/logfwd-io/src/dashboard.html
crates/logfwd-io/src/diagnostics.rs
crates/logfwd-io/src/enrichment.rs
crates/logfwd-io/src/filter_hints.rs
crates/logfwd-io/src/format.rs
crates/logfwd-io/src/input.rs
crates/logfwd-io/src/lib.rs
crates/logfwd-io/src/tail.rs
crates/logfwd-output/Cargo.toml
crates/logfwd-output/src/otlp_sink.rs
crates/logfwd-transform/Cargo.toml
crates/logfwd-transform/src/lib.rs
crates/logfwd-transform/src/udf/geo_lookup.rs
crates/logfwd-transform/tests/scanner_datafusion_boundary.rs
crates/logfwd/Cargo.toml
crates/logfwd/src/main.rs
crates/logfwd/src/pipeline.rs
crates/logfwd/tests/integration.rs

coderabbitai · 2026-03-31T07:49:05Z

+    fn flush(&mut self) -> io::Result<()> {
+        use std::io::Write as _;
+
+        let list: Vec<&SourceCheckpoint> = self.checkpoints.values().collect();
+        let json = serde_json::to_vec_pretty(&list)
+            .map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e))?;
+
+        let tmp = self.tmp_path();
+        let final_path = self.checkpoints_path();
+
+        {
+            let mut file = std::fs::OpenOptions::new()
+                .write(true)
+                .create(true)
+                .truncate(true)
+                .open(&tmp)?;
+            file.write_all(&json)?;
+            file.sync_all()?;
+        }
+
+        std::fs::rename(&tmp, &final_path)?;
+
+        Ok(())
+    }


⚠️ Potential issue | 🟡 Minor

Missing directory fsync after rename for crash-safe durability.

The atomic write pattern (tmp → fsync → rename) is correct, but on Linux/ext4, rename durability requires an additional fsync on the parent directory. Without it, a crash after rename but before journal commit can lose the file.

🛡️ Proposed fix to add directory fsync

std::fs::rename(&tmp, &final_path)?; + // Sync the directory to ensure the rename is durable. + if let Ok(dir) = std::fs::File::open(&self.data_dir) { + let _ = dir.sync_all(); + } + Ok(()) }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/checkpoint.rs` around lines 123 - 146, The flush method currently writes to a tmp file and renames it but misses syncing the parent directory; after std::fs::rename(&tmp, &final_path)? call, open the parent directory of final_path and call sync_all() on that directory File to ensure rename is durable (use final_path.parent().unwrap_or(Path::new(".")) to locate the dir), propagate any io errors as io::Result so flush still returns Err on failure; reference the flush function and the tmp_path() / checkpoints_path() names when making the change.

coderabbitai · 2026-03-31T07:49:05Z

+#[cfg(unix)]
+fn libc_getuid() -> u32 {
+    // Use the raw syscall via std rather than pulling in libc.
+    // std::os::unix doesn't expose getuid directly, so we use a cfg-guarded
+    // approach: on non-root systems HOME is always set, so the root check is
+    // mostly a documentation hint. We fall back to HOME-based path if unsure.
+    //
+    // Using nix or libc would be cleaner but they aren't dependencies.
+    // Instead we read /proc/self/status (Linux) or skip the check (macOS).
+    #[cfg(target_os = "linux")]
+    {
+        if let Ok(status) = std::fs::read_to_string("/proc/self/status") {
+            for line in status.lines() {
+                #[allow(clippy::collapsible_if)]
+                if let Some(rest) = line.strip_prefix("Uid:") {
+                    if let Some(uid_str) = rest.split_whitespace().next() {
+                        if let Ok(uid) = uid_str.parse::<u32>() {
+                            return uid;
+                        }
+                    }
+                }
+            }
+        }
+        // Couldn't determine, assume non-root.
+        1000
+    }
+    #[cfg(not(target_os = "linux"))]
+    {
+        // On macOS / other unices, default to non-root path.
+        1000
+    }
+}


🧹 Nitpick | 🔵 Trivial

libc_getuid() reads /proc/self/status on every call.

This performs I/O each time default_data_dir() is called. If called frequently, consider caching the result or using the libc crate directly (single syscall vs. file read + parse).

Current impact is likely minimal since default_data_dir() is typically called once at startup.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/checkpoint.rs` around lines 186 - 217, libc_getuid() does file I/O on every call (reading /proc/self/status), so change it to compute the UID once and cache it or call the libc syscall directly; specifically, update libc_getuid() (and any callers like default_data_dir()) to use a static one-time cache (e.g., once_cell::sync::OnceCell or std::sync::Once) that stores the parsed u32 on first invocation and returns the cached value thereafter, or replace the parsing branch with a direct libc::getuid() call under cfg(unix) if you prefer adding the libc dependency—ensure the function still returns a fallback (1000) when detection fails.

coderabbitai · 2026-03-31T07:49:05Z

+  function poll() {
+    var xhr = new XMLHttpRequest();
+    xhr.open("GET", "/api/pipelines");
+    xhr.onload = function() {
+      if (xhr.status === 200) {
+        document.getElementById("status").textContent = "live";
+        try { render(JSON.parse(xhr.responseText)); } catch(e) {}
+      }
+    };
+    xhr.onerror = function() {
+      document.getElementById("status").textContent = "disconnected";
+    };
+    xhr.send();
+  }
+
+  poll();
+  setInterval(poll, 1000);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's see the structure and find the dashboard.html file find . -name "dashboard.html" -type f

Repository: strawgate/memagent

Length of output: 139

🏁 Script executed:

# Read the dashboard.html file to see the code and understand the context wc -l ./crates/logfwd-io/src/dashboard.html

Repository: strawgate/memagent

Length of output: 103

🏁 Script executed:

# Read the entire dashboard.html file to see the polling logic and rate calculation cat -n ./crates/logfwd-io/src/dashboard.html

Repository: strawgate/memagent

Length of output: 6417

🏁 Script executed:

# Let me trace through a specific scenario to confirm the rate calculation vulnerability # Check if there are any guards against negative values anywhere else rg -i "negative|rate|delta" ./crates/logfwd-io/src/dashboard.html

Repository: strawgate/memagent

Length of output: 749

Prevent overlapping polls from rendering stale or negative rates.

setInterval fires a new XHR every second even if the previous /api/pipelines request is still running. Under a slow diagnostics endpoint, responses can arrive out of order, overwrite a newer snapshot with an older one, and make the delta math in rate() go negative. The rate() function at line 46-50 calculates (cur - prevVal) / dt with no safeguard—if a newer response is overwritten by an older one, counters decrease and rates become negative.

Additionally, non-200 responses silently skip rendering while leaving the "live" status intact, and JSON parse failures (line 124) are caught without updating the status to reflect a disconnected state.

Drive the next poll from onloadend or guard with an in-flight flag to ensure requests complete before the next one starts. Treat non-200 and JSON parse failures as disconnected errors.

Suggested fix

(function() { var prev = null; var prevTime = null; + var inFlight = false; function fmt(n) { @@ function poll() { + if (inFlight) return; + inFlight = true; + var xhr = new XMLHttpRequest(); xhr.open("GET", "/api/pipelines"); xhr.onload = function() { - if (xhr.status === 200) { - document.getElementById("status").textContent = "live"; - try { render(JSON.parse(xhr.responseText)); } catch(e) {} - } + if (xhr.status !== 200) { + document.getElementById("status").textContent = "disconnected"; + return; + } + + try { + render(JSON.parse(xhr.responseText)); + document.getElementById("status").textContent = "live"; + } catch (e) { + document.getElementById("status").textContent = "disconnected"; + } }; xhr.onerror = function() { document.getElementById("status").textContent = "disconnected"; }; + xhr.onloadend = function() { + inFlight = false; + setTimeout(poll, 1000); + }; xhr.send(); } poll(); - setInterval(poll, 1000); })();

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/dashboard.html` around lines 118 - 134, The poll() loop can overlap requests and allow out-of-order responses to make rate() return negative values and leave the UI showing "live" on errors; fix by making poll() non-reentrant (use an inFlight flag or start the next poll from xhr.onloadend instead of setInterval) so a new XHR is not issued until the prior request completes, update xhr.onload to treat non-200 responses as disconnected (set the "status" element to "disconnected" and skip render), move the JSON.parse try/catch to set status to "disconnected" on parse errors (inside render call handling), and harden rate() to guard dt > 0 and clamp negative deltas to zero before computing (or return 0 on invalid data) to avoid negative rates.

coderabbitai · 2026-03-31T07:49:05Z

+        if start < bytes.len() {
+            self.partial.extend_from_slice(&bytes[start..]);


⚠️ Potential issue | 🟠 Major

Cap partial growth on newline-free input.

These buffers grow until a newline arrives. JsonParser and RawParser have no limit at all, and CriParser does not apply max_line_size to its outer partial buffer. A single malformed or adversarial line can therefore grow memory without bound before any truncation happens.

Also applies to: 132-133, 166-170

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/format.rs` around lines 68 - 69, The partial buffer growth must be bounded: wherever you call self.partial.extend_from_slice(&bytes[start..]) (seen in the JsonParser, RawParser and CriParser parsing paths), check and enforce max_line_size before extending — compute new_len = self.partial.len() + slice.len(), and if new_len > self.max_line_size only extend up to the remaining allowed bytes (or stop extending and mark/truncate the line) instead of unboundedly appending; ensure CriParser uses its outer max_line_size when extending its outer partial as well and propagate a truncation/overflow flag or behavior consistently so a single newline-free input cannot grow memory without bound.

coderabbitai · 2026-03-31T07:49:05Z

+            if !line.is_empty() {
+                out.extend_from_slice(b"{\"_raw\":\"");
+                for &b in line {
+                    match b {
+                        b'"' => out.extend_from_slice(b"\\\""),
+                        b'\\' => out.extend_from_slice(b"\\\\"),
+                        b'\n' => out.extend_from_slice(b"\\n"),
+                        b'\r' => out.extend_from_slice(b"\\r"),
+                        b'\t' => out.extend_from_slice(b"\\t"),
+                        b if b < 0x20 => {
+                            // Escape control characters per RFC 8259.
+                            let _ = std::io::Write::write_fmt(out, format_args!("\\u{:04x}", b));
+                        }
+                        _ => out.push(b),
+                    }
+                }
+                out.extend_from_slice(b"\"}\n");
+                count += 1;
+            }


⚠️ Potential issue | 🟠 Major

Preserve blank raw lines instead of dropping them.

The if !line.is_empty() guard turns b"\n" into no output. In raw mode that is data loss: explicit blank records never reach downstream consumers.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/format.rs` around lines 107 - 125, The current guard `if !line.is_empty()` drops empty raw lines, losing explicit blank records; remove that condition so the code always emits a JSON record for every `line` (including empty ones) by writing out the prefix `{"_raw":"`, running the existing escaping loop (which is a no-op for empty slices), appending the closing `"}\n"`, and incrementing `count` for every line; update the block around `line`, `out`, and `count` in format.rs accordingly so blank `b"\n"` lines produce `{"_raw":""}\n` instead of being skipped.

coderabbitai · 2026-03-31T07:49:05Z

+        // Check for new/rotated files.
+        let watch_paths = self.watch_paths.clone();
+        for path in &watch_paths {


🧹 Nitpick | 🔵 Trivial

Minor: watch_paths.clone() on every poll.

This clones the entire Vec<PathBuf> each poll cycle. For large numbers of watched paths, consider iterating over a reference or using indices.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/tail.rs` around lines 345 - 347, The code currently clones the entire Vec<PathBuf> each poll by calling watch_paths.clone(); change the loop to iterate over a reference to the existing collection to avoid the allocation: stop cloning and use for path in &self.watch_paths (or self.watch_paths.iter()) inside the polling code where watch_paths is defined, ensuring any mutations that previously required an owned Vec are adjusted (e.g., collect indices or clone only when absolutely needed).

coderabbitai · 2026-03-31T07:49:05Z

+        // Read new data from all tailed files.
+        let paths: Vec<PathBuf> = self.files.keys().cloned().collect();
+        for path in paths {


🧹 Nitpick | 🔵 Trivial

Minor: files.keys().cloned().collect() on every poll.

Same pattern — allocates a new Vec<PathBuf> each poll. Could iterate keys directly with careful borrow management or use a separate read_order Vec.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/tail.rs` around lines 406 - 408, Avoid allocating a new Vec each poll by not calling self.files.keys().cloned().collect(); instead iterate the key iterator or maintain a persistent order vector. Replace the current allocation with a loop like `for path in self.files.keys().cloned()` (no collect) or add a persistent `read_order: Vec<PathBuf>` field and iterate that, updating it when files change; update the loop that uses `paths` (the current `paths: Vec<PathBuf>` variable) to use the new iteration approach and ensure borrows of `self.files` inside the loop are handled correctly.

coderabbitai · 2026-03-31T07:49:05Z

+        // Read ALL available bytes in a loop until we drain everything.
+        let mut result = Vec::new();
+        loop {
+            let n = tailed.file.read(&mut self.read_buf)?;
+            if n == 0 {
+                break;
+            }
+            result.extend_from_slice(&self.read_buf[..n]);
+            tailed.offset += n as u64;
+        }


🧹 Nitpick | 🔵 Trivial

Consider pre-allocating result Vec based on available bytes.

The loop extends result incrementally. Pre-allocating with Vec::with_capacity((current_size - tailed.offset) as usize) would avoid reallocations when reading large chunks.

♻️ Suggested optimization

// Read ALL available bytes in a loop until we drain everything. - let mut result = Vec::new(); + let expected = (current_size - tailed.offset) as usize; + let mut result = Vec::with_capacity(expected); loop {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/tail.rs` around lines 502 - 511, The loop reads into result incrementally causing reallocations; before the loop allocate result with Vec::with_capacity by computing the expected remaining bytes (e.g., let remaining = (file_size.saturating_sub(tailed.offset)) as usize) using the tailed file's metadata and tailed.offset, then call Vec::with_capacity(remaining) so the subsequent loop (tailed.file.read into self.read_buf, extending result, and updating tailed.offset) won't reallocate; ensure you clamp or check remaining to a sane max (and handle metadata errors) to avoid panic or huge allocations.

scan_config.rs had a doc comment attached to the `use alloc` import instead of the struct. Removed the duplicate. Pre-existing bugs in moved code filed as separate issues: #380 — diagnostics uptime/CPU mismatch #381 — diagnostics JSON escape incomplete #382 — enrichment CSV empty/duplicate headers #383 — enrichment CSV missing fields as empty vs null #384 — format parser unbounded partial buffer #385 — RawParser drops blank lines #386 — checkpoint missing directory fsync Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T07:51:58Z

@strawgate: The thorough analysis is already underway — I triggered the full automated review in my previous reply above. To recap the six issues I flagged:

#	Severity	Location	Issue
1	🟡 Medium	`diagnostics.rs`	`cpu_user_ms` is populated with wall-clock `run_time()`, not CPU user time — misleading to API consumers
2	🟡 Medium	`tail.rs`	Let-chain syntax (`if let … && expr`) requires Rust ≥ 1.88; edition 2024 only mandates Rust ≥ 1.85 — verify the pinned toolchain
3	🔵 Low	`logfwd-io/Cargo.toml`	`xxhash-rust = "0.8.15"` (exact pin) vs. workspace `"0.8"` — maintenance inconsistency
4	🔵 Low	`scan_config.rs`	Duplicate doc comment attaches `///` to the `use alloc::…` statement instead of `FieldSpec`
5	🔵 Low	`checkpoint.rs`	`libc_getuid` on non-Linux unix always returns `1000` — fine if intentional but should be a clear code comment
6	🔵 Low	`enrichment.rs`	`parse_cri_log_path` uses `find("/pods/")` (first match) — `rfind` or anchoring to `/var/log/pods/` would be more robust

The automated review will add inline comments to the diff. The most actionable items before merge are #1 (incorrect metric semantics) and #2 (potential MSRV breakage on let-chains). Let me know if you'd like me to dig deeper into any of these.

🧠 Learnings used

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-core/**/*.rs : No per-line heap allocations in hot paths (scanner, format parser, CRI parser); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-output/**/*.rs : No per-line heap allocations in hot paths (OTLP encoder, compress); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-arrow/**/*.rs : No per-line heap allocations in hot paths (builders, Arrow construction); verify with `just ci`

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Applies to src/logfwd-core/**/*.rs : Implement Kani proofs for pure logic in logfwd-core; three-layer verification approach: Kani (scalar) → proptest (SIMD ≡ scalar) → Kani (consumers)

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Consult `dev-docs/DECISIONS.md` for settled architecture decisions before reopening a question; key decisions include: two-stage architecture (SIMD detects parallel, scanner sequential), streaming per-block with quote/in_string bitmasks, portable SIMD via `wide` crate, three-layer verification (Kani scalar, proptest SIMD≡scalar, Kani consumers), and pipeline state machine design

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: CODE_STYLE.md:0-0
Timestamp: 2026-03-31T06:20:47.516Z
Learning: Applies to **/*.rs : Use Kani proofs for pure logic in logfwd-core (see `dev-docs/DECISIONS.md`)

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-03-31T06:21:14.412Z
Learning: Consult library-specific reference documentation before working on related code: arrow-v54.md for RecordBatch/StringViewArray, datafusion-v45.md for SessionContext/SQL, tokio-async-patterns.md for async runtime, opentelemetry-otlp.md for OTLP protobuf, notify-memchr-zstd.md for file watchers/SIMD/compression, and kani-verification.md for proof verification

Learnt from: CR
Repo: strawgate/memagent PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-31T06:20:58.181Z
Learning: Applies to **/*.rs : Use Rust stable toolchain with no hand-written unsafe code — use SIMD via wide crate for performance-critical operations

✅ Actions performed

Full review triggered.

strawgate · 2026-03-31T07:53:30Z

CodeRabbit Round 1 — All addressed

Fixed in this PR:

Old moved modules deleted from logfwd-core/src (commit c463745)
Misplaced doc comment on use statement in scan_config.rs (commit 096975c)

Pre-existing bugs in moved code — filed as separate issues (not fixing in this PR to avoid scope creep):

Fix: diagnostics /api/stats reports uptime as CPU time #380 — diagnostics uptime/CPU mismatch
Fix: diagnostics JSON escape misses control characters #381 — diagnostics JSON escape incomplete
Fix: enrichment CSV loader accepts empty/duplicate headers #382 — enrichment CSV empty/duplicate headers
Fix: enrichment CSV loader pads missing fields as empty string, not null #383 — enrichment CSV missing fields as empty vs null
Fix: format parsers have unbounded partial buffer growth #384 — format parser unbounded partial buffer
Fix: RawParser drops blank lines #385 — RawParser drops blank lines
Fix: checkpoint atomic write missing directory fsync #386 — checkpoint missing directory fsync

Trivial nitpicks (tail.rs clone/collect per poll, checkpoint libc_getuid):
Pre-existing patterns, not introduced by this PR. Low priority.

All CI checks pass. 439 tests green.

Found by CodeRabbit in PR #375. JsonParser, RawParser, and CriParser partial buffers previously grew without limit until a newline arrived, risking memory exhaustion from a single malformed or extremely long line. This change: 1. Adds `max_line_size` to `JsonParser` and `RawParser` constructors. 2. Caps `partial` buffer growth in `process()` for all parsers. 3. Discards data exceeding `max_line_size` until the next newline. 4. Updates `CriParser` to also cap its outer partial buffer. 5. Sets a default 2MB limit in the production pipeline. 6. Updates tests and fuzzers to accommodate the new constructor signatures. Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>

Found by CodeRabbit in PR #375. JsonParser, RawParser, and CriParser partial buffers previously grew without limit until a newline arrived, risking memory exhaustion from a single malformed or extremely long line. This change: 1. Adds `max_line_size` to `JsonParser` and `RawParser` constructors. 2. Caps `partial` buffer growth in `process()` for all parsers. 3. Discards data exceeding `max_line_size` until the next newline. 4. Updates `CriParser` to also cap its outer partial buffer. 5. Sets a default 2MB limit in the production pipeline. 6. Updates tests and fuzzers to accommodate the new constructor signatures. 7. Fixed code formatting issues found in CI. Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>

coderabbitai Bot requested changes Mar 31, 2026

View reviewed changes

strawgate merged commit 6b97b86 into master Mar 31, 2026
8 of 10 checks passed

strawgate deleted the feat/no-std-core branch March 31, 2026 07:53

This was referenced Mar 31, 2026

Create logfwd-input crate, move file tailer + IO #265

Closed

Tighten logfwd-core to no_std + forbid(unsafe_code) #267

Closed

This was referenced Mar 31, 2026

feat: rich diagnostics dashboard with live charts and system stats #454

Closed

feat: Preact + TypeScript diagnostics dashboard #526

Merged

strawgate mentioned this pull request Apr 1, 2026

Epic: Proven Core — formal verification and crate restructuring #262

Closed

18 tasks

This was referenced Apr 9, 2026

Migrate diagnostics server from tiny_http to axum with WebSocket telemetry #1707

Merged

refactor: extract diagnostics into dedicated crate #1738

Merged

		if start < bytes.len() {
		self.partial.extend_from_slice(&bytes[start..]);

Conversation

strawgate commented Mar 31, 2026

Summary

What moved

What remains in logfwd-core

Test plan

Uh oh!

coderabbitai Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Possibly related PRs

Uh oh!

strawgate commented Mar 31, 2026

Uh oh!

coderabbitai Bot commented Mar 31, 2026

Analysis Summary

✅ What's correct and well-executed

⚠️ Issues to review

Uh oh!

strawgate commented Mar 31, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Mar 31, 2026

Uh oh!

strawgate commented Mar 31, 2026

CodeRabbit Round 1 — All addressed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading