sensor: use ebpf/es input names and Arrow-native path#1733
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end beta platform sensor support and OS capability probes: three new input types ( Possibly related PRs
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 2 warnings, 1 inconclusive)
✅ Passed checks (1 passed)
Comment |
ApprovabilityVerdict: Needs human review This PR introduces significant new runtime behavior by replacing simple JSON heartbeat emission with a 1100+ line Arrow-native sensor implementation featuring a state machine, runtime control plane reload, and per-platform signal families. While backward-compatible config aliases are preserved, the fundamental change in what sensor inputs emit and how they behave at runtime warrants human review. You can customize Macroscope's approvability policy. Learn more. |
There was a problem hiding this comment.
Pull request overview
This PR adds a dedicated, cross-platform CI workflow to probe/record platform-specific sensor capabilities (Linux eBPF, macOS EndpointSecurity/OpenBSM, Windows eBPF-for-Windows surfaces) and introduces new *_sensor_beta inputs with lightweight heartbeat/control-event emission plus corresponding config validation and documentation.
Changes:
- Add
eBPF CapabilitiesGitHub Actions workflow with explicit Linux/macOS/Windows lanes and artifacted capability snapshots. - Introduce
linux_sensor_beta/macos_sensor_beta/windows_sensor_betainputs end-to-end (config parsing/validation, pipeline wiring, and a newlogfwd-iobeta input source). - Update user docs (README + mdBook config reference) and add a dependency-structure audit research note.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/ci-ebpf-capabilities.sh | Adds a Linux runner eBPF capability probe producing TSV/env/markdown summaries + raw outputs. |
| .github/workflows/ebpf-capabilities.yml | Adds a scheduled/PR workflow with per-OS lanes and uploaded capability artifacts. |
| crates/logfwd-io/src/platform_sensor_beta.rs | New beta input source emitting startup/heartbeat control events with platform gating. |
| crates/logfwd-io/src/lib.rs | Exposes the new platform_sensor_beta module publicly. |
| crates/logfwd-config/src/lib.rs | Adds new InputType variants and sensor_beta config + validation/tests. |
| crates/logfwd/src/pipeline.rs | Wires new *_sensor_beta inputs into pipeline input creation and format validation. |
| crates/logfwd/src/main.rs | Updates read-only format validation and input labeling for new input types. |
| book/src/config/reference.md | Documents new beta inputs and supported fields. |
| README.md | Adds a user-facing overview and example config for platform sensor betas. |
| dev-docs/research/dependency-structure-audit-2026-04-08.md | Adds internal research notes on dependency/tooling/doc drift. |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ebpf-capabilities.yml:
- Around line 188-190: The "Add macOS probe summary" step currently uses if:
always() and unconditionally cats artifacts/macos-sensor/summary.md which fails
when the probe step was skipped; update that step to first check for the file's
existence (e.g., test -f artifacts/macos-sensor/summary.md on macOS/Linux and
Test-Path on Windows) and only append the file to $GITHUB_STEP_SUMMARY if
present, otherwise append a small fallback message; apply the same
existence-check + fallback logic to the corresponding Windows/other summary
steps referenced around lines 284-287 so they don't error when the probe didn't
run.
- Around line 79-83: The workflow's tests only run crate-level tests for
logfwd-io and logfwd-config and therefore miss the full runtime lane implemented
in crates/logfwd/src/pipeline.rs; add a logfwd-level smoke per OS so the
pipeline and OS-gating are exercised end-to-end by either adding a targeted
cargo invocation such as "cargo test -p logfwd <specific_smoke_test_or_module>
-- --nocapture" for each platform job or by running a validate/effective-config
command against a platform-sensor config fixture (e.g., invoke the logfwd binary
or test helper that loads the fixture and runs the pipeline) so the workflow
covers the full runtime lane alongside logfwd-io and logfwd-config (apply the
same change to the other two similar job blocks mentioned).
- Around line 6-20: The workflow path filters currently omit runtime wiring
files that can affect sensor format checks; update both the push and
pull_request "paths" arrays in .github/workflows/ebpf-capabilities.yml to also
include crates/logfwd/src/main.rs and crates/logfwd/src/pipeline.rs so changes
to PlatformSensorBetaInput construction and the sensor format checks in
main.rs/pipeline.rs will trigger this workflow.
In `@crates/logfwd-config/src/lib.rs`:
- Around line 984-1026: The validation branch for InputType::LinuxSensorBeta |
InputType::MacosSensorBeta | InputType::WindowsSensorBeta must additionally
reject non-JSON formats up front so Config::load_str() fails fast: inside the
same branch (where you already check
generator/listen/path/max_open_files/glob_rescan_interval_ms and
sensor_beta.poll_interval_ms) add a check for input.format (e.g., ensure
input.format is Some(Format::Json) or the enum variant your code uses) and
return ConfigError::Validation with a clear message like "pipeline '{name}'
input '{label}': only JSON format is supported for *_sensor_beta inputs" if it's
not JSON; then add a regression test next to
sensor_beta_rejects_zero_poll_interval that constructs a *_sensor_beta input
with format: raw and asserts Config::load_str() returns a validation error.
In `@crates/logfwd-io/src/platform_sensor_beta.rs`:
- Around line 183-218: poll() takes self.machine with take(), so if
Init::start() returns Err the machine is lost and name() later panics; fix by
preserving/restoring state: before calling init.start() save init.common (or
clone needed fields) and call start() on a non-consuming borrow or, simpler,
don't take the machine—match on self.machine.as_mut() or temporarily replace it
then on Err put back PlatformSensorMachine::Init(init) into self.machine; update
PlatformSensorBetaInput::poll to restore self.machine on any Err from
init.start() or use as_mut() to avoid consuming Init so name() cannot observe
None.
In `@crates/logfwd/src/pipeline.rs`:
- Around line 1654-1663: The function build_platform_sensor_beta_config silently
clamps poll_interval_ms to 1 via poll_interval_ms.max(1); change it to detect
when clamping happens and emit a warning (e.g., using tracing::warn!) including
the original configured value and the clamped value, then use
Duration::from_millis(clamped); specifically, read cfg.and_then(|c|
c.poll_interval_ms) into an Option/variable original_poll, compute let clamped =
original_poll.unwrap_or(10_000).max(1), if original_poll.is_some() &&
original_poll.unwrap() == 0 { tracing::warn!(original = original_poll.unwrap(),
clamped = clamped, "poll_interval_ms was below minimum and was clamped to 1ms");
} and then construct PlatformSensorBetaConfig with poll_interval:
Duration::from_millis(clamped) (or alternatively add equivalent documentation to
CONFIG_REFERENCE.md if logging is not desired).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: ef8ee97f-a998-41ea-986e-d3719d4bc372
📒 Files selected for processing (10)
.github/workflows/ebpf-capabilities.ymlREADME.mdbook/src/config/reference.mdcrates/logfwd-config/src/lib.rscrates/logfwd-io/src/lib.rscrates/logfwd-io/src/platform_sensor_beta.rscrates/logfwd/src/main.rscrates/logfwd/src/pipeline.rsdev-docs/research/dependency-structure-audit-2026-04-08.mdscripts/ci-ebpf-capabilities.sh
|
Pushed follow-up commit What landed:
Validation run locally after push:
|
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/logfwd/src/pipeline.rs (1)
1632-1717: 🛠️ Refactor suggestion | 🟠 MajorMove the new sensor-beta runtime wiring out of the binary crate.
This branch adds more long-lived runtime orchestration to
crates/logfwd, which is exactly the layer the repo wants kept to CLI/bootstrap only. Please push the sensor-beta builder/config-translation logic intologfwd-runtime(or the IO/runtime layer) and keep this file as thin wiring. As per coding guidelines,crates/logfwd/**/*.rs:Binary crate (logfwd) must contain only CLI/bootstrap logic — no long-lived runtime orchestration logic. Move orchestration to logfwd-runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd/src/pipeline.rs` around lines 1632 - 1717, This change wires sensor-beta runtime orchestration into the binary crate; move the builder/translation logic for sensor-beta into the runtime/io crate and keep pipeline.rs as thin wiring: extract build_platform_sensor_beta_config (and any validation like is_sensor_beta_input_type usage and the creation of PlatformSensorBetaInput::new) into a new public builder/util in logfwd-runtime (or logfwd-io) that returns a ready-to-use PlatformSensorBetaConfig and an initialized source (or a factory function that returns Box<dyn InputSource>, format and buf_cap), then in pipeline.rs call that runtime API to receive raw_source/format/buf_cap and only perform the minimal wiring to wrap with make_format/FramedInput and return InputState; update imports and error mapping in pipeline.rs to delegate initialization errors to the runtime helper.
♻️ Duplicate comments (2)
crates/logfwd-config/src/lib.rs (1)
994-1078:⚠️ Potential issue | 🟠 MajorReject unsupported
*_sensor_betaformats here too.This branch validates sensor-beta fields, but it still lets
format: cri,auto,text, etc. through because the only global input-format rejection islogfmt/syslog. That meansConfig::load_str()can still accept configs that thelogfwdruntime later refuses, even though the sensor beta lane now only supportsraworjson.Suggested fix
InputType::LinuxSensorBeta | InputType::MacosSensorBeta | InputType::WindowsSensorBeta => { + if let Some(fmt) = &input.format + && !matches!(fmt, Format::Raw | Format::Json) + { + return Err(ConfigError::Validation(format!( + "pipeline '{name}' input '{label}': {} inputs only support format raw or json", + input.input_type + ))); + } if input.generator.is_some() { return Err(ConfigError::Validation(format!( "pipeline '{name}' input '{label}': 'generator' settings are only supported for generator inputs" ))); }Also applies to: 1089-1094
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 994 - 1078, In the InputType::LinuxSensorBeta | MacosSensorBeta | WindowsSensorBeta validation branch (around the sensor_beta checks), also reject inputs whose format is not "raw" or "json": inspect input.format (or input.format.as_deref()) and if Some(fmt) and fmt != "raw" && fmt != "json" return ConfigError::Validation with a message like "pipeline '{name}' input '{label}': sensor_beta inputs only support 'raw' or 'json' format". Apply the same check in the other sensor_beta branch referenced (the one at lines ~1089-1094) so both sensor_beta input types consistently validate supported formats; keep references to InputType::LinuxSensorBeta, InputType::MacosSensorBeta, InputType::WindowsSensorBeta, sensor_beta, validate_sensor_beta_family_name and sensor_beta_supported_families_csv to locate the code..github/workflows/ebpf-capabilities.yml (1)
83-87:⚠️ Potential issue | 🟠 MajorThe
logfwdsmoke still stops at format validation.
cargo test -p logfwd sensor_beta_formats_allow_raw_and_json_onlyonly exercisesvalidate_input_format()incrates/logfwd/src/pipeline.rs:1767-1780; it never builds the sensor input path. Regressions in control-path resolution, defaultraw, or the framing bypass can still leave all three lanes green. Add one fixture-drivenlogfwdsmoke per OS that reaches pipeline/input construction.Also applies to: 136-138, 236-238
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ebpf-capabilities.yml around lines 83 - 87, The current CI step only runs the unit test sensor_beta_formats_allow_raw_and_json_only which exercises validate_input_format() in crates/logfwd/src/pipeline.rs but does not construct the pipeline input path; add (per-OS) a fixture-driven smoke test invocation that exercises pipeline/input construction (e.g., a test that calls the pipeline input builder or the higher-level integration test that creates a sensor input) so regressions in control-path resolution, default raw framing, or framing bypass are caught; update the .github/workflows/ebpf-capabilities.yml job to run that new/alternative logfwd test (in addition to sensor_beta_formats_allow_raw_and_json_only) for each OS instead of relying solely on sensor_beta_formats_allow_raw_and_json_only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ebpf-capabilities.yml:
- Around line 6-13: Update the workflow's push.paths and pull_request.paths
lists to include the logfwd-io lib file that exports pub mod
platform_sensor_beta so changes to that module re-trigger the ebpf-capabilities
workflow; specifically, add an entry for the logfwd-io lib (the file that
contains "pub mod platform_sensor_beta") to both push.paths and
pull_request.paths under the ebpf-capabilities workflow (the keys to edit are
push.paths and pull_request.paths).
In `@crates/logfwd-config/src/lib.rs`:
- Around line 1379-1381: Rename the boolean helper
validate_sensor_beta_family_name to a predicate-style name (e.g.
is_sensor_beta_family_supported) and update all call sites and any
documentation/comments to use the new name; change the function
signature/function identifier from validate_sensor_beta_family_name to
is_sensor_beta_family_supported while keeping the implementation (calling
sensor_beta_supported_families(input_type).contains(&name)) unchanged to
preserve behavior.
- Around line 1359-1377: The platform-specific family lists in
sensor_beta_supported_families and sensor_beta_supported_families_csv are
duplicated from the runtime matrix and must be unified: introduce a single
shared source (e.g., a const map/array per InputType like SENSOR_BETA_FAMILIES
or a function get_sensor_beta_families) and update
sensor_beta_supported_families to return references into that shared constant,
and update sensor_beta_supported_families_csv to derive the CSV string by
joining the same shared data rather than hardcoding strings; reference
InputType, sensor_beta_supported_families, and
sensor_beta_supported_families_csv when making the change so both functions use
the single definition.
In `@crates/logfwd-io/src/platform_sensor_beta.rs`:
- Around line 29-37: Add a triple-slash doc comment for the public method
PlatformSensorTarget::as_str describing that it returns a stable, lowercase
string identifier ("linux", "macos", or "windows") for the target platform and
instruct callers to use it when a stable textual platform key is required (e.g.,
for config keys, logging, or interop with external APIs), including that the
returned &'static str is stable and should not be parsed for other semantics;
follow existing doc style and place the comment directly above the pub const fn
as_str declaration.
In `@crates/logfwd/src/pipeline.rs`:
- Around line 340-353: The code currently calls std::fs::canonicalize(...) on
sensor_cfg.control_path which resolves symlinks and pins the watcher to the
symlink target; stop resolving symlinks and instead make the path absolute by
joining with base when relative (keep the joined PathBuf as-is) and assign
sensor_cfg.control_path = Some(path.to_string_lossy().into_owned()) directly; in
other words remove the std::fs::canonicalize(...) branch and simply set
control_path from the computed PathBuf (after the relative -> base.join(...)
adjustment) so symlinked control files are preserved (referencing
resolved_cfg.sensor_beta, sensor_cfg.control_path and the std::fs::canonicalize
call to locate the change).
---
Outside diff comments:
In `@crates/logfwd/src/pipeline.rs`:
- Around line 1632-1717: This change wires sensor-beta runtime orchestration
into the binary crate; move the builder/translation logic for sensor-beta into
the runtime/io crate and keep pipeline.rs as thin wiring: extract
build_platform_sensor_beta_config (and any validation like
is_sensor_beta_input_type usage and the creation of
PlatformSensorBetaInput::new) into a new public builder/util in logfwd-runtime
(or logfwd-io) that returns a ready-to-use PlatformSensorBetaConfig and an
initialized source (or a factory function that returns Box<dyn InputSource>,
format and buf_cap), then in pipeline.rs call that runtime API to receive
raw_source/format/buf_cap and only perform the minimal wiring to wrap with
make_format/FramedInput and return InputState; update imports and error mapping
in pipeline.rs to delegate initialization errors to the runtime helper.
---
Duplicate comments:
In @.github/workflows/ebpf-capabilities.yml:
- Around line 83-87: The current CI step only runs the unit test
sensor_beta_formats_allow_raw_and_json_only which exercises
validate_input_format() in crates/logfwd/src/pipeline.rs but does not construct
the pipeline input path; add (per-OS) a fixture-driven smoke test invocation
that exercises pipeline/input construction (e.g., a test that calls the pipeline
input builder or the higher-level integration test that creates a sensor input)
so regressions in control-path resolution, default raw framing, or framing
bypass are caught; update the .github/workflows/ebpf-capabilities.yml job to run
that new/alternative logfwd test (in addition to
sensor_beta_formats_allow_raw_and_json_only) for each OS instead of relying
solely on sensor_beta_formats_allow_raw_and_json_only.
In `@crates/logfwd-config/src/lib.rs`:
- Around line 994-1078: In the InputType::LinuxSensorBeta | MacosSensorBeta |
WindowsSensorBeta validation branch (around the sensor_beta checks), also reject
inputs whose format is not "raw" or "json": inspect input.format (or
input.format.as_deref()) and if Some(fmt) and fmt != "raw" && fmt != "json"
return ConfigError::Validation with a message like "pipeline '{name}' input
'{label}': sensor_beta inputs only support 'raw' or 'json' format". Apply the
same check in the other sensor_beta branch referenced (the one at lines
~1089-1094) so both sensor_beta input types consistently validate supported
formats; keep references to InputType::LinuxSensorBeta,
InputType::MacosSensorBeta, InputType::WindowsSensorBeta, sensor_beta,
validate_sensor_beta_family_name and sensor_beta_supported_families_csv to
locate the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2779166c-f2d7-490a-b66d-bdd7b5587210
📒 Files selected for processing (5)
.github/workflows/ebpf-capabilities.ymlcrates/logfwd-config/src/lib.rscrates/logfwd-io/src/platform_sensor_beta.rscrates/logfwd/src/pipeline.rsscripts/ci-ebpf-capabilities.sh
|
Did a deeper branch-wide audit pass and pushed follow-up commit What I found and fixed:
Verification run after this commit:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@book/src/config/reference.md`:
- Around line 204-222: Change the heading paragraph for the windows_sensor_beta
input so its opening sentence matches the other sensor sections: replace "This
input only supports `format: raw` (default) or `format: json`." with the
consistent phrasing "This input supports `format: raw` (default) or `format:
json`." in the windows_sensor_beta input block to align wording with the other
sensor types.
In `@crates/logfwd/src/main.rs`:
- Around line 1959-1974: Add additional unit tests alongside
validate_pipelines_read_only_accepts_sensor_beta_raw to cover the full
sensor-beta platform/format matrix: create separate tests that call
validate_pipelines_read_only with yaml inputs for macos_sensor_beta+raw,
windows_sensor_beta+raw, and at least one sensor-beta (linux/macos/windows) with
format json to assert Ok. Use one test per behavior with descriptive test names
(e.g., validate_pipelines_read_only_accepts_macos_sensor_beta_raw,
validate_pipelines_read_only_accepts_windows_sensor_beta_raw,
validate_pipelines_read_only_accepts_sensor_beta_json) and reuse the same call
pattern to validate_pipelines_read_only(&config, None, |_name| {}, |_err| {}).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: d7a33c44-f73d-407c-8f72-b9b502ee7b01
📒 Files selected for processing (3)
README.mdbook/src/config/reference.mdcrates/logfwd/src/main.rs
|
Why do we have a raw and a json mode? Let's also remove beta from the names and we'll need to figure out how to test in CI |
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
crates/logfwd/src/main.rs (1)
1959-1982: 🧹 Nitpick | 🔵 TrivialSplit this matrix test and add a
jsoncase.This still leaves
validate_pipelines_read_only(..., format: json)untested, and the loop makes failures less localizable than separate scenario tests.As per coding guidelines: "Write one test per behavior with descriptive names describing the scenario (e.g.,
empty_input_returns_none). Test names should make the expected behavior clear without reading the test body."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd/src/main.rs` around lines 1959 - 1982, Split the existing matrix test validate_pipelines_read_only_accepts_sensor_beta_raw into individual tests for each sensor_beta platform and add a separate test for the json format; specifically create one test per behavior (e.g., validate_pipelines_read_only_accepts_linux_sensor_beta_raw, validate_pipelines_read_only_accepts_macos_sensor_beta_raw, validate_pipelines_read_only_accepts_windows_sensor_beta_raw) that each build the YAML with format: raw and call logfwd_config::Config::load_str(...) and validate_pipelines_read_only(&config, None, |_name| {}, |_err| {}) asserting is_ok(), and add another test (e.g., validate_pipelines_read_only_accepts_sensor_beta_json) that uses format: json and asserts the same; this makes failures localizable and ensures the json case is covered.crates/logfwd-config/src/lib.rs (1)
994-1077:⚠️ Potential issue | 🟠 MajorReject non-
raw/jsonformats for*_sensor_betaduring config load.This branch still lets
cri,auto,text, andconsolethrough because the only later format gate rejectslogfmt/syslog. That meansConfig::load_str()can accept sensor-beta configs the runtime is documented to refuse.Suggested fix
InputType::LinuxSensorBeta | InputType::MacosSensorBeta | InputType::WindowsSensorBeta => { + if let Some(fmt) = &input.format + && !matches!(fmt, Format::Raw | Format::Json) + { + return Err(ConfigError::Validation(format!( + "pipeline '{name}' input '{label}': {} inputs only support format raw or json", + input.input_type + ))); + } if input.generator.is_some() { return Err(ConfigError::Validation(format!( "pipeline '{name}' input '{label}': 'generator' settings are only supported for generator inputs" ))); }Also applies to: 1088-1093
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 994 - 1077, The validation for InputType::LinuxSensorBeta|MacosSensorBeta|WindowsSensorBeta currently doesn't reject unsupported output formats early; update the sensor-beta branch in the validator (the match arm handling those InputType variants and the sensor_beta field) to check input.format (or however format is stored) and return ConfigError::Validation if the format is not "raw" or "json" (i.e., disallow "cri", "auto", "text", "console", etc.) so Config::load_str() fails for sensor-beta inputs with non-raw/json formats; apply the same check in the sibling branch around the later sensor-beta validation (the second occurrence mentioned for lines 1088-1093).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@book/src/config/reference.md`:
- Around line 167-174: Update the documentation for sensor_beta.enabled_families
to explicitly state that an empty array is a valid override that disables all
signal families (i.e., [] means “no families enabled” rather than falling back
to defaults); modify the existing entry for sensor_beta.enabled_families and the
other duplicated occurrences of this field so they each mention this behavior
and the possible values (e.g., process,file,network,dns,authz) and that omitting
the field uses the default set while providing [] disables everything.
In `@crates/logfwd-io/src/platform_sensor_beta.rs`:
- Around line 582-586: The constructor new() currently calls
read_control_file(cfg.control_path) and returns Err on parse/IO failures, so a
malformed or partial control file prevents the input from starting; change new()
to treat failures from read_control_file(path) as non-fatal: when
cfg.control_path.is_some() and read_control_file(...) returns an Err, log or
record a control_reload_failed-style event but keep using the static control
value (do not replace control or return Err), and schedule/allow subsequent
reload attempts to surface the parsing error via the existing reload/error path
(i.e., leave control unchanged and only apply apply_control_file(control,
file_cfg, target) when read_control_file succeeds). Ensure the code references
cfg.control_path, read_control_file, apply_control_file and the new()
initialization path so the startup no longer fails on bad control files.
- Around line 983-1010: The test
control_file_reload_updates_generation_and_families uses raw tempfile and
std::fs::write to create and modify control files; replace those direct
filesystem calls by using the crate's shared tempfiles test helpers (the
tempfiles module) to create the control file and write its contents so tests go
through the standardized helpers; update the creation of control_path and the
write step (used with PlatformSensorBetaInput::new, control_path, and the
subsequent write/sleep) to call the appropriate tempfile helper functions
instead of tempfile::tempdir and std::fs::write, and ensure any helper cleanup
or sync semantics the helper provides are used in place of the explicit sleep.
In `@crates/logfwd/src/pipeline.rs`:
- Around line 1774-1787: Split the single test
sensor_beta_formats_allow_raw_and_json_only into three focused tests, each
asserting one behavior: (1) sensor_beta_accepts_raw should call
validate_input_format("sensor", InputType::LinuxSensorBeta, &Format::Raw) and
assert is_ok(), (2) sensor_beta_accepts_json should call
validate_input_format("sensor", InputType::WindowsSensorBeta, &Format::Json) and
assert is_ok(), and (3) sensor_beta_rejects_cri should call
validate_input_format("sensor", InputType::MacosSensorBeta, &Format::Cri) and
assert expect_err with a message containing "expected raw or json"; keep the
same validate_input_format and InputType variants but give each test a
descriptive name so failures point to the exact behavior.
---
Duplicate comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 994-1077: The validation for
InputType::LinuxSensorBeta|MacosSensorBeta|WindowsSensorBeta currently doesn't
reject unsupported output formats early; update the sensor-beta branch in the
validator (the match arm handling those InputType variants and the sensor_beta
field) to check input.format (or however format is stored) and return
ConfigError::Validation if the format is not "raw" or "json" (i.e., disallow
"cri", "auto", "text", "console", etc.) so Config::load_str() fails for
sensor-beta inputs with non-raw/json formats; apply the same check in the
sibling branch around the later sensor-beta validation (the second occurrence
mentioned for lines 1088-1093).
In `@crates/logfwd/src/main.rs`:
- Around line 1959-1982: Split the existing matrix test
validate_pipelines_read_only_accepts_sensor_beta_raw into individual tests for
each sensor_beta platform and add a separate test for the json format;
specifically create one test per behavior (e.g.,
validate_pipelines_read_only_accepts_linux_sensor_beta_raw,
validate_pipelines_read_only_accepts_macos_sensor_beta_raw,
validate_pipelines_read_only_accepts_windows_sensor_beta_raw) that each build
the YAML with format: raw and call logfwd_config::Config::load_str(...) and
validate_pipelines_read_only(&config, None, |_name| {}, |_err| {}) asserting
is_ok(), and add another test (e.g.,
validate_pipelines_read_only_accepts_sensor_beta_json) that uses format: json
and asserts the same; this makes failures localizable and ensures the json case
is covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3565b579-30cc-421e-8a3b-f51442c3420e
📒 Files selected for processing (6)
.github/workflows/ebpf-capabilities.ymlbook/src/config/reference.mdcrates/logfwd-config/src/lib.rscrates/logfwd-io/src/platform_sensor_beta.rscrates/logfwd/src/main.rscrates/logfwd/src/pipeline.rs
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
crates/logfwd-io/src/platform_sensor_beta.rs (2)
582-586:⚠️ Potential issue | 🟠 MajorDon't make an optional control file a startup dependency.
Lines 582-586 still fail
new()on malformed or partially-written control JSON. That makes the optional control plane able to prevent the input from starting, even though the reload path already models the same condition ascontrol_reload_failed. Fall back to the static config at startup and surface the file error on reload instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/platform_sensor_beta.rs` around lines 582 - 586, The startup path in new() currently propagates errors from read_control_file(path) and can abort startup; change the logic so cfg.control_path.as_ref() && read_control_file only applies when read_control_file returns Ok(file_cfg) (i.e., do not ?-propagate read_control_file errors from new()). Instead, on startup fall back to the static control value and defer surfacing file parsing/IO errors to the reload path (emit a control_reload_failed/log the error or set the same error state used by reload). Update the conditional around cfg.control_path, read_control_file, and apply_control_file so malformed/partial control JSON does not cause new() to fail but still gets reported via the existing reload/error handling.
987-1014: 🛠️ Refactor suggestion | 🟠 MajorUse the shared
tempfileshelpers in these file-I/O tests.These tests still create temp dirs/files with
tempfile::tempdirandstd::fs::writedirectly. Inlogfwd-io, file-I/O tests are supposed to go through the crate’stempfileshelpers instead. As per coding guidelines,In logfwd-io, tests must use tempfiles for file I/O testing — no raw payload injection. See#1615.Also applies to: 1046-1073
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-io/src/platform_sensor_beta.rs` around lines 987 - 1014, The test control_file_reload_updates_generation_and_families uses tempfile::tempdir() and std::fs::write for creating the control file; replace those raw file-I/O calls with the crate's shared tempfiles helpers (use the crate's temp dir/file creation and write helpers) so the control file used by PlatformSensorBetaInput is created via the tempfiles utilities instead of tempfile::tempdir and std::fs::write; update the control_path initialization and the write step in control_file_reload_updates_generation_and_families (and the similar block around the other lines noted) to call the tempfiles helper APIs that create a temp directory and write the JSON payload into control_path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/logfwd-io/src/platform_sensor_beta.rs`:
- Around line 279-293: The periodic signal-sample emission is currently gated by
self.state.control.emit_heartbeat; change the logic in the emission block so
that signal samples (self.common.signal_sample_rows call) are evaluated and
emitted independently when emit_signal_rows is enabled and last_emit.elapsed()
>= self.common.cfg.poll_interval, while heartbeat rows (self.common.control_row)
remain gated by emit_heartbeat; update self.state.last_emit when either path
emits to avoid silence, and add a regression test covering emit_heartbeat =
false with emit_signal_rows = true to verify samples continue after startup.
---
Duplicate comments:
In `@crates/logfwd-io/src/platform_sensor_beta.rs`:
- Around line 582-586: The startup path in new() currently propagates errors
from read_control_file(path) and can abort startup; change the logic so
cfg.control_path.as_ref() && read_control_file only applies when
read_control_file returns Ok(file_cfg) (i.e., do not ?-propagate
read_control_file errors from new()). Instead, on startup fall back to the
static control value and defer surfacing file parsing/IO errors to the reload
path (emit a control_reload_failed/log the error or set the same error state
used by reload). Update the conditional around cfg.control_path,
read_control_file, and apply_control_file so malformed/partial control JSON does
not cause new() to fail but still gets reported via the existing reload/error
handling.
- Around line 987-1014: The test
control_file_reload_updates_generation_and_families uses tempfile::tempdir() and
std::fs::write for creating the control file; replace those raw file-I/O calls
with the crate's shared tempfiles helpers (use the crate's temp dir/file
creation and write helpers) so the control file used by PlatformSensorBetaInput
is created via the tempfiles utilities instead of tempfile::tempdir and
std::fs::write; update the control_path initialization and the write step in
control_file_reload_updates_generation_and_families (and the similar block
around the other lines noted) to call the tempfiles helper APIs that create a
temp directory and write the JSON payload into control_path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: dac1abb1-3077-4268-9744-221b80bb4906
📒 Files selected for processing (1)
crates/logfwd-io/src/platform_sensor_beta.rs
…-beta-3os-ci-lanes # Conflicts: # .github/workflows/ebpf-capabilities.yml # crates/logfwd-config/src/lib.rs # crates/logfwd-config/src/validate.rs # crates/logfwd-runtime/src/pipeline/input_build.rs # scripts/ci-ebpf-capabilities.sh
Auto-dismissed because every review thread opened from this change request is now resolved. If additional changes are still required, please leave a new review.
…-beta-3os-ci-lanes # Conflicts: # book/src/config/reference.md
Summary
linux_ebpf_sensorwindows_ebpf_sensormacos_es_sensorformat(noraw/jsonmode)FramedInputformat decodingemit_heartbeatis accepted only as a no-op compatibility fieldplatform_sensor_beta->platform_sensorCompatibility
linux_sensor_beta,macos_sensor_beta,windows_sensor_betasensor_beta:(alias ofsensor:)Validation
cargo check -p logfwdcargo test -p logfwd-io platform_sensor::tests -- --nocapturecargo test -p logfwd-config sensor -- --nocapturecargo test -p logfwd-runtime input_build::tests -- --nocapturecargo test -p logfwd validate_pipelines_read_only_rejects_sensor_format -- --nocapturecargo test -p logfwd validate_pipelines_read_only_accepts_arrow_native_sensor -- --nocaptureNote
Replace beta sensor inputs with Arrow-native
linux_ebpf_sensor,macos_es_sensor, andwindows_ebpf_sensorlinux_sensor_beta,macos_sensor_beta, andwindows_sensor_betainput types tolinux_ebpf_sensor,macos_es_sensor, andwindows_ebpf_sensorin config, runtime, and docs; legacy names remain accepted as serde aliases for backward compatibility.PlatformSensorBetaInput(JSON heartbeat emitter) with a newPlatformSensorInputthat emits ArrowRecordBatchrows directly, including control, capability, and signal sample rows on startup and at each poll interval.control_path,control_reload_interval_ms,enabled_families, andemit_signal_rows; validation rejects unknown families, zero intervals, empty paths, and anyformatsetting (Arrow-native inputs do not supportformat).libc::isattycalls inlogfwdandlogfwd-output, replacing them withstd::io::IsTerminal.formatfield on a sensor input will now fail validation with an explicit error.Macroscope summarized ffe2beb.