feat: add pprof-rs CPU profiling and fix dhat profiler flushing#449
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an optional Cargo feature Possibly related PRs
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/logfwd/Cargo.toml`:
- Line 9: The Cargo.toml declares an unused feature "cpu-profiling" and an
optional dependency "pprof"; remove the "cpu-profiling" feature entry and the
optional dependency entry for "pprof" from the crate's Cargo.toml so the
manifest no longer lists an unwired feature/dependency (search for the feature
name cpu-profiling and the dependency pprof to locate the exact keys to delete),
ensuring no other code references #[cfg(feature = "cpu-profiling")] or pprof
symbols remain before committing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1a568058-18f2-4676-a699-c9e6c9e3cc73
📒 Files selected for processing (1)
crates/logfwd/Cargo.toml
|
|
||
| [features] | ||
| dhat-heap = ["dhat"] | ||
| cpu-profiling = ["pprof"] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether `cpu-profiling` is actually wired in Rust code.
set -euo pipefail
echo "== Feature references =="
rg -n --type=toml 'cpu-profiling|pprof'
echo
echo "== cfg gates for cpu-profiling =="
rg -n --type=rust '#\[cfg\(.*cpu-profiling.*\)\]'
echo
echo "== pprof symbol usage =="
rg -n --type=rust '\bpprof\b|ProfilerGuard|flamegraph'Repository: strawgate/memagent
Length of output: 275
Remove unwired cpu-profiling feature and pprof dependency.
The feature and optional dependency are declared in Cargo.toml (lines 9, 18) but have no implementation: no #[cfg(feature = "cpu-profiling")] guards exist, and pprof is never referenced in code. Adding dependencies without justification violates project guidelines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd/Cargo.toml` at line 9, The Cargo.toml declares an unused
feature "cpu-profiling" and an optional dependency "pprof"; remove the
"cpu-profiling" feature entry and the optional dependency entry for "pprof" from
the crate's Cargo.toml so the manifest no longer lists an unwired
feature/dependency (search for the feature name cpu-profiling and the dependency
pprof to locate the exact keys to delete), ensuring no other code references
#[cfg(feature = "cpu-profiling")] or pprof symbols remain before committing.
There was a problem hiding this comment.
Thanks for the catch. This is now wired in code: crates/logfwd/src/main.rs contains #[cfg(feature = "cpu-profiling")] guarded pprof::ProfilerGuardBuilder initialization plus flamegraph generation during shutdown. I kept the cpu-profiling feature and pprof dependency because they are now actively used.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
Skipped: comment is from another GitHub bot.
3f9c94a to
26c5874
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/bench.yml:
- Around line 315-318: The "Build logfwd with profiling" step is missing Rust
toolchain and cache setup; before the cargo build step add the same pre-steps
used in the build job: run mozilla-actions/sccache-action@v0.0.7,
dtolnay/rust-toolchain@stable, and Swatinem/rust-cache@v2 (with
cache-on-failure: true) so the profile job uses a consistent Rust toolchain and
benefits from sccache and dependency caching.
- Around line 320-329: The workflow invokes bin/logfwd-competitive-bench with
--dhat-binary but never enables profiling; add the missing --profile flag so the
profiling branch (guarded by args.profile_dir in main.rs) runs and the
--dhat-binary check is reached — update the Run profiling pass step to pass
--profile ./profiles (or appropriate profile directory) alongside the existing
flags so the code path that uses --dhat-binary is executed.
- Around line 335-337: The workflow currently passes --dhat-binary
./bin/logfwd-prof but omits the required --profile DIR flag, so the profiling
block in main.rs never runs and run_agent_dhat() never emits artifacts; update
the benchmark invocation to include --profile /tmp/profiles (or another
writeable directory) alongside --dhat-binary, and then change the artifact
upload paths from dhat-heap.json and flamegraph.svg to the actual files produced
(e.g., /tmp/profiles/dhat-heap.json and /tmp/profiles/flamegraph.svg or use
/tmp/profiles/** to capture both) so the profiler output created by
run_agent_dhat() is uploaded.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: cd6684dc-7a1f-4b8b-984f-4c8f5bfb9118
📒 Files selected for processing (1)
.github/workflows/bench.yml
| path: | | ||
| dhat-heap.json | ||
| flamegraph.svg |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check where flamegraph.svg is written
rg -n 'flamegraph\.svg' crates/logfwd-competitive-bench/src/Repository: strawgate/memagent
Length of output: 440
🏁 Script executed:
#!/bin/bash
# Verify dhat-heap.json output path logic
rg -nC3 'dhat-heap\.json' crates/logfwd-competitive-bench/src/runner.rsRepository: strawgate/memagent
Length of output: 1045
🏁 Script executed:
#!/bin/bash
# Find output_dir definition and its value
rg -n 'output_dir\s*=' crates/logfwd-competitive-bench/src/main.rs | head -20Repository: strawgate/memagent
Length of output: 44
🏁 Script executed:
#!/bin/bash
# Check bench.yml to see if it defines or references output_dir
rg -n 'output_dir\|dhat\|flamegraph' .github/workflows/bench.ymlRepository: strawgate/memagent
Length of output: 44
🏁 Script executed:
#!/bin/bash
# Read bench.yml around the artifact upload section with full context
sed -n '320,350p' .github/workflows/bench.ymlRepository: strawgate/memagent
Length of output: 1029
🏁 Script executed:
#!/bin/bash
# Find and examine main.rs for benchmark output directory
fd -t f main.rs crates/logfwd-competitive-bench/src/ | xargs cat -n | head -100Repository: strawgate/memagent
Length of output: 4388
🏁 Script executed:
#!/bin/bash
# Find where output_dir is used and defined for profile/dhat operations
rg -n 'output_dir|profile' crates/logfwd-competitive-bench/src/main.rs | head -30Repository: strawgate/memagent
Length of output: 1054
🏁 Script executed:
#!/bin/bash
# Look at the full context of profile argument and how it's handled
sed -n '100,200p' crates/logfwd-competitive-bench/src/main.rsRepository: strawgate/memagent
Length of output: 3318
🏁 Script executed:
#!/bin/bash
# Check if run_agent_dhat is called and under what conditions
rg -nB5 'run_agent_dhat' crates/logfwd-competitive-bench/src/main.rsRepository: strawgate/memagent
Length of output: 414
🏁 Script executed:
#!/bin/bash
# Check the full profiling section in main.rs around lines 333-385
sed -n '330,390p' crates/logfwd-competitive-bench/src/main.rsRepository: strawgate/memagent
Length of output: 2734
Add --profile DIR argument to enable dhat profiling in workflow.
The workflow passes --dhat-binary ./bin/logfwd-prof but omits --profile DIR. Without --profile DIR, the profiling block in main.rs (lines 333–390) never executes, so run_agent_dhat() is never called and the artifact files are never generated. The artifact upload will find nothing.
To fix: add --profile /tmp/profiles (or similar) to the benchmark invocation, then update the upload path to reference the actual location where files are written.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/bench.yml around lines 335 - 337, The workflow currently
passes --dhat-binary ./bin/logfwd-prof but omits the required --profile DIR
flag, so the profiling block in main.rs never runs and run_agent_dhat() never
emits artifacts; update the benchmark invocation to include --profile
/tmp/profiles (or another writeable directory) alongside --dhat-binary, and then
change the artifact upload paths from dhat-heap.json and flamegraph.svg to the
actual files produced (e.g., /tmp/profiles/dhat-heap.json and
/tmp/profiles/flamegraph.svg or use /tmp/profiles/** to capture both) so the
profiler output created by run_agent_dhat() is uploaded.
There was a problem hiding this comment.
Agreed on enabling --profile and aligning artifact paths. I wasn’t able to apply this directly because workflow files are restricted in this run. Leaving this open for a follow-up workflow edit.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
Skipped: comment is from another GitHub bot.
There was a problem hiding this comment.
Requesting changes: the profiling workflow currently disables profiling execution by omitting --profile, so the expected profiling artifacts are not generated.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:#449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: PR Review
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
| --mode binary \ | ||
| --profile ./profiles \ | ||
| --dhat-binary ./bin/logfwd-dhat | ||
| --dhat-binary ./bin/logfwd-prof |
There was a problem hiding this comment.
logfwd-competitive-bench only runs the profiling block when --profile DIR is set, so this invocation currently skips both perf/flamegraph and dhat generation even though --dhat-binary is provided.
In crates/logfwd-competitive-bench/src/main.rs, profiling is gated by if let Some(ref profile_dir) = args.profile_dir (around line 333). With the current command, that condition is false and no dhat-heap.json/flamegraph.svg is produced, so artifact upload has nothing to collect.
Please restore --profile (for example --profile ./profiles) and align the upload paths to that directory.
There was a problem hiding this comment.
Confirmed. I could not update .github/workflows/bench.yml in this run due workflow-file edit restrictions, so I’m leaving this thread open for a follow-up workflow change with the required permission.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
e2c8142 to
d9b5da1
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/logfwd/src/main.rs (1)
293-327:⚠️ Potential issue | 🔴 CriticalDon't
exit(0)from the signal task.The signal path never cancels the shared shutdown token; it writes the profiles and terminates the process immediately. That skips the existing drain/join logic below,
meter_provider.shutdown(), and any in-flight pipeline flushes, so SIGINT/SIGTERM becomes abrupt shutdown again. Cancel the token here and letrun_pipelines()own the final teardown after workers exit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd/src/main.rs` around lines 293 - 327, The signal-handling task must not call std::process::exit(0); instead of terminating the process, replace that call with signalling the shared shutdown mechanism (e.g., call the shutdown token's cancel() method or send on the shutdown_tx used by run_pipelines()) so run_pipelines() can perform the drain/join and meter_provider.shutdown(); keep the profiling drops/report logic around _pprof_to_drop and _profiler_to_drop, then return from the tokio::spawn task normally so workers can exit and final teardown runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/bench.yml:
- Around line 315-318: The profiled build step that creates logfwd-prof uses
plain cargo build, causing codegen to differ from the benchmarked binaries;
update the "Build logfwd with profiling" step (the cargo build --release
--features dhat-heap,cpu-profiling -p logfwd invocation) to set the same
RUSTFLAGS used elsewhere (e.g. -Ctarget-cpu=x86-64-v3) so the profiled binary
matches the benchmarked CPU target; you can do this by exporting RUSTFLAGS or
prefixing the cargo command with RUSTFLAGS="<same flags>" before running cargo
build and then cp target/release/logfwd bin/logfwd-prof as before.
In `@crates/logfwd/src/main.rs`:
- Around line 98-103: The current catch-all Err branch in main.rs turns every
error into EXIT_RUNTIME; update error handling so errors from cmd_config(),
cmd_generate_json(), validate_pipelines(), and pipeline construction are mapped
to a configuration exit code (EXIT_CONFIG) while other runtime/io errors map to
EXIT_RUNTIME: create or use a typed CLI error enum (e.g., CliError) or
inspect/propagate error kinds from
cmd_config()/cmd_generate_json()/validate_pipelines() and match on those
variants, returning EXIT_CONFIG for configuration/usage/validation failures and
EXIT_RUNTIME for other errors, and keep the eprintln printing the error text
while returning the proper exit constant.
- Line 224: Run rustfmt (cargo fmt) on the file to fix the formatting errors;
specifically reformat the block that returns
Err(io::Error::new(io::ErrorKind::Other, format!("invalid num_lines: {e}"))) and
the other similar error/format blocks flagged (the other occurrences around the
error messages and multi-line expressions). Ensure the code compiles after
formatting and that expressions like format!(...) and multi-line returns are
properly aligned per rustfmt so cargo fmt --check passes.
- Around line 286-290: The pprof profiler initialization currently calls
ProfilerGuardBuilder::default().build().unwrap() which can panic; change this to
handle errors by replacing .unwrap() with either the ? operator (propagate the
error from main/startup) or use .expect("failed to initialize pprof profiler:
<reason>") to provide a clear message, ensuring the variable pprof_guard is
created only on success and matches the existing shutdown handling for
pprof::ProfilerGuard (refer to pprof_guard and ProfilerGuardBuilder in the
diff).
---
Outside diff comments:
In `@crates/logfwd/src/main.rs`:
- Around line 293-327: The signal-handling task must not call
std::process::exit(0); instead of terminating the process, replace that call
with signalling the shared shutdown mechanism (e.g., call the shutdown token's
cancel() method or send on the shutdown_tx used by run_pipelines()) so
run_pipelines() can perform the drain/join and meter_provider.shutdown(); keep
the profiling drops/report logic around _pprof_to_drop and _profiler_to_drop,
then return from the tokio::spawn task normally so workers can exit and final
teardown runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 87d27808-0203-42c5-9f8e-7d292afc77c3
📒 Files selected for processing (3)
.github/workflows/bench.ymlcrates/logfwd/Cargo.tomlcrates/logfwd/src/main.rs
There was a problem hiding this comment.
Requesting changes.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:#449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: PR Review
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
| .frequency(999) | ||
| .blocklist(&["libc", "libgcc", "pthread", "vdso"]) | ||
| .build() | ||
| .unwrap(); |
There was a problem hiding this comment.
build().unwrap() introduces a hard panic in a production code path when cpu-profiling is enabled.
pprof::ProfilerGuardBuilder::build() can fail in restricted environments (e.g. missing perf permissions on CI/container hosts). In that case, this unwrap() aborts process startup before pipelines run.
Please propagate or handle this error explicitly (consistent with repo guidance to avoid .unwrap() in production paths), e.g. map it to io::Error and return it from run_pipelines.
There was a problem hiding this comment.
Fixed in ce5f790. pprof::ProfilerGuardBuilder::build() now maps failure into CliError::Runtime(io::Error::other(...)) and returns cleanly instead of panicking.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
| #[cfg(feature = "dhat-heap")] | ||
| drop(_profiler_to_drop); | ||
|
|
||
| std::process::exit(0); |
There was a problem hiding this comment.
This now force-exits from the signal task and bypasses the normal shutdown path.
Before this change, signal handling called shutdown_for_signal.cancel(), allowing run_async tasks to stop, sibling joins to complete, and post-run cleanup (like meter-provider shutdown) to run. With std::process::exit(0) here, SIGINT/SIGTERM can terminate immediately and skip that graceful path.
Please keep the profiler flush/drop logic, but then signal cancellation (shutdown_for_signal.cancel()) and let run_pipelines unwind naturally.
There was a problem hiding this comment.
Fixed in ce5f790. I removed the force-exit and now call shutdown_for_signal.cancel() after profiler flush/drop, preserving graceful shutdown and cleanup paths.
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
| eprintln!("{}error{}: --config requires a path", red(), reset(),); | ||
| eprintln!(" logfwd --config <config.yaml> [--validate] [--dry-run]"); | ||
| std::process::exit(EXIT_CONFIG); | ||
| return Err(io::Error::new(io::ErrorKind::Other, "missing config path")); |
There was a problem hiding this comment.
This refactor changes user-facing exit-code semantics for config/argument errors.
print_usage() documents exit code 1 for configuration errors, but branches like missing config path now return Err(...), and main_inner maps all Err to EXIT_RUNTIME (2).
Concrete regression: logfwd --config (without a path) now exits 2 instead of 1.
Please preserve config-vs-runtime distinction (e.g., typed error/exit enum or explicit mapping) so config/CLI validation failures still return EXIT_CONFIG.
There was a problem hiding this comment.
Fixed in ce5f790 by restoring config/runtime distinction with typed CLI errors. --config usage/config validation paths now map back to EXIT_CONFIG (1).
Note
🔒 Integrity filtering filtered 1 item
Integrity filtering activated and filtered the following item during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.
- pr:feat: add pprof-rs CPU profiling and fix dhat profiler flushing #449 (
pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".)
What is this? | From workflow: AI: Address PR Review Feedback
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/logfwd/src/main.rs (1)
317-358:⚠️ Potential issue | 🟠 MajorMove CPU profile emission and metrics shutdown into a shared tail block to ensure both occur on all exit paths.
Currently,
pprof_guardis moved into the signal task, so flamegraph.svg is only written on signal interrupts (lines 352-359). Meanwhile,meter_provider.shutdown()at line 458 is only called when all pipelines succeed—it's skipped on error returns at lines 378, 390, 430, and 432. This causes telemetry loss on non-graceful-shutdown paths and loses CPU profiling on programmatic exits. Move profile emission and meter shutdown to execute once after pipelines complete, regardless of success or failure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd/src/main.rs` around lines 317 - 358, pprof_guard (and the local _pprof_to_drop) and the meter_provider.shutdown() must be executed on all exit paths instead of being consumed only by the signal task or only on success; move creation of pprof_guard out so it is not moved into tokio::spawn and remove profile emission from inside the signal handler, then after the pipeline run/completion block (the code paths that currently return at lines with early returns), add a single shared tail that (1) attempts to emit the CPU profile using the pprof_guard.report().build().flamegraph(...) logic (using the same file name) and (2) calls meter_provider.shutdown() unconditionally (handling any Result/Errors), ensuring profiler (profiler variable) and pprof_guard are kept in scope until that tail runs so both profiling and telemetry shutdown run on success and error paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/logfwd/src/main.rs`:
- Line 176: The helper functions use the disallowed "cmd" abbreviation; rename
the functions and all their call sites to full verb_noun names: change
cmd_config -> configure_config, cmd_blackhole -> blackhole_sink, and
cmd_generate_json -> generate_json_output (also update the occurrences
referenced at lines ~237-242). Update function definitions, any references,
imports/exports, and tests to use the new names so the code compiles and follows
the allowed-abbreviation guideline.
- Around line 128-133: The branches currently print the CLI error directly
(eprintln!("{}error{}: {e}", ...)) before returning, causing duplicate messages;
instead remove the direct eprintln calls in the match handling of result and
return the appropriate CliError variant (e.g., wrap/return CliError::Config(...)
or propagate the existing CliError) so that main_inner()'s top-level handler
prints the error exactly once; apply the same change to the other similar match
branches handling errors (the ones around the other result matches mentioned) so
they return the contextual CliError instead of printing.
---
Outside diff comments:
In `@crates/logfwd/src/main.rs`:
- Around line 317-358: pprof_guard (and the local _pprof_to_drop) and the
meter_provider.shutdown() must be executed on all exit paths instead of being
consumed only by the signal task or only on success; move creation of
pprof_guard out so it is not moved into tokio::spawn and remove profile emission
from inside the signal handler, then after the pipeline run/completion block
(the code paths that currently return at lines with early returns), add a single
shared tail that (1) attempts to emit the CPU profile using the
pprof_guard.report().build().flamegraph(...) logic (using the same file name)
and (2) calls meter_provider.shutdown() unconditionally (handling any
Result/Errors), ensuring profiler (profiler variable) and pprof_guard are kept
in scope until that tail runs so both profiling and telemetry shutdown run on
success and error paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1cac0613-c8ae-459d-bcc4-84be5e3e7621
📒 Files selected for processing (1)
crates/logfwd/src/main.rs
| match result { | ||
| Ok(_) => EXIT_OK, | ||
| Err(e) => { | ||
| eprintln!("{}error{}: {e}", red(), reset()); | ||
| e.exit_code() | ||
| } |
There was a problem hiding this comment.
Stop double-printing CLI errors.
main_inner() already renders every CliError at Lines 128-133. These branches still print before returning, so --config/--generate-json now emit duplicate errors with different wording. Return one contextual CliError::Config(...) here and let the top-level handler print it once.
Also applies to: 176-180, 193-195, 243-249
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd/src/main.rs` around lines 128 - 133, The branches currently
print the CLI error directly (eprintln!("{}error{}: {e}", ...)) before
returning, causing duplicate messages; instead remove the direct eprintln calls
in the match handling of result and return the appropriate CliError variant
(e.g., wrap/return CliError::Config(...) or propagate the existing CliError) so
that main_inner()'s top-level handler prints the error exactly once; apply the
same change to the other similar match branches handling errors (the ones around
the other result matches mentioned) so they return the contextual CliError
instead of printing.
| // --------------------------------------------------------------------------- | ||
|
|
||
| async fn cmd_config(args: &[String]) -> io::Result<()> { | ||
| async fn cmd_config(args: &[String]) -> Result<(), CliError> { |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Rename the cmd_* helpers.
cmd_config, cmd_blackhole, and cmd_generate_json still use the disallowed cmd abbreviation. Prefer full verb_noun names before more call sites depend on them.
As per coding guidelines, "Allow only specific abbreviations in names: buf, pos, len, idx, cfg, ctx. No other abbreviations."
Also applies to: 237-242
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd/src/main.rs` at line 176, The helper functions use the
disallowed "cmd" abbreviation; rename the functions and all their call sites to
full verb_noun names: change cmd_config -> configure_config, cmd_blackhole ->
blackhole_sink, and cmd_generate_json -> generate_json_output (also update the
occurrences referenced at lines ~237-242). Update function definitions, any
references, imports/exports, and tests to use the new names so the code compiles
and follows the allowed-abbreviation guideline.
|
Reviewed all currently unresolved threads and verified the branch state at
Validation run
No new code changes were required based on the current review feedback and branch contents. Note 🔒 Integrity filtering filtered 1 itemIntegrity filtering activated and filtered the following item during workflow execution.
What is this? | From workflow: AI: Address PR Review Feedback Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
- Add cpu-profiling feature with pprof-rs dependency - Ensure dhat and pprof profilers are explicitly dropped during graceful shutdown - Fix process exit logic to allow destructors to run before termination - Update nightly bench workflow to use pprof-rs and upload flamegraphs
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The profiling step in bench.yml was missing --profile, so the competitive bench tool's profiling code path was never entered — neither dhat-heap nor perf/flamegraph artifacts were produced. Changes: - Add --profile . to enable the profiling code path - Set LOGFWD to logfwd-prof binary so pprof CPU profiling activates Addresses review feedback on #449. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
9416e1a to
6b5dc4e
Compare
|
|
||
| - name: Build logfwd with profiling | ||
| run: | | ||
| cargo build --release --features dhat-heap,cpu-profiling -p logfwd |
There was a problem hiding this comment.
logfwd-prof is built with both dhat-heap and cpu-profiling, then used for the dhat pass. That means the SIGTERM shutdown path in crates/logfwd/src/main.rs also writes flamegraph.svg during the dhat run, which overwrites the perf/inferno flamegraph generated earlier in the same --profile . directory.
Concrete failure mode: artifact upload picks up only the last flamegraph.svg, so the expected perf flamegraph is lost/replaced.
Consider separating outputs (different filenames/dirs) or using separate binaries for perf and dhat profiling.
Use distinct profiling binaries and a dedicated profiles directory so the dhat run no longer overwrites the perf flamegraph. Align the profiling build job with the main benchmark toolchain and lock in the new pprof dependency set. Made-with: Cursor
This PR introduces significant improvements to the project's profiling capabilities, enabling better performance analysis in CI and local development.
Key Changes
pprof-rsunder a newcpu-profilingfeature. This allows for generating CPU flamegraphs in environments whereperfis restricted (like standard GitHub Actions runners).dhatandpprofprofilers are explicitly dropped when aSIGINTorSIGTERMis received. This ensures profiling data is correctly written to disk.