feat: Benchmark reorg — fix failures, dual mode, github-action-benchmark#66
Conversation
…ction-benchmark
Major fixes from first nightly run failures:
Blackhole: protocol-aware line counting — ES bulk (newlines/2), OTLP
JSON (count "body": keys), NDJSON (count newlines). Detects protocol
from URL path (/_bulk, /v1/logs).
Agent config fixes:
- logfwd: use type-suffixed columns (level_str, duration_ms_int)
- otelcol: encoding: json + larger batch (5000/1s) for reliable delivery
- vlagent: use blackhole host IP in kubeconfig for Docker rewriting
- runner: rewrite bare IPs (not just host:port) for Docker mode
Dual execution mode:
- --mode binary|docker|both (default: binary, --docker sets docker)
- Resolves both binaries and Docker images when mode=both
- BenchResult includes mode field; markdown/text tables show mode column
Dashboard: replace custom pages.yml + index.html with
github-action-benchmark (used by OTel Collector). Single action handles
gh-pages deployment, Chart.js charts, and regression detection.
New --gh-bench-file flag outputs customBiggerIsBetter format:
[{"name": "passthrough/logfwd (docker)", "unit": "lines/sec", "value": 454000}]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Caution Review failedPull request was closed or merged during review WalkthroughThis PR removes the Pages-based benchmark dashboard (deleted Possibly related PRs
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/logfwd-competitive-bench/src/blackhole.rs`:
- Around line 120-138: The manual byte-by-byte loop in count_otlp_body_keys
should be replaced with memchr::memmem search to simplify and likely speed up
matching; import memchr::memmem (or memchr::memmem::Finder) and use its
find_iter (or Finder::new(...).find_iter(body)) over the needle b"\"body\":" to
count occurrences, then return that count instead of managing pos/offset
manually—update the function count_otlp_body_keys to use memmem-based iteration
and remove the manual memchr loop.
In `@crates/logfwd-competitive-bench/src/main.rs`:
- Around line 232-246: Add an explicit warning when the code falls back from
Docker to binary so users know the requested Docker mode wasn't honored: inside
the branch that checks if run_docker && !run_binary and finds Some(binary) in
resolved.binary, call the logging facility (e.g., warn or eprintln!) to report
that "--mode docker" was requested but no Docker image was found and the binary
will be used instead, then proceed to call run_one as before; update the message
to include the agent or resolved.binary identifier to make the warning
actionable.
In `@crates/logfwd-competitive-bench/src/runner.rs`:
- Around line 113-127: The bare-IP replace (`host_ip` → `docker_ip`) can
over-match anywhere in `rewritten`; change it to a more targeted replacement
using a regex (via the regex crate) that only replaces standalone/host
occurrences of `host_ip` (e.g. assert non-digit/colon boundaries with
lookarounds such as (?<![:\d.])HOST(?![:\d.])) instead of a blind `.replace`;
update the code around `host_ip`, `docker_ip`, `ctx.blackhole_addr`,
`ctx.docker_blackhole_addr`, and `rewritten` to build and apply that regex (use
`Regex::new` and `re.replace_all`) so only true bare-host instances are swapped
while preserving other occurrences in the config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 65eec1c6-c4f2-4d49-b61a-e41653c45efd
📒 Files selected for processing (9)
.github/workflows/bench.yml.github/workflows/pages.ymlbench/site/index.htmlcrates/logfwd-competitive-bench/src/agents/logfwd.rscrates/logfwd-competitive-bench/src/agents/otelcol.rscrates/logfwd-competitive-bench/src/agents/vlagent.rscrates/logfwd-competitive-bench/src/blackhole.rscrates/logfwd-competitive-bench/src/main.rscrates/logfwd-competitive-bench/src/runner.rs
💤 Files with no reviewable changes (2)
- bench/site/index.html
- .github/workflows/pages.yml
| // Rewrite config file: host paths → /bench, host addresses → Docker-reachable. | ||
| // Replace the full host:port first, then bare IP for other ports (e.g., K8s API). | ||
| if let Ok(content) = std::fs::read_to_string(&config_path) { | ||
| let host_ip = ctx.blackhole_addr.split(':').next().unwrap_or("127.0.0.1"); | ||
| let docker_ip = ctx | ||
| .docker_blackhole_addr | ||
| .split(':') | ||
| .next() | ||
| .unwrap_or(host_ip); | ||
| let rewritten = content | ||
| .replace(&ctx.bench_dir.to_string_lossy().to_string(), "/bench") | ||
| .replace(&ctx.blackhole_addr, &ctx.docker_blackhole_addr); | ||
| .replace(&ctx.blackhole_addr, &ctx.docker_blackhole_addr) | ||
| .replace(host_ip, docker_ip); | ||
| let _ = std::fs::write(&config_path, rewritten); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Bare IP replacement may over-match in edge cases.
The chained .replace(host_ip, docker_ip) replaces all occurrences of the bare IP anywhere in the config. This could cause issues if the IP appears in unintended contexts (e.g., embedded in a different port combination). For benchmark configs this is likely fine, but worth noting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd-competitive-bench/src/runner.rs` around lines 113 - 127, The
bare-IP replace (`host_ip` → `docker_ip`) can over-match anywhere in
`rewritten`; change it to a more targeted replacement using a regex (via the
regex crate) that only replaces standalone/host occurrences of `host_ip` (e.g.
assert non-digit/colon boundaries with lookarounds such as
(?<![:\d.])HOST(?![:\d.])) instead of a blind `.replace`; update the code around
`host_ip`, `docker_ip`, `ctx.blackhole_addr`, `ctx.docker_blackhole_addr`, and
`rewritten` to build and apply that regex (use `Regex::new` and
`re.replace_all`) so only true bare-host instances are swapped while preserving
other occurrences in the config.
- Replace manual byte search with memchr::memmem::find_iter (cleaner, faster) - Add warning when --mode docker falls back to binary for an agent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Follow-up to #61. First nightly run exposed failures in filebeat (0 lines), otelcol (partial), vlagent (0 lines in Docker), logfwd (wrong SQL columns in scenarios), and fluent-bit (config issues). This PR fixes all of them and simplifies the architecture.
Protocol-aware blackhole — detects protocol from URL path:
/_bulk(filebeat): ES bulk format, count newlines / 2/v1/logs(otelcol): OTLP JSON, count"body":keys per log recordAgent config fixes:
level_str,duration_ms_int)encoding: json+ batch 5000/1s for complete deliveryDual execution mode —
--mode binary|docker|both:both(binary pass + Docker pass for each agent)Dashboard — replaced custom
pages.yml+bench/site/index.html(-472 lines) withgithub-action-benchmarkaction. Handles gh-pages deployment, Chart.js charts, regression detection automatically.Net: -279 lines.
Test plan
cargo clippy -- -D warningscleancargo fmt --checkcleanactionlintcleanhttps://strawgate.github.io/memagent/dev/bench/🤖 Generated with Claude Code