Add low-and-slow rate-ingest nightly benchmark (1–100K eps)#293
Conversation
…K eps Agent-Logs-Url: https://github.com/strawgate/memagent/sessions/3a95d8b0-8172-4460-9b30-fe4b5e92073e Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>
Avoids ~100K unnecessary flush syscalls/sec at high EPS while still ensuring logfwd sees data promptly at low rates (1–100 eps). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Requesting changes due to correctness issues in rate-bench failure handling: current behavior can emit success-shaped but invalid/partial benchmark data when logfwd exits early or when some EPS levels fail.
What is this? | From workflow: AI: PR Review
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
| let _ = std::fs::remove_file(&data_file); | ||
| let _ = std::fs::remove_file(&cfg_path); | ||
|
|
||
| Ok(RateBenchResult { |
There was a problem hiding this comment.
If logfwd exits during warmup/measurement, this function still returns Ok with fallback averages (rss_kb=0, cpu_percent=0.0) because /proc reads are optional and empty sample vectors average to zero. That produces a success-shaped benchmark result instead of failing the run.
A concrete failure case is startup bind/config failure (for example diagnostics port conflict): writer continues appending, actual_eps remains non-zero, and CI can publish misleading data.
Please fail this EPS run when the child exits early (e.g. try_wait) and/or when no RSS/CPU samples were collected.
| ); | ||
| results.push(r); | ||
| } | ||
| Err(e) => eprintln!(" ERROR: {e}"), |
There was a problem hiding this comment.
This swallows per-rate failures and continues, so the overall rate-bench can succeed with missing EPS points.
In run_rate_bench_main, the process exits non-zero only when results.is_empty(), which means a partial result set (e.g. 5/6 rates) is still uploaded/published as if complete.
Please make incomplete runs fail (or explicitly encode failed EPS levels and fail CI when any level failed) so dashboards/issues don’t silently reflect partial data.
Expands nightly benchmarks with a non-competitive, logfwd-only benchmark that measures steady-state RSS memory and CPU utilisation at six controlled ingest rates: 1, 10, 100, 1 000, 10 000, and 100 000 eps.
New:
rate_bench.rsmodule/proc/{pid}/status(RSS) and/proc/{pid}/stat(CPU ticks) every 500 ms over a 20 s measurement window after a 5 s warmupgithub-action-benchmarkcustomSmallerIsBetterJSON for dashboard regression trackingCLI (
--rate-benchflag)Added to
logfwd-competitive-bench. When set, bypasses the competitive benchmark flow entirely and runs the rate bench against theLOGFWDbinary:Nightly workflow additions
rate-bench-result.json+rate-gh-bench.json(90-day retention)benchmark-actiondashboard step atdev/bench-rate(customSmallerIsBetter— lower RSS/CPU is better)Example output:
Original prompt
Created from VS Code.