fix: benchmark reporting — detect timeouts, fix comparisons by strawgate · Pull Request #331 · strawgate/fastforward

strawgate · 2026-03-31T04:28:15Z

Summary

Timeout detection: BenchResult.timed_out flag set by wait_blackhole_done. Agents hitting the 120s timeout show TIMEOUT instead of "0 lines/sec".
Exclude timed-out agents from comparisons: No more "filebeat is 1.2x faster than otelcol" when otelcol didn't finish.
Runs column clarity: Shows 4203ms not 4203, and 120027ms(TIMEOUT) for timed-out runs.

Before (issue #322)

| otelcol | binary | 120027ms | 1ms | 0 lines/sec | 120027|120028|120027 |
> filebeat is 1.2x faster than otelcol

After

| otelcol | binary | **TIMEOUT** (120027ms) | 1ms | 0 lines (timed out) | 120027ms(TIMEOUT) | ... |

Comparisons only include agents that completed.

Test plan

cargo clippy clean
cargo fmt clean
CI passes

Fixes #322

🤖 Generated with Claude Code

- BenchResult tracks `timed_out` flag from wait_blackhole_done - Timed-out agents show "TIMEOUT" instead of misleading "0 lines/sec" - Timed-out agents excluded from speed comparisons - Runs column shows units and timeout markers (e.g. "4203ms", "120027ms(TIMEOUT)") Fixes #322 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T04:28:37Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: af7309d0-2b5e-47e0-86f1-ae5b0a28d62e

📥 Commits

Reviewing files that changed from the base of the PR and between 885922a and d830e9a.

📒 Files selected for processing (3)

crates/logfwd-competitive-bench/src/main.rs
crates/logfwd-competitive-bench/src/runner.rs
crates/logfwd-competitive-bench/src/summarize.rs

Walkthrough

This PR adds timeout detection and tracking to the competitive benchmarking framework. A new timed_out boolean field is added to BenchResult to indicate when a benchmark run exceeds its configured timeout. The timeout status is determined by wait_blackhole_done() which now returns a tuple containing both lines processed and a timeout flag. All runner functions are updated to destructure and propagate this value to the result struct. Summarization logic then uses this information to annotate timed-out runs in output and exclude them from performance comparisons.

Possibly related PRs

Benchmark dashboard v2: matrix CI, runtime sampling, three-section site #301: Extended BenchResult with additional run metadata (iteration/samples and sampling logic) in the same benchmarking infrastructure
fix: Include bench dashboard in docs site deployment #310: Modified runner and summarize logic including BenchResult field additions and summarize output formatting
feat: Benchmark reorg — fix failures, dual mode, github-action-benchmark #66: Added mode field to BenchResult struct, establishing the pattern for extending benchmark result tracking

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-31T04:35:10Z

    let mut stable_count = 0u32;

    loop {
        if start.elapsed() > timeout {


The timeout check runs before checking whether the expected line count has been reached, which can misclassify successful runs as timed out near the boundary.

Concrete case: if lines reaches expected just before 120s, the next loop iteration may start slightly after 120s and return (lines, true) immediately, even though the run actually completed.

Consider re-checking lines >= expected in the timeout branch (or reading stats first each loop) before returning timed_out = true.

strawgate merged commit ede6500 into master Mar 31, 2026
8 of 11 checks passed

strawgate deleted the fix/bench-reporting-v2 branch March 31, 2026 04:30

github-actions Bot reviewed Mar 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: benchmark reporting — detect timeouts, fix comparisons#331

fix: benchmark reporting — detect timeouts, fix comparisons#331
strawgate merged 1 commit into
masterfrom
fix/bench-reporting-v2

strawgate commented Mar 31, 2026

Uh oh!

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading

Review failed

Uh oh!

Uh oh!

github-actions Bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

strawgate commented Mar 31, 2026

Summary

Before (issue #322)

After

Test plan

Uh oh!

coderabbitai Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Possibly related PRs

Uh oh!

Uh oh!

github-actions Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading