Skip to content

fix: Parse criterion output in bench workflow#17

Merged
strawgate merged 4 commits into
masterfrom
worktree-grok-regex-udfs
Mar 29, 2026
Merged

fix: Parse criterion output in bench workflow#17
strawgate merged 4 commits into
masterfrom
worktree-grok-regex-udfs

Conversation

@strawgate
Copy link
Copy Markdown
Owner

Summary

  • Criterion 0.5 doesn't support --output-format bencher — that's a libtest flag
  • Rewrites the parser to handle criterion's native output format (bench name on own line + indented time/thrpt)
  • Captures stderr too since criterion writes benchmark progress there

Verified locally

Parser correctly produces markdown table from criterion output.

🤖 Generated with Claude Code

strawgate and others added 4 commits March 29, 2026 02:41
Add two new DataFusion scalar UDFs for structured log parsing:

- regexp_extract(string, pattern, group_index): Spark-compatible regex
  extraction returning capture group at given index (0=full match, 1+=groups)
- grok(string, pattern): Logstash-style grok pattern parsing returning a
  Struct with one field per named capture (%{PATTERN:name} syntax)

Grok includes 25+ built-in patterns (IP, WORD, NUMBER, TIMESTAMP_ISO8601,
LOGLEVEL, etc). Both UDFs are registered automatically in SqlTransform.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New `logfwd-bench` crate with criterion benchmarks covering the full
pipeline: scanner (all fields + pushdown), CRI parse/reassemble,
DataFusion transforms (passthrough, filter, projection, regexp_extract,
grok), zstd compression, output sinks, and end-to-end pipelines.

Nightly GitHub Actions workflow runs benchmarks on master, parses
results into a markdown table, and posts as a GitHub issue (closing
the previous one to avoid clutter).

Also fixes clippy warnings in UDF code (Default impls, collapsible ifs).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Criterion doesn't support --output-format bencher. Parse its native
format instead (bench name on own line, followed by time/thrpt lines).
Also capture stderr since criterion writes there.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@strawgate strawgate force-pushed the worktree-grok-regex-udfs branch from f4d5f5a to f5b7af3 Compare March 29, 2026 07:41
@strawgate strawgate merged commit 7b65073 into master Mar 29, 2026
1 check failed
@strawgate strawgate deleted the worktree-grok-regex-udfs branch March 29, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant