docs: ARCHITECTURE.md + unified SIMD structural benchmarks by strawgate · Pull Request #321 · strawgate/fastforward

strawgate · 2026-03-31T03:28:46Z

Summary

Add dev-docs/ARCHITECTURE.md — primary entry point documenting the full pipeline data flow from input to output, layer interfaces, crate boundaries, buffer lifecycle, and verification strategy
Add structural_detect benchmark comparing separate passes vs unified SIMD (5-char, 9-char) — validates extending ChunkIndex into StructuralIndex (Unified SIMD StructuralIndex: extend ChunkIndex to detect all structural characters #313)
Update DIRECTION, PHASES, PROVEN_CORE, PROOF_AUDIT, ZERO_COPY_PIPELINE for current state

Benchmark results (NEON, ~760KB NDJSON)

Approach	Time	Throughput
Current (memchr + ChunkIndex 2-char)	95 µs	7.9 GiB/s
Unified SIMD 5-char	141 µs	5.4 GiB/s
Unified SIMD 9-char	256 µs	3.0 GiB/s

Scaling is linear at ~28µs per character. 9-char gives 11x headroom over 1M lines/sec target.

Test plan

CI passes (clippy, fmt, tests)
Benchmark compiles and runs: cargo bench -p logfwd-core --bench structural_detect
ARCHITECTURE.md accurately describes current pipeline flow
Dev-docs are internally consistent

🤖 Generated with Claude Code

Add ARCHITECTURE.md — primary entry point documenting the full pipeline data flow from bytes on disk to serialized output. Covers each layer's responsibility, interface types, crate boundaries, buffer lifecycle, and verification strategy. Add structural_detect benchmark comparing separate passes (memchr + ChunkIndex) vs unified SIMD (5-char, 9-char) vs hybrid scalar. Results: 9-char unified SIMD at 3 GiB/s with linear scaling at ~28µs per character — validates extending ChunkIndex into StructuralIndex for format-agnostic structural scanning. Update dev-docs: - DIRECTION.md: add StructuralIndex vision, unified SIMD section, benchmark results - PHASES.md: restructure into 8 phases reflecting current state (Phase 0-1.5 done, Phase 2 is StructuralIndex) - PROVEN_CORE.md: add framer, aggregator, byte_search to module table with proof counts - PROOF_AUDIT.md: update for content_correct oracle proof, close NewlineFramer correctness gap - ZERO_COPY_PIPELINE.md: cross-reference to ARCHITECTURE.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-31T03:28:59Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a1d79f3b-4d6f-4fc8-b03c-e1996412ff2e

📥 Commits

Reviewing files that changed from the base of the PR and between b8f8ebd and 2c4c876.

📒 Files selected for processing (7)

crates/logfwd-core/Cargo.toml
crates/logfwd-core/benches/structural_detect.rs
dev-docs/ARCHITECTURE.md
dev-docs/DIRECTION.md
dev-docs/PHASES.md
dev-docs/PROVEN_CORE.md
dev-docs/ZERO_COPY_PIPELINE.md

Walkthrough

This PR adds comprehensive benchmarking infrastructure and developer documentation for structural-character detection in logfwd-core. The main addition is a new Criterion benchmark file (structural_detect.rs) with 676 lines that evaluates multiple detection strategies: separate-pass baseline functions, unified 64-byte block scanning approaches, hybrid two-pass methods, and architecture-specific SIMD implementations (NEON on aarch64, AVX2 on x86_64) for both 5-character and 9-character sets. Accompanying updates to the crate Cargo.toml register the benchmark. Developer documentation is significantly expanded with a new comprehensive ARCHITECTURE.md file, alongside updates to DIRECTION.md, PHASES.md, PROVEN_CORE.md, and ZERO_COPY_PIPELINE.md that clarify the unified SIMD scanning approach and reorganize the project roadmap.

Possibly related PRs

docs: dev-docs with architecture direction, proven core plan, and research #298 — Concurrently modifies the same developer documentation files (DIRECTION.md, PROVEN_CORE.md, PHASES.md) to align architectural descriptions and project phases
Replace platform-specific NEON intrinsics with portable sonic-simd #80 — Refactors structural-detection and ChunkIndex SIMD code paths that this PR's benchmarks directly exercise and measure

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-31T03:40:04Z

+        results.push(unsafe { simd_9char::find_9chars_neon(&block) });
+
+        #[cfg(target_arch = "x86_64")]
+        if is_x86_feature_detected!("avx2") {


simd_9char_scan_buffer has no non-AVX2 fallback on x86_64, so on CPUs without AVX2 this benchmark does effectively no scanning work (the loop runs but results never gets a push). That makes unified_9char_simd timings artificially fast on those machines.

Please add an else branch (and ideally a non-x86_64 fallback too) similar to simd_unified_scan_buffer so this benchmark always measures real 9-char detection work.

strawgate merged commit 884a73f into master Mar 31, 2026
9 of 11 checks passed

strawgate deleted the docs/architecture-and-benchmarks branch March 31, 2026 03:30

github-actions Bot reviewed Mar 31, 2026

View reviewed changes

This was referenced Mar 31, 2026

fix: benchmark reporting — timeout detection, cached builds #330

Closed

docs: StructuralIndex research, benchmarks, and architecture decisions #329

Merged

feat: StructuralIndex — portable SIMD via wide, replace ChunkIndex #360

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: ARCHITECTURE.md + unified SIMD structural benchmarks#321

docs: ARCHITECTURE.md + unified SIMD structural benchmarks#321
strawgate merged 1 commit into
masterfrom
docs/architecture-and-benchmarks

strawgate commented Mar 31, 2026

Uh oh!

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading

Review failed

Uh oh!

Uh oh!

github-actions Bot Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

strawgate commented Mar 31, 2026

Summary

Benchmark results (NEON, ~760KB NDJSON)

Test plan

Uh oh!

coderabbitai Bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Possibly related PRs

Uh oh!

Uh oh!

github-actions Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Mar 31, 2026 •

edited

Loading