Skip to content

docs: ARCHITECTURE.md + unified SIMD structural benchmarks#321

Merged
strawgate merged 1 commit into
masterfrom
docs/architecture-and-benchmarks
Mar 31, 2026
Merged

docs: ARCHITECTURE.md + unified SIMD structural benchmarks#321
strawgate merged 1 commit into
masterfrom
docs/architecture-and-benchmarks

Conversation

@strawgate
Copy link
Copy Markdown
Owner

Summary

  • Add dev-docs/ARCHITECTURE.md — primary entry point documenting the full pipeline data flow from input to output, layer interfaces, crate boundaries, buffer lifecycle, and verification strategy
  • Add structural_detect benchmark comparing separate passes vs unified SIMD (5-char, 9-char) — validates extending ChunkIndex into StructuralIndex (Unified SIMD StructuralIndex: extend ChunkIndex to detect all structural characters #313)
  • Update DIRECTION, PHASES, PROVEN_CORE, PROOF_AUDIT, ZERO_COPY_PIPELINE for current state

Benchmark results (NEON, ~760KB NDJSON)

Approach Time Throughput
Current (memchr + ChunkIndex 2-char) 95 µs 7.9 GiB/s
Unified SIMD 5-char 141 µs 5.4 GiB/s
Unified SIMD 9-char 256 µs 3.0 GiB/s

Scaling is linear at ~28µs per character. 9-char gives 11x headroom over 1M lines/sec target.

Test plan

  • CI passes (clippy, fmt, tests)
  • Benchmark compiles and runs: cargo bench -p logfwd-core --bench structural_detect
  • ARCHITECTURE.md accurately describes current pipeline flow
  • Dev-docs are internally consistent

🤖 Generated with Claude Code

Add ARCHITECTURE.md — primary entry point documenting the full
pipeline data flow from bytes on disk to serialized output. Covers
each layer's responsibility, interface types, crate boundaries,
buffer lifecycle, and verification strategy.

Add structural_detect benchmark comparing separate passes (memchr +
ChunkIndex) vs unified SIMD (5-char, 9-char) vs hybrid scalar.
Results: 9-char unified SIMD at 3 GiB/s with linear scaling at
~28µs per character — validates extending ChunkIndex into
StructuralIndex for format-agnostic structural scanning.

Update dev-docs:
- DIRECTION.md: add StructuralIndex vision, unified SIMD section,
  benchmark results
- PHASES.md: restructure into 8 phases reflecting current state
  (Phase 0-1.5 done, Phase 2 is StructuralIndex)
- PROVEN_CORE.md: add framer, aggregator, byte_search to module
  table with proof counts
- PROOF_AUDIT.md: update for content_correct oracle proof, close
  NewlineFramer correctness gap
- ZERO_COPY_PIPELINE.md: cross-reference to ARCHITECTURE.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 31, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a1d79f3b-4d6f-4fc8-b03c-e1996412ff2e

📥 Commits

Reviewing files that changed from the base of the PR and between b8f8ebd and 2c4c876.

📒 Files selected for processing (7)
  • crates/logfwd-core/Cargo.toml
  • crates/logfwd-core/benches/structural_detect.rs
  • dev-docs/ARCHITECTURE.md
  • dev-docs/DIRECTION.md
  • dev-docs/PHASES.md
  • dev-docs/PROVEN_CORE.md
  • dev-docs/ZERO_COPY_PIPELINE.md

Walkthrough

This PR adds comprehensive benchmarking infrastructure and developer documentation for structural-character detection in logfwd-core. The main addition is a new Criterion benchmark file (structural_detect.rs) with 676 lines that evaluates multiple detection strategies: separate-pass baseline functions, unified 64-byte block scanning approaches, hybrid two-pass methods, and architecture-specific SIMD implementations (NEON on aarch64, AVX2 on x86_64) for both 5-character and 9-character sets. Accompanying updates to the crate Cargo.toml register the benchmark. Developer documentation is significantly expanded with a new comprehensive ARCHITECTURE.md file, alongside updates to DIRECTION.md, PHASES.md, PROVEN_CORE.md, and ZERO_COPY_PIPELINE.md that clarify the unified SIMD scanning approach and reorganize the project roadmap.

Possibly related PRs


Comment @coderabbitai help to get the list of available commands and usage tips.

@strawgate strawgate merged commit 884a73f into master Mar 31, 2026
9 of 11 checks passed
@strawgate strawgate deleted the docs/architecture-and-benchmarks branch March 31, 2026 03:30
results.push(unsafe { simd_9char::find_9chars_neon(&block) });

#[cfg(target_arch = "x86_64")]
if is_x86_feature_detected!("avx2") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simd_9char_scan_buffer has no non-AVX2 fallback on x86_64, so on CPUs without AVX2 this benchmark does effectively no scanning work (the loop runs but results never gets a push). That makes unified_9char_simd timings artificially fast on those machines.

Please add an else branch (and ideally a non-x86_64 fallback too) similar to simd_unified_scan_buffer so this benchmark always measures real 9-char detection work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant