Skip to content

feat: add BlocksByRange req/resp protocol#691

Merged
tcoratger merged 8 commits into
leanEthereum:mainfrom
akronim26:req-resp
Apr 30, 2026
Merged

feat: add BlocksByRange req/resp protocol#691
tcoratger merged 8 commits into
leanEthereum:mainfrom
akronim26:req-resp

Conversation

@akronim26
Copy link
Copy Markdown
Contributor

🗒️ Description

This PR adds the BlocksByRange req/resp protocol to ease the sync by fetching the blocks by range instead of fetching by roots

🔗 Related Issues or PRs

Fixes #688

✅ Checklist

  • Ran tox checks to avoid unnecessary CI fails:
    uvx tox
  • Considered adding appropriate tests for the changes.
  • Considered updating the online docs in the ./docs/ directory.

@akronim26 akronim26 marked this pull request as draft April 29, 2026 17:58
@akronim26 akronim26 marked this pull request as ready for review April 29, 2026 19:21
@tcoratger
Copy link
Copy Markdown
Collaborator

@akronim26 Please can you rebase with the latest framework changes?

This commit adds the server-side handler for the BlocksByRange protocol, integrates it into the SyncService, and enhances the backfill synchronization with range-based fetching. It also includes a strict SSZ validation fix for fixed-size containers to ensure malformed oversized payloads are correctly rejected.
This commit addresses multiple linting violations (E501 line length), fixes missing imports in head_sync.py and test helpers, and restores/enhances test coverage for malformed SSZ payloads in the request/response layer.
@akronim26
Copy link
Copy Markdown
Contributor Author

@akronim26 Please can you rebase with the latest framework changes?

Done!

tcoratger and others added 2 commits April 30, 2026 23:30
Apply review fixes spanning protocol semantics, client correctness, and
test discipline. The headline change is a sliding history window: replace
the absolute MIN_BLOCK_REQUESTS_HISTORY_SLOT floor with a window relative
to the responder's current slot, matching upstream consensus-spec behavior
and unblocking sync from genesis.

Protocol semantics:

- Sliding window: handle_blocks_by_range now rejects start_slot < max(0,
  current_slot - MIN_SLOTS_FOR_BLOCK_REQUESTS). Add CurrentSlotLookup
  callback on RequestHandler; missing lookup returns SERVER_ERROR rather
  than silently misreporting history.
- Crash guard: head_sync rejects gossip blocks at or below the finalized
  slot, fixing the SSZValueError underflow path on adversarial input.
- Gap-detection floor: head_slot + 1 (not finalized_slot) so slots already
  in the Store are never redownloaded.

Client correctness (reqresp_client.request_blocks_by_range):

- Propagate CodecError so peer downscoring fires on protocol violations
  (was being swallowed by the broad except Exception).
- Parent-root continuity check applies across empty slots, not just
  consecutive slots, matching the canonical-chain MUST.
- Range bounds use int() arithmetic to avoid Slot overflow.
- Local validation: count <= MAX_REQUEST_BLOCKS and start_slot + count
  <= UINT64_MAX before opening a stream.
- Raise CodecError when peer sends more than count chunks.
- Log violations with conn.peer_id, not the raw connection repr.
- Bug fix: count == 0 compares against Uint64(0) (was raising TypeError).

Architecture:

- New StoreView Protocol (has_root / finalized_slot / head_slot) replaces
  two loose Callable | None callbacks on BackfillSync. _SyncStoreView
  adapter in service.py reads through a getter so live store mutations
  are observed.
- _max_range_slot watermark advances only after a completed (success or
  empty) range fetch; failed fetches stay retryable. reset() clears it.
- LiveNetworkEventSource exposes set_block_by_slot_lookup and
  set_current_slot_lookup, symmetric with set_block_lookup. __main__
  wires the current-slot side. Block-by-slot wiring requires SignedBlock
  storage and is a follow-up.

Tests and style:

- Drop MagicMock store fixture; use a concrete FakeStoreView dataclass.
- Drop the get_finalized_slot=None mutation hack; that test now builds
  its own no-store-view backfill.
- Split MockNetworkRequester.request_log into typed root/range logs.
- Full-equality assertions across new tests; new coverage for failed-
  range non-advance, reset-clears-watermark, sliding-window genesis edge
  case, missing current_slot_lookup SERVER_ERROR.
- Drop step-numbered comments, nested test import, dead add_block(root=)
  parameter, and dead duplicate MAX_CONCURRENT_REQUESTS constant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the unit tests that were missing after the previous review-fix commit:

- tests/lean_spec/subspecs/networking/client/test_reqresp_client_range.py:
  14 tests for the outbound BlocksByRange flow. Zero-count short-circuit,
  count-above-max rejection, start_slot+count overflow, no-connection,
  full-range happy path, partial response on early close, RESOURCE_UNAVAILABLE
  chunk skipping, slot monotonicity violation, out-of-range slot violation,
  parent-root continuity violation across a skipped slot (the upstream-
  aligned tightening), parent-root continuity holds across skipped slots,
  more-than-count chunk enforcement, timeout returns empty, SERVER_ERROR
  halts reading and returns the partial list.

- tests/lean_spec/subspecs/sync/test_head_sync_backfill_routing.py:
  5 tests for the post-review _cache_and_backfill routing. Silent rejection
  at the finalized slot, silent rejection below the finalized slot,
  single-slot gap above head uses root recursion, multi-slot gap above
  head uses a single range fetch, alt-fork gossip at-or-below head uses
  root recursion.

Drop the local UINT64_MAX constant in the reqresp client; use the canonical
Uint64.max_value() helper from the typed API instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tcoratger tcoratger merged commit 8cdc4b0 into leanEthereum:main Apr 30, 2026
13 checks passed
ch4r10t33r added a commit to blockblaz/zeam that referenced this pull request May 6, 2026
* feat: implement blocksByRange RPC for efficient bulk sync (closes #823)

Implements the leanSpec blocksByRange protocol from leanEthereum/leanSpec#691.

Server side:
- Adds BlocksByRangeRequest type with start_slot and count fields
- Server walks canonical chain backward from head, collecting blocks whose
  slot falls within the requested range, then streams them in slot order
- Caps count at MAX_REQUEST_BLOCKS to bound work per request

Client side:
- requestBlocksByRange / sendBlocksByRangeRequest in Network
- BlockByRangeContext + PendingRPC variant for tracking
- processBlockByRangeChunk reuses chain.onBlock (with parent caching for
  out-of-order chunks), same as the by-root path

Sync trigger:
- When a peer status arrives showing they are far ahead (gap > 64 slots),
  initiate a single bulk blocks_by_range request instead of the recursive
  head-by-root walk. Falls back to head-by-root if the bulk request fails.

Wiring:
- Adds blocks_by_range to LeanSupportedProtocol enum (Zig and Rust sides)
- Pins enum ordinals to keep protocol_tag stable across the FFI boundary:
  blocks_by_root=0, status=1, blocks_by_range=2
- Registers the new protocol in the libp2p-glue ReqResp behaviour
- Updates mock network and exhaustive switches throughout

Tests: zig build test passes (all 11 test suites).

* network,libp2p-glue: address @ch4r10t33r review on PR #824

Two review comments on PR #824 (blocksByRange RPC, closes #823):

1. Rust LeanSupportedProtocol declaration order disagreed with the
   manual TryFrom<u32> mapping. Pre-fix: declaration ord made
   `StatusV1 as u32 == 2` while `try_from(2) == BlocksByRangeV1` —
   silent Rust→u32 round-trip break on Status and BlocksByRange. No
   production caller exercises `as u32` today, but the asymmetry was
   a foot-shaped trap.

   Fix: add #[repr(u32)] + explicit discriminants matching TryFrom
   (BlocksByRootV1 = 0, StatusV1 = 1, BlocksByRangeV1 = 2). Keep the
   manual TryFrom impl as a sanity wrapper (no num_enum dep). Three
   new unit tests in `mod tests` pin the round-trip, the explicit
   discriminant values, and the out-of-range rejection.

   Updated the Zig comment at interface.zig::LeanSupportedProtocol to
   state the bidirectional FFI invariant + cross-reference the Rust
   test.

2. No dedicated unit test for the range RPC. Add a mock-network
   round-trip suite in pkgs/network/src/mock.zig covering 4 of the
   6 reviewer scenarios at the WIRE-contract layer:

   - multi-chunk in slot-ascending order: M chunks via stream
     sendResponse + finish; assert M chunks received in order +
     single completed event + zero failures. Pins the streaming
     contract + cloneResponse + DeferredResponseTask.dispatch
     ordering.
   - empty stream + clean finish (start_slot past head): server
     calls finish() with no responses; peer sees zero chunks +
     completed + no error. Pins the start_slot > head.slot path.
   - RESOURCE_UNAVAILABLE error path: server replies with
     sendError(3, ...); peer sees a failure event with code 3,
     never a chunk, never a bare completed. Pins the
     MIN_SLOTS_FOR_BLOCK_REQUESTS gate at node.zig:1196-1206.

   Scenarios deferred to follow-up issues (need BeamNode harness):
   - gaps preserved with empty-slot skip in finalized walk
   - sync-trigger gap > 64 → range vs head-by-root selection

Reviewer flagged three additional issues to be split into follow-up
tracking (NOT addressed in this PR):

  * sendBlocksByRangeRequest has a request-ID-vs-pending-state race:
    FFI sendRequest dispatches BEFORE pending_rpc_requests.put runs.
    A fast peer (or loopback) can land a response in the gap and
    snapshotPendingRequest returns null → response dropped silently.
    Same pre-existing pattern in ensureBlocksByRootRequest. Will be
    fixed in a follow-up PR that lands both paths in one diff.
  * No per-peer rate limit / DoS handling on the server. Follow-up.
  * @min(gap, MAX_REQUEST_BLOCKS) cycle-gates large catch-ups to ~5
    minutes for a 10k-slot deficit. Follow-up: chain follow-up
    requests after the first batch lands, or document cadence.

Tests:
  cargo test --package libp2p-glue --lib req_resp::protocol_id::tests  # 3 pass
  zig build test                                                        # all 11 mock tests pass, 89 forkchoice/locking, etc. — no regressions

Reviewed-by: @ch4r10t33r

* libp2p-glue: rustfmt fix for protocol_id tests

CI lint job (cargo fmt --check) on a7776e9 flagged a long
`assert_eq!` line in `try_from_round_trip_matches_repr`. Apply the
exact diff cargo fmt suggested — break the macro across multiple
lines.

No behaviour change. 3 unit tests still pass:
  cargo test --package libp2p-glue --lib req_resp::protocol_id::tests

---------

Co-authored-by: zclawz <zclawz@users.noreply.github.com>
Co-authored-by: Parthasarathy Ramanujam <1627026+ch4r10t33r@users.noreply.github.com>
Co-authored-by: zclawz <zclawz@openclaw.ai>
MegaRedHand added a commit to lambdaclass/ethlambda that referenced this pull request May 12, 2026
## 🗒️ Description / Motivation

This PR adds inbound `BlocksByRange` request-response support to the P2P
req/resp protocol implementation.

The change follows the recently merged spec update:
- leanEthereum/leanSpec#691

This is needed so peers can request canonical blocks by slot range,
similar to the existing `BlocksByRoot` protocol.

The implementation:
- registers the new protocol
- adds SSZ request/response handling
- supports serving canonical blocks from local storage
- validates malformed requests

This improves interoperability with other clients implementing the
updated spec.

---
## What Changed
### Req/Resp Protocol
- Added `BlocksByRangeRequest`
- Added `BlocksByRange` response payload variant
- Added protocol ID:
  - `/leanconsensus/req/blocks_by_range/1/ssz_snappy`

### Codec
- Updated request/response codec read paths
- Updated request/response codec write paths

### Behaviour Registration
- Registered `BlocksByRange` in the libp2p request-response behaviour

### Inbound Request Handling
- Added inbound request handler for `BlocksByRange`
- Serves canonical blocks by walking backward from the current
fork-choice head
- Skips:
  - empty slots
  - side forks

### Validation
Added validation for:
- `step == 0`
- `count > 1024`

Invalid requests return protocol error responses.

### Tests
- Added unit test for canonical range selection and ordering
---

## Correctness / Behavior Guarantees
### Preserved Invariants
- Only canonical blocks are returned
- Returned blocks preserve requested slot ordering
- Empty slots are skipped
- Non-canonical side forks are ignored

### Behavior Notes
- Invalid requests are rejected early with error responses
- Maximum request size is capped at `1024` blocks
- Implementation mirrors existing `BlocksByRoot` handling patterns for
consistency

---
## Tests Added / Run
### Added
- `blocks_by_range_returns_canonical_blocks_in_requested_order`
### Verified With
```bash
cargo fmt --check
cargo check -p ethlambda-p2p
cargo test -p ethlambda-p2p blocks_by_range_returns_canonical_blocks_in_requested_order
git diff --check
```
---
## Related Issues / PRs
- Closes #346
- Related to leanEthereum/leanSpec#691

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Tomás Grüner <47506558+MegaRedHand@users.noreply.github.com>
@unnawut unnawut added the specs Scope: Changes to the specifications label May 14, 2026
@unnawut unnawut added this to the pq-devnet-4 milestone May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

specs Scope: Changes to the specifications

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add BlocksByRange req/resp protocol

3 participants