Skip to content

feat(checkpoint-sync): fetch finalized block during checkpoint-sync#363

Merged
pablodeymo merged 8 commits into
mainfrom
feat/checkpoint-sync-fetch-finalized-block
May 13, 2026
Merged

feat(checkpoint-sync): fetch finalized block during checkpoint-sync#363
pablodeymo merged 8 commits into
mainfrom
feat/checkpoint-sync-fetch-finalized-block

Conversation

@MegaRedHand
Copy link
Copy Markdown
Collaborator

Summary

  • New fetch_finalized_anchor(url, ...) issues /lean/v0/states/finalized and /lean/v0/blocks/finalized in parallel and verifies the pair via the canonical block.state_root == hash_tree_root(state) invariant. On mismatch the caller is expected to retry (e.g. the peer advanced finalization between the two requests).
  • The fetched SignedBlock is persisted via Store::insert_signed_block, so the local node can serve a valid anchor over BlocksByRoot instead of synthesizing a block whose hash differs from latest_finalized.root (the scenario leanSpec PR #713 addresses).
  • The anchor-pair invariant check (anchor_pair_is_consistent) is hoisted into ethlambda_types::state. Store::get_forkchoice_store, the hive test driver, and the new checkpoint-sync client all delegate to it. Notably, the panic path in get_forkchoice_store previously checked only header equality + state self-consistency; it now also enforces block.state_root == hash_tree_root(state).

Compat

--checkpoint-sync-url is now documented as a base URL (e.g. http://peer:5052). The legacy .../lean/v0/states/finalized form is still accepted and the trailing path is stripped, so existing devnet scripts keep working.

Closes #354.

Test plan

  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo test -p ethlambda (26 passed)
  • cargo test -p ethlambda-rpc --tests (8 passed, incl. anchor-pair regression coverage)
  • cargo test -p ethlambda-storage (33 passed)
  • Smoke test in a multi-node devnet: stop one node, restart it with --checkpoint-sync-url pointing at a peer's API, confirm it boots, requests blocks, and stays in sync.

Fetches the finalized state and signed block in parallel from a
checkpoint peer, verifies they pair (`signed_block.state_root ==
hash_tree_root(canonical_state)`), and persists the signed block so
the local node can serve a valid anchor over BlocksByRoot.

Without the matching signed block, peers requesting the anchor via
BlocksByRoot would receive a synthetic block whose hash differs from
`latest_finalized.root` and would score-penalize us (see leanSpec
PR #713).

The `--checkpoint-sync-url` CLI flag now takes a base URL but still
accepts the legacy `.../lean/v0/states/finalized` form to avoid
breaking existing devnet scripts.

Closes #354.
State+Block invariant check belongs alongside the types it operates on,
not in storage. Storage, rpc test-driver, and the checkpoint-sync client
all import directly from ethlambda_types::state.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Greptile Summary

This PR upgrades checkpoint sync to fetch the finalized state and signed block in parallel, verifying they form a consistent anchor pair before initializing the store. The fetched block is then persisted so the local node can serve a valid anchor over BlocksByRoot rather than a synthesized block with a mismatched hash. The anchor-pair consistency check is also moved to a shared function in ethlambda_types::state, replacing duplicated logic in test_driver.rs and store.rs.

  • fetch_finalized_anchor issues both /lean/v0/states/finalized and /lean/v0/blocks/finalized via tokio::try_join!, validates the pair with anchor_pair_is_consistent, and returns both to the caller for store initialization.
  • normalize_base_url provides backward-compatible URL normalization so existing devnet scripts pointing at the legacy full-state URL keep working.
  • Store::get_forkchoice_store now asserts the full anchor_pair_is_consistent invariant (header equality + block.state_root == hash_tree_root(state)) instead of the previous header-only check.

Confidence Score: 4/5

Safe to merge with minor follow-up: a trailing slash on the legacy state URL produces a double-path fetch failure, and a mid-startup finalization advance causes a hard error rather than a transparent retry.

The core parallel fetch, pair-validation, and block persistence logic is solid. The two issues are both in the URL normalization and startup UX paths, not in consensus-critical data handling. The normalize_base_url bug only triggers for an edge-case URL form not currently in use, and the missing retry means an uncommon race during a one-time startup operation surfaces as a manual-restart instead of a self-healing behavior.

bin/ethlambda/src/checkpoint_sync.rs for the URL normalization ordering bug and bin/ethlambda/src/main.rs for the missing retry path around fetch_finalized_anchor.

Important Files Changed

Filename Overview
bin/ethlambda/src/checkpoint_sync.rs Refactored to fetch state and block in parallel via tokio::try_join!; introduces normalize_base_url with a minor edge-case bug when the legacy URL has a trailing slash; anchor-pair validation delegates to the new shared anchor_pair_is_consistent.
bin/ethlambda/src/main.rs Updated fetch_initial_state to use fetch_finalized_anchor and persist the anchor block's signatures via insert_signed_block; no retry logic despite the PR noting retries are expected on AnchorPairingMismatch.
crates/common/types/src/state.rs Adds the public anchor_pair_is_consistent function, hoisting the previously duplicated invariant check into a single shared location; logic is unchanged from the prior implementation in test_driver.rs.
crates/net/rpc/src/test_driver.rs Removed the local copy of anchor_pair_is_consistent and replaced with the shared import; purely mechanical, no behavioural change.
crates/storage/src/store.rs get_forkchoice_store now delegates to anchor_pair_is_consistent for the consistency assert instead of the previous inline header comparison; the previous assert only checked header equality, missing the block.state_root == hash_tree_root(state) invariant now enforced.

Sequence Diagram

sequenceDiagram
    participant Main as main.rs
    participant FFA as fetch_finalized_anchor
    participant PeerAPI as Checkpoint Peer API
    participant Store as Store

    Main->>FFA: fetch_finalized_anchor(url, genesis_time, validators)
    FFA->>FFA: normalize_base_url(url)
    FFA->>FFA: build_client()
    par tokio::try_join!
        FFA->>PeerAPI: GET /lean/v0/states/finalized
        PeerAPI-->>FFA: SSZ State
        FFA->>FFA: verify_checkpoint_state(state, ...)
    and
        FFA->>PeerAPI: GET /lean/v0/blocks/finalized
        PeerAPI-->>FFA: SSZ SignedBlock
    end
    FFA->>FFA: "anchor_pair_is_consistent(&mut state, &block)"
    alt mismatch (peer advanced finalization)
        FFA-->>Main: Err(AnchorPairingMismatch)
        Note over Main: node startup fails (no retry)
    else consistent
        FFA-->>Main: Ok((state, signed_block))
    end
    Main->>Store: get_forkchoice_store(backend, state, block)
    Store->>Store: anchor_pair_is_consistent (assert)
    Store->>Store: init_store(backend, state, body)
    Main->>Store: insert_signed_block(anchor_root, signed_block)
    Note over Store: Persists signatures for BlocksByRoot serving
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
bin/ethlambda/src/main.rs:600-602
**No retry on `AnchorPairingMismatch`**

The PR description explicitly says "the caller is expected to retry" if the peer advances finalization between the two parallel fetches, but `fetch_initial_state` propagates the error immediately. In practice, if finalization ticks over while the state and block requests are in-flight, the node will fail to start with a confusing `anchor block does not match anchor state` message that gives no hint to simply retry. Adding a bounded retry loop (e.g. 3 attempts with a short delay) around `fetch_finalized_anchor` would handle the described transient condition without operator intervention.

### Issue 2 of 2
bin/ethlambda/src/checkpoint_sync.rs:104-108
**`normalize_base_url` mishandles trailing slash on the legacy state path**

If the operator passes `http://peer:5052/lean/v0/states/finalized/` (trailing slash), `strip_suffix` fails (the string doesn't end with `/lean/v0/states/finalized`), and `trim_end_matches('/')` only removes the slash, leaving the state path embedded in the "base URL". Both constructed endpoints then become double-path URLs (`…/lean/v0/states/finalized/lean/v0/states/finalized`), causing HTTP 404 errors. Trimming trailing slashes before stripping the legacy suffix avoids this.

```suggestion
fn normalize_base_url(url: &str) -> &str {
    url.trim_end_matches('/')
        .strip_suffix(FINALIZED_STATE_PATH)
        .unwrap_or(url.trim_end_matches('/'))
}
```

Reviews (1): Last reviewed commit: "refactor: remove pub from private functi..." | Re-trigger Greptile

Comment thread bin/ethlambda/src/main.rs Outdated
Comment thread bin/ethlambda/src/checkpoint_sync.rs
…slash; test fix

- Add bounded retry (3 attempts, 1s delay) around fetch_finalized_anchor
  in fetch_initial_state. When the peer advances finalization between the
  parallel state+block requests the pair won't match; that's transient and
  the caller is supposed to retry, not fail node startup.

- normalize_base_url: trim trailing slashes BEFORE the legacy-suffix
  strip. Operators passing http://peer:5052/lean/v0/states/finalized/
  (trailing slash) used to leave the state path embedded in the base URL
  and double-prefix every request. Regression test added.

- crates/blockchain test on_block_rejects_duplicate_attestation_data:
  switch from get_forkchoice_store to from_anchor_state. The former now
  enforces block.state_root == hash_tree_root(state) via the shared
  anchor_pair_is_consistent helper, and a synthetic genesis block with
  zero state_root cannot satisfy that. Using from_anchor_state matches
  the architectural intent (no synthetic anchor block needed for tests
  that just want a store seeded from a state).
@MegaRedHand MegaRedHand changed the title feat(checkpoint-sync): fetch finalized block in parallel with state feat(checkpoint-sync): fetch finalized block during checkpoint-sync May 12, 2026
MegaRedHand and others added 3 commits May 12, 2026 17:23
- forkchoice_spectests: detect 'empty steps' fixtures (the
  test_store_from_anchor_rejects_mismatched_state_root case) via the
  non-panicking anchor_pair_is_consistent helper instead of letting
  Store::get_forkchoice_store's assert! panic out of the test harness.
  Cross-checks the consistency flag against whether steps is empty.

- checkpoint_sync.rs: apply rustfmt to the strip_suffix chain in
  normalize_base_url (multi-line layout).
@pablodeymo pablodeymo merged commit 4659377 into main May 13, 2026
2 checks passed
@pablodeymo pablodeymo deleted the feat/checkpoint-sync-fetch-finalized-block branch May 13, 2026 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fetch finalized block during checkpoint-sync

2 participants