feat(checkpoint-sync): fetch finalized block during checkpoint-sync#363
Conversation
Fetches the finalized state and signed block in parallel from a checkpoint peer, verifies they pair (`signed_block.state_root == hash_tree_root(canonical_state)`), and persists the signed block so the local node can serve a valid anchor over BlocksByRoot. Without the matching signed block, peers requesting the anchor via BlocksByRoot would receive a synthetic block whose hash differs from `latest_finalized.root` and would score-penalize us (see leanSpec PR #713). The `--checkpoint-sync-url` CLI flag now takes a base URL but still accepts the legacy `.../lean/v0/states/finalized` form to avoid breaking existing devnet scripts. Closes #354.
State+Block invariant check belongs alongside the types it operates on, not in storage. Storage, rpc test-driver, and the checkpoint-sync client all import directly from ethlambda_types::state.
Greptile SummaryThis PR upgrades checkpoint sync to fetch the finalized state and signed block in parallel, verifying they form a consistent anchor pair before initializing the store. The fetched block is then persisted so the local node can serve a valid anchor over
Confidence Score: 4/5Safe to merge with minor follow-up: a trailing slash on the legacy state URL produces a double-path fetch failure, and a mid-startup finalization advance causes a hard error rather than a transparent retry. The core parallel fetch, pair-validation, and block persistence logic is solid. The two issues are both in the URL normalization and startup UX paths, not in consensus-critical data handling. The normalize_base_url bug only triggers for an edge-case URL form not currently in use, and the missing retry means an uncommon race during a one-time startup operation surfaces as a manual-restart instead of a self-healing behavior. bin/ethlambda/src/checkpoint_sync.rs for the URL normalization ordering bug and bin/ethlambda/src/main.rs for the missing retry path around fetch_finalized_anchor.
|
| Filename | Overview |
|---|---|
| bin/ethlambda/src/checkpoint_sync.rs | Refactored to fetch state and block in parallel via tokio::try_join!; introduces normalize_base_url with a minor edge-case bug when the legacy URL has a trailing slash; anchor-pair validation delegates to the new shared anchor_pair_is_consistent. |
| bin/ethlambda/src/main.rs | Updated fetch_initial_state to use fetch_finalized_anchor and persist the anchor block's signatures via insert_signed_block; no retry logic despite the PR noting retries are expected on AnchorPairingMismatch. |
| crates/common/types/src/state.rs | Adds the public anchor_pair_is_consistent function, hoisting the previously duplicated invariant check into a single shared location; logic is unchanged from the prior implementation in test_driver.rs. |
| crates/net/rpc/src/test_driver.rs | Removed the local copy of anchor_pair_is_consistent and replaced with the shared import; purely mechanical, no behavioural change. |
| crates/storage/src/store.rs | get_forkchoice_store now delegates to anchor_pair_is_consistent for the consistency assert instead of the previous inline header comparison; the previous assert only checked header equality, missing the block.state_root == hash_tree_root(state) invariant now enforced. |
Sequence Diagram
sequenceDiagram
participant Main as main.rs
participant FFA as fetch_finalized_anchor
participant PeerAPI as Checkpoint Peer API
participant Store as Store
Main->>FFA: fetch_finalized_anchor(url, genesis_time, validators)
FFA->>FFA: normalize_base_url(url)
FFA->>FFA: build_client()
par tokio::try_join!
FFA->>PeerAPI: GET /lean/v0/states/finalized
PeerAPI-->>FFA: SSZ State
FFA->>FFA: verify_checkpoint_state(state, ...)
and
FFA->>PeerAPI: GET /lean/v0/blocks/finalized
PeerAPI-->>FFA: SSZ SignedBlock
end
FFA->>FFA: "anchor_pair_is_consistent(&mut state, &block)"
alt mismatch (peer advanced finalization)
FFA-->>Main: Err(AnchorPairingMismatch)
Note over Main: node startup fails (no retry)
else consistent
FFA-->>Main: Ok((state, signed_block))
end
Main->>Store: get_forkchoice_store(backend, state, block)
Store->>Store: anchor_pair_is_consistent (assert)
Store->>Store: init_store(backend, state, body)
Main->>Store: insert_signed_block(anchor_root, signed_block)
Note over Store: Persists signatures for BlocksByRoot serving
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
bin/ethlambda/src/main.rs:600-602
**No retry on `AnchorPairingMismatch`**
The PR description explicitly says "the caller is expected to retry" if the peer advances finalization between the two parallel fetches, but `fetch_initial_state` propagates the error immediately. In practice, if finalization ticks over while the state and block requests are in-flight, the node will fail to start with a confusing `anchor block does not match anchor state` message that gives no hint to simply retry. Adding a bounded retry loop (e.g. 3 attempts with a short delay) around `fetch_finalized_anchor` would handle the described transient condition without operator intervention.
### Issue 2 of 2
bin/ethlambda/src/checkpoint_sync.rs:104-108
**`normalize_base_url` mishandles trailing slash on the legacy state path**
If the operator passes `http://peer:5052/lean/v0/states/finalized/` (trailing slash), `strip_suffix` fails (the string doesn't end with `/lean/v0/states/finalized`), and `trim_end_matches('/')` only removes the slash, leaving the state path embedded in the "base URL". Both constructed endpoints then become double-path URLs (`…/lean/v0/states/finalized/lean/v0/states/finalized`), causing HTTP 404 errors. Trimming trailing slashes before stripping the legacy suffix avoids this.
```suggestion
fn normalize_base_url(url: &str) -> &str {
url.trim_end_matches('/')
.strip_suffix(FINALIZED_STATE_PATH)
.unwrap_or(url.trim_end_matches('/'))
}
```
Reviews (1): Last reviewed commit: "refactor: remove pub from private functi..." | Re-trigger Greptile
…fetch-finalized-block
…slash; test fix - Add bounded retry (3 attempts, 1s delay) around fetch_finalized_anchor in fetch_initial_state. When the peer advances finalization between the parallel state+block requests the pair won't match; that's transient and the caller is supposed to retry, not fail node startup. - normalize_base_url: trim trailing slashes BEFORE the legacy-suffix strip. Operators passing http://peer:5052/lean/v0/states/finalized/ (trailing slash) used to leave the state path embedded in the base URL and double-prefix every request. Regression test added. - crates/blockchain test on_block_rejects_duplicate_attestation_data: switch from get_forkchoice_store to from_anchor_state. The former now enforces block.state_root == hash_tree_root(state) via the shared anchor_pair_is_consistent helper, and a synthetic genesis block with zero state_root cannot satisfy that. Using from_anchor_state matches the architectural intent (no synthetic anchor block needed for tests that just want a store seeded from a state).
- forkchoice_spectests: detect 'empty steps' fixtures (the test_store_from_anchor_rejects_mismatched_state_root case) via the non-panicking anchor_pair_is_consistent helper instead of letting Store::get_forkchoice_store's assert! panic out of the test harness. Cross-checks the consistency flag against whether steps is empty. - checkpoint_sync.rs: apply rustfmt to the strip_suffix chain in normalize_base_url (multi-line layout).
Summary
fetch_finalized_anchor(url, ...)issues/lean/v0/states/finalizedand/lean/v0/blocks/finalizedin parallel and verifies the pair via the canonicalblock.state_root == hash_tree_root(state)invariant. On mismatch the caller is expected to retry (e.g. the peer advanced finalization between the two requests).SignedBlockis persisted viaStore::insert_signed_block, so the local node can serve a valid anchor overBlocksByRootinstead of synthesizing a block whose hash differs fromlatest_finalized.root(the scenario leanSpec PR #713 addresses).anchor_pair_is_consistent) is hoisted intoethlambda_types::state.Store::get_forkchoice_store, the hive test driver, and the new checkpoint-sync client all delegate to it. Notably, the panic path inget_forkchoice_storepreviously checked only header equality + state self-consistency; it now also enforcesblock.state_root == hash_tree_root(state).Compat
--checkpoint-sync-urlis now documented as a base URL (e.g.http://peer:5052). The legacy.../lean/v0/states/finalizedform is still accepted and the trailing path is stripped, so existing devnet scripts keep working.Closes #354.
Test plan
cargo clippy --workspace --all-targets -- -D warningscargo test -p ethlambda(26 passed)cargo test -p ethlambda-rpc --tests(8 passed, incl. anchor-pair regression coverage)cargo test -p ethlambda-storage(33 passed)--checkpoint-sync-urlpointing at a peer's API, confirm it boots, requests blocks, and stays in sync.