Skip to content

Add lean_attestations_production_time_seconds metric from leanMetrics PR #30#312

Merged
MegaRedHand merged 1 commit into
mainfrom
add-attestations-production-time-metric
Apr 21, 2026
Merged

Add lean_attestations_production_time_seconds metric from leanMetrics PR #30#312
MegaRedHand merged 1 commit into
mainfrom
add-attestations-production-time-metric

Conversation

@pablodeymo
Copy link
Copy Markdown
Collaborator

Motivation

Implements the lean_attestations_production_time_seconds histogram defined in
leanEthereum/leanMetrics#30,
which measures the end-to-end time a node spends producing attestations
for all its validators in a given slot.

Description

New Metric

Metric Type Buckets Sample collection event
lean_attestations_production_time_seconds Histogram 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 0.75, 1 On attestation production

Implementation

  • crates/blockchain/src/metrics.rs: registers the histogram under the
    Validator Metrics family and exposes time_attestations_production(),
    which returns a TimingGuard that observes the duration on drop. Added
    to the init() eager-registration list so the metric appears on
    /metrics from process start.
  • crates/blockchain/src/lib.rs: instruments produce_attestations()
    with let _timing = metrics::time_attestations_production(); at the top
    of the function, capturing the full flow:
    1. Attestation data construction (store::produce_attestation_data)
    2. Per-validator signing
    3. Self-delivery via on_gossip_attestation
    4. Gossip publish

Documentation

  • docs/metrics.md: adds a Buckets column to the Validator Metrics
    table (consistent with the other tables in the doc) and lists the new
    metric alongside lean_validators_count and lean_is_aggregator.

How to Test

  1. make fmt - passes
  2. make lint - passes (clippy clean)
  3. make test - passes
  4. Run a local devnet and verify the metric appears on the /metrics
    endpoint (port 5054):
    curl -s localhost:5054/metrics | grep lean_attestations_production_time_seconds
    
    You should see the histogram buckets populate as the node produces
    attestations each slot.

Related

duration of the full attestation production flow in produce_attestations
(attestation data construction, signing, self-delivery, and gossip publish).
Implements leanMetrics PR #30 with buckets [0.01, 0.025, 0.05, 0.1, 0.25,
0.5, 0.75, 1] and updates docs/metrics.md.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

The PR is straightforward and correct. It follows the existing patterns for metrics in the codebase and properly instruments the produce_attestations function using RAII timing guards.

Minor issue:

crates/blockchain/src/metrics.rs:232 - Description inconsistency:

"Time taken to produce attestation",

Change to "Time taken to produce attestations" (plural) to match the metric name lean_attestations_production_time_seconds and the function name produce_attestations.

Verification notes:

  1. Scope correctness: The TimingGuard is assigned to _timing (line 173 of lib.rs), ensuring it lives for the entire function scope and measures the full duration of attestation production including the attestation data generation and all validator signatures.
  2. Bucket suitability: The buckets [0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 0.75, 1.0] (10ms to 1s) are appropriate for attestation production latency.
  3. Documentation: The markdown table correctly adds the "Buckets" column and maintains alignment for existing Gauge metrics (which have no buckets).

Security/Performance: No concerns. The metric uses thread-safe LazyLock initialization and the histogram recording has minimal overhead (atomic operations).


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR implements the lean_attestations_production_time_seconds histogram metric per the leanMetrics spec, measuring end-to-end attestation production time across all validators in a slot. The implementation follows established patterns in the codebase: a LazyLock static histogram, eager registration in init(), a TimingGuard-returning helper, and RAII instrumentation at the call site.

Confidence Score: 5/5

Safe to merge — clean, minimal change that follows existing patterns with no logic or correctness issues.

All findings are P2 style suggestions. The implementation is correct: the TimingGuard drops at function return capturing the full loop, buckets in code and docs are consistent, and eager registration ensures the metric appears from process start.

No files require special attention.

Important Files Changed

Filename Overview
crates/blockchain/src/metrics.rs Adds LEAN_ATTESTATIONS_PRODUCTION_TIME_SECONDS histogram with correct buckets, eager-registers it in init(), and exposes time_attestations_production(). Minor: help string uses singular "attestation" while the metric name implies plural.
crates/blockchain/src/lib.rs Instruments produce_attestations() with a TimingGuard at function entry; RAII correctly captures full end-to-end duration including signing and gossip publish for all validators.
docs/metrics.md Adds Buckets column to Validator Metrics table and documents the new histogram with correct bucket values matching the code.

Sequence Diagram

sequenceDiagram
    participant Scheduler
    participant BlockChainServer
    participant Metrics
    participant Store
    participant KeyManager
    participant Gossip

    Scheduler->>BlockChainServer: produce_attestations(slot)
    BlockChainServer->>Metrics: time_attestations_production()
    Note over Metrics: TimingGuard starts clock
    BlockChainServer->>Store: produce_attestation_data(slot)
    Store-->>BlockChainServer: attestation_data

    loop for each validator_id
        BlockChainServer->>KeyManager: sign_attestation(validator_id, data)
        KeyManager-->>BlockChainServer: signature
        BlockChainServer->>Store: on_gossip_attestation (self-deliver)
        BlockChainServer->>Gossip: publish signed_attestation
    end

    Note over Metrics: TimingGuard dropped → histogram observed
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/src/metrics.rs
Line: 233

Comment:
**Singular vs. plural in metric description**

The metric name (`lean_attestations_production_time_seconds`) refers to producing attestations for *all* validators in a slot, but the help string says "Time taken to produce attestation" (singular). Other metrics in this file are consistent between their name and description. Consider aligning to "Time taken to produce attestations for all validators in a slot" for clarity.

```suggestion
            "Time taken to produce attestations for all validators in a slot",
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Add lean_attestations_production_time_se..." | Re-trigger Greptile

Comment thread crates/blockchain/src/metrics.rs
@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

No actionable findings.

This PR is observability-only. The new histogram is registered and exposed correctly in metrics.rs, metrics.rs, and metrics.rs, and the added guard in lib.rs does not alter attestation contents, fork choice, justification/finalization, state transition, XMSS verification, or SSZ behavior.

Residual note: lean_attestations_production_time_seconds currently measures the whole produce_attestations batch for the slot, including self-delivery and gossip publish in lib.rs, not just local attestation construction. That is fine if intentional, but dashboards should treat it as end-to-end batch latency rather than per-attestation signing latency.

I could not complete a local cargo check in this environment because the pinned Rust toolchain (1.92.0) could not be fetched under the sandbox’s network restrictions.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

I have everything needed to write the review.


Review: PR #312lean_attestations_production_time_seconds metric

Overall this is a clean, well-scoped addition that follows established patterns. No correctness or security issues. Two minor observations worth considering before merge.


What looks good

  • RAII guard placement (lib.rs:173): let _timing at the top of produce_attestations() correctly captures the full function scope — attestation data construction, the per-validator signing loop, self-delivery, and gossip publish — on every exit path, including per-validator continue branches.
  • Eager registration in init() is consistent with existing histograms, ensuring the metric (with zero observations) appears on /metrics immediately on process start rather than only after the first slot's attestation production.
  • TimingGuard pattern and LazyLock registration are idiomatic and match the surrounding code exactly.
  • Bucket selection matches the leanMetrics spec PR feat: add functions to build attestations and blocks #30.

Observations

1. Metric description string is singular — should be plural

metrics.rs:233 and docs/metrics.md:71:

"Time taken to produce attestation"

The function iterates over all validators in a slot. Both the PR description and the leanMetrics spec describe this as the "end-to-end time a node spends producing attestations for all its validators in a given slot." The description should read:

"Time taken to produce attestations for all validators in a slot"

This matters for Grafana dashboards where the description appears in panel tooltips.


2. init() ordering breaks PQ sig grouping

metrics.rs:394:

std::sync::LazyLock::force(&LEAN_PQ_SIG_ATTESTATION_SIGNING_TIME_SECONDS);
std::sync::LazyLock::force(&LEAN_ATTESTATIONS_PRODUCTION_TIME_SECONDS);   // inserted here
std::sync::LazyLock::force(&LEAN_PQ_SIG_ATTESTATION_VERIFICATION_TIME_SECONDS);

The two LEAN_PQ_SIG_ATTESTATION_* entries belong together semantically. Moving the new entry after LEAN_PQ_SIG_ATTESTATION_VERIFICATION_TIME_SECONDS (i.e., after the block of all PQ sig metrics) would preserve the grouping without any functional difference.


3. Bucket upper bound — informational

lean_pq_sig_attestation_signing_time_seconds uses a 1.0s ceiling for a single XMSS signing operation. This metric wraps the full signing loop over all validators, so on a node managing N validators the total could exceed 1.0s. In current devnet deployments with few validators per node this is fine, and the spec defines these buckets — just something to revisit if validator-per-node counts grow.


4. Node Info table missing Buckets column — out of scope but noted

docs/metrics.md:11: The Node Info Metrics table still uses the old 6-column schema (no Buckets column). There are no histograms there so it's not functionally wrong, but it's inconsistent with all the other tables. Out of scope for this PR, but worth a follow-up.


Automated review by Claude (Anthropic) · sonnet · custom prompt

@MegaRedHand MegaRedHand merged commit 1e5a166 into main Apr 21, 2026
7 checks passed
@MegaRedHand MegaRedHand deleted the add-attestations-production-time-metric branch April 21, 2026 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants