Skip to content

test: add CI cache for rpc snapshot tests to mitigate flackyness issue#5978

Merged
LesnyRumcajs merged 16 commits intomainfrom
hm/run-rpc-snapshot-tests-with-cargo-test
Aug 26, 2025
Merged

test: add CI cache for rpc snapshot tests to mitigate flackyness issue#5978
LesnyRumcajs merged 16 commits intomainfrom
hm/run-rpc-snapshot-tests-with-cargo-test

Conversation

@hanabi1224
Copy link
Copy Markdown
Contributor

@hanabi1224 hanabi1224 commented Aug 25, 2025

Summary of changes

This PR tries to add CI cache for rpc snapshot tests to mitigate flackyness issue.

Changes introduced in this pull request:

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • Tests

    • Renamed generated RPC tests to use "snapshot" naming for clearer identification.
    • Default test selection updated to target RPC snapshot tests; retry count reduced from 4 to 3; timeouts unchanged.
  • Chores

    • CI now restores and saves RPC snapshot cache across OSes using dynamic keys to accelerate test runs.
    • No changes to public APIs or runtime behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 25, 2025

Walkthrough

Renamed generated RPC test function prefix from rpc_test_ to rpc_snapshot_test_ in build.rs; updated Nextest filter and reduced retries; added restore/save caching (with computed dynamic key) and an ls diagnostic for RPC snapshot files in the unit-tests GitHub Actions workflow using absolute /home/runner/.cache/... paths.

Changes

Cohort / File(s) Summary of changes
RPC test generation rename
build.rs
Renamed generated test function names from rpc_test_{ident} to rpc_snapshot_test_{ident}; test attribute (#[tokio::test(flavor = "multi_thread")]) and body (rpc_regression_test_run("{test}").await) unchanged.
Nextest profile update
.config/nextest.toml
Updated default profile override filter from test(rpc_test_) to test(rpc_snapshot_test_); changed retries.count from 4 to 3; other profile settings unchanged.
CI cache & diagnostic
.github/workflows/unit-tests.yml
Added cache restore (immediately after checkout) for /home/runner/.cache/forest/test/rpc-snapshots using restore key prefix rpcsnap- with enableCrossOsArchive: true; added get-cache-hash step to compute hash and write GITHUB_OUTPUT; added cache save step using dynamic key rpcsnap-${{ steps.get-cache-hash.outputs.hash }}; replaced tilde paths with absolute paths and added ls -lhR /home/runner/.cache/forest/test/rpc-snapshots diagnostic.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Build as build.rs
  participant Tests as Generated Tests
  participant Runner as nextest / test runner
  Note over Build,Tests: Codegen change (symbol rename)
  Build->>Tests: generate test fn names `rpc_snapshot_test_*`
  Tests->>Runner: nextest selects tests matching `test(rpc_snapshot_test_)`
  Runner->>Tests: execute tests
Loading
sequenceDiagram
  autonumber
  participant GH as GitHub Actions
  participant Cache as actions/cache
  participant Job as tests job
  GH->>Cache: restore `/home/runner/.cache/.../rpc-snapshots` (key: `rpcsnap-*`)
  GH->>Job: run tests (may produce/consume snapshots)
  Job->>GH: compute hash -> `outputs.hash`
  GH->>Cache: save `/home/runner/.cache/.../rpc-snapshots` (key: `rpcsnap-${{steps.get-cache-hash.outputs.hash}}`)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • akaladarshi
  • LesnyRumcajs
  • elmattic

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 70dfb99 and fa84780.

📒 Files selected for processing (1)
  • .github/workflows/unit-tests.yml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/unit-tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)
✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hm/run-rpc-snapshot-tests-with-cargo-test

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@hanabi1224 hanabi1224 changed the title test: run rpc snapshots with cargo test instead of nextest test: run rpc snapshot tests with cargo test instead of nextest Aug 25, 2025
@hanabi1224 hanabi1224 marked this pull request as ready for review August 25, 2025 07:45
@hanabi1224 hanabi1224 requested a review from a team as a code owner August 25, 2025 07:45
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team August 25, 2025 07:45
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 344f9d0 and 1b343e8.

📒 Files selected for processing (2)
  • .config/nextest.toml (1 hunks)
  • Makefile (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build MacOS
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: All lint checks
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)

Comment thread .config/nextest.toml Outdated
Comment thread Makefile Outdated
test:
cargo nextest run --workspace --no-fail-fast

cargo test --lib rpc_test
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

cargo test --lib rpc_test filters by test name, not by package; likely not selecting the intended crate.

To run the rpc_test crate reliably, target the package with -p (and include integration tests by omitting --lib). Also, snapshot/RPC tests often benefit from single-threaded execution to reduce flakiness.

Apply this change:

-	cargo test --lib rpc_test
+	cargo test -p rpc_test -- --test-threads=1

Notes:

  • If you truly only want library unit tests (not integration tests), add --lib back: cargo test -p rpc_test --lib -- --test-threads=1.
  • Keep the test name filter out of the command; use package selection for clarity and correctness.
🤖 Prompt for AI Agents
In Makefile around line 103, the command `cargo test --lib rpc_test` incorrectly
treats `rpc_test` as a test name filter rather than a package; change it to
target the package using `-p rpc_test` and add single-threaded execution for
stability (if you only want unit lib tests, keep `--lib` as well); e.g., use
package selection and append the Rust test-threads flag so the Makefile runs the
intended crate reliably and with reduced flakiness.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--lib is there!

cargo test --package forest-filecoin --lib rpc_snapshot_test_
...
cargo test --package forest-filecoin --profile quick --lib rpc_snapshot_test_

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 You're absolutely right! I apologize for the confusion. Looking at your actual commands, they are properly structured:

cargo test --package forest-filecoin --lib rpc_snapshot_test_
cargo test --package forest-filecoin --profile quick --lib rpc_snapshot_test_

These commands correctly:

  • Target the forest-filecoin package with --package
  • Use --lib to run only library tests
  • Use rpc_snapshot_test_ as a test name filter (matching the renamed test functions from your build.rs changes)

The diff snippet I initially reviewed showed cargo test --lib rpc_test which didn't match your actual implementation. Your commands are well-formed and should work as intended. Thank you for the clarification!


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

Comment thread Makefile Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
Makefile (2)

101-103: Run snapshot tests single-threaded and make package selection explicit for robustness.

  • To actually reduce flakiness when switching from nextest to cargo test, force the Rust test harness to run snapshot tests serially. This avoids concurrent multi-threaded Tokio runtimes competing for network/resources.
  • Consider making the package selection explicit (or use --workspace) so this works regardless of where the Makefile runs from within the workspace.

Proposed change:

-	cargo nextest run --workspace --no-fail-fast -- --skip rpc_snapshot_test_
-	cargo test --lib rpc_snapshot_test_
+	# Run everything except snapshot tests via nextest
+	cargo nextest run --workspace --no-fail-fast -E 'not test(rpc_snapshot_test_)'
+	# Run snapshot tests serially with the built-in runner
+	cargo test --workspace --lib rpc_snapshot_test_ -- --test-threads=1

Notes:

  • Using nextest’s filterset (-E 'not test(rpc_snapshot_test_)') avoids relying on libtest passthrough flags and is clearer about intent. It’s equivalent to -- --skip rpc_snapshot_test_ per nextest docs.
  • If you prefer to scope narrowly, replace --workspace with -p <crate-containing-snapshots>. Keep --lib since these are unit tests generated into the crate.

109-110: Mirror the single-thread + explicit selection for the quick profile.

Keep behavior consistent in the “release/quick” lane to avoid surprises.

-	cargo nextest run --cargo-profile quick --workspace --no-fail-fast -- --skip rpc_snapshot_test_
-	cargo test --profile quick --lib rpc_snapshot_test_
+	cargo nextest run --cargo-profile quick --workspace --no-fail-fast -E 'not test(rpc_snapshot_test_)'
+	cargo test --profile quick --workspace --lib rpc_snapshot_test_ -- --test-threads=1
🧹 Nitpick comments (1)
build.rs (1)

56-89: Duplicate-ident check passes—no current collisions; raw strings and collision guard recommended

  • We ran the slug-based duplicate detection script against src/tool/subcommands/api_cmd/test_snapshots.txt and found no duplicate identifiers today, so your generated test names won’t collide in the current snapshot set.
  • For extra resilience, consider applying these optional hardening tweaks:
    • Use a raw string literal for the snapshot name in the generated code to prevent accidental escape issues:
    rpc_regression_test_run(r#"{test}"#).await
    • Add a HashSet-based duplicate-ident assertion in the build script to fail fast at compile time if two filenames slug to the same identifier in the future.

Suggested diff:

 fn rpc_regression_tests_gen() {
-    use std::{fs::File, io::BufWriter, path::PathBuf};
+    use std::{collections::HashSet, fs::File, io::BufWriter, path::PathBuf};

     println!("cargo:rerun-if-changed=src/tool/subcommands/api_cmd/test_snapshots.txt");
     let tests: Vec<&str> = include_str!("src/tool/subcommands/api_cmd/test_snapshots.txt")
         .lines()
@@
     let mut w = BufWriter::new(File::create(&out_path).unwrap());
+    let mut seen = HashSet::new();

     for test in tests {
         // Derive a valid Rust identifier from the snapshot filename.
         let ident = test
             .strip_suffix(".rpcsnap.json.zst")
             .unwrap_or(test)
             .chars()
             .map(|c| if c.is_ascii_alphanumeric() { c } else { '_' })
             .collect::<String>();
+        if !seen.insert(ident.clone()) {
+            panic!(
+                "Duplicate RPC snapshot test ident generated: {ident} (from {test}). \
+                Consider renaming the snapshot to avoid identifier collisions."
+            );
+        }

         writeln!(
             w,
             r#"
                 #[tokio::test(flavor = "multi_thread")]
                 async fn rpc_snapshot_test_{ident}() {{
-                    rpc_regression_test_run("{test}").await
+                    rpc_regression_test_run(r#"{test}"#).await
                 }}
             "#,
         )
         .unwrap();
     }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1b343e8 and a52c947.

📒 Files selected for processing (2)
  • Makefile (1 hunks)
  • build.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: All lint checks
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)

@LesnyRumcajs
Copy link
Copy Markdown
Member

fiasco! no green checkmark, no review!

 test tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_walletverify_1755692429792254 ... ok

failures:

---- tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_ethgetblocktransactioncountbyhash_1740132538001911 stdout ----
Testing filecoin_ethgetblocktransactioncountbyhash_1740132538001911.rpcsnap.json.zst ...
thread 'tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_ethgetblocktransactioncountbyhash_1740132538001911' panicked at src/tool/subcommands/api_cmd/test_snapshot.rs:244:45:
called `Result::unwrap()` on an `Err` value: expected value at line 1 column 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_stategetrandomnessfromtickets_1737546933236620 stdout ----
Testing filecoin_stategetrandomnessfromtickets_1737546933236620.rpcsnap.json.zst ...
thread 'tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_stategetrandomnessfromtickets_1737546933236620' panicked at src/tool/subcommands/api_cmd/test_snapshot.rs:244:45:
called `Result::unwrap()` on an `Err` value: expected value at line 1 column 1

---- tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_statelookuprobustaddress_1737546941473578 stdout ----
Testing filecoin_statelookuprobustaddress_1737546941473578.rpcsnap.json.zst ...
thread 'tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_statelookuprobustaddress_1737546941473578' panicked at src/tool/subcommands/api_cmd/test_snapshot.rs:244:45:
called `Result::unwrap()` on an `Err` value: expected value at line 1 column 1


failures:
    tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_ethgetblocktransactioncountbyhash_1740132538001911
    tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_stategetrandomnessfromtickets_1737546933236620
    tool::subcommands::api_cmd::test_snapshot::tests::rpc_snapshot_test_filecoin_statelookuprobustaddress_1737546941473578

test result: FAILED. 220 passed; 3 failed; 0 ignored; 0 measured; 566 filtered out; finished in 103.66s

@LesnyRumcajs LesnyRumcajs marked this pull request as draft August 25, 2025 08:50
@hanabi1224
Copy link
Copy Markdown
Contributor Author

fiasco! no green checkmark, no review!

@LesnyRumcajs Will try CI cache. Made it ready for review because unit tests do not run for draft PRs

@hanabi1224 hanabi1224 marked this pull request as ready for review August 25, 2025 08:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c48a67d and 3712b63.

📒 Files selected for processing (1)
  • .github/workflows/unit-tests.yml (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build MacOS
  • GitHub Check: Build Ubuntu
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: Analyze (go)
  • GitHub Check: Analyze (rust)

Comment thread .github/workflows/unit-tests.yml Outdated
Comment on lines +45 to +49
- uses: actions/cache@v4
with:
path: ~/.cache/forest/test/rpc-snapshots
key: rpcsnap-${{ hashFiles('~/.cache/forest/test/rpc-snapshots/**') }}
enableCrossOsArchive: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Cache key will never change — hashFiles cannot read outside the workspace (results in a constant key).

hashFiles only matches files within the checkout workspace; '~/.cache/...' won’t match anything, producing an empty hash and a constant key 'rpcsnap-'. This risks stale/incorrect snapshots being restored across branches/PRs and could explain the JSON parse panics you’re seeing. Scope the key to stable signals available in-repo and add restore-keys.

Apply this diff:

-      - uses: actions/cache@v4
-        with:
-          path: ~/.cache/forest/test/rpc-snapshots
-          key: rpcsnap-${{ hashFiles('~/.cache/forest/test/rpc-snapshots/**') }}
-          enableCrossOsArchive: true
+      - uses: actions/cache@v4
+        with:
+          path: ~/.cache/forest/test/rpc-snapshots
+          # Use repo-scoped signals to invalidate when gens/tests change.
+          key: rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-${{ hashFiles('build.rs', 'Makefile', 'Cargo.lock') }}
+          restore-keys: |
+            rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-
+            rpcsnap-${{ runner.os }}-${{ runner.arch }}-
+          enableCrossOsArchive: true

Optional: If you want precise invalidation when the generated snapshot set changes, have build.rs emit a small manifest (e.g., rpc-snapshots.manifest) under the repo and include it in hashFiles. This aligns with the author’s “fail-fast” preference for deterministic inputs.


Action Required: Update cache key in .github/workflows/unit-tests.yml

The current cache step uses hashFiles('~/.cache/…'), but GitHub Actions’ hashFiles can only read files inside the repository workspace. As a result, the cache key will always be the constant prefix rpcsnap-, leading to stale or incorrect snapshots being restored across branches or PRs.

Please update the cache configuration:

• File: .github/workflows/unit-tests.yml
• Lines: ~45–49

Suggested diff:

-      - uses: actions/cache@v4
-        with:
-          path: ~/.cache/forest/test/rpc-snapshots
-          key: rpcsnap-${{ hashFiles('~/.cache/forest/test/rpc-snapshots/**') }}
-          enableCrossOsArchive: true
+      - uses: actions/cache@v4
+        with:
+          path: ~/.cache/forest/test/rpc-snapshots
+          # Use workspace-scoped signals to invalidate on key changes
+          key: rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-${{ hashFiles('build.rs', 'Makefile', 'Cargo.lock') }}
+          restore-keys: |
+            rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-
+            rpcsnap-${{ runner.os }}-${{ runner.arch }}-
+          enableCrossOsArchive: true

Optional enhancement: have your build script emit a small manifest file (e.g. rpc-snapshots.manifest) into the repo and include it in hashFiles for precise cache invalidation when snapshot definitions change.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- uses: actions/cache@v4
with:
path: ~/.cache/forest/test/rpc-snapshots
key: rpcsnap-${{ hashFiles('~/.cache/forest/test/rpc-snapshots/**') }}
enableCrossOsArchive: true
- uses: actions/cache@v4
with:
path: ~/.cache/forest/test/rpc-snapshots
# Use workspace-scoped signals to invalidate on key changes
key: rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-${{ hashFiles('build.rs', 'Makefile', 'Cargo.lock') }}
restore-keys: |
rpcsnap-${{ runner.os }}-${{ runner.arch }}-${{ github.ref_name }}-
rpcsnap-${{ runner.os }}-${{ runner.arch }}-
enableCrossOsArchive: true
🤖 Prompt for AI Agents
.github/workflows/unit-tests.yml around lines 45-49: the cache key currently
uses hashFiles on a directory outside the repository workspace which GitHub
Actions cannot read, so the key always collapses to the constant prefix; update
the cache key to hash files inside the repository (for example add a
repo-tracked manifest like rpc-snapshots.manifest or hash the snapshot source
files under the repo), and include stable discriminators such as runner OS and
github.ref or github.sha in the key (e.g. rpcsnap-${{ runner.os }}-${{
github.ref }}-${{ hashFiles('path/inside/repo/**') }}). Ensure the cache path
remains the same but the key uses only workspace-accessible paths so snapshots
are correctly invalidated; optionally add restore-keys for broader hits.

Comment thread .github/workflows/unit-tests.yml Outdated
@hanabi1224 hanabi1224 force-pushed the hm/run-rpc-snapshot-tests-with-cargo-test branch from 1f69f6e to 7afe3b2 Compare August 25, 2025 09:17
@hanabi1224 hanabi1224 force-pushed the hm/run-rpc-snapshot-tests-with-cargo-test branch from 7afe3b2 to 0e902af Compare August 25, 2025 09:18
@hanabi1224 hanabi1224 changed the title test: run rpc snapshot tests with cargo test instead of nextest test: add CI cache for rpc snapshot tests to mitigate flackyness issue Aug 25, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
.github/workflows/unit-tests.yml (1)

67-71: Cache key is constant — hashFiles cannot read outside the workspace; cache will never invalidate.

hashFiles('/home/runner/.cache/...') resolves to an empty string because hashFiles only matches files inside $GITHUB_WORKSPACE. The effective key becomes rpcsnap-, causing cross-run/PR leakage and blocking updates. This can surface as stale/partial snapshots and JSON parse panics. Keep cross-OS sharing but derive the key from repo-tracked signals instead.

Apply this diff:

-      - uses: actions/cache@v4
-        with:
-          path: /home/runner/.cache/forest/test/rpc-snapshots
-          key: rpcsnap-${{ hashFiles('/home/runner/.cache/forest/test/rpc-snapshots/**') }}
-          enableCrossOsArchive: true
+      - uses: actions/cache@v4
+        with:
+          path: $HOME/.cache/forest/test/rpc-snapshots
+          # OS/arch-agnostic by design; version to evolve later if needed.
+          key: rpcsnap-v1-${{ hashFiles('build.rs', '.config/nextest.toml', 'Cargo.lock', 'Makefile') }}
+          restore-keys: |
+            rpcsnap-v1-
+          enableCrossOsArchive: true

Notes:

  • Keeps enableCrossOsArchive: true to honor your OS/arch-agnostic snapshot design.
  • When snapshot generation logic changes, bump v1 to v2.
  • For precise invalidation, have build.rs emit a repo-tracked manifest (e.g., rpc-snapshots.manifest) and include it in hashFiles. I can draft that if helpful.
🧹 Nitpick comments (3)
.github/workflows/unit-tests.yml (3)

85-85: Always show cached contents, even after failures.

Make the diagnostic ls run on failures and avoid hard-coded home paths.

Apply this diff:

-      - run: ls -lhR /home/runner/.cache/forest/test/rpc-snapshots
+      - if: always()
+        run: ls -lhR "$HOME/.cache/forest/test/rpc-snapshots"

67-71: Consider mirroring the cache step in the tests job (if those RPC snapshot tests also run there).

Right now only tests-release benefits from restored snapshots. If make test exercises the same RPC snapshot tests, duplicating this cache step there will cut flakiness and warm-up time.

Would you like me to open a follow-up patch adding the same cache block to the tests job while keeping the same OS/arch-agnostic keying?


85-85: Fix nit: add trailing newline to satisfy linters.

YAMLlint flags “no new line character at the end of file.” Also run Prettier to quiet the GitHub Actions style warning.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d1c270f and d1bf0fc.

📒 Files selected for processing (1)
  • .github/workflows/unit-tests.yml (2 hunks)
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-25T09:53:37.424Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-25T08:06:18.837Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: Makefile:0-0
Timestamp: 2025-08-25T08:06:18.837Z
Learning: hanabi1224 intentionally uses multi-threading (instead of single-threaded execution) in RPC snapshot tests to speed up file downloading operations, prioritizing performance over potential test isolation benefits.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-25T09:57:27.387Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-13T09:43:20.301Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-18T03:09:47.932Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.

Applied to files:

  • .github/workflows/unit-tests.yml
🪛 YAMLlint (1.37.1)
.github/workflows/unit-tests.yml

[error] 85-85: no new line character at the end of file

(new-line-at-end-of-file)

🪛 GitHub Actions: Script linters
.github/workflows/unit-tests.yml

[warning] 1-1: Code style issues found in the above file. Run Prettier with --write to fix.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: tests
  • GitHub Check: tests-release
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)

Comment thread .config/nextest.toml Outdated
filter = 'test(rpc_test_)'
filter = 'test(rpc_snapshot_test_)'
slow-timeout = { period = "120s", terminate-after = 3 }
retries = { backoff = "exponential", count = 4, delay = "5s", jitter = true }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this mitigates the flakiness, then the retries should not be needed anymore.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think retry and jitter is still useful for avoiding thundering herd issues. I can reduce the retires.count to 3

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just love the term thundering herd.

@hanabi1224 hanabi1224 marked this pull request as draft August 25, 2025 11:13
@hanabi1224 hanabi1224 force-pushed the hm/run-rpc-snapshot-tests-with-cargo-test branch from 443c8ad to d472ae8 Compare August 25, 2025 11:35
@hanabi1224 hanabi1224 marked this pull request as ready for review August 25, 2025 13:58
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
.github/workflows/unit-tests.yml (1)

61-64: Minor: Remove or gate the public IP print.

Printing the runner’s egress IP is rarely actionable and shows up in public logs. Consider removing it or gating behind a debug flag.

Example:

-      - name: Show IP
-        run: curl ifconfig.me
-        continue-on-error: true
+      # Debug-only: reveal egress IP when needed
+      - name: Show IP (debug)
+        if: ${{ runner.debug == '1' }}
+        run: curl -s ifconfig.me || true
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6cc77af and 70dfb99.

📒 Files selected for processing (2)
  • .config/nextest.toml (1 hunks)
  • .github/workflows/unit-tests.yml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • .config/nextest.toml
🧰 Additional context used
🧠 Learnings (6)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
📚 Learning: 2025-08-25T09:53:37.424Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-25T08:06:18.837Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: Makefile:0-0
Timestamp: 2025-08-25T08:06:18.837Z
Learning: hanabi1224 intentionally uses multi-threading (instead of single-threaded execution) in RPC snapshot tests to speed up file downloading operations, prioritizing performance over potential test isolation benefits.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-25T09:57:27.387Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-13T09:43:20.301Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

Applied to files:

  • .github/workflows/unit-tests.yml
📚 Learning: 2025-08-18T03:09:47.932Z
Learnt from: hanabi1224
PR: ChainSafe/forest#5944
File: src/chain/store/index.rs:0-0
Timestamp: 2025-08-18T03:09:47.932Z
Learning: In Forest's tipset_by_height caching implementation, hanabi1224 prefers performance-conscious solutions that leverage finality guarantees rather than expensive chain walking for fork detection. The approach of constraining cache lookups to finalized epochs (using CHECKPOINT_INTERVAL >= CHAIN_FINALITY) provides fork safety without the performance cost of ancestry verification.

Applied to files:

  • .github/workflows/unit-tests.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: cargo-publish-dry-run
  • GitHub Check: Build Ubuntu
  • GitHub Check: Build MacOS
  • GitHub Check: All lint checks
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests-release
  • GitHub Check: tests
  • GitHub Check: Analyze (rust)
  • GitHub Check: Analyze (go)

Comment on lines +67 to +72
- uses: actions/cache/restore@v4
with:
path: |
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test
key: rpcsnap-
enableCrossOsArchive: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Cache restore will never hit (mismatched key semantics).

You save with a content hash key (rpcsnap-<hash>) but restore with a constant key (rpcsnap-) and no restore-keys. GitHub Actions requires an exact key match unless restore-keys are provided. As written, restore always misses and the cache is never used.

Apply this diff to enable prefix restores and actually leverage previously saved caches:

-      - uses: actions/cache/restore@v4
-        with:
-          path: |
-            /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
-          key: rpcsnap-
-          enableCrossOsArchive: true
+      - uses: actions/cache/restore@v4
+        with:
+          path: |
+            /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
+          # Broad key + prefix-based restore so the latest rpcsnap-<hash> is picked up.
+          key: rpcsnap-
+          restore-keys: |
+            rpcsnap-
+          enableCrossOsArchive: true
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- uses: actions/cache/restore@v4
with:
path: |
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test
key: rpcsnap-
enableCrossOsArchive: true
- uses: actions/cache/restore@v4
with:
path: |
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test
# Broad key + prefix-based restore so the latest rpcsnap-<hash> is picked up.
key: rpcsnap-
restore-keys: |
rpcsnap-
enableCrossOsArchive: true
🤖 Prompt for AI Agents
.github/workflows/unit-tests.yml around lines 67 to 72: the cache restore uses a
constant key "rpcsnap-" while the save step uses a content-hash key like
"rpcsnap-<hash>", so restores will always miss; update the restore step to
either use the same dynamic hash key or, preferably, add a restore-keys entry
with the prefix "rpcsnap-" (e.g., restore-keys: | rpcsnap-) so prefix restores
can match saved caches, and ensure the primary key used when saving remains the
content-hash format and enableCrossOsArchive remains set to true.

Comment on lines +70 to +71
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test
key: rpcsnap-
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Path consistency and existence checks for rpc_test vs. renamed test prefix.

You renamed generated test functions to rpc_snapshot_test_*, but the cache path is still .../rpc_test. If that directory name is intentional, all good. Otherwise, the path and the ls/hashing step won’t find any files.

Run this repo-scoped verification to confirm where snapshots are written and referenced:

If the directory should be rpc_snapshot_test, update the workflow paths and commands accordingly.

Also applies to: 86-89, 93-95


🏁 Script executed:

#!/bin/bash
# Confirm generator output naming and snapshot directory usage.
rg -nC2 -g '!**/target/**' 'rpc_snapshot_test_' build.rs . || true
rg -nC2 -g '!**/target/**' 'rpc-snapshots/.*/rpc_test' . || true
# List any populated snapshot directories at runtime path (for local runs)
ls -ld ~/.cache/forest/test/rpc-snapshots/* || true

Length of output: 836


🏁 Script executed:

#!/bin/bash
set -e
# Search for any hard-coded references to the snapshot cache directory
rg -nC3 'rpc-snapshots' -g '!**/target/**' .

# Verify the test function name prefix used in build.rs
rg -nC3 'rpc_snapshot_test_' -g '!**/target/**' .

# Locate the implementation of the regression test runner and see which directory it writes to
rg -nC3 'fn rpc_regression_test_run' -g '!**/target/**' .

Length of output: 4452


🏁 Script executed:

#!/bin/bash
set -e
# Show relevant lines from the unit-tests workflow (lines 60-100)
sed -n '60,100p' .github/workflows/unit-tests.yml

Length of output: 1554


Fix cache directory path mismatch The unit-tests workflow is caching under ~/.cache/forest/test/rpc-snapshots/rpc_test, but the RPC snapshot runner writes into ~/.cache/forest/test/rpc-snapshots (no rpc_test subdirectory). As a result, both the restore (actions/cache/restore) and hash/ls steps will find nothing.

Please update the workflow to use the correct directory:

• In .github/workflows/unit-tests.yml (lines 70–71, and similarly at 86–89 & 93–95):

# before
path: |
  /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
# after
path: |
  /home/runner/.cache/forest/test/rpc-snapshots

# before (ls/hash step)
ls -lhR ~/.cache/forest/test/rpc-snapshots/rpc_test/*
# after
ls -lhR ~/.cache/forest/test/rpc-snapshots/*

# and use the same path under actions/cache/save

• If the rpc_test subdirectory is intentional, modify the snapshot runner or code that writes snapshots to create and use …/rpc-snapshots/rpc_test instead.

These changes are critical to ensure your cache restore, hash computation, and save steps actually pick up the generated snapshots.

🤖 Prompt for AI Agents
In .github/workflows/unit-tests.yml around lines 70–71 (and similarly at 86–89 &
93–95): the workflow caches and lists
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test but the RPC snapshot
runner writes into /home/runner/.cache/forest/test/rpc-snapshots (no rpc_test
subdirectory), so restore/hash/save steps find nothing; update the workflow
paths and ls/hash/save commands to use
/home/runner/.cache/forest/test/rpc-snapshots (replace any occurrences of the
trailing /rpc_test and corresponding glob patterns) so the cache restore,
ls/hash, and actions/cache/save use the actual snapshot directory, or
alternatively adjust the snapshot writer to create and write into the rpc_test
subdirectory if that layout is intentional.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no writer. Currently all snapshots are manually updated to the DitialOcean space.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Look for snapshot download/fetch logic and DigitalOcean references
rg -nC5 -i 'digitalocean\|do_space\|snapshot.*download\|fetch.*snapshot' -g '!**/target/**' .

# Find where rpc_regression_test_run is implemented to see how it handles snapshots
rg -nC10 'fn rpc_regression_test_run' -g '!**/target/**' .

# Look for any HTTP/download logic in the test snapshot code
rg -nC5 'http\|download\|fetch\|curl\|reqwest' src/tool/subcommands/api_cmd/test_snapshot.rs

# Check if there are any environment variables or configuration related to snapshot storage
rg -nC3 '\.cache.*forest.*test.*rpc' -g '!**/target/**' .

Length of output: 2318


🏁 Script executed:

#!/bin/bash
# Get more context from the rpc_regression_test_run function
cat src/tool/subcommands/api_cmd/test_snapshot.rs | sed -n '197,250p'

# Look for snapshot-related constants, URLs, or configuration
rg -nC5 'SNAPSHOT\|snapshot\|\.cache' src/tool/subcommands/api_cmd/test_snapshot.rs

# Check for any HTTP client or download logic in the entire file
grep -n -A10 -B5 -i 'reqwest\|http\|download\|fetch\|client' src/tool/subcommands/api_cmd/test_snapshot.rs || echo "No HTTP/download code found"

# Look for where the cache directory is actually defined
rg -nC3 'cache.*forest.*test' -g '!**/target/**' .

Length of output: 5379


@hanabi1224 Thank you for the clarification! You're absolutely right - the snapshots are downloaded from the DigitalOcean CDN at https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/.

However, my analysis about the cache path mismatch is still valid. Looking at the code in rpc_regression_test_run():

  • Line 218: The local cache directory is project_dir.cache_dir().join("test").join("rpc-snapshots")
  • Line 230: Downloads go to &cache_dir (which is ~/.cache/forest/test/rpc-snapshots)

But your workflow is trying to cache:

  • /home/runner/.cache/forest/test/rpc-snapshots/rpc_test

The actual cache location should be:

  • /home/runner/.cache/forest/test/rpc-snapshots

The /rpc_test part is the remote path on DigitalOcean, not a local subdirectory. The cache restore/save steps won't find anything with the current paths.

  /\_/\  
 ( o.o ) 
  > ^ <

curious whiskers twitching


✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:70-71
Timestamp: 2025-08-25T14:13:54.264Z
Learning: hanabi1224's RPC snapshot tests download snapshots from DigitalOcean CDN at https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/ and cache them locally at ~/.cache/forest/test/rpc-snapshots (not in a rpc_test subdirectory).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:86-96
Timestamp: 2025-08-25T14:11:10.764Z
Learning: hanabi1224's RPC snapshot tests include a checksum validation mechanism that compares against HTTP HEAD headers, automatically re-downloading snapshots if the cached version doesn't match, providing protection against corrupted cached files.

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

download_file_with_cache would create that subdirectory since it's part of URL path

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

Let me examine the download_file_with_cache implementation to understand how URL paths are mapped to local directories.

@hanabi1224 You're absolutely right! I apologize for the confusion. Looking at the download_file_with_cache function, it does indeed preserve the URL path structure when creating the local cache directory. Since the DigitalOcean URL is https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/, the function creates a local rpc_test subdirectory under the cache path.

So the workflow cache paths are correct:

  • Cache location: /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
  • This matches where download_file_with_cache actually stores the files

Thank you for the correction - the caching setup in your workflow is properly aligned with how the download function organizes files locally.

  /\_/\  
 ( ^.^ ) 
  > v <

grateful whiskers


🏁 Script executed:

#!/bin/bash
# Find and examine the download_file_with_cache implementation
rg -nC10 'fn download_file_with_cache' -g '!**/target/**' .

# Also look for any URL-to-path mapping logic
rg -nC5 'url.*path\|path.*url' src/tool/subcommands/api_cmd/test_snapshot.rs

Length of output: 1456


Perfect! The script output confirms exactly what you explained. Looking at line 36 in src/utils/net/download_file.rs:

let cache_file_path = cache_dir.join(url.path().strip_prefix('/').unwrap_or_else(|| url.path()));

This shows that download_file_with_cache preserves the URL path structure when creating local cache paths. For your DigitalOcean URL https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/, the function:

  1. Extracts the path /rpc_test/filename
  2. Strips the leading / to get rpc_test/filename
  3. Joins it with the cache directory: ~/.cache/forest/test/rpc-snapshots/rpc_test/filename

So your workflow cache configuration is absolutely correct - the files do get stored in the rpc_test subdirectory as specified. Thank you for the correction!

  /\_/\  
 ( ^.^ ) 
  > v <

learns something new every day


✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:0-0
Timestamp: 2025-08-25T14:17:09.086Z
Learning: hanabi1224's download_file_with_cache function preserves URL path structure in local cache directories, so snapshots from https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/ are cached locally at ~/.cache/forest/test/rpc-snapshots/rpc_test (including the rpc_test subdirectory from the URL path).

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:0-0
Timestamp: 2025-08-25T14:17:09.086Z
Learning: hanabi1224's download_file_with_cache function in src/utils/net/download_file.rs preserves URL path structure in local cache directories by using cache_dir.join(url.path().strip_prefix('/').unwrap_or_else(|| url.path())), so snapshots from https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/ are cached locally at ~/.cache/forest/test/rpc-snapshots/rpc_test/ (including the rpc_test subdirectory from the URL path).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


⛔ Learnings removed
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:70-71
Timestamp: 2025-08-25T14:13:54.264Z
Learning: hanabi1224's RPC snapshot tests download snapshots from DigitalOcean CDN at https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/ and cache them locally at ~/.cache/forest/test/rpc-snapshots (not in a rpc_test subdirectory).

🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:70-71
Timestamp: 2025-08-25T14:13:54.264Z
Learning: hanabi1224's RPC snapshot tests download snapshots from DigitalOcean CDN at https://forest-snapshots.fra1.cdn.digitaloceanspaces.com/rpc_test/ and cache them locally at ~/.cache/forest/test/rpc-snapshots (not in a rpc_test subdirectory).

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:86-96
Timestamp: 2025-08-25T14:11:10.764Z
Learning: hanabi1224's RPC snapshot tests include a checksum validation mechanism that compares against HTTP HEAD headers, automatically re-downloading snapshots if the cached version doesn't match, providing protection against corrupted cached files.

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

Comment on lines +86 to +96
- id: get-cache-hash
run: |
ls -lhR ~/.cache/forest/test/rpc-snapshots/rpc_test/*
echo "hash=$(openssl md5 ~/.cache/forest/test/rpc-snapshots/rpc_test/* | sort | openssl md5 | cut -d ' ' -f 2)" >> $GITHUB_OUTPUT
shell: bash
- uses: actions/cache/save@v4
with:
path: |
/home/runner/.cache/forest/test/rpc-snapshots/rpc_test
key: rpcsnap-${{ steps.get-cache-hash.outputs.hash }}
enableCrossOsArchive: true
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Aug 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Make hashing/saving robust; avoid step failures on empty/missing files and save even when tests fail.

  • ls -lhR ~/.cache/.../rpc_test/* will exit non‑zero if the directory is empty or missing, failing the job.
  • openssl md5 is brittle; prefer sha256sum which is present on runners.
  • cache/save won’t run if make test-release fails; you likely want to persist downloaded snapshots to help subsequent retries.

Apply this diff to harden the steps:

-      - id: get-cache-hash
-        run: |
-          ls -lhR ~/.cache/forest/test/rpc-snapshots/rpc_test/*
-          echo "hash=$(openssl md5 ~/.cache/forest/test/rpc-snapshots/rpc_test/* | sort | openssl md5 | cut -d ' ' -f 2)" >> $GITHUB_OUTPUT
-        shell: bash
-      - uses: actions/cache/save@v4
-        with:
-          path: |
-            /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
-          key: rpcsnap-${{ steps.get-cache-hash.outputs.hash }}
-          enableCrossOsArchive: true
+      - id: get-cache-hash
+        if: always()
+        shell: bash
+        run: |
+          set -euo pipefail
+          dir="/home/runner/.cache/forest/test/rpc-snapshots/rpc_test"
+          # Don't fail if directory is empty/missing.
+          if [[ ! -d "$dir" ]]; then
+            echo "hash=" >> "$GITHUB_OUTPUT"
+            exit 0
+          fi
+          shopt -s nullglob
+          files=("$dir"/*)
+          if [[ ${#files[@]} -eq 0 ]]; then
+            echo "hash=" >> "$GITHUB_OUTPUT"
+            exit 0
+          fi
+          # Optional: list for diagnostics, but never fail.
+          ls -lhR "$dir" || true
+          # Stable content hash across OS/arch.
+          printf '%s\0' "${files[@]}" \
+            | xargs -0 sha256sum \
+            | sort -k1,1 \
+            | sha256sum \
+            | awk '{print "hash="$1}' >> "$GITHUB_OUTPUT"
+
+      - uses: actions/cache/save@v4
+        if: ${{ always() && steps.get-cache-hash.outputs.hash != '' }}
+        with:
+          path: |
+            /home/runner/.cache/forest/test/rpc-snapshots/rpc_test
+          key: rpcsnap-${{ steps.get-cache-hash.outputs.hash }}
+          enableCrossOsArchive: true

Optional guard (if you see flaky/corrupted snapshots): add a quick JSON validation before saving to avoid caching error pages/rate-limit bodies.

      - name: Validate snapshots are JSON
        if: ${{ always() }}
        shell: bash
        run: |
          set -euo pipefail
          dir="/home/runner/.cache/forest/test/rpc-snapshots/rpc_test"
          if [[ -d "$dir" ]]; then
            find "$dir" -type f -maxdepth 1 -print0 | xargs -0 -n1 sh -c 'jq -e . "$0" >/dev/null 2>&1 || echo "Non-JSON snapshot: $0"' || true
          fi

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. The test will re-download proper snapshot if checksum does not match the http head header

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224, understood! Thanks for the clarification about the checksum validation against HTTP HEAD headers - that's a solid fallback mechanism that handles the corruption scenarios I was concerned about. Your current implementation makes sense given that validation layer.


✏️ Learnings added
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:86-96
Timestamp: 2025-08-25T14:11:10.764Z
Learning: hanabi1224's RPC snapshot tests include a checksum validation mechanism that compares against HTTP HEAD headers, automatically re-downloading snapshots if the cached version doesn't match, providing protection against corrupted cached files.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:53:37.424Z
Learning: hanabi1224 intentionally designs RPC snapshot test cache to be OS/arch-agnostic, sharing cached snapshot files across different platforms using enableCrossOsArchive: true and hash-based cache keys to improve cache efficiency and reduce redundant downloads.

Learnt from: hanabi1224
PR: ChainSafe/forest#5978
File: .github/workflows/unit-tests.yml:72-76
Timestamp: 2025-08-25T09:57:27.387Z
Learning: hanabi1224's strategy for RPC snapshot cache versioning: branch isolation is not needed currently, but if snapshot generation logic changes in the future, a version string can be added to the cache key to handle the evolution.

Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.

@LesnyRumcajs LesnyRumcajs added this pull request to the merge queue Aug 26, 2025
Merged via the queue into main with commit 2d6412b Aug 26, 2025
44 checks passed
@LesnyRumcajs LesnyRumcajs deleted the hm/run-rpc-snapshot-tests-with-cargo-test branch August 26, 2025 08:57
@coderabbitai coderabbitai Bot mentioned this pull request Jan 28, 2026
6 tasks
@coderabbitai coderabbitai Bot mentioned this pull request Feb 9, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants