Skip to content

feat(nimbus): Always fetch newer results#15871

Open
RJAK11 wants to merge 3 commits into
mainfrom
15737
Open

feat(nimbus): Always fetch newer results#15871
RJAK11 wants to merge 3 commits into
mainfrom
15737

Conversation

@RJAK11

@RJAK11 RJAK11 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Because

  • the fetch task used experiment status and a fixed time window to decide when to fetch results

This commit

  • adds a field to track when an experiment's results were last loaded
  • finds the newest modified time among each experiment's statistics, metadata, and errors files
  • fetches results when that timestamp is newer than the stored one and saves the new timestamp

Fixes #15737

@mikewilli mikewilli left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work with our error checks? Right now we inject an error inside the client code if we expect results but the file doesn't exist. I think with this new feature we'd just skip over them.

Comment thread experimenter/experimenter/experiments/tests/test_changelog_utils.py Outdated
Comment thread experimenter/experimenter/jetstream/tasks.py Outdated
Comment thread experimenter/experimenter/jetstream/tasks.py Outdated
Comment thread experimenter/experimenter/jetstream/tasks.py Outdated
@RJAK11

RJAK11 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Will this work with our error checks? Right now we inject an error inside the client code if we expect results but the file doesn't exist. I think with this new feature we'd just skip over them.

Thank you @mikewilli for the review! You are right, the changes would've skipped the error checks and I didn’t notice that. I added a new function so that the task also re-fetches when we expect a window whose file isn't in the bucket, so now the client should still inject the error. It's bounded by the analysis in the exact way we originally used so a result that never arrives doesn't just keep refetching forever. This might be a messy solution but I couldn’t think of another approach. Let me know what you think!

@RJAK11 RJAK11 requested a review from mikewilli June 16, 2026 13:27
Comment thread experimenter/experimenter/jetstream/client.py

@jaredlockhart jaredlockhart left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I understand better, yes it makes sense to compute whether we 'expect' results to compute errors but if there's anything newer we always fetch it anyways so it still catches the 'long since ended but forced reran' catches so we don't have to manually reload. This is good, great test coverage. One idea about using the stored results date instead of adding a new field 🙏

Comment thread experimenter/experimenter/jetstream/client.py Outdated
Comment thread experimenter/experimenter/jetstream/tasks.py Outdated
Rana Al-Khulaidi added 2 commits June 18, 2026 20:10
Because

* the fetch task used experiment status and a fixed time window to decide when to fetch results

This commit

* adds a field to track when an experiment's results were last loaded
* finds the newest modified time among each experiment's statistics, metadata, and errors files
* refreshes when that timestamp is newer than the stored one, and saves the new timestamp after loading
* still refreshes when an expected results window is missing from the bucket so that missing-results errors are injected

Fixes #15737
Because

* the fetch task used experiment status and a fixed time window to decide when to fetch results

This commit

* adds a field to track when an experiment's results were last loaded
* finds the newest modified time among each experiment's statistics, metadata, and errors files
* refreshes when that timestamp is newer than the stored one, and saves the new timestamp after loading
* still refreshes when an expected results window is missing from the bucket so that missing-results errors are injected

Fixes #15737
Because

* the fetch task used experiment status and a fixed time window to decide when to fetch results

This commit

* adds a field to track when an experiment's results were last loaded
* finds the newest modified time among each experiment's statistics, metadata, and errors files
* refreshes when that timestamp is newer than the stored one, and saves the new timestamp after loading
* still refreshes when an expected results window is missing from the bucket so that missing-results errors are injected

Fixes #15737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Always fetch new results

3 participants