Skip to content

Collapse in-flight reconnects by coverage to close the SequentialQueue dedup gap#94210

Open
adhorodyski wants to merge 7 commits into
Expensify:mainfrom
callstack-internal:reconnect-patterns-amplifier
Open

Collapse in-flight reconnects by coverage to close the SequentialQueue dedup gap#94210
adhorodyski wants to merge 7 commits into
Expensify:mainfrom
callstack-internal:reconnect-patterns-amplifier

Conversation

@adhorodyski

@adhorodyski adhorodyski commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@mountiny

Explanation of Change

The SequentialQueue sends one request at a time and parks the rest in a waiting line. The old duplicate-check only looked at the waiting line, never at the request already on the wire. So if a ReconnectApp was mid-send and another identical one came in, the second one sailed through. The NetworkState comment even claimed the opposite ("Duplicate reconnectApp() calls are safe, SQ deduplicates them"), which was wrong in exactly that window.

This PR adds a resolver, resolveReconnectDuplicationConflictAction, that also looks at the in-flight request and compares by coverage: how far back a reconnect re-fetches (updateIDFrom, where a full reconnect re-fetches everything). It has three outcomes:

  • An incoming ReconnectApp already covered by one in flight or queued is dropped (noAction).
  • A wider one is kept (push) and runs after, so we never re-fetch too little.
  • A live OpenApp counts as covering an incoming reconnect, since OpenApp re-fetches everything.

Concrete example: you are on a flaky connection that flips offline then online twice in a second. The first online edge fires a ReconnectApp and it starts sending. Before it finishes, the second edge fires another identical ReconnectApp. Before this PR, both go out, two large downloads. After, the second one sees the first is already fetching everything it would, and drops itself. One download. If the second reconnect were wider instead (a full reconnect arriving while a small incremental one is in flight), it survives and runs next, so nothing is missed.

Only the RECONNECT_APP call site uses the new resolver. OpenApp keeps the behavior it had before this PR: it dedupes through the generic writeWithNoDuplicatesConflictAction (a queue-only replace where the latest one wins), because its successData carries caller-specific preservation writes that coverage cannot see, so an OpenApp that lands mid-flight still runs and nothing is silently dropped. No other command is affected.

This is a safety net, not the main throttle. The bulk of the storm reduction lives in two other PRs that cut the cause: "Stop repeated ReconnectApp calls when the device clock is behind the server" (#93426) and "coalesce redundant GetMissingOnyxMessages while a fetch is in flight" (#93948). This PR adds the queue-level correctness on top: if a duplicate reconnect still slips through, the queue now catches it and drops it. It also fixes the wrong NetworkState comment and documents the resolver in SEQUENTIAL_QUEUE.md.

Fixed Issues

$ #92541
PROPOSAL:

Tests

  1. Run RequestConflictUtilsTest.ts. It covers the coverage-table cases: full/full drops, full/incremental drops, incremental/incremental compared by coverage, incremental/full keeps, and OpenApp during an in-flight reconnect always runs.
  2. Run SequentialQueueTest.ts. It checks the full flow: a duplicate ReconnectApp is dropped so exactly one request goes on the wire; an in-flight incremental plus an incoming full means the full one runs after; a different command is not collapsed; a waiting full is not overwritten by an incoming incremental; IS_LOADING_REPORT_DATA still settles to false after a drop; a failed in-flight reconnect is retried with no data loss; and an incoming OpenApp during an in-flight reconnect still runs.
  • Verify that no errors appear in the JS console

Offline tests

N/A

QA Steps

  1. With the app in the background, toggle airplane mode off, on, off rapidly (2 to 3 times within about 1 second). Confirm: the loading indicator appears once and resolves cleanly; no stuck loading bar; no duplicate network requests in the network inspector.
  2. Force-close the app and cold-boot it. Confirm: OpenApp runs, the home screen loads fully, and no loading state is stuck.
  3. Navigate to a chat, go offline for 5 seconds, then come back online. Confirm: messages sync correctly and the UI returns to a normal responsive state without a manual refresh.
  • Verify that no errors appear in the JS console

PR Author Checklist

  • I linked the correct issue in the ### Fixed Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I added steps for the expected offline behavior in the Offline steps section
    • I added steps for Staging and/or Production testing in the QA steps section
    • I added steps to cover failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android: Native
    • Android: mWeb Chrome
    • iOS: Native
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))
  • If new assets were added or existing ones were modified, I verified that:
    • The assets are optimized and compressed (for SVG files, run npm run compress-svg)
    • The assets load correctly across all supported platforms.
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • I added unit tests for any new feature or bug fix in this PR to help automatically prevent regressions in this user flow.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.

Screenshots/Videos

Android: Native
Android: mWeb Chrome
iOS: Native
iOS: mWeb Safari
MacOS: Chrome / Safari

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

❌ Looks like you've decreased code coverage for some files. Please write tests to increase, or at least maintain, the existing level of code coverage. See our documentation here for how to interpret this table.

Files with missing lines Coverage Δ
src/libs/NetworkState.ts 93.37% <ø> (ø)
src/libs/actions/App.ts 52.65% <100.00%> (ø)
src/libs/actions/RequestConflictUtils.ts 100.00% <100.00%> (ø)
src/libs/API/index.ts 92.45% <83.33%> (-0.55%) ⬇️
... and 16 files with indirect coverage changes

@adhorodyski

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 00775b5dfb

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/libs/actions/RequestConflictUtils.ts
adhorodyski and others added 3 commits June 23, 2026 12:29
…aller-specific side effects always run

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…connectApp on the coverage resolver

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@adhorodyski

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc7717e6dd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/libs/actions/RequestConflictUtils.ts
@mountiny mountiny self-requested a review June 23, 2026 14:04
@mountiny mountiny marked this pull request as ready for review June 23, 2026 14:04
@mountiny mountiny requested review from a team as code owners June 23, 2026 14:04
@melvin-bot melvin-bot Bot requested review from blimpich, flaviadefaria and mkhutornyi and removed request for a team June 23, 2026 14:04
@melvin-bot

melvin-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

@mkhutornyi @blimpich One of you needs to copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

@melvin-bot melvin-bot Bot removed the request for review from a team June 23, 2026 14:05
@mountiny mountiny marked this pull request as draft June 23, 2026 14:05
@mountiny

Copy link
Copy Markdown
Contributor

@codex review

@mountiny

Copy link
Copy Markdown
Contributor

@MelvinBot review

@MelvinBot

Copy link
Copy Markdown
Contributor

Review

Verdict: The approach is sound and well-tested. I traced the two open codex P2s and the core logic — the design holds up. One of the P2s is already fixed; the other is not a real bug. Details below.

The unaddressed codex P2 (full-reconnect cutoff freshness) is a false alarm

Codex flagged that dropping a newer full reconnect could let an older in-flight one write a stale LAST_FULL_RECONNECT_TIME and re-trigger a loop. It can't, for two independent reasons:

  1. The guard is written synchronously, before the request. triggerFullReconnect merges LAST_FULL_RECONNECT_TIME up front, then calls reconnectApp(). Dropping the duplicate request doesn't undo that merge, so shouldTriggerFullReconnect is already false for the new cutoff regardless of which request wins the queue.
  2. The written value is floored at now. successData writes getLastFullReconnectTimeToRecord(serverReconnectCutoff) = max(now, cutoff) (FullReconnectUtils.ts), reading the live module-level serverReconnectCutoff at build time (App.ts:398), not a value snapshotted at trigger time. So it can't regress below now.

The only residual edge is server cutoff set ahead of a skewed client clock — bounded to one extra, self-correcting reconnect, and it predates this PR (it's exactly what the max() guards). Worth a reply on that thread so it doesn't block.

Core logic checks out

  • Coverage math is correct. Lower updateIDFrom = more coverage; full reconnect / OpenApp = 0 = covers everything. Incoming dropped iff some live reconnect-family request has coverage <= incoming. Verified against the test table — incremental(500) covers incoming incremental(600) (drop), but incoming incremental(400) survives a live 500 (push, runs after). No under-fetch.
  • Reading getOngoingRequest() inside the closure is right. checkAndFixConflictingRequest runs twice — a read-only pass in prepareRequest and the authoritative pass in SequentialQueue.push. The ongoing request can change between them, so evaluating it live (not captured once) is what keeps both passes consistent with the real in-flight state.
  • The dropped-request spinner is safe. IS_LOADING_REPORT_DATA is a single global key set true/false in every reconnect-family request's optimistic/finallyData (App.ts:355-369). On noAction, optimistic data is suppressed (so the drop never sets it true), and the covering in-flight request clears the shared flag when it finishes. Covered by the IS_LOADING_REPORT_DATA test.
  • {conflictAction: {type: 'push'}} is type-complete (pushNewRequest belongs only to delete) and data: {updateIDFrom: unknown} type-checks against Record<string, unknown> with the runtime typeof === 'number' guard. No issues.

Notes (non-blocking)

  • The failing checklist check is the human Reviewer Checklist (@mkhutornyi/@blimpich), not a code failure.
  • Tests are thorough — the suite pins the resolver table, the queue wiring, the under-fetch fix, the spinner, and OpenApp-during-reconnect. Nice that they're wired through the real resolver rather than a stand-in matcher.
Investigation method

Spawned two researchers against main: one to trace the full-reconnect cutoff mechanism (subscribeToFullReconnecttriggerFullReconnectgetOnyxDataForOpenOrReconnect), one to validate the resolver wiring (two-pass conflict evaluation, optimistic-data suppression, type safety). Both returned high confidence; findings verified directly against source at the cited lines.


Next Steps: Reply with @MelvinBot <feedback> to dig deeper on any point.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc7717e6dd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/libs/actions/RequestConflictUtils.ts
Comment thread src/libs/actions/RequestConflictUtils.ts
@adhorodyski adhorodyski marked this pull request as ready for review June 23, 2026 15:59

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc7717e6dd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/libs/API/index.ts
blimpich
blimpich previously approved these changes Jun 24, 2026

@blimpich blimpich left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few NAB comments

Comment thread src/libs/actions/RequestConflictUtils.ts Outdated
Comment thread src/libs/actions/RequestConflictUtils.ts Outdated
Comment thread src/libs/API/index.ts Outdated
…narrow readUpdateIDFrom return type

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mkhutornyi

mkhutornyi commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Reviewer Checklist

  • I have verified the author checklist is complete (all boxes are checked off).
  • I verified the correct issue is linked in the ### Fixed Issues section above
  • I verified testing steps are clear and they cover the changes made in this PR
    • I verified the steps for local testing are in the Tests section
    • I verified the steps for Staging and/or Production testing are in the QA steps section
    • I verified the steps cover any possible failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
  • I checked that screenshots or videos are included for tests on all platforms
  • I included screenshots or videos for tests on all platforms
  • I verified that the composer does not automatically focus or open the keyboard on mobile unless explicitly intended. This includes checking that returning the app from the background does not unexpectedly open the keyboard.
  • I verified tests pass on all platforms & I tested again on:
    • Android: HybridApp
    • Android: mWeb Chrome
    • iOS: HybridApp
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
  • If there are any errors in the console that are unrelated to this PR, I either fixed them (preferred) or linked to where I reported them in Slack
  • I verified proper code patterns were followed (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick).
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I verified that this PR follows the guidelines as stated in the Review Guidelines
  • I verified other components that can be impacted by these changes have been tested, and I retested again (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar have been tested & I retested again)
  • If a new component is created I verified that:
    • A similar component doesn't exist in the codebase
    • All props are defined accurately and each prop has a /** comment above it */
    • The file is named correctly
    • The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
    • The only data being stored in the state is data necessary for rendering and nothing else
    • For Class Components, any internal methods passed to components event handlers are bound to this properly so there are no scoping issues (i.e. for onClick={this.submit} the method this.submit should be bound to this in the constructor)
    • Any internal methods bound to this are necessary to be bound (i.e. avoid this.submit = this.submit.bind(this); if this.submit is never passed to a component event handler like onClick)
    • All JSX used for rendering exists in the render method
    • The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG)
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • For any bug fix or new feature in this PR, I verified that sufficient unit tests are included to prevent regressions in this flow.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.
  • I have checked off every checkbox in the PR reviewer checklist, including those that don't apply to this PR.

Screenshots/Videos

Android: HybridApp
android.mov
Android: mWeb Chrome
iOS: HybridApp
ios.mov
iOS: mWeb Safari
MacOS: Chrome / Safari
web.mov

@MelvinBot

Copy link
Copy Markdown
Contributor

🤖 Review: looks correct and well-tested. The coverage model is sound, the wiring is right, and the test matrix covers the meaningful cases. One safe-but-worth-noting behavioral change vs. main, plus a couple of minor notes.

Core logic ✅

resolveReconnectDuplicationConflictAction drops an incoming reconnect iff some live (in-flight or queued) reconnect-family request has coverage <= incomingCoverage. Since lower updateIDFrom ⇒ strict superset of the range fetched ([from..now]), the <= comparison is monotone and the drop/keep decision is correct for every full/incremental pairing. Never drops a wider request → never under-fetches, which is the property that matters. The in-flight read is captured correctly: getOngoingRequest() is evaluated inside the closure, and params.updateIDFrom is only set for incremental reconnects in reconnectApp, so a full reconnect correctly maps to coverage 0.

One behavioral change to be aware of (safe)

This resolver only ever returns push or noAction — never delete/replace. The previous generic path (resolveDuplicationConflictActionreplace, latest-wins) meant a queued reconnect was always collapsed to one. Now, a wider incoming reconnect arriving while a narrower one is already queued is appended, and the narrower (now-redundant) queued one is left in place — so both run (one extra over-fetch). The reverse and identical cases are handled optimally; this is the only ordering that regresses, and it's strictly an over-fetch, never data loss. Fine for a safety net, but if you want symmetry you could delete the queued requests that the pushed wider one covers. Your call — I'd leave it given the index-fragility warnings in SEQUENTIAL_QUEUE.md.

Minor notes
  • Wiring coverage gap: SequentialQueueTest hand-rebuilds the resolver wiring via makeReconnectRequest rather than calling the real writeWithNoDuplicatesReconnectConflictAction, and SubscribeToFullReconnectTest mocks @libs/API wholesale. So no test exercises the actual writeWithNoDuplicatesReconnectConflictAction body end-to-end — a bug in how it builds incomingRequest or passes getOngoingRequest() wouldn't be caught. The pure-resolver and queue-behavior coverage is strong, so this is low-risk, but worth knowing.
  • Two coverage-read paths: the incoming request uses readUpdateIDFrom(params) then re-wraps into data.updateIDFrom, while persisted/ongoing requests read request.data?.updateIDFrom via reconnectCoverageFrom. Consistent today; just a small duplication.

✅ All CI green (typecheck, ESLint, tests, perf). The NetworkState comment fix and SEQUENTIAL_QUEUE.md documentation are accurate and a nice touch.


Next Steps: Reply with @MelvinBot <feedback> to refine this review.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 016db62b15

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

*/
function resolveReconnectDuplicationConflictAction(persistedRequests: AnyRequest[], ongoingRequest: AnyRequest | null, incomingRequest: AnyRequest): ConflictActionData {
const incomingCoverage = reconnectCoverageFrom(incomingRequest);
const isCovered = [ongoingRequest, ...persistedRequests].some((live) => isReconnectFamilyRequest(live) && reconnectCoverageFrom(live) <= incomingCoverage);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve reconnect success data when dropping duplicates

When a full ReconnectApp was created before the sidebar loaded and a duplicate full reconnect is requested after isSidebarLoaded becomes true, this coverage check drops the later request. That later request is the one whose getOnyxDataForOpenOrReconnect(..., isSidebarLoaded) contains the public-room successData merges; the earlier queued/in-flight request does not, so its full reconnect response can remove public rooms instead of preserving them. Please either include/merge the reconnect Onyx side effects in the dedupe decision or keep the later request when those side effects differ.

Useful? React with 👍 / 👎.

@MelvinBot

Copy link
Copy Markdown
Contributor

🤖 Regression scan: no data-loss regressions. One confirmed minor over-fetch regression, plus one edge I checked and cleared.

✅ No correctness / data-loss regression

The dedup only ever drops a reconnect that is a coverage subset of a request that is live (in-flight or queued). The covering request fetches the same-or-wider range and, if it fails, is retried by the queue (durable ongoing request). So a dropped duplicate can never cause a missed backfill. The new behavior where an incoming ReconnectApp is dropped while an OpenApp is in flight is also safe and intended — OpenApp re-fetches everything a reconnect would.

⚠️ Confirmed regression (minor, safe): extra redundant fetch

This is the only behavioral regression vs. main. resolveReconnectDuplicationConflictAction only returns push or noAction — never replace/delete. On main, ReconnectApp used the latest-wins replace resolver, so the waiting queue never held more than one reconnect.

Regressing ordering: a narrower reconnect is already waiting in the queue, then a wider one comes in. The wider one isn't covered, so it's pushed and the now-redundant narrower one is left in place → both run (one extra request). On main, replace collapsed them to one.

Strictly an over-fetch, never under-fetch, so no data loss — acceptable for a safety net. Optional fix: delete the queued requests the pushed-wider one covers (weigh against the index-fragility warnings in SEQUENTIAL_QUEUE.md). This is the same point I raised in my earlier review.

🔎 Edge I checked and cleared (no regression)

Because the resolver runs on two evaluation passes, an incoming duplicate can have its optimistic IS_LOADING_REPORT_DATA = true applied in pass 1 and then be dropped in pass 2 — its own finallyData never runs. I confirmed this does not strand the spinner: every reconnect variant (full and partial) sets only IS_LOADING_REPORT_DATA optimistically, and both the in-flight reconnect and OpenApp clear that exact key in finallyData. So the live request always resets the shared flag. The IS_LOADING_REPORT_DATA-after-drop test pins this.


Next Steps: Reply with @MelvinBot <feedback> to dig deeper into any vector.

@mkhutornyi

Copy link
Copy Markdown
Contributor

Navigate to a chat, go offline for 5 seconds, then come back online. Confirm: messages sync correctly and the UI returns to a normal responsive state without a manual refresh.

Not consistent but sometimes this test is failing on iOS. Loading bar doesn't disappear. Also happening on production

bug.mov

@mkhutornyi mkhutornyi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check above comments

@adhorodyski

Copy link
Copy Markdown
Contributor Author

Hm I Think this should also remove ones skipped by the resolver (as Melvin mentioned it's safe but better to delete them), not sure why I missed this one scenario. We have control there so there should be no over-fetch either, this needs correction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants