Skip to content

Conversation

@easwars
Copy link
Contributor

@easwars easwars commented Aug 20, 2025

The change being reverted here (#8369) is a prime suspect for a race that can show up with the following sequence of events:

  • create a new gRPC channel with the xds:/// scheme
  • make an RPC
  • close the channel
  • repeat (possibly from multiple goroutines)

The observable behavior from the race is that the xDS client thinks that a Listener resource is removed by the control plane when it clearly is not. This results in the user's gRPC channel moving to TRANSIENT_FAILURE and subsequent RPC failures.

The reason the above mentioned PR is not being rolled back using git revert is because the xds directory structure has changed significantly since the time the PR was originally merged. Manually performing the revert seemed much easier.

RELEASE NOTES:

  • xdsclient: Revert a change that introduces a race with xDS resource processing, leading to RPC failures

@easwars easwars requested a review from dfawley August 20, 2025 19:12
@easwars easwars added Type: Bug Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Aug 20, 2025
@easwars easwars added this to the 1.76 Release milestone Aug 20, 2025
@codecov
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 60.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.91%. Comparing base (fa0d658) to head (7fec595).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
internal/xds/clients/xdsclient/authority.go 60.00% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8527      +/-   ##
==========================================
+ Coverage   81.86%   81.91%   +0.04%     
==========================================
  Files         412      412              
  Lines       40518    40451      -67     
==========================================
- Hits        33172    33137      -35     
+ Misses       5953     5924      -29     
+ Partials     1393     1390       -3     
Files with missing lines Coverage Δ
internal/xds/clients/xdsclient/ads_stream.go 82.95% <ø> (+0.14%) ⬆️
internal/xds/clients/xdsclient/channel.go 77.56% <ø> (-1.39%) ⬇️
internal/xds/clients/xdsclient/xdsclient.go 82.96% <ø> (+0.96%) ⬆️
internal/xds/clients/xdsclient/authority.go 80.00% <60.00%> (+3.06%) ⬆️

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dfawley dfawley changed the title xdsclient: revert https://github.com/grpc/grpc-go/pull/8369 xdsclient: revert #8369: delay resource cache deletion Aug 21, 2025
@dfawley dfawley assigned easwars and unassigned dfawley Aug 21, 2025
@dfawley dfawley modified the milestones: 1.76 Release, 1.75 Release Aug 21, 2025
@easwars easwars merged commit b0bc6dc into grpc:master Aug 21, 2025
19 of 23 checks passed
@easwars easwars deleted the revert_8369 branch August 21, 2025 17:52
eshitachandwani pushed a commit to eshitachandwani/grpc-go that referenced this pull request Aug 29, 2025
The change being reverted here (grpc#8369) is a prime suspect for a race
that can show up with the following sequence of events:
- create a new gRPC channel with the `xds:///` scheme
- make an RPC
- close the channel
- repeat (possibly from multiple goroutines)

The observable behavior from the race is that the xDS client thinks that
a Listener resource is removed by the control plane when it clearly is
not. This results in the user's gRPC channel moving to TRANSIENT_FAILURE
and subsequent RPC failures.

The reason the above mentioned PR is not being rolled back using `git
revert` is because the xds directory structure has changed significantly
since the time the PR was originally merged. Manually performing the
revert seemed much easier.

RELEASE NOTES:
* xdsclient: Revert a change that introduces a race with xDS resource
processing, leading to RPC failures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Type: Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants