Fix a race condition in the tests#118
Merged
Merged
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #118 +/- ##
==========================================
- Coverage 79.33% 79.24% -0.09%
==========================================
Files 10 10
Lines 1916 1913 -3
==========================================
- Hits 1520 1516 -4
- Misses 396 397 +1 ☔ View full report in Codecov by Sentry. |
Member
Author
|
(bump) |
vchuravy
approved these changes
Jan 15, 2025
vchuravy
left a comment
Member
There was a problem hiding this comment.
Can you add a comment? I always find timeouts scary, since it may be able to fail.
Quick explanation of what the `GC tests for RemoteChannels` test does: 1. Create `RemoteChannel`s `rr` and `fstore` on worker 1 and worker 2 respectively from the master process. At this point only the master process knows about `rr` and `fstore`. 2. Master process calls `put!(fstore, rr)`, i.e. we remotecall worker 2 and put `rr` (which is owned worker 1 but is currently only known about by the master) into `fstore`. 3. Remotecall into worker 1 and check that it knows about `rr`. Step 3 should succeed despite us never previously explicitly communicating with worker 1, because `serialize(::ClusterSerializer, ::RemoteChannel)` will send a message to the owner of the `RemoteChannel` to inform them of its existence (see `send_add_client()`). This happens asynchronously in step 2, and on rare occasions worker 1 would not process that message before step 3, causing the test to fail. Now we give the check 10s to succeed.
cc6aa30 to
f9d3d89
Compare
Member
Author
|
Sure, added in f9d3d89. |
63 tasks
64 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Quick explanation of what the
GC tests for RemoteChannelstest does:RemoteChannelsrrandfstoreon worker 1 and worker 2 respectively from the master process. At this point only the master process knows aboutrrandfstore.put!(fstore, rr), i.e. we remotecall worker 2 and putrr(which is owned worker 1 but is currently only known about by the master) intofstore.rr.Step 3 should succeed despite us never previously explicitly communicating with worker 1, because
serialize(::ClusterSerializer, ::RemoteChannel)will send a message to the owner of theRemoteChannelto inform them of its existence (seesend_add_client()). This happens asynchronously in step 2, and on rare occasions worker 1 would not process that message before step 3, causing the test to fail.Now we give the check 10s to succeed.
Cherry-picked from JuliaParallel/DistributedNext.jl#8.