[WIP/POC] ordered RPC#8430
Conversation
|
After implementing this, I'm not entirely sure if this is what we actually need/want. I'll have to think a little more about it. Either way, this doesn't handle many cases. |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 27 files ± 0 27 suites ±0 9h 39m 36s ⏱️ - 11m 22s For more details on these failures, see this check. Results for commit 570af51. ± Comparison against base commit 1c74474. This pull request removes 2 and adds 5 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
|
I cleaned this up and took it a step further. This now moves the worker heartbeat to the scheduler stream. This was not possible earlier because the heartbeat expected a scheduler response to adjust its interval. This is the only reason why the heartbeat used a dedicated connection pool. |
|
I'm starting to like this after all. Moving the heartbeat was seamless and I believe this avoids a lot of anti patterns. The transfer of actual payload data should not be done over this interface so we can't get rid of the old |
|
Well, thinking about the above statement again, I think there is nothing wrong with payload data on that stream as long as we're not re-using/abusing the primary administrative stream between scheduler/worker. However, to use this interface for smth like P2P where we would want to use the same stream that is also used for all the task-finished/erred/etc. messages we'll likely need to extend this multiplexing to move data off-band after all and just transmit administrative metadata over the stream. |
| a.digest_metric(("execute", span.id, "x", "test", "test"), 1) | ||
| await asyncio.sleep(0) | ||
| await hb_task | ||
| assert n > 9 |
There was a problem hiding this comment.
This 9 feels very magical. since the heartbeat doesn't have to open a new comm, we need fewer ticks until it completes and this is not guaranteed to be past 9.
I'm not entirely convinced this test still works/make sense/is relevant.
|
@hendrikmakait this now moved the P2P logic to the ordered PRC. Do you recall which logic was put in P2P to ensure ordering? I would hope that we could remove some logic to test this (and ultimately reduce complexity) |
|
The one test failure right now is |
Closes #7480
cc @hendrikmakait