Fix deadlock on deserialization failure#116
Conversation
Previously, we introduced disconnecting a counterparty who sends us bogus messages that we're unable to parse. However, before we disconnect we also send out a last `LSPSMessage::Invalid` in the hopes that the counterparty understands this. However, when we added this logic we unfortunately overlooked that we lock the `request_id_to_method_map` `Mutex` for parsing the message, but also try to lock when the `PeerHandler` calls `get_and_clear_pending_msgs`. Here, we avoid the resulting deadlock by dropping the lock as soon as it's not required anymore after parsing.
.. in `get_and_clear_pending_msg`. We also remove a potentially dangerous (if we ever were to fail serialization for some reason) `unwrap`.
Disconnecting might be unfortunate if we have open channels with the counterparty. If they send us bogus data, we now just add them to a (currently unpersisted) ignore list. We should in the future figure out when to drop them from this list, so that peers have the chance to recover.
| action: ErrorAction::IgnoreAndLog(Level::Trace), | ||
| }); | ||
| } | ||
| } |
There was a problem hiding this comment.
Will this respond to peer with an error or just log it on our end? We've run into issues where we just silently fail/ignore requests (when user hits our rate limits) and the wallet user experience is odd because it just hangs. This could introduce that experience?
There was a problem hiding this comment.
Yes, but in this case it's really intended as it's a DoS protection if the user sends us bogus data. The alternative is disconnecting them which I'd even prefer, but we don't know if the user has any channels with us that still need to be kept operational. While this just adds the user to the ignorelist until restart, we def. want to remove them after a while in the future, especially once we start persisting things, as tracked here: https://github.com/lightningdevkit/lightning-liquidity/issues/117
Previously, we introduced disconnecting a counterparty who sends us bogus messages that we're unable to parse. However, before we disconnect we also send out a last
LSPSMessage::Invalidin the hopes that thecounterparty understands this.
However, when we added this logic we unfortunately overlooked that we lock the
request_id_to_method_mapMutexfor parsing the message, but also try to lock when thePeerHandlercallsget_and_clear_pending_msg.Here, we avoid the resulting deadlock by dropping the lock as soon as it's not required anymore after parsing.
In a second commit, we avoid unnecessary locking of
request_ids_and_methods_mapinget_and_clear_pending_msgand remove a potentially dangerous (if we ever were to fail serialization for some reason)unwrap.