fix: double lock of deterministicMNManager->cs#5637
fix: double lock of deterministicMNManager->cs#5637PastaPastaPasta merged 3 commits intodashpay:developfrom
Conversation
bc9d552 to
cee5bf9
Compare
|
What was this introduced in? |
Logs:
node1 2023-10-21T16:46:03.990302Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] DOUBLE LOCK DETECTED
node1 2023-10-21T16:46:03.990322Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] Lock order:
node1 2023-10-21T16:46:03.990339Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] 'cs_main' in miner.cpp:129 (in thread 'httpworker.1')
node1 2023-10-21T16:46:03.990353Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] 'm_mempool.cs' in miner.cpp:129 (in thread 'httpworker.1')
node1 2023-10-21T16:46:03.990366Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] (*) 'deterministicMNManager->cs' in evo/cbtx.cpp:114 (in thread 'httpworker.1')
node1 2023-10-21T16:46:03.990439Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] (*) 'cs' in ./evo/deterministicmns.h:614 (in thread 'httpworker.1')
node1 2023-10-21T16:46:04.003619Z (mocktime: 2014-12-04T17:33:19Z) [httpworker.1] Posix Signal: Aborted
No debug information available for stacktrace. You should add debug information and then run:
dashd -printcrashinfo=bvcgc43iinzgc43ijfxgm3ybaacwiyltnbsbkudponuxqictnftw4ylmhiqecytpoj2gkzaphcbzaaaaaaaaayeurhimekiaaav6ldwqyiuqaafwsoe5bqrjaaahz6eh2dbcsaaakhhzaaaaaaaaanreseaaaaaaacgguliaaaaaaadyauwqaaaaaaacp3daaaaaaaaamxigcaaaaaaablpulyaaaaaaadovy3yaaaaaaahj7vbaaaaaaaadnbsdaaaaaaaaaa======
Part of backtrace:
dashpay#9 UniqueLock<AnnotatedMixin<std::mutex>, std::unique_lock<std::mutex> >::UniqueLock (fTry=false, nLine=615, pszFile=0x55a9f71a3710 "./evo/deterministicmns.h",
pszName=0x55a9f719caff "cs", mutexIn=..., this=0x7f7e1e71b250) at ./sync.h:164
dashpay#10 CDeterministicMNManager::GetListForBlock (this=0x55a9f84d06b0, pindex=0x7f7db03621c0) at ./evo/deterministicmns.h:615
dashpay#11 0x000055a9f6612258 in llmq::utils::ComputeQuorumMembersByQuarterRotation (pCycleQuorumBaseBlockIndex=0x7f7db03a6930, llmqParams=...) at /usr/include/c++/12/bits/unique_ptr.h:191
dashpay#12 llmq::utils::GetAllQuorumMembers (llmqType=<optimized out>, pQuorumBaseBlockIndex=0x7f7db0359bc0, reset_cache=reset_cache@entry=false) at llmq/utils.cpp:150
dashpay#13 0x000055a9f694d957 in CDeterministicMNManager::HandleQuorumCommitment (qc=..., pQuorumBaseBlockIndex=<optimized out>, mnList=..., debugLogs=debugLogs@entry=false)
at evo/deterministicmns.cpp:989
dashpay#14 0x000055a9f695c455 in CDeterministicMNManager::BuildNewListFromBlock (this=<optimized out>, block=..., pindexPrev=pindexPrev@entry=0x7f7db03c1ac0, state=..., view=...,
mnListRet=..., debugLogs=false) at evo/deterministicmns.cpp:918
dashpay#15 0x000055a9f692e7cd in CalcCbTxMerkleRootMNList (block=..., pindexPrev=pindexPrev@entry=0x7f7db03c1ac0, merkleRootRet=..., state=..., view=...)
at /usr/include/c++/12/bits/unique_ptr.h:191
dashpay#16 0x000055a9f6a352ed in BlockAssembler::CreateNewBlock (this=this@entry=0x7f7e1e71f0b0, scriptPubKeyIn=...) at ./validation.h:649
dashpay#17 0x000055a9f6771c49 in generateBlocks (chainman=..., mempool=..., evodb=..., llmq_ctx=..., coinbase_script=..., nGenerate=10, nMaxTries=<optimized out>) at rpc/mining.cpp:167
dashpay#18 0x000055a9f677a496 in generatetoaddress (request=...) at /usr/include/c++/12/bits/unique_ptr.h:191
dashpay#19 0x000055a9f671ea02 in CRPCCommand::CRPCCommand(char const*, char const*, UniValue (*)(JSONRPCRequest const&), std::initializer_list<char const*>)::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}::operator()(JSONRPCRequest const&, UniValue&, bool) const (__closure=<optimized out>, result=..., request=...) at ./rpc/server.h:120
cee5bf9 to
75cc716
Compare
I reproduced it on local branch WIP during some asset-locks functional tests changes; maybe it's not gonna happen on production ever but double lock is double lock; better to fix it. |
|
Can you add a minimal regression test here? |
|
It started to happen when I changed (locally) feature_assetlocks.py:
I don't think so; the changes that I made in functional tests are not connected to issue directly and do not explain how this double lock never has been appeared before. I updated visibility of mutex from "public" to "private" and updated annotation of methods CDeterministicMN to increase error-prune over this mutex. |
PastaPastaPasta
left a comment
There was a problem hiding this comment.
here's a diff for a static mutex and atomics
Subject: [PATCH] use LogPrintf instead
---
Index: src/evo/cbtx.cpp
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/evo/cbtx.cpp b/src/evo/cbtx.cpp
--- a/src/evo/cbtx.cpp (revision 75cc7160ebbc2f3ad6eb76d88ac9a0ec9ebe9187)
+++ b/src/evo/cbtx.cpp (date 1697940729424)
@@ -112,9 +112,9 @@
bool CalcCbTxMerkleRootMNList(const CBlock& block, const CBlockIndex* pindexPrev, uint256& merkleRootRet, BlockValidationState& state, const CCoinsViewCache& view)
{
try {
- static int64_t nTimeDMN = 0;
- static int64_t nTimeSMNL = 0;
- static int64_t nTimeMerkle = 0;
+ static std::atomic<int64_t> nTimeDMN = 0;
+ static std::atomic<int64_t> nTimeSMNL = 0;
+ static std::atomic<int64_t> nTimeMerkle = 0;
int64_t nTime1 = GetTimeMicros();
@@ -132,10 +132,12 @@
int64_t nTime3 = GetTimeMicros(); nTimeSMNL += nTime3 - nTime2;
LogPrint(BCLog::BENCHMARK, " - CSimplifiedMNList: %.2fms [%.2fs]\n", 0.001 * (nTime3 - nTime2), nTimeSMNL * 0.000001);
- static CSimplifiedMNList smlCached;
- static uint256 merkleRootCached;
- static bool mutatedCached{false};
+ static Mutex cached_mutex;
+ static CSimplifiedMNList smlCached GUARDED_BY(cached_mutex);
+ static uint256 merkleRootCached GUARDED_BY(cached_mutex);
+ static bool mutatedCached GUARDED_BY(cached_mutex) {false};
+ LOCK(cached_mutex);
if (sml == smlCached) {
merkleRootRet = merkleRootCached;
if (mutatedCached) {
Co-authored-by: pasta <pasta@dashboost.org>
| int nHeight = pindexPrev->nHeight + 1; | ||
|
|
||
| CDeterministicMNList oldList = GetListForBlockInternal(pindexPrev); | ||
| CDeterministicMNList oldList = GetListForBlock(pindexPrev); |
There was a problem hiding this comment.
Related. Internal doesn't do CS, non -internal does, so, it must be changed
PastaPastaPasta
left a comment
There was a problem hiding this comment.
utACK for squash merge
Issue being fixed or feature implemented
fix: double lock of deterministicMNManager->cs
Logs:
Part of backtrace:
What was done?
CDeterministicMNManager::BuildNewListFromBlockdoesn't requirecslock anymoreHow Has This Been Tested?
Run unit/functional tests
Breaking Changes
N/A
Checklist: