Skip to content

Commit 49ef33f

Browse files
Ole John Askedahlerlend
authored andcommitted
Bug#36066725 Regular mgmd hangs when sending it a stop node for ndbmtd
Root cause is that the mutexes 'theMultiTransporterMutex' and 'clusterMgrThreadMutex' are taken in different order in the two respective call chains: 1) ClusterMgr::threadMain() -> lock() -> NdbMutex_Lock(clusterMgrThreadMutex) - ::threadMain(), holding clusterMgrThreadMutex -> TransporterFacade::startConnecting() - TF::startConnecting -> lockMultiTransporters() <<<< HANG while holding clusterMgrThreadMutex 2) TransporterRegistry::report_disconnect() -> lockMultiTransporters() - ::report_disconnect(), holding theMultiTransporterMutex, -> TransporterFacade::reportDisconnect() - TF::reportDisconnect -> ClusterMgr::reportDisconnected() - ClusterMgr::reportDisconnected() -> lock() - lock() -> NdbMutex_Lock(clusterMgrThreadMutex) <<<< Held by 1) Patch change TransporterRegistry::report_disconnect() such that the theMultiTransporterMutex is released before calling reportDisconnect(NodeId). It should be sufficient to hold theMultiTransporterMutex while ::report_disconnect check if we are disconnecting a multiTransporter, and if all its Trps are in DISCONNECTED state. When this finished we have set up 'ready_to_disconnect' and can release theMultiTransporterMutex before -> reportDisconnect() Change-Id: I19be0d9d92184efb8f20a92aa7189b9b85f069bc
1 parent cb8dc83 commit 49ef33f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

storage/ndb/src/common/transporter/TransporterRegistry.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2321,14 +2321,14 @@ void TransporterRegistry::report_disconnect(TransporterReceiveHandle &recvdata,
23212321
remove_allTransporters(this_trp);
23222322
}
23232323
} // End of multiTransporter DISCONNECT handling
2324+
unlockMultiTransporters();
23242325

23252326
if (ready_to_disconnect) // 5)
23262327
{
23272328
DEBUG_FPRINTF((stderr, "(%u) -> reportDisconnect(node_id=%u)\n",
23282329
localNodeId, node_id));
23292330
recvdata.reportDisconnect(node_id, errnum);
23302331
}
2331-
unlockMultiTransporters();
23322332
DBUG_VOID_RETURN;
23332333
}
23342334

0 commit comments

Comments
 (0)