Skip to content

Commit 787dcc6

Browse files
committed
MB-20351: Fix lock-order inversion in ~CheckpointManager
As identified by TSan. Seen whilst testing sherlock->watson merge, analysed the code and it seems this is a latent issue and hard to re-produce. The issue is that when the executor thread does a reset on the current task, the VBCBAdapator is the last one holding the ref-counted vbucket, so destruction occurs and ~VBucket calls the destructor of the checkpoint manager, which is the reverse locks ordering of a previous code path. Number of threads in play here, but main ones of interest: WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=170834) Cycle in lock order graph: M_checkpoint (0x7d640002e9a8) => M_exepool (0x7d4c00008288) => M_taskqueue (0x7d4400008e80) => M_exethread (0x7d380000df60) => M_checkpoint Mutex M_exepool acquired here while holding mutex M_checkpoint in thread T35: #0 pthread_mutex_lock <null> (engine_testapp+0x000000486760) #1 std::mutex::lock() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/x86_64-linux-gnu/c++/4.9/bits/gthr-default.h:748 (ep.so+0x0000000f47b0) #2 ExecutorPool::wake(unsigned long) ep-engine/src/executorpool.cc:355 (ep.so+0x0000000f48f1) #3 Flusher::wake() ep-engine/src/flusher.cc:155 (ep.so+0x000000101ee6) #4 NotifyFlusherCB::callback(unsigned short&) ep-engine/src/flusher.h:88 (ep.so+0x00000010d194) #5 Checkpoint::queueDirty(SingleThreadedRCPtr<Item> const&, CheckpointManager*) ep-engine/src/checkpoint.h:675 (ep.so+0x0000000271b0) #6 CheckpointManager::closeOpenCheckpoint_UNLOCKED() ep-engine/src/checkpoint.cc:454 (ep.so+0x000000028dcb) #7 CheckpointManager::addNewCheckpoint_UNLOCKED(unsigned long, unsigned long, unsigned long) ep-engine/src/checkpoint.cc:371 (ep.so+0x00000002881f) #8 CheckpointManager::checkOpenCheckpoint_UNLOCKED(bool, bool) ep-engine/src/checkpoint.cc:361 (ep.so+0x00000002bd71) #9 CheckpointVisitor::visitBucket(RCPtr<VBucket>&) ep-engine/src/checkpoint_remover.cc:43 (ep.so+0x00000003c3bd) #10 VBCBAdaptor::run() ep-engine/src/ep.cc:3924 (ep.so+0x0000000a6174) #11 ExecutorThread::run() ep-engine/src/executorthread.cc:115 (ep.so+0x0000000fe1b6) #12 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000fdd15) #13 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x0000000057fb) ... Mutex M_checkpoint acquired here while holding mutex M_exethread in thread T36: #0 pthread_mutex_lock <null> (engine_testapp+0x000000486760) #1 CheckpointManager::~CheckpointManager() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/x86_64-linux-gnu/c++/4.9/bits/gthr-default.h:748 (ep.so+0x000000027fdd) #2 VBucket::~VBucket() ep-engine/src/vbucket.cc:152 (ep.so+0x00000014a018) #3 PagingVisitor::~PagingVisitor() ep-engine/src/atomic.h:190 (ep.so+0x00000010a5e6) #4 PagingVisitor::~PagingVisitor() ep-engine/src/item_pager.cc:43 (ep.so+0x00000010a645) #5 std::_Sp_counted_ptr<PagingVisitor*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/shared_ptr_base.h:373 (ep.so+0x00000010a2b0) #6 VBCBAdaptor::~VBCBAdaptor() /usr/bin/../lib/gcc/x86_64-linux-gnu/4.9/../../../../include/c++/4.9/bits/shared_ptr_base.h:149 (ep.so+0x0000000aea7e) #7 ExecutorThread::run() ep-engine/src/atomic.h:325 (ep.so+0x0000000fdee4) #8 launch_executor_thread(void*) ep-engine/src/executorthread.cc:33 (ep.so+0x0000000fdd15) #9 platform_thread_wrap(void*) platform/src/cb_pthreads.cc:54 (libplatform.so.0.1.0+0x0000000057fb) Change-Id: I0a966b3d112963243e17647184123fd8b3200656 Reviewed-on: http://review.couchbase.org/66357 Well-Formed: buildbot <build@couchbase.com> Tested-by: buildbot <build@couchbase.com> Reviewed-by: Will Gardner <will.gardner@couchbase.com> Reviewed-by: Jim Walker <jim@couchbase.com>
1 parent 4035fc4 commit 787dcc6

1 file changed

Lines changed: 0 additions & 1 deletion

File tree

src/checkpoint.cc

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,6 @@ std::ostream& operator <<(std::ostream& os, const Checkpoint& c) {
303303
}
304304

305305
CheckpointManager::~CheckpointManager() {
306-
LockHolder lh(queueLock);
307306
std::list<Checkpoint*>::iterator it = checkpointList.begin();
308307
while(it != checkpointList.end()) {
309308
delete *it;

0 commit comments

Comments
 (0)