You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2016/03/03 03:55:23 UTC

[kudu-CR] catalog_manager: wait for table visitors before shutting down catalog

Hello Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/2427

to review the following change.

Change subject: catalog_manager: wait for table visitors before shutting down catalog
......................................................................

catalog_manager: wait for table visitors before shutting down catalog

We observed the following crash:

  *** Aborted at 1456910767 (unix time) try "date -d @1456910767" if you are using GNU date ***
  PC: @     0x7f1892a8c3c0 base::subtle::NoBarrier_CompareAndSwap()
  *** SIGSEGV (@0x158) received by PID 6439 (TID 0x7f187b54e700) from PID 344; stack trace: ***
      @       0x3b44e0f710 (unknown) at ??:0
      @     0x7f1892a8c3c0 base::subtle::NoBarrier_CompareAndSwap() at ??:0
      @     0x7f1892a8c440 base::subtle::Acquire_CompareAndSwap() at ??:0
      @     0x7f1892a8c5ca base::SpinLock::Lock() at ??:0
      @     0x7f1892a8ce74 kudu::simple_spinlock::lock() at ??:0
      @     0x7f1892a9a90e boost::lock_guard<>::lock_guard() at ??:0
      @     0x7f1891f2a43a kudu::tablet::MvccManager::TakeSnapshot() at ??:0
      @     0x7f1891f2b00e kudu::tablet::MvccSnapshot::MvccSnapshot() at ??:0
      @     0x7f1891e63755 kudu::tablet::Tablet::NewRowIterator() at ??:0
      @     0x7f1892ade0c1 kudu::master::SysCatalogTable::VisitTablets() at ??:0
      @     0x7f1892a74497 kudu::master::CatalogManager::VisitTablesAndTabletsUnlocked() at ??:0
      @     0x7f1892a73f8e kudu::master::CatalogManager::VisitTablesAndTabletsTask() at ??:0

This can happen if the deferral of VisitTablesAndTabletsTask to the worker
thread pool is slow enough such that the catalog is able to shutdown before
the table visitor starts.

I spent a bunch of time trying to reproduce this crash. The attached test is
simple, but it needs class friendship and I've only seen it trigger the
crash once in thousands of runs. As such, it may hurt more than it helps.

Change-Id: I142b8dbdf4356a324bcde0e63fa44ea63798d509
---
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master-test.cc
3 files changed, 18 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/2427/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I142b8dbdf4356a324bcde0e63fa44ea63798d509
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>