You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2017/08/29 02:15:00 UTC

[jira] [Created] (KUDU-2118) Running RaftConsensus instances should not be destroyed by reactor threads

Adar Dembo created KUDU-2118:
--------------------------------

             Summary: Running RaftConsensus instances should not be destroyed by reactor threads
                 Key: KUDU-2118
                 URL: https://issues.apache.org/jira/browse/KUDU-2118
             Project: Kudu
          Issue Type: Bug
          Components: consensus
    Affects Versions: 1.5.0
            Reporter: Adar Dembo
            Priority: Critical
         Attachments: 0_create-table-stress-test.txt.gz

RaftConsensus is an object with shared ownership, and one of its invariants is that the last ref may be dropped (and thus the object destroyed) by the reactor thread, but if that happens, RaftConsensus must already be shut down, because the act of shutting down may wait, and reactor threads aren't allowed to wait.

And yet, here's a pre-commit test failure showing otherwise. In it, a reactor thread destroys a LeaderElection object, which destroys the embedded ElectionDecisionCallback, which had the last ref to RaftConsensus, which then destroys it. Normally the Shutdown call in the destructor would no-op, but apparently it's going through a full stop sequence instead.

{noformat}
thread_restrictions.cc:79] Check failed: LoadTLS()->wait_allowed Waiting is not allowed to be used on this thread to prevent server-wide latency aberrations and deadlocks. Thread 3852 (name: "rpc reactor", category: "reactor")
    @     0x7fcfc8864507  kudu::ThreadRestrictions::AssertWaitAllowed() at ??:0
    @     0x7fcfc55de12f  kudu::consensus::RaftConsensus::Stop() at ??:0
    @     0x7fcfc55de6aa  kudu::consensus::RaftConsensus::Shutdown() at ??:0
    @     0x7fcfc55cdba4  kudu::consensus::RaftConsensus::~RaftConsensus() at ??:0
    @     0x7fcfc55fab95  __gnu_cxx::new_allocator<>::destroy<>() at ??:0
    @     0x7fcfc55fab47  std::allocator_traits<>::_S_destroy<>() at ??:0
    @     0x7fcfc55faae9  std::allocator_traits<>::destroy<>() at ??:0
    @     0x7fcfc55fa91b  std::_Sp_counted_ptr_inplace<>::_M_dispose() at ??:0
    @           0x4304fa  std::_Sp_counted_base<>::_M_release() at /usr/include/c++/4.8/bits/shared_ptr_base.h:158
    @           0x42e68f  std::__shared_count<>::~__shared_count() at /usr/include/c++/4.8/bits/shared_ptr_base.h:547
    @     0x7fcfcb8a4032  std::__shared_ptr<>::~__shared_ptr() at ??:0
    @     0x7fcfcb8a4072  std::shared_ptr<>::~shared_ptr() at ??:0
    @     0x7fcfc55ed4d4  std::_Head_base<>::~_Head_base() at ??:0
    @     0x7fcfc55ed4f2  _ZNSt11_Tuple_implILm0EJSt10shared_ptrIN4kudu9consensus13RaftConsensusEENS3_14ElectionReasonESt12_PlaceholderILi1EEEED1Ev at ??:0
    @     0x7fcfc55ed50c  std::tuple<>::~tuple() at ??:0
    @     0x7fcfc55ed52a  std::_Bind<>::~_Bind() at ??:0
    @     0x7fcfc55f6162  std::_Function_base::_Base_manager<>::_M_destroy() at ??:0
    @     0x7fcfc55f34ed  std::_Function_base::_Base_manager<>::_M_manager() at ??:0
    @     0x7fcfcbe5d5c5  std::_Function_base::~_Function_base() at ??:0
    @     0x7fcfc55b0d18  std::function<>::~function() at ??:0
    @     0x7fcfc55add9d  kudu::consensus::LeaderElection::~LeaderElection() at ??:0
    @     0x7fcfc55b699a  kudu::RefCountedThreadSafe<>::DeleteInternal() at ??:0
    @     0x7fcfc55b697a  kudu::DefaultRefCountedThreadSafeTraits<>::Destruct() at ??:0
    @     0x7fcfc55b6960  kudu::RefCountedThreadSafe<>::Release() at ??:0
    @     0x7fcfc55b6936  kudu::internal::MaybeRefcount<>::Release() at ??:0
    @     0x7fcfc55b68c4  kudu::internal::BindState<>::~BindState() at ??:0
    @     0x7fcfc55b6910  kudu::internal::BindState<>::~BindState() at ??:0
    @     0x7fcfcb44f23d  kudu::RefCountedThreadSafe<>::DeleteInternal() at ??:0
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)