You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Andrew Wong (Jira)" <ji...@apache.org> on 2019/09/24 18:48:00 UTC
[jira] [Resolved] (KUDU-2952) TServers reporting replica stats may
race with leadership change, hitting a DCHECK
[ https://issues.apache.org/jira/browse/KUDU-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrew Wong resolved KUDU-2952.
-------------------------------
Fix Version/s: 1.11.0
Resolution: Fixed
> TServers reporting replica stats may race with leadership change, hitting a DCHECK
> ----------------------------------------------------------------------------------
>
> Key: KUDU-2952
> URL: https://issues.apache.org/jira/browse/KUDU-2952
> Project: Kudu
> Issue Type: Bug
> Components: consensus, tserver
> Reporter: Andrew Wong
> Assignee: Andrew Wong
> Priority: Major
> Fix For: 1.11.0
>
> Attachments: master_hms-itest.txt
>
>
> I have a precommit that failed with:
> {code:java}
> F0924 00:08:46.821594 9670 catalog_manager.cc:4239] Check failed: ts_desc->permanent_uuid() == report.consensus_state().leader_uuid()
> *** Check failure stack trace: ***
> @ 0x7f5e442ea62d google::LogMessage::Fail() at ??:0
> @ 0x7f5e442ec64c google::LogMessage::SendToLog() at ??:0
> @ 0x7f5e442ea189 google::LogMessage::Flush() at ??:0
> @ 0x7f5e442ecfdf google::LogMessageFatal::~LogMessageFatal() at ??:0
> @ 0x7f5e45d89a01 kudu::master::CatalogManager::ProcessTabletReport() at ??:0
> @ 0x7f5e45e29ae7 kudu::master::MasterServiceImpl::TSHeartbeat() at ??:0
> @ 0x7f5e41f29cbc _ZZN4kudu6master15MasterServiceIfC1ERK13scoped_refptrINS_12MetricEntityEERKS2_INS_3rpc13ResultTrackerEEENKUlPKN6google8protobuf7MessageEPSE_PNS7_10RpcContextEE0_clESG_SH_SJ_ at ??:0
> @ 0x7f5e41f3068b _ZNSt17_Function_handlerIFvPKN6google8protobuf7MessageEPS2_PN4kudu3rpc10RpcContextEEZNS6_6master15MasterServiceIfC1ERK13scoped_refptrINS6_12MetricEntityEERKSD_INS7_13ResultTrackerEEEUlS4_S5_S9_E0_E9_M_invokeERKSt9_Any_dataS4_S5_S9_ at ??:0
> @ 0x7f5e3fea909e std::function<>::operator()() at ??:0
> @ 0x7f5e3fea88cf kudu::rpc::GeneratedServiceIf::Handle() at ??:0
> @ 0x7f5e3feab3b6 kudu::rpc::ServicePool::RunThread() at ??:0
> @ 0x7f5e3feac785 boost::_mfi::mf0<>::operator()() at ??:0
> @ 0x7f5e3feac5ac boost::_bi::list1<>::operator()<>() at ??:0
> @ 0x7f5e3feac493 boost::_bi::bind_t<>::operator()() at ??:0
> @ 0x7f5e3feac3c2 boost::detail::function::void_function_obj_invoker0<>::invoke() at ??:0
> @ 0x7f5e44db28d2 boost::function0<>::operator()() at ??:0
> @ 0x7f5e44daf65b kudu::Thread::SuperviseThread() at ??:0
> @ 0x7f5e41429184 start_thread at ??:0
> @ 0x7f5e438f4ffd clone at ??:0
> {code}
> Looking through the code, it looks like there's a kind of TOCTOU race going on when generating reports.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)