You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Will Berkeley (JIRA)" <ji...@apache.org> on 2019/01/18 20:56:00 UTC
[jira] [Created] (KUDU-2664) Tablet server crashed when running
kudu remote_replica unsafe_change
Will Berkeley created KUDU-2664:
-----------------------------------
Summary: Tablet server crashed when running kudu remote_replica unsafe_change
Key: KUDU-2664
URL: https://issues.apache.org/jira/browse/KUDU-2664
Project: Kudu
Issue Type: Bug
Affects Versions: 1.8.0
Reporter: Will Berkeley
While trying to reproduce a different issue, I ran the following command
{noformat}
for i in 0 1; do bin/kudu remote_replica unsafe_change_config 127.0.0.1:7250 3ccbce6a3116487cbcc79ab4280a2ee5
{noformat}
and encountered the following tablet server crash
{noformat}
F0118 10:45:42.696043 280514560 raft_consensus.cc:1286] T 3ccbce6a3116487cbcc79ab4280a2ee5 P 6ca21fa7dcf54761a5ec7017ff101a68 [term 6 FOLLOWER]: Unexpected new leader in same term! Existing leader UUID: kudu-tools, new leader UUID: 454b53ed77bd458a81a7710c892f214b
*** Check failure stack trace: ***
@ 0x10c91247f google::LogMessageFatal::~LogMessageFatal()
@ 0x10c90f259 google::LogMessageFatal::~LogMessageFatal()
@ 0x108b74c05 kudu::consensus::RaftConsensus::CheckLeaderRequestUnlocked()
@ 0x108b6c180 kudu::consensus::RaftConsensus::UpdateReplica()
@ 0x108b6b459 kudu::consensus::RaftConsensus::Update()
@ 0x107cf5106 kudu::tserver::ConsensusServiceImpl::UpdateConsensus()
@ 0x10b53b87d kudu::consensus::ConsensusServiceIf::ConsensusServiceIf()::$_1::operator()()
@ 0x10b53b819 _ZNSt3__128__invoke_void_return_wrapperIvE6__callIJRZN4kudu9consensus18ConsensusServiceIfC1ERK13scoped_refptrINS3_12MetricEntityEERKS6_INS3_3rpc13ResultTrackerEEE3$_1PKN6google8protobuf7MessageEPSK_PNSB_10RpcContextEEEEvDpOT_
@ 0x10b53b6a9 std::__1::__function::__func<>::operator()()
@ 0x10b843e07 std::__1::function<>::operator()()
@ 0x10b843a1a kudu::rpc::GeneratedServiceIf::Handle()
@ 0x10b846cb6 kudu::rpc::ServicePool::RunThread()
@ 0x10b849aa9 boost::_mfi::mf0<>::operator()()
@ 0x10b849a10 boost::_bi::list1<>::operator()<>()
@ 0x10b8499ba boost::_bi::bind_t<>::operator()()
@ 0x10b84979d boost::detail::function::void_function_obj_invoker0<>::invoke()
@ 0x10b7bb1fa boost::function0<>::operator()()
@ 0x10c2cc2f5 kudu::Thread::SuperviseThread()
@ 0x7fff5dc09305 _pthread_body
@ 0x7fff5dc0c26f _pthread_start
@ 0x7fff5dc08415 thread_start
{noformat}
The target of the config change was TS 6ca21fa7dcf54761a5ec7017ff101a68 at address 127.0.0.1:7250, and I was trying to kick out one of the three replicas while fishing for a repro of the other issue.
I couldn't get the crash to happen again and I wasn't able to capture a minidump or core dump...and I accidentally deleted the logs, so I'm afraid the above is all there is to go on.
It's expected that funny stuff could happen when using unsafe_change_config-- it's unsafe. But it shouldn't be possible to crash the tablet server with it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)