You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/12/19 09:03:58 UTC

[jira] [Updated] (KUDU-1169) SIGILL when aborting a replaced operation from previous leader

     [ https://issues.apache.org/jira/browse/KUDU-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated KUDU-1169:
------------------------------
    Fix Version/s: 0.5.0

> SIGILL when aborting a replaced operation from previous leader
> --------------------------------------------------------------
>
>                 Key: KUDU-1169
>                 URL: https://issues.apache.org/jira/browse/KUDU-1169
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus, tserver
>    Affects Versions: Private Beta
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>              Labels: crash
>             Fix For: 0.5.0
>
>
> We saw a SIGILL crash with the following stack:
> {code}
> kudu::rpc::InboundCall::Respond(google::protobuf::MessageLite const&, bool) + 79 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x959d8b
> kudu::rpc::InboundCall::RespondSuccess(google::protobuf::MessageLite const&) + 75 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x94d24c
> kudu::rpc::RpcContext::RespondSuccess() + 524 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8bcb6f
> kudu::consensus::RaftConsensus::NonTxRoundReplicationFinished(kudu::consensus::ConsensusRound*, kudu::Callback<void ()(kudu::Status const&)> const&, kudu::Status const&) + 367 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8ca675
> kudu::consensus::ReplicaState::AbortOpsAfterUnlocked(long) + 629 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8b9b49
> kudu::consensus::RaftConsensus::EnforceLogMatchingPropertyMatchesUnlocked(kudu::consensus::RaftConsensus::LeaderRequest const&, kudu::consensus::ConsensusResponsePB*) + 713 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8c304f
> kudu::consensus::RaftConsensus::CheckLeaderRequestUnlocked(kudu::consensus::ConsensusRequestPB const*, kudu::consensus::ConsensusResponsePB*, kudu::consensus::RaftConsensus::LeaderRequest*) + 815 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8c4850
> kudu::consensus::RaftConsensus::UpdateReplica(kudu::consensus::ConsensusRequestPB const*, kudu::consensus::ConsensusResponsePB*) + 624 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x8c6b61
> kudu::consensus::RaftConsensus::Update(kudu::consensus::ConsensusRequestPB const*, kudu::consensus::ConsensusResponsePB*) + 417 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> (gdb) info symbol 0x6f9086
> kudu::tserver::ConsensusServiceImpl::UpdateConsensus(kudu::consensus::ConsensusRequestPB const*, kudu::consensus::ConsensusResponsePB*, kudu::rpc::RpcContext*) + 710 in section .text of /opt/cloudera/parcels/KUDU-0.1.0-1.kudu0.1.0.p0.195/lib/kudu/sbin-release/kudu-tserver
> {code}
> My guess is that we somehow ended up responding twice to the same transaction



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)