You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2017/08/22 05:12:00 UTC
[jira] [Resolved] (KUDU-2030) Tablet server crashes on using deallocated Mutex object

     [ https://issues.apache.org/jira/browse/KUDU-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved KUDU-2030.
-------------------------------
       Resolution: Duplicate
    Fix Version/s: n/a

This was fixed by KUDU-2088

> Tablet server crashes on using deallocated Mutex object
> -------------------------------------------------------
>
>                 Key: KUDU-2030
>                 URL: https://issues.apache.org/jira/browse/KUDU-2030
>             Project: Kudu
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.0
>            Reporter: Alexey Serbin
>              Labels: stability
>             Fix For: n/a
>
>
> The code in {{RaftConsensus::UpdateReplica()}} (src/kudu/consensus/raft_consensus.cc) instantiates {{Synchronizer}} on the stack and then uses the derived StatusCallback in a way that under certain code path leads to an attempt to use already deallocated {{Mutex}} object {{CountDownLatch::lock_}}.  The instance of {{CountDownLatch}} is aggregated by the {{Synchronizer}} object itself.
> Under certain scenarios, tserver crashes with the following stack trace:
> {noformat}
> F0605 18:22:23.583866 14144 mutex.cc:76] Check failed: rv == 0 || rv == 16 . Invalid argument. Owner tid: 23156096; Self tid: 144; To collect the owner stack trace, enable the flag --debug_mutex_collect_stacktrace
> *** Check failure stack trace: ***                                              
>     @     0x7fab619a62fd  google::LogMessage::Fail() at ??:0                    
>     @     0x7fab619a81bd  google::LogMessage::SendToLog() at ??:0               
>     @     0x7fab619a5e39  google::LogMessage::Flush() at ??:0                   
>     @     0x7fab619a8c5f  google::LogMessageFatal::~LogMessageFatal() at ??:0   
>     @     0x7fab627eb453  kudu::Mutex::TryAcquire() at ??:0                     
>     @     0x7fab627eb82c  kudu::Mutex::Acquire() at ??:0                        
>     @     0x7fab6aec6b7a  kudu::CountDownLatch::CountDown() at ??:0             
>     @     0x7fab6aec526a  kudu::CountDownLatch::CountDown() at ??:0             
>     @     0x7fab69339633  kudu::Synchronizer::StatusCB() at ??:0                
>     @     0x7fab69339a21  kudu::internal::RunnableAdapter<>::Run() at ??:0      
>     @     0x7fab69339964  kudu::internal::InvokeHelper<>::MakeItSo() at ??:0    
>     @     0x7fab693398f2  kudu::internal::Invoker<>::Run() at ??:0              
>     @     0x7fab692bce26  kudu::Callback<>::Run() at ??:0  
> {noformat}
> The {{pthread_mutex_trylock()}} in mutex.cc:74 returns {{EINVAL}} since the underlying pthread mutex handle has already been deallocated.
> To reproduce, run the {{ClientFailoverOnNegotiationTimeoutITest.Kudu1580ConnectToTServer}} from {{client-negotiation-failover-itest}} built from version {{5f8442ff67fe87b019c71a09f0556bdcb6868428}} in DEBUG configuration with --stress-cpu-threads=8 about 1K times.  One 1K run usually produces about 3-4 crashes like that.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)