You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Adar Dembo (JIRA)" <ji...@apache.org> on 2018/07/05 16:48:00 UTC

[jira] [Commented] (KUDU-2244) spinlock contention in raft_consensus

    [ https://issues.apache.org/jira/browse/KUDU-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533888#comment-16533888 ] 

Adar Dembo commented on KUDU-2244:
----------------------------------

Just to piggy-back on this, RaftConsensus::StartElection holds lock_ (a spinlock) while it creates a LeaderElection object, which causes the creation of a RpcPeerProxy, which does DNS resolution. That resolution may not always be fast (see KUDU-2395), so it doesn't make sense for lock_ to be a spinlock, or to be held during this time.

 

> spinlock contention in raft_consensus
> -------------------------------------
>
>                 Key: KUDU-2244
>                 URL: https://issues.apache.org/jira/browse/KUDU-2244
>             Project: Kudu
>          Issue Type: Improvement
>          Components: consensus
>            Reporter: Andrew Wong
>            Priority: Major
>
> I was going through the logs of a cluster that was seeing a bunch of kernel_stack_watchdog traces, and the slowness seemed to be caused by a lot of activity in consensus requests. E.g.
> W1214 18:57:29.514219 36138 kernel_stack_watchdog.cc:145] Thread 36317 stuck at /data/jenkins/workspace/generic-package-centos64-7-0-impala/topdir/BUILD/kudu-1.3.0-cdh5.11.0/src/kudu/rpc/outbound_call.cc:192 for 123ms:
> Kernel stack:
> [<ffffffff810b7ef5>] sys_sched_yield+0x65/0xd0
> [<ffffffff81645909>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> User stack:
>     @     0x7f72fab92057  __GI___sched_yield
>     @          0x19498bf  kudu::Thread::StartThread()
>     @          0x1952e7d  kudu::ThreadPool::CreateThreadUnlocked()
>     @          0x19534d3  kudu::ThreadPool::Submit()
>     @          0x1953a27  kudu::ThreadPool::SubmitFunc()
>     @          0x1953ecb  kudu::ThreadPool::SubmitClosure()
>     @           0x9c94ec  kudu::consensus::RaftConsensus::ElectionCallback()
>     @           0x9e6032  kudu::consensus::LeaderElection::CheckForDecision()
>     @           0x9e78c3  kudu::consensus::LeaderElection::VoteResponseRpcCallback()
>     @           0xa8b137  kudu::rpc::OutboundCall::CallCallback()
>     @           0xa8c2bc  kudu::rpc::OutboundCall::SetResponse()
>     @           0xa822c0  kudu::rpc::Connection::HandleCallResponse()
>     @           0xa83ffc  ev::base<>::method_thunk<>()
>     @          0x198a07f  ev_invoke_pending
>     @          0x198af71  ev_run
>     @           0xa5e049  kudu::rpc::ReactorThread::RunThread()
> So it seemed to be cause by some slowness in getting threads. Upon perusing the logs a bit more, there were a sizable number of spinlock profiling traces:
> W1214 18:54:27.897955 36379 rpcz_store.cc:238] Trace:
> 1214 18:54:26.766922 (+     0us) service_pool.cc:143] Inserting onto call queue
> 1214 18:54:26.771135 (+  4213us) service_pool.cc:202] Handling call
> 1214 18:54:26.771138 (+     3us) raft_consensus.cc:1126] Updating replica for 0 ops
> 1214 18:54:27.897699 (+1126561us) raft_consensus.cc:1165] Early marking committed up to index 0
> 1214 18:54:27.897700 (+     1us) raft_consensus.cc:1170] Triggering prepare for 0 ops
> 1214 18:54:27.897701 (+     1us) raft_consensus.cc:1282] Marking committed up to 1766
> 1214 18:54:27.897702 (+     1us) raft_consensus.cc:1332] Filling consensus response to leader.
> 1214 18:54:27.897736 (+    34us) spinlock_profiling.cc:255] Waited 991 ms on lock 0x120b3540. stack: 00000000019406c5 00000000009c60d7 00000000009c75f7 00000000007dc628 0000000000a7adfc 0000000000a7b9cd 000000000194d059 00007f72fbcc2dc4 00007f72fabad1cc ffffffffffffffff
> 1214 18:54:27.897737 (+     1us) raft_consensus.cc:1327] UpdateReplicas() finished
> 1214 18:54:27.897741 (+     4us) inbound_call.cc:130] Queueing success response
> Metrics: {"spinlock_wait_cycles":2478395136}
> Each of the traces noted on the order of 500-1000ms of waiting on spinlocks. Upon looking at raft_consensus.cc, it seems we're holding a spinlock (update_lock_) while we call RaftConsensus::UpdateReplica(), which according to its header, "won't return until all operations have been stored in the log and all Prepares() have been completed". While locking may be necessary, it's worth considering using a different kind of lock here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)