You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ratis.apache.org by "Tsz Wo Nicholas Sze (JIRA)" <ji...@apache.org> on 2018/11/09 21:03:00 UTC
[jira] [Assigned] (RATIS-404) Deadlock in ratis between
appendEntries and RaftLogWorker
[ https://issues.apache.org/jira/browse/RATIS-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tsz Wo Nicholas Sze reassigned RATIS-404:
-----------------------------------------
Resolution: Fixed
Assignee: Mukul Kumar Singh
I have committed this. Thanks, [~msingh]!
> Deadlock in ratis between appendEntries and RaftLogWorker
> ---------------------------------------------------------
>
> Key: RATIS-404
> URL: https://issues.apache.org/jira/browse/RATIS-404
> Project: Ratis
> Issue Type: Bug
> Components: server
> Affects Versions: 0.3.0
> Reporter: Mukul Kumar Singh
> Assignee: Mukul Kumar Singh
> Priority: Major
> Fix For: 0.3.0
>
> Attachments: RATIS-404.001.patch
>
>
> The deadlock happens when the RaftLogWorker queue is completely full. This happens when the following thread is trying to enqueue holding onto the RaftServerImpl lock.
> {code}
> "grpc-default-executor-18" #459 daemon prio=5 os_prio=0 tid=0x00007f8cd4a4a000 nid=0x5f6 waiting on condition [0x00007f8c31df2000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000098dd53d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
> at org.apache.ratis.server.storage.RaftLogWorker.addIOTask(RaftLogWorker.java:186)
> at org.apache.ratis.server.storage.RaftLogWorker.writeLogEntry(RaftLogWorker.java:300)
> at org.apache.ratis.server.storage.SegmentedRaftLog.appendEntry(SegmentedRaftLog.java:302)
> at org.apache.ratis.server.storage.SegmentedRaftLog.append(SegmentedRaftLog.java:379)
> at org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:914)
> - locked <0x000000009893b638> (a org.apache.ratis.server.impl.RaftServerImpl)
> at org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:821)
> at org.apache.ratis.server.impl.RaftServerProxy.lambda$appendEntriesAsync$18(RaftServerProxy.java:434)
> at org.apache.ratis.server.impl.RaftServerProxy$$Lambda$310/1439556067.apply(Unknown Source)
> at org.apache.ratis.server.impl.RaftServerProxy.lambda$null$5(RaftServerProxy.java:309)
> at org.apache.ratis.server.impl.RaftServerProxy$$Lambda$176/355487796.get(Unknown Source)
> at org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:82)
> at org.apache.ratis.server.impl.RaftServerProxy.lambda$submitRequest$6(RaftServerProxy.java:309)
> at org.apache.ratis.server.impl.RaftServerProxy$$Lambda$175/1025132044.apply(Unknown Source)
> at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:981)
> at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2124)
> at org.apache.ratis.server.impl.RaftServerProxy.submitRequest(RaftServerProxy.java:308)
> at org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:434)
> at org.apache.ratis.grpc.server.GrpcServerProtocolService$1.onNext(GrpcServerProtocolService.java:76)
> at org.apache.ratis.grpc.server.GrpcServerProtocolService$1.onNext(GrpcServerProtocolService.java:66)
> at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
> at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
> at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
> at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
> at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The RaftLogWorker thread is in turn blocked on locking the RaftServerImpl lock as in the following trace.
> {code}
> "c5a4d441-cb73-47a2-94b5-fc8233061955-RaftLogWorker" #440 daemon prio=5 os_prio=0 tid=0x00000000026a2000 nid=0x5e3 waiting for monitor entry [0x00007f8c884aa000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at org.apache.ratis.server.impl.RaftServerImpl.lambda$appendEntriesAsync$21(RaftServerImpl.java:925)
> - waiting to lock <0x000000009893b638> (a org.apache.ratis.server.impl.RaftServerImpl)
> at org.apache.ratis.server.impl.RaftServerImpl$$Lambda$316/47202155.apply(Unknown Source)
> at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
> at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
> at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962)
> at org.apache.ratis.server.storage.SegmentedRaftLog$Task.done(SegmentedRaftLog.java:83)
> at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:220)
> at java.lang.Thread.run(Thread.java:748)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)