You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Rajini Sivaram (JIRA)" <ji...@apache.org> on 2017/09/25 10:41:00 UTC

[jira] [Created] (KAFKA-5970) Deadlock due to locking of DelayedProduce and group

Rajini Sivaram created KAFKA-5970:
-------------------------------------

             Summary: Deadlock due to locking of DelayedProduce and group
                 Key: KAFKA-5970
                 URL: https://issues.apache.org/jira/browse/KAFKA-5970
             Project: Kafka
          Issue Type: Bug
          Components: clients
            Reporter: Rajini Sivaram
            Assignee: Rajini Sivaram
            Priority: Critical
             Fix For: 1.0.0


From a local run of TransactionsBounceTest. Looks like we hold group lock while completing DelayedProduce, which in turn may acquire group lock.

{quote}
Found one Java-level deadlock:
=============================
"kafka-request-handler-7":
  waiting to lock monitor 0x00007fe08891fb08 (object 0x000000074a9fbc50, a kafka.coordinator.group.GroupMetadata),
  which is held by "kafka-request-handler-4"
"kafka-request-handler-4":
  waiting to lock monitor 0x00007fe0869e4408 (object 0x0000000749be7bb8, a kafka.server.DelayedProduce),
  which is held by "kafka-request-handler-3"
"kafka-request-handler-3":
  waiting to lock monitor 0x00007fe08891fb08 (object 0x000000074a9fbc50, a kafka.coordinator.group.GroupMetadata),
  which is held by "kafka-request-handler-4"

Java stack information for the threads listed above:
===================================================
"kafka-request-handler-7":
        at kafka.coordinator.group.GroupMetadataManager$$anonfun$handleTxnCompletion$1.apply(GroupMetadataManager.scala:752)
          waiting to lock <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.coordinator.group.GroupMetadataManager$$anonfun$handleTxnCompletion$1.apply(GroupMetadataManager.scala:750)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
        at kafka.coordinator.group.GroupMetadataManager.handleTxnCompletion(GroupMetadataManager.scala:750)
        at kafka.coordinator.group.GroupCoordinator.handleTxnCompletion(GroupCoordinator.scala:439)
        at kafka.server.KafkaApis.kafka$server$KafkaApis$$maybeSendResponseCallback$1(KafkaApis.scala:1556)
        at kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$1$$anonfun$apply$20.apply(KafkaApis.scala:1614)
        at kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$1$$anonfun$apply$20.apply(KafkaApis.scala:1614)
        at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:134)
        at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:66)
        at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:116)
        at kafka.server.DelayedProduce.safeTryComplete(DelayedProduce.scala:76)
          locked <0x000000074b21c968> (a kafka.server.DelayedProduce)
        at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:338)
        at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:244)
        at kafka.server.ReplicaManager.tryCompleteDelayedProduce(ReplicaManager.scala:284)
        at kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:434)
        at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:285)
        at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1290)
        at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1286)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:1286)
        at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:786)
        at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:598)
        at kafka.server.KafkaApis.handle(KafkaApis.scala:100)
        at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
        at java.lang.Thread.run(Thread.java:748)
"kafka-request-handler-4":
        at kafka.server.DelayedProduce.safeTryComplete(DelayedProduce.scala:75)
          waiting to lock <0x0000000749be7bb8> (a kafka.server.DelayedProduce)
        at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:338)
        at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:244)
        at kafka.server.ReplicaManager.tryCompleteDelayedProduce(ReplicaManager.scala:284)
        at kafka.cluster.Partition.tryCompleteDelayedRequests(Partition.scala:434)
        at kafka.cluster.Partition.appendRecordsToLeader(Partition.scala:516)
        at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:707)
        at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:691)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
        at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
        at scala.collection.AbstractTraversable.map(Traversable.scala:104)
        at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:691)
        at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:441)
        at kafka.coordinator.group.GroupMetadataManager.appendForGroup(GroupMetadataManager.scala:240)
        at kafka.coordinator.group.GroupMetadataManager.storeGroup(GroupMetadataManager.scala:228)
        at kafka.coordinator.group.GroupCoordinator.onCompleteJoin(GroupCoordinator.scala:731)
          locked <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.coordinator.group.DelayedJoin.onComplete(DelayedJoin.scala:44)
        at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:66)
        at kafka.coordinator.group.DelayedJoin$$anonfun$tryComplete$1.apply$mcZ$sp(DelayedJoin.scala:42)
        at kafka.coordinator.group.GroupCoordinator.tryCompleteJoin(GroupCoordinator.scala:708)
          locked <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.coordinator.group.DelayedJoin.tryComplete(DelayedJoin.scala:42)
        at kafka.coordinator.group.DelayedJoin.safeTryComplete(DelayedJoin.scala:40)
        at kafka.server.DelayedOperationPurgatory.tryCompleteElseWatch(DelayedOperation.scala:199)
        at kafka.coordinator.group.GroupCoordinator.prepareRebalance(GroupCoordinator.scala:693)
        at kafka.coordinator.group.GroupCoordinator.kafka$coordinator$group$GroupCoordinator$$maybePrepareRebalance(GroupCoordinator.scala:668)
          locked <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.coordinator.group.GroupCoordinator.removeMemberAndUpdateGroup(GroupCoordinator.scala:700)
        at kafka.coordinator.group.GroupCoordinator.handleLeaveGroup(GroupCoordinator.scala:324)
          locked <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.server.KafkaApis.handleLeaveGroupRequest(KafkaApis.scala:1259)
        at kafka.server.KafkaApis.handle(KafkaApis.scala:112)
        at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
        at java.lang.Thread.run(Thread.java:748)
"kafka-request-handler-3":
        at kafka.coordinator.group.GroupMetadataManager$$anonfun$handleTxnCompletion$1.apply(GroupMetadataManager.scala:752)
          waiting to lock <0x000000074a9fbc50> (a kafka.coordinator.group.GroupMetadata)
        at kafka.coordinator.group.GroupMetadataManager$$anonfun$handleTxnCompletion$1.apply(GroupMetadataManager.scala:750)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
        at kafka.coordinator.group.GroupMetadataManager.handleTxnCompletion(GroupMetadataManager.scala:750)
        at kafka.coordinator.group.GroupCoordinator.handleTxnCompletion(GroupCoordinator.scala:439)
        at kafka.server.KafkaApis.kafka$server$KafkaApis$$maybeSendResponseCallback$1(KafkaApis.scala:1556)
        at kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$1$$anonfun$apply$20.apply(KafkaApis.scala:1614)
        at kafka.server.KafkaApis$$anonfun$handleWriteTxnMarkersRequest$1$$anonfun$apply$20.apply(KafkaApis.scala:1614)
        at kafka.server.DelayedProduce.onComplete(DelayedProduce.scala:134)
        at kafka.server.DelayedOperation.forceComplete(DelayedOperation.scala:66)
        at kafka.server.DelayedProduce.tryComplete(DelayedProduce.scala:116)
        at kafka.server.DelayedProduce.safeTryComplete(DelayedProduce.scala:76)
          locked <0x0000000749be7bb8> (a kafka.server.DelayedProduce)
        at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:338)
        at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:244)
        at kafka.server.ReplicaManager.tryCompleteDelayedProduce(ReplicaManager.scala:284)
        at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1294)
        at kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1286)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:1286)
        at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:786)
        at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:598)
        at kafka.server.KafkaApis.handle(KafkaApis.scala:100)
        at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:65)
        at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)