You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Joel Koshy (JIRA)" <ji...@apache.org> on 2014/04/09 20:36:17 UTC
[jira] [Commented] (KAFKA-1310) Zookeeper timeout causes deadlock
in Controller
[ https://issues.apache.org/jira/browse/KAFKA-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964521#comment-13964521 ]
Joel Koshy commented on KAFKA-1310:
-----------------------------------
Fixed by KAFKA-1317
> Zookeeper timeout causes deadlock in Controller
> -----------------------------------------------
>
> Key: KAFKA-1310
> URL: https://issues.apache.org/jira/browse/KAFKA-1310
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8.1
> Reporter: Fedor Korotkiy
> Assignee: Neha Narkhede
> Priority: Blocker
> Fix For: 0.8.1.1
>
>
> Steps to reproduce:
> 1. Checkout and build 0.8.1 branch from github:
> git clone git@github.com:apache/kafka.git && cd kafka && git checkout origin/0.8.1 && ./gradlew jar
> 2. Start zookeeper server:
> ./bin/zookeeper-server-start.sh config/zookeeper.properties
> 3. Start kafka server:
> ./bin/kafka-server-start.sh config/server.properties
> 4. Suspend zookeeper process for 10 seconds (ctrl-Z, then %1).
> 5. And kafka hasn't been re-registered in zookeeper.
> ./bin/zookeeper-shell.sh
> ls /brokers/ids
> >> []
> Root cause of the problem seems to be the deadlock between DeleteTopicsThread and SessionExpirationListener in KafkaController.
> 1. DeleteTopicsThread acquires controllerLock and await()-s on deleteTopicsCond in awaitTopicDeletionNotification()
> 2. SessionExpirationListener fires. It acquires controllerLock and tries to shutdown deleteTopicManager(in onControllerResignation()). This interrupts DeleteTopicsThread.
> 3. DeleteTopicsThread can't return from deleteTopicsCond.await() because controllerLock is taken. We got a deadlock.
--
This message was sent by Atlassian JIRA
(v6.2#6252)