You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Joel Koshy (JIRA)" <ji...@apache.org> on 2012/11/16 02:21:11 UTC
[jira] [Created] (KAFKA-618) Deadlock between leader-finder-thread
and consumer-fetcher-thread during broker failure
Joel Koshy created KAFKA-618:
--------------------------------
Summary: Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
Key: KAFKA-618
URL: https://issues.apache.org/jira/browse/KAFKA-618
Project: Kafka
Issue Type: Bug
Affects Versions: 0.8
Reporter: Joel Koshy
Priority: Blocker
Fix For: 0.8
This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
Here is the threaddump output that shows the issue:
Found one Java-level deadlock:
=============================
"ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
"console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
Java stack information for the threads listed above:
===================================================
"ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
- locked <0x00007f2283ab01d0> (a java.lang.Object)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
"console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
- waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
- locked <0x00007f2283ab0338> (a java.lang.Object)
at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-618) Deadlock between leader-finder-thread
and consumer-fetcher-thread during broker failure
Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Koshy updated KAFKA-618:
-----------------------------
Attachment: KAFKA-618-v1.patch
Ran 20 iterations of testcase 4011 and they all pass.
One potential concern is partitionMap in the ConsumerFetcherThread being null/inconsistent wrt the partitionMap in ConsumerFetcherManager, but I looked at it closely and don't think it is possible.
> Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-618
> URL: https://issues.apache.org/jira/browse/KAFKA-618
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Joel Koshy
> Priority: Blocker
> Fix For: 0.8
>
> Attachments: KAFKA-618-v1.patch
>
>
> This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
> Here is the threaddump output that shows the issue:
> Found one Java-level deadlock:
> =============================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
> which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
> Java stack information for the threads listed above:
> ===================================================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
> at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
> - locked <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
> - waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
> - locked <0x00007f2283ab0338> (a java.lang.Object)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
> at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-618) Deadlock between
leader-finder-thread and consumer-fetcher-thread during broker failure
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498555#comment-13498555 ]
Jun Rao commented on KAFKA-618:
-------------------------------
This is a very good finding. The following is one way of breaking the deadlock.
In ConsumerFetcherManager, don't expose getPartitionTopicInfo(). Instead, pass partitionMap (which is immutable) to each newly created ConsumerFetcherThread. This way, ConsumerFetcherThread.processPartitionData() and ConsumerFetcherThread.handleOffsetOutOfRange() won't depend on ConsumerFetcherManager any more. If we do that, we can improve ConsumerFetcherManager.stopAllConnections() a bit too. The clearing of noLeaderPartitionSet and partitionMap can be done together before calling closeAllFetchers(). Before, we have to clear partitionMap last because before all fetchers are stopped, the processing of the fetch request still needs to read partitionMap and expects it to be non-null.
> Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-618
> URL: https://issues.apache.org/jira/browse/KAFKA-618
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Joel Koshy
> Priority: Blocker
> Fix For: 0.8
>
>
> This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
> Here is the threaddump output that shows the issue:
> Found one Java-level deadlock:
> =============================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
> which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
> Java stack information for the threads listed above:
> ===================================================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
> at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
> - locked <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
> - waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
> - locked <0x00007f2283ab0338> (a java.lang.Object)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
> at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (KAFKA-618) Deadlock between leader-finder-thread
and consumer-fetcher-thread during broker failure
Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Koshy closed KAFKA-618.
----------------------------
Committed to 0.8
> Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-618
> URL: https://issues.apache.org/jira/browse/KAFKA-618
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Joel Koshy
> Priority: Blocker
> Fix For: 0.8
>
> Attachments: KAFKA-618-v1.patch
>
>
> This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
> Here is the threaddump output that shows the issue:
> Found one Java-level deadlock:
> =============================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
> which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
> Java stack information for the threads listed above:
> ===================================================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
> at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
> - locked <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
> - waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
> - locked <0x00007f2283ab0338> (a java.lang.Object)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
> at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (KAFKA-618) Deadlock between leader-finder-thread
and consumer-fetcher-thread during broker failure
Posted by "Joel Koshy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Koshy resolved KAFKA-618.
------------------------------
Resolution: Fixed
> Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-618
> URL: https://issues.apache.org/jira/browse/KAFKA-618
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Joel Koshy
> Priority: Blocker
> Fix For: 0.8
>
> Attachments: KAFKA-618-v1.patch
>
>
> This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
> Here is the threaddump output that shows the issue:
> Found one Java-level deadlock:
> =============================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
> which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
> Java stack information for the threads listed above:
> ===================================================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
> at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
> - locked <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
> - waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
> - locked <0x00007f2283ab0338> (a java.lang.Object)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
> at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-618) Deadlock between
leader-finder-thread and consumer-fetcher-thread during broker failure
Posted by "Jun Rao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499309#comment-13499309 ]
Jun Rao commented on KAFKA-618:
-------------------------------
Thanks for the patch. +1
> Deadlock between leader-finder-thread and consumer-fetcher-thread during broker failure
> ---------------------------------------------------------------------------------------
>
> Key: KAFKA-618
> URL: https://issues.apache.org/jira/browse/KAFKA-618
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8
> Reporter: Joel Koshy
> Priority: Blocker
> Fix For: 0.8
>
> Attachments: KAFKA-618-v1.patch
>
>
> This causes the test failure reported in KAFKA-607. This affects high-level consumers - if they hit the deadlock then they would get wedged (or at least until the consumer timeout).
> Here is the threaddump output that shows the issue:
> Found one Java-level deadlock:
> =============================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> waiting for ownable synchronizer 0x00007f2283ad0000, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),
> which is held by "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread"
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> waiting to lock monitor 0x00007f2288297190 (object 0x00007f2283ab01d0, a java.lang.Object),
> which is held by "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1"
> Java stack information for the threads listed above:
> ===================================================
> "ConsumerFetcherThread-console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-0-1":
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f2283ad0000> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:842)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1178)
> at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:186)
> at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262)
> at kafka.consumer.ConsumerFetcherManager.getPartitionTopicInfo(ConsumerFetcherManager.scala:131)
> at kafka.consumer.ConsumerFetcherThread.processPartitionData(ConsumerFetcherThread.scala:43)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:116)
> at kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:99)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:105)
> at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:99)
> - locked <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> "console-consumer-41755_jkoshy-ld-1353026496639-b0e24a70-leader-finder-thread":
> at kafka.server.AbstractFetcherThread.addPartition(AbstractFetcherThread.scala:142)
> - waiting to lock <0x00007f2283ab01d0> (a java.lang.Object)
> at kafka.server.AbstractFetcherManager.addFetcher(AbstractFetcherManager.scala:49)
> - locked <0x00007f2283ab0338> (a java.lang.Object)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:81)
> at kafka.consumer.ConsumerFetcherManager$$anon$1$$anonfun$doWork$5.apply(ConsumerFetcherManager.scala:76)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
> at scala.collection.Iterator$class.foreach(Iterator.scala:631)
> at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
> at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
> at kafka.consumer.ConsumerFetcherManager$$anon$1.doWork(ConsumerFetcherManager.scala:76)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> Found 1 deadlock.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira