You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Goltseva Taisiia (Jira)" <ji...@apache.org> on 2020/08/24 08:46:00 UTC

[jira] [Created] (KAFKA-10426) Deadlock in KafkaConfigBackingStore

Goltseva Taisiia created KAFKA-10426:
----------------------------------------

             Summary: Deadlock in KafkaConfigBackingStore
                 Key: KAFKA-10426
                 URL: https://issues.apache.org/jira/browse/KAFKA-10426
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
    Affects Versions: 2.6.0, 2.4.1
            Reporter: Goltseva Taisiia


Hi, guys!

We faced the following deadlock:

 
{code:java}
KafkaBasedLog Work Thread - _streaming_service_config
priority:5 - threadId:0x00007f18ec22c000 - nativeId:0x950 - nativeId (decimal):2384 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at com.company.streaming.platform.kafka.DistributedHerder$ConfigUpdateListener.onSessionKeyUpdate(DistributedHerder.java:1586)
- waiting to lock <0x00000000e6136808> (a com.company.streaming.platform.kafka.DistributedHerder)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:707)
- locked <0x00000000d8c3be40> (a java.lang.Object)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:481)
at org.apache.kafka.connect.util.KafkaBasedLog.poll(KafkaBasedLog.java:264)
at org.apache.kafka.connect.util.KafkaBasedLog.access$500(KafkaBasedLog.java:71)
at org.apache.kafka.connect.util.KafkaBasedLog$WorkThread.run(KafkaBasedLog.java:337)



CustomDistributedHerder-connect-1
priority:5 - threadId:0x00007f1a01e30800 - nativeId:0x93a - nativeId (decimal):2362 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore.snapshot(KafkaConfigBackingStore.java:285)
- waiting to lock <0x00000000d8c3be40> (a java.lang.Object)
at com.company.streaming.platform.kafka.DistributedHerder.updateConfigsWithIncrementalCooperative(DistributedHerder.java:514)
- locked <0x00000000e6136808> (a com.company.streaming.platform.kafka.DistributedHerder)
at com.company.streaming.platform.kafka.DistributedHerder.tick(DistributedHerder.java:402)
at com.company.streaming.platform.kafka.DistributedHerder.run(DistributedHerder.java:286)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}

DistributedHerder went to updateConfigsWithIncrementalCooperative() synchronized method and called configBackingStore.snapshot() which take a lock on internal object in KafkaConfigBackingStore class.

 

Meanwhile KafkaConfigBackingStore in ConsumeCallback inside synchronized block on internal object got SESSION_KEY record and called updateListener.onSessionKeyUpdate() which take a lock on DistributedHerder.

 

As I can see the problem is here:

[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L737]

 

As I understand this call should be performed outside synchronized block:
{code:java}
if (started)
   updateListener.onSessionKeyUpdate(KafkaConfigBackingStore.this.sessionKey);{code}
 

I'm going to make a PR.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)