You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Goltseva Taisiia (Jira)" <ji...@apache.org> on 2020/08/24 08:46:00 UTC
[jira] [Created] (KAFKA-10426) Deadlock in KafkaConfigBackingStore
Goltseva Taisiia created KAFKA-10426:
----------------------------------------
Summary: Deadlock in KafkaConfigBackingStore
Key: KAFKA-10426
URL: https://issues.apache.org/jira/browse/KAFKA-10426
Project: Kafka
Issue Type: Bug
Components: KafkaConnect
Affects Versions: 2.6.0, 2.4.1
Reporter: Goltseva Taisiia
Hi, guys!
We faced the following deadlock:
{code:java}
KafkaBasedLog Work Thread - _streaming_service_config
priority:5 - threadId:0x00007f18ec22c000 - nativeId:0x950 - nativeId (decimal):2384 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at com.company.streaming.platform.kafka.DistributedHerder$ConfigUpdateListener.onSessionKeyUpdate(DistributedHerder.java:1586)
- waiting to lock <0x00000000e6136808> (a com.company.streaming.platform.kafka.DistributedHerder)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:707)
- locked <0x00000000d8c3be40> (a java.lang.Object)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore$ConsumeCallback.onCompletion(KafkaConfigBackingStore.java:481)
at org.apache.kafka.connect.util.KafkaBasedLog.poll(KafkaBasedLog.java:264)
at org.apache.kafka.connect.util.KafkaBasedLog.access$500(KafkaBasedLog.java:71)
at org.apache.kafka.connect.util.KafkaBasedLog$WorkThread.run(KafkaBasedLog.java:337)
CustomDistributedHerder-connect-1
priority:5 - threadId:0x00007f1a01e30800 - nativeId:0x93a - nativeId (decimal):2362 - state:BLOCKED
stackTrace:
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.kafka.connect.storage.KafkaConfigBackingStore.snapshot(KafkaConfigBackingStore.java:285)
- waiting to lock <0x00000000d8c3be40> (a java.lang.Object)
at com.company.streaming.platform.kafka.DistributedHerder.updateConfigsWithIncrementalCooperative(DistributedHerder.java:514)
- locked <0x00000000e6136808> (a com.company.streaming.platform.kafka.DistributedHerder)
at com.company.streaming.platform.kafka.DistributedHerder.tick(DistributedHerder.java:402)
at com.company.streaming.platform.kafka.DistributedHerder.run(DistributedHerder.java:286)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
DistributedHerder went to updateConfigsWithIncrementalCooperative() synchronized method and called configBackingStore.snapshot() which take a lock on internal object in KafkaConfigBackingStore class.
Meanwhile KafkaConfigBackingStore in ConsumeCallback inside synchronized block on internal object got SESSION_KEY record and called updateListener.onSessionKeyUpdate() which take a lock on DistributedHerder.
As I can see the problem is here:
[https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L737]
As I understand this call should be performed outside synchronized block:
{code:java}
if (started)
updateListener.onSessionKeyUpdate(KafkaConfigBackingStore.this.sessionKey);{code}
I'm going to make a PR.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)