You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Tom Bentley (Jira)" <ji...@apache.org> on 2021/02/24 15:12:00 UTC

[jira] [Commented] (KAFKA-12308) ConfigDef.parseType deadlock

    [ https://issues.apache.org/jira/browse/KAFKA-12308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289998#comment-17289998 ] 

Tom Bentley commented on KAFKA-12308:
-------------------------------------

I think this is caused by the fact the {{DelegatingClassLoader}} is not registered as parallel capable, but should be. It should be because, according to https://docs.oracle.com/javase/7/docs/technotes/guides/lang/cl-mt.html, to qualify for the acyclic delegation model "If the class is not found, the class loader asks its parent to locate the class. If the parent cannot find the class, the class loader attempts to locate the class itself.", but {{DelegatingClassLoader}} may actually ask the {{PluginClassLoader}} to load a class before it's tried {{super}}.

From the stack dump provided
{noformat}
"StartAndStopExecutor-connect-1-5":
	at java.lang.ClassLoader.loadClass(ClassLoader.java:398)                                              // wait for DCL getClassLoadingLock
	- waiting to lock <0x00000006c222db00> (a org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
	at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:397) // deletate to super
	at java.lang.ClassLoader.loadClass(ClassLoader.java:405)                                              // super delegates to parent (DCL)
	- locked <0x000000077b9bf3c0> (a java.lang.Object)                                                    // lock PCLY+name (super's getClassLoadingLock)
	at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:104)
	- locked <0x000000077b9bf3c0> (a java.lang.Object)                                                    // lock PCLY+name (getClassLoadingLock)
	- locked <0x00000006c25b4e38> (a org.apache.kafka.connect.runtime.isolation.PluginClassLoader)        // lock PCLY (synchronized)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
{noformat}

and 

{noformat}
"StartAndStopExecutor-connect-1-6":
	at org.apache.kafka.connect.runtime.isolation.PluginClassLoader.loadClass(PluginClassLoader.java:91) // lock PCLX (synchronized)
	- waiting to lock <0x00000006c25b4e38> (a org.apache.kafka.connect.runtime.isolation.PluginClassLoader)
	at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.loadClass(DelegatingClassLoader.java:394) // delegated to PCL
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)                                             // ClassLoader.loadClass(String name) calling PCL.loadClass(String,
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
{noformat}

It also says 

{noformat}
"StartAndStopExecutor-connect-1-5":
  waiting to lock monitor 0x00000203a553b6f8 (object 0x00000006c222db00, a org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader),
  which is held by "StartAndStopExecutor-connect-1-6"
{noformat}

the {{0x00000006c222db00}} doesn't appear in the stacktrace, I think that's because it's [held by the JVM itself|https://github.com/openjdk/jdk/blob/06170b7cbf6129274747b4406562184802d4ff07/src/hotspot/share/classfile/systemDictionary.cpp#L695]. 

If DelegatingClassloader is registered as parallel capable this won't happen

{noformat}
	at java.lang.ClassLoader.loadClass(ClassLoader.java:398)                                              // wait for DCL getClassLoadingLock
	- waiting to lock <0x00000006c222db00> (a org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader)
{noformat}

Because {{DCL.getClassLoadingLock}} will return an object specific to the class being loaded, rather than the DCL instance itself, which is locked by the JVM.

Does this seem plausible to you [~kkonstantine] [~ChrisEgerton]?

> ConfigDef.parseType deadlock
> ----------------------------
>
>                 Key: KAFKA-12308
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12308
>             Project: Kafka
>          Issue Type: Bug
>          Components: config
>    Affects Versions: 2.5.0
>         Environment: kafka 2.5.0
> centos7
> java version "1.8.0_231"
>            Reporter: cosmozhu
>            Priority: Major
>         Attachments: deadlock.log
>
>
> hi,
>  the problem was found, when I restarted *ConnectDistributed*
> I restart ConnectDistributed in the single node for the test, with not delete connectors.
>  sometimes the process stopped when creating connectors.
> I add some logger and found it had a deadlock in `ConfigDef.parseType`.My connectors always have the same transforms. I guess when connector startup (in startAndStopExecutor which default 8 threads) and load the same class file it has something wrong.
> I attached the jstack log file.
> thanks for any help.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)