You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Jeffrey Martin (Jira)" <ji...@apache.org> on 2019/09/18 18:39:00 UTC

[jira] [Comment Edited] (FLINK-14076) 'ClassNotFoundException: KafkaException' on Flink v1.9 w/ checkpointing

    [ https://issues.apache.org/jira/browse/FLINK-14076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932737#comment-16932737 ] 

Jeffrey Martin edited comment on FLINK-14076 at 9/18/19 6:38 PM:
-----------------------------------------------------------------

Two steps to get the JobManager to stay up:
 # add 'org.apache.kafka:kafka-clients:2.2.0' (and all its transitive deps) to the JM and TM classpaths
 # add this config entry in the JM and TM: _classloader.parent-first-patterns.additional: "org.apache.kafka.;org.apache.commons."_

The first one is necessary to allow JM to deserialize exceptions from failed checkpoints. The second one is necessary to ensure that all code uses those jars and there aren't conflicts of the form "can't assign _LinkedMap from TM classpath_ to field of type _LinkedMap from task classpath_"


was (Author: jdm2212):
Two steps to get the JobManager to stay up:
 # add 'org.apache.kafka:kafka-clients:2.2.0' (and all its transitive deps) to the JM and TM classpaths
 # add this config entry in the JM and TM: _classloader.parent-first-patterns.additional: "org.apache.kafka.;org.apache.commons."_

The first one is necessary to allow JM to deserialize exceptions from failed checkpoints. The second one is necessary to ensure that all code uses those jars and there aren't classpath conflicts.

> 'ClassNotFoundException: KafkaException' on Flink v1.9 w/ checkpointing
> -----------------------------------------------------------------------
>
>                 Key: FLINK-14076
>                 URL: https://issues.apache.org/jira/browse/FLINK-14076
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.9.0
>            Reporter: Jeffrey Martin
>            Priority: Major
>         Attachments: error.txt
>
>
> A Flink job that worked with checkpointing on a Flink v1.8.0 cluster fails on a Flink v1.9.0 cluster with checkpointing. It works on a Flink v1.9.0 cluster _without_ checkpointing. It is specifically _enabling checkpointing on v1.9.0_ that causes the JM to start throwing ClassNotFoundExceptions. Full stacktrace: [^error.txt]
> The job reads from Kafka via FlinkKafkaConsumer and writes to Kafka via FlinkKafkaProducer.
> The jobmanagers and taskmanagers are standalone.
> The exception is being raised deep in some Flink serialization code, so I'm not sure how to go about stepping through this in a debugger. The issue is happening in an internal repository at my job, but I can try to get a minimal repro on GitHub if it's not obvious from the error message alone what's broken.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)