You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Stephan Ewen <se...@apache.org> on 2018/03/09 11:21:18 UTC

[DISCUSS] Inverted (child-first) class loading

Hi all!

Flink 1.4 introduces child-first classloading by default, for the
application libraries.

We added that, because it allows applications to use different versions of
many libraries, compared to what Flink uses in its core, or compared to
what other dependencies (like Hadoop) pull into the class path.

For example, applications can use different versions of akka, Avro,
Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.

Now, while that is nice, child-first classloading runs into trouble when
the application jars are not properly built, meaning when the application
JAR contains libraries that it should not (because they are already in the
classpath / lib folder).

For example, when the class path has the Kafka Connector (connector is in
the lib directory) and the application jar also contains Kafka, the we get
nasty errors due to class duplication and impossible class casts (X cannot
be cast to X).


What I would like to understand is how this change worked out for the
users. Based on that, we can keep this or revert this change in the next
release.

Please answer to this mail with:

  a. This was a great change, keep it and polish it.

  b. This caused in the end more problems than it solved, so please set the
default back to "parent-first" in 1.5 and leave "child-first" as an
optional flag.


Thanks a lot,
Stephan

Re:[DISCUSS] Inverted (child-first) class loading

Posted by mingleizhang <zm...@163.com>.
Hi, Stephan


It is a great change, keep it and polish it. nice nice nice


I think the more fewer NoClassDefFoundError or ClassNotFoundException the user will encounter in the future. But I would like to two question about this functionality. If Im am wrong, please helps me out. Thank you in advance.
You said : Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built. 
Q: Hmm, I think it is seldom to happens as users (like me in some times, but seldom) always debug the application to run correctly on my local machine before they deploy to production environment. If I set the child-first strategy, I can use whatever software (akka, Avro,protobuf) version I want  and those software also in flink core is being using and can not cause class conflicts. I am correct ?


You said: when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
Q: the class path is what  ? flink itself runtime classpath ? (I dont think it is a flink itself runtime classpath), and the application jar ( I think it is the user jar that upload to a flink cluster) 


Could you tell me more ? 


Thanks
Minglei.







At 2018-03-09 19:21:18, "Stephan Ewen" <se...@apache.org> wrote:
>Hi all!
>
>Flink 1.4 introduces child-first classloading by default, for the
>application libraries.
>
>We added that, because it allows applications to use different versions of
>many libraries, compared to what Flink uses in its core, or compared to
>what other dependencies (like Hadoop) pull into the class path.
>
>For example, applications can use different versions of akka, Avro,
>Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>
>Now, while that is nice, child-first classloading runs into trouble when
>the application jars are not properly built, meaning when the application
>JAR contains libraries that it should not (because they are already in the
>classpath / lib folder).
>
>For example, when the class path has the Kafka Connector (connector is in
>the lib directory) and the application jar also contains Kafka, the we get
>nasty errors due to class duplication and impossible class casts (X cannot
>be cast to X).
>
>
>What I would like to understand is how this change worked out for the
>users. Based on that, we can keep this or revert this change in the next
>release.
>
>Please answer to this mail with:
>
>  a. This was a great change, keep it and polish it.
>
>  b. This caused in the end more problems than it solved, so please set the
>default back to "parent-first" in 1.5 and leave "child-first" as an
>optional flag.
>
>
>Thanks a lot,
>Stephan

Re: [DISCUSS] Inverted (child-first) class loading

Posted by Ken Krugler <kk...@transpac.com>.
Hi Kedar,

See some thoughts inline below.

I will admit that classpath issues (as in “I can see my damn class in the file, I can load it from my code, but I’m getting a freakin’ class not found exception at runtime…arghhh”) have been one of the biggest hassles with Flink-based projects that we’ve run into. 

— Ken

> On Mar 10, 2018, at 5:53 AM, kedar mhaswade <ke...@gmail.com> wrote:
> 
> This is an interesting question and it usually has consequences that are far-reaching in user experience.
> 
> If a Flink app is supposed to be a "standalone app" that any Flink installation should be able to run, then the child-first classloading makes sense. This is how we build many of the Java application servers (e.g. GlassFish, JBoss etc). Doing this makes the application "self-contained" and perhaps portable. Of course, this increases the size of the Jar. The one issue to watch out for is application using framework classes that are newer than framework itself. For instance, should I expect my app with Flink 1.6 DataSet/DataStream classes to run smoothly on a Flink 1.5 installation?

I think this might be possible, once Flink has support for HDFS & S3 w/o requiring any Hadoop code.

Though I think there would still be logging entanglement, as you’d want to have your logging output be directed to the standard locations.

And I wonder if this model would really need a “no-parent” classloading mechanism - though it would obviously still need the JRE.

So not sure if this would be a good idea, just wanted to write down some thoughts.

But it would solve one issue we often run into, where our main class (which is often running on a separate “controller” server) needs to use Flink classes when building the workflow, before executing it. Currently we have to make sure we’ve got a version of Flink installed on that machine which matches the version of the cluster, and we add all of the Flink jars to the classpath.

> If a Flink app depends on a particular (version of the) Flink installation, then, if using parent-first classloading, the app can make use of the classes that the installation itself uses. This makes the app (comparatively) less self-contained, but this limits the size of the app's Jar. There are advantages of doing this, but it poses problems especially in upgrades.

Plus there’s the issue where code we’re using has dependencies on different versions of the same jars that are part of the Flink (and/or Hadoop/EMR) installation.

> Whether one or the other should be the behavior largely depends on how the applications are built, tested, and deployed. Application's build comes into picture because in tools like Maven a dependency can be declared to be "provided" which means if you know that your app's dependency is also your framework's (i.e. Flink) dependency and you, as an app developer, are okay with that Maven wouldn't bundle it in your app's Jar.
> 
> So, my recommendation is that since this appears like a backward incompatible change, Flink should provide an option to go back to parent-first classloading for a given app, at least for 1.5. Child-first classloading seems like the right thing to do given how (unnecessarily) complicated the deployments have become and given how frequently apps use library versions that are different from the framework. 
> 
> ElasticSearch solution has merits too, but it is unclear if it helps at deployment time merely to identify that there is a duplicate (without knowing where it has come from). Ideally, when people build the so-called shadow Jar (one Jar with all dependencies) the build script should warn of the duplicates. Shadow Jars alleviate (but do not remove) the problems of "Jar Hell". But it seems to me that till we move to a modular Java (that is Java 9; I think this is way out in future), this is the preferred solution.
> 
> That said, I'd really like to see a classloading section in Flink docs (somewhere in dev/best_practices.html). Is a JIRA in order?
> 
> Regards,
> Kedar
> 
> On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ewenstephan@gmail.com <ma...@gmail.com>> wrote:
> @Ken very interesting thought.
> 
> One for have three options:
>   - forbid duplicate classes
>   - parent first conflict resolution
>   - child first conflict resolution
> 
> Having number one as the default and let the error message suggest options two and three as options would definitely make users aware of the issue...
> 
> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kkrugler_lists@transpac.com <ma...@transpac.com>> wrote:
> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
> 
> Basically they prevent a program from running if there are duplicate classes on the classpath.
> 
> This causes headaches when you really need a different version of library X, and that’s already on the class path.
> 
> See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.
> 
> But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).
> 
>> Caused by: java.lang.IllegalStateException: jar hell!
>> class: jdk.packager.services.UserJvmOptionsService
>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
> 
> — Ken
> 
> 
>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <sewen@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi all!
>> 
>> Flink 1.4 introduces child-first classloading by default, for the application libraries.
>> 
>> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
>> 
>> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>> 
>> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
>> 
>> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
>> 
>> 
>> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
>> 
>> Please answer to this mail with:
>> 
>>   a. This was a great change, keep it and polish it.
>> 
>>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
>> 
>> 
>> Thanks a lot,
>> Stephan
>> 
> 
> --------------------------------------------
> http://about.me/kkrugler <http://about.me/kkrugler>
> +1 530-210-6378 <tel:(530)%20210-6378>
> 

--------------------------------------------
http://about.me/kkrugler
+1 530-210-6378


Re: [DISCUSS] Inverted (child-first) class loading

Posted by kedar mhaswade <ke...@gmail.com>.
Many thanks Aljoscha! I am sorry I missed this section.

Regards,
Kedar

On Mon, Mar 12, 2018 at 9:16 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> Hi Kedar,
>
> There is this section in the Flink docs: https://ci.apache.org/
> projects/flink/flink-docs-master/monitoring/debugging_classloading.html
>
> Best,
> Aljoscha
>
>
> On 10. Mar 2018, at 05:53, kedar mhaswade <ke...@gmail.com>
> wrote:
>
> This is an interesting question and it usually has consequences that are
> far-reaching in user experience.
>
> If a Flink app is supposed to be a "standalone app" that any Flink
> installation should be able to run, then the child-first classloading makes
> sense. This is how we build many of the Java application servers (e.g.
> GlassFish, JBoss etc). Doing this makes the application "self-contained"
> and perhaps portable. Of course, this increases the size of the Jar. The
> one issue to watch out for is application using framework classes that are
> newer than framework itself. For instance, should I expect my app with
> Flink *1.6* DataSet/DataStream classes to run smoothly on a Flink 1.5
> installation?
>
> If a Flink app depends on a particular (version of the) Flink
> installation, then, if using parent-first classloading, the app can make
> use of the classes that the installation itself uses. This makes the app
> (comparatively) less self-contained, but this limits the size of the app's
> Jar. There are advantages of doing this, but it poses problems especially
> in upgrades.
>
> Whether one or the other should be the behavior largely depends on how the
> applications are built, tested, and deployed. Application's build comes
> into picture because in tools like Maven a dependency can be declared to be
> "provided" which means if you know that your app's dependency is also your
> framework's (i.e. Flink) dependency and you, as an app developer, are okay
> with that Maven wouldn't bundle it in your app's Jar.
>
> So, my recommendation is that since this appears like a backward
> incompatible change, Flink should provide an option to go back to
> parent-first classloading for a given app, at least for 1.5. Child-first
> classloading seems like the right thing to do given how (unnecessarily)
> complicated the deployments have become and given how frequently apps use
> library versions that are different from the framework.
>
> ElasticSearch solution has merits too, but it is unclear if it helps *at
> deployment time* merely to identify that there is a duplicate (without
> knowing where it has come from). Ideally, when people build the so-called
> shadow Jar (one Jar with all dependencies) the build script should warn of
> the duplicates. Shadow Jars alleviate (but do not remove) the problems of
> "Jar Hell". But it seems to me that till we move to a modular Java (that is
> Java 9; I think this is way out in future), this is the preferred solution.
>
> That said, I'd really like to see a classloading section in Flink docs
> (somewhere in dev/best_practices.html). Is a JIRA in order?
>
> Regards,
> Kedar
>
> On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ew...@gmail.com>
> wrote:
>
>> @Ken very interesting thought.
>>
>> One for have three options:
>>   - forbid duplicate classes
>>   - parent first conflict resolution
>>   - child first conflict resolution
>>
>> Having number one as the default and let the error message suggest
>> options two and three as options would definitely make users aware of the
>> issue...
>>
>> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kk...@transpac.com>
>> wrote:
>>
>>> I can’t believe I’m suggesting this, but perhaps the Elasticsearch
>>> “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
>>>
>>> Basically they prevent a program from running if there are duplicate
>>> classes on the classpath.
>>>
>>> This causes headaches when you really need a different version of
>>> library X, and that’s already on the class path.
>>>
>>> See https://github.com/elastic/elasticsearch/issues/14348 for an
>>> example of the issues it can cause.
>>>
>>> But it definitely catches a lot of oops-ish mistakes in building the
>>> jars, and makes debugging easier (they print out “class X jar1: <path to
>>> jar> jar2: <path to jar>”).
>>>
>>> Caused by: java.lang.IllegalStateException: jar hell!
>>> class: jdk.packager.services.UserJvmOptionsService
>>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
>>>
>>> — Ken
>>>
>>>
>>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
>>>
>>> Hi all!
>>>
>>> Flink 1.4 introduces child-first classloading by default, for the
>>> application libraries.
>>>
>>> We added that, because it allows applications to use different versions
>>> of many libraries, compared to what Flink uses in its core, or compared to
>>> what other dependencies (like Hadoop) pull into the class path.
>>>
>>> For example, applications can use different versions of akka, Avro,
>>> Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>>>
>>> Now, while that is nice, child-first classloading runs into trouble when
>>> the application jars are not properly built, meaning when the application
>>> JAR contains libraries that it should not (because they are already in the
>>> classpath / lib folder).
>>>
>>> For example, when the class path has the Kafka Connector (connector is
>>> in the lib directory) and the application jar also contains Kafka, the we
>>> get nasty errors due to class duplication and impossible class casts (X
>>> cannot be cast to X).
>>>
>>>
>>> What I would like to understand is how this change worked out for the
>>> users. Based on that, we can keep this or revert this change in the next
>>> release.
>>>
>>> Please answer to this mail with:
>>>
>>>   a. This was a great change, keep it and polish it.
>>>
>>>   b. This caused in the end more problems than it solved, so please set
>>> the default back to "parent-first" in 1.5 and leave "child-first" as an
>>> optional flag.
>>>
>>>
>>> Thanks a lot,
>>> Stephan
>>>
>>>
>>> --------------------------------------------
>>> http://about.me/kkrugler
>>> +1 530-210-6378 <(530)%20210-6378>
>>>
>>>
>
>

Re: [DISCUSS] Inverted (child-first) class loading

Posted by Aljoscha Krettek <al...@apache.org>.
Hi Kedar,

There is this section in the Flink docs: https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html <https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html>

Best,
Aljoscha

> On 10. Mar 2018, at 05:53, kedar mhaswade <ke...@gmail.com> wrote:
> 
> This is an interesting question and it usually has consequences that are far-reaching in user experience.
> 
> If a Flink app is supposed to be a "standalone app" that any Flink installation should be able to run, then the child-first classloading makes sense. This is how we build many of the Java application servers (e.g. GlassFish, JBoss etc). Doing this makes the application "self-contained" and perhaps portable. Of course, this increases the size of the Jar. The one issue to watch out for is application using framework classes that are newer than framework itself. For instance, should I expect my app with Flink 1.6 DataSet/DataStream classes to run smoothly on a Flink 1.5 installation?
> 
> If a Flink app depends on a particular (version of the) Flink installation, then, if using parent-first classloading, the app can make use of the classes that the installation itself uses. This makes the app (comparatively) less self-contained, but this limits the size of the app's Jar. There are advantages of doing this, but it poses problems especially in upgrades.
> 
> Whether one or the other should be the behavior largely depends on how the applications are built, tested, and deployed. Application's build comes into picture because in tools like Maven a dependency can be declared to be "provided" which means if you know that your app's dependency is also your framework's (i.e. Flink) dependency and you, as an app developer, are okay with that Maven wouldn't bundle it in your app's Jar.
> 
> So, my recommendation is that since this appears like a backward incompatible change, Flink should provide an option to go back to parent-first classloading for a given app, at least for 1.5. Child-first classloading seems like the right thing to do given how (unnecessarily) complicated the deployments have become and given how frequently apps use library versions that are different from the framework. 
> 
> ElasticSearch solution has merits too, but it is unclear if it helps at deployment time merely to identify that there is a duplicate (without knowing where it has come from). Ideally, when people build the so-called shadow Jar (one Jar with all dependencies) the build script should warn of the duplicates. Shadow Jars alleviate (but do not remove) the problems of "Jar Hell". But it seems to me that till we move to a modular Java (that is Java 9; I think this is way out in future), this is the preferred solution.
> 
> That said, I'd really like to see a classloading section in Flink docs (somewhere in dev/best_practices.html). Is a JIRA in order?
> 
> Regards,
> Kedar
> 
> On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ewenstephan@gmail.com <ma...@gmail.com>> wrote:
> @Ken very interesting thought.
> 
> One for have three options:
>   - forbid duplicate classes
>   - parent first conflict resolution
>   - child first conflict resolution
> 
> Having number one as the default and let the error message suggest options two and three as options would definitely make users aware of the issue...
> 
> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kkrugler_lists@transpac.com <ma...@transpac.com>> wrote:
> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
> 
> Basically they prevent a program from running if there are duplicate classes on the classpath.
> 
> This causes headaches when you really need a different version of library X, and that’s already on the class path.
> 
> See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.
> 
> But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).
> 
>> Caused by: java.lang.IllegalStateException: jar hell!
>> class: jdk.packager.services.UserJvmOptionsService
>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
> 
> — Ken
> 
> 
>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <sewen@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi all!
>> 
>> Flink 1.4 introduces child-first classloading by default, for the application libraries.
>> 
>> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
>> 
>> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>> 
>> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
>> 
>> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
>> 
>> 
>> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
>> 
>> Please answer to this mail with:
>> 
>>   a. This was a great change, keep it and polish it.
>> 
>>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
>> 
>> 
>> Thanks a lot,
>> Stephan
>> 
> 
> --------------------------------------------
> http://about.me/kkrugler <http://about.me/kkrugler>
> +1 530-210-6378 <tel:(530)%20210-6378>
> 


Re: [DISCUSS] Inverted (child-first) class loading

Posted by Aljoscha Krettek <al...@apache.org>.
Hi Kedar,

There is this section in the Flink docs: https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html <https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html>

Best,
Aljoscha

> On 10. Mar 2018, at 05:53, kedar mhaswade <ke...@gmail.com> wrote:
> 
> This is an interesting question and it usually has consequences that are far-reaching in user experience.
> 
> If a Flink app is supposed to be a "standalone app" that any Flink installation should be able to run, then the child-first classloading makes sense. This is how we build many of the Java application servers (e.g. GlassFish, JBoss etc). Doing this makes the application "self-contained" and perhaps portable. Of course, this increases the size of the Jar. The one issue to watch out for is application using framework classes that are newer than framework itself. For instance, should I expect my app with Flink 1.6 DataSet/DataStream classes to run smoothly on a Flink 1.5 installation?
> 
> If a Flink app depends on a particular (version of the) Flink installation, then, if using parent-first classloading, the app can make use of the classes that the installation itself uses. This makes the app (comparatively) less self-contained, but this limits the size of the app's Jar. There are advantages of doing this, but it poses problems especially in upgrades.
> 
> Whether one or the other should be the behavior largely depends on how the applications are built, tested, and deployed. Application's build comes into picture because in tools like Maven a dependency can be declared to be "provided" which means if you know that your app's dependency is also your framework's (i.e. Flink) dependency and you, as an app developer, are okay with that Maven wouldn't bundle it in your app's Jar.
> 
> So, my recommendation is that since this appears like a backward incompatible change, Flink should provide an option to go back to parent-first classloading for a given app, at least for 1.5. Child-first classloading seems like the right thing to do given how (unnecessarily) complicated the deployments have become and given how frequently apps use library versions that are different from the framework. 
> 
> ElasticSearch solution has merits too, but it is unclear if it helps at deployment time merely to identify that there is a duplicate (without knowing where it has come from). Ideally, when people build the so-called shadow Jar (one Jar with all dependencies) the build script should warn of the duplicates. Shadow Jars alleviate (but do not remove) the problems of "Jar Hell". But it seems to me that till we move to a modular Java (that is Java 9; I think this is way out in future), this is the preferred solution.
> 
> That said, I'd really like to see a classloading section in Flink docs (somewhere in dev/best_practices.html). Is a JIRA in order?
> 
> Regards,
> Kedar
> 
> On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ewenstephan@gmail.com <ma...@gmail.com>> wrote:
> @Ken very interesting thought.
> 
> One for have three options:
>   - forbid duplicate classes
>   - parent first conflict resolution
>   - child first conflict resolution
> 
> Having number one as the default and let the error message suggest options two and three as options would definitely make users aware of the issue...
> 
> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kkrugler_lists@transpac.com <ma...@transpac.com>> wrote:
> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
> 
> Basically they prevent a program from running if there are duplicate classes on the classpath.
> 
> This causes headaches when you really need a different version of library X, and that’s already on the class path.
> 
> See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.
> 
> But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).
> 
>> Caused by: java.lang.IllegalStateException: jar hell!
>> class: jdk.packager.services.UserJvmOptionsService
>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
> 
> — Ken
> 
> 
>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <sewen@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi all!
>> 
>> Flink 1.4 introduces child-first classloading by default, for the application libraries.
>> 
>> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
>> 
>> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>> 
>> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
>> 
>> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
>> 
>> 
>> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
>> 
>> Please answer to this mail with:
>> 
>>   a. This was a great change, keep it and polish it.
>> 
>>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
>> 
>> 
>> Thanks a lot,
>> Stephan
>> 
> 
> --------------------------------------------
> http://about.me/kkrugler <http://about.me/kkrugler>
> +1 530-210-6378 <tel:(530)%20210-6378>
> 


Re: [DISCUSS] Inverted (child-first) class loading

Posted by Ken Krugler <kk...@transpac.com>.
Hi Kedar,

See some thoughts inline below.

I will admit that classpath issues (as in “I can see my damn class in the file, I can load it from my code, but I’m getting a freakin’ class not found exception at runtime…arghhh”) have been one of the biggest hassles with Flink-based projects that we’ve run into. 

— Ken

> On Mar 10, 2018, at 5:53 AM, kedar mhaswade <ke...@gmail.com> wrote:
> 
> This is an interesting question and it usually has consequences that are far-reaching in user experience.
> 
> If a Flink app is supposed to be a "standalone app" that any Flink installation should be able to run, then the child-first classloading makes sense. This is how we build many of the Java application servers (e.g. GlassFish, JBoss etc). Doing this makes the application "self-contained" and perhaps portable. Of course, this increases the size of the Jar. The one issue to watch out for is application using framework classes that are newer than framework itself. For instance, should I expect my app with Flink 1.6 DataSet/DataStream classes to run smoothly on a Flink 1.5 installation?

I think this might be possible, once Flink has support for HDFS & S3 w/o requiring any Hadoop code.

Though I think there would still be logging entanglement, as you’d want to have your logging output be directed to the standard locations.

And I wonder if this model would really need a “no-parent” classloading mechanism - though it would obviously still need the JRE.

So not sure if this would be a good idea, just wanted to write down some thoughts.

But it would solve one issue we often run into, where our main class (which is often running on a separate “controller” server) needs to use Flink classes when building the workflow, before executing it. Currently we have to make sure we’ve got a version of Flink installed on that machine which matches the version of the cluster, and we add all of the Flink jars to the classpath.

> If a Flink app depends on a particular (version of the) Flink installation, then, if using parent-first classloading, the app can make use of the classes that the installation itself uses. This makes the app (comparatively) less self-contained, but this limits the size of the app's Jar. There are advantages of doing this, but it poses problems especially in upgrades.

Plus there’s the issue where code we’re using has dependencies on different versions of the same jars that are part of the Flink (and/or Hadoop/EMR) installation.

> Whether one or the other should be the behavior largely depends on how the applications are built, tested, and deployed. Application's build comes into picture because in tools like Maven a dependency can be declared to be "provided" which means if you know that your app's dependency is also your framework's (i.e. Flink) dependency and you, as an app developer, are okay with that Maven wouldn't bundle it in your app's Jar.
> 
> So, my recommendation is that since this appears like a backward incompatible change, Flink should provide an option to go back to parent-first classloading for a given app, at least for 1.5. Child-first classloading seems like the right thing to do given how (unnecessarily) complicated the deployments have become and given how frequently apps use library versions that are different from the framework. 
> 
> ElasticSearch solution has merits too, but it is unclear if it helps at deployment time merely to identify that there is a duplicate (without knowing where it has come from). Ideally, when people build the so-called shadow Jar (one Jar with all dependencies) the build script should warn of the duplicates. Shadow Jars alleviate (but do not remove) the problems of "Jar Hell". But it seems to me that till we move to a modular Java (that is Java 9; I think this is way out in future), this is the preferred solution.
> 
> That said, I'd really like to see a classloading section in Flink docs (somewhere in dev/best_practices.html). Is a JIRA in order?
> 
> Regards,
> Kedar
> 
> On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ewenstephan@gmail.com <ma...@gmail.com>> wrote:
> @Ken very interesting thought.
> 
> One for have three options:
>   - forbid duplicate classes
>   - parent first conflict resolution
>   - child first conflict resolution
> 
> Having number one as the default and let the error message suggest options two and three as options would definitely make users aware of the issue...
> 
> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kkrugler_lists@transpac.com <ma...@transpac.com>> wrote:
> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
> 
> Basically they prevent a program from running if there are duplicate classes on the classpath.
> 
> This causes headaches when you really need a different version of library X, and that’s already on the class path.
> 
> See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.
> 
> But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).
> 
>> Caused by: java.lang.IllegalStateException: jar hell!
>> class: jdk.packager.services.UserJvmOptionsService
>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
> 
> — Ken
> 
> 
>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <sewen@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi all!
>> 
>> Flink 1.4 introduces child-first classloading by default, for the application libraries.
>> 
>> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
>> 
>> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>> 
>> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
>> 
>> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
>> 
>> 
>> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
>> 
>> Please answer to this mail with:
>> 
>>   a. This was a great change, keep it and polish it.
>> 
>>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
>> 
>> 
>> Thanks a lot,
>> Stephan
>> 
> 
> --------------------------------------------
> http://about.me/kkrugler <http://about.me/kkrugler>
> +1 530-210-6378 <tel:(530)%20210-6378>
> 

--------------------------------------------
http://about.me/kkrugler
+1 530-210-6378


Re: [DISCUSS] Inverted (child-first) class loading

Posted by kedar mhaswade <ke...@gmail.com>.
This is an interesting question and it usually has consequences that are
far-reaching in user experience.

If a Flink app is supposed to be a "standalone app" that any Flink
installation should be able to run, then the child-first classloading makes
sense. This is how we build many of the Java application servers (e.g.
GlassFish, JBoss etc). Doing this makes the application "self-contained"
and perhaps portable. Of course, this increases the size of the Jar. The
one issue to watch out for is application using framework classes that are
newer than framework itself. For instance, should I expect my app with
Flink *1.6* DataSet/DataStream classes to run smoothly on a Flink 1.5
installation?

If a Flink app depends on a particular (version of the) Flink installation,
then, if using parent-first classloading, the app can make use of the
classes that the installation itself uses. This makes the app
(comparatively) less self-contained, but this limits the size of the app's
Jar. There are advantages of doing this, but it poses problems especially
in upgrades.

Whether one or the other should be the behavior largely depends on how the
applications are built, tested, and deployed. Application's build comes
into picture because in tools like Maven a dependency can be declared to be
"provided" which means if you know that your app's dependency is also your
framework's (i.e. Flink) dependency and you, as an app developer, are okay
with that Maven wouldn't bundle it in your app's Jar.

So, my recommendation is that since this appears like a backward
incompatible change, Flink should provide an option to go back to
parent-first classloading for a given app, at least for 1.5. Child-first
classloading seems like the right thing to do given how (unnecessarily)
complicated the deployments have become and given how frequently apps use
library versions that are different from the framework.

ElasticSearch solution has merits too, but it is unclear if it helps *at
deployment time* merely to identify that there is a duplicate (without
knowing where it has come from). Ideally, when people build the so-called
shadow Jar (one Jar with all dependencies) the build script should warn of
the duplicates. Shadow Jars alleviate (but do not remove) the problems of
"Jar Hell". But it seems to me that till we move to a modular Java (that is
Java 9; I think this is way out in future), this is the preferred solution.

That said, I'd really like to see a classloading section in Flink docs
(somewhere in dev/best_practices.html). Is a JIRA in order?

Regards,
Kedar

On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ew...@gmail.com> wrote:

> @Ken very interesting thought.
>
> One for have three options:
>   - forbid duplicate classes
>   - parent first conflict resolution
>   - child first conflict resolution
>
> Having number one as the default and let the error message suggest options
> two and three as options would definitely make users aware of the issue...
>
> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kk...@transpac.com>
> wrote:
>
>> I can’t believe I’m suggesting this, but perhaps the Elasticsearch
>> “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.
>>
>> Basically they prevent a program from running if there are duplicate
>> classes on the classpath.
>>
>> This causes headaches when you really need a different version of library
>> X, and that’s already on the class path.
>>
>> See https://github.com/elastic/elasticsearch/issues/14348 for an example
>> of the issues it can cause.
>>
>> But it definitely catches a lot of oops-ish mistakes in building the
>> jars, and makes debugging easier (they print out “class X jar1: <path to
>> jar> jar2: <path to jar>”).
>>
>> Caused by: java.lang.IllegalStateException: jar hell!
>> class: jdk.packager.services.UserJvmOptionsService
>> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
>> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
>>
>> — Ken
>>
>>
>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
>>
>> Hi all!
>>
>> Flink 1.4 introduces child-first classloading by default, for the
>> application libraries.
>>
>> We added that, because it allows applications to use different versions
>> of many libraries, compared to what Flink uses in its core, or compared to
>> what other dependencies (like Hadoop) pull into the class path.
>>
>> For example, applications can use different versions of akka, Avro,
>> Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>>
>> Now, while that is nice, child-first classloading runs into trouble when
>> the application jars are not properly built, meaning when the application
>> JAR contains libraries that it should not (because they are already in the
>> classpath / lib folder).
>>
>> For example, when the class path has the Kafka Connector (connector is in
>> the lib directory) and the application jar also contains Kafka, the we get
>> nasty errors due to class duplication and impossible class casts (X cannot
>> be cast to X).
>>
>>
>> What I would like to understand is how this change worked out for the
>> users. Based on that, we can keep this or revert this change in the next
>> release.
>>
>> Please answer to this mail with:
>>
>>   a. This was a great change, keep it and polish it.
>>
>>   b. This caused in the end more problems than it solved, so please set
>> the default back to "parent-first" in 1.5 and leave "child-first" as an
>> optional flag.
>>
>>
>> Thanks a lot,
>> Stephan
>>
>>
>> --------------------------------------------
>> http://about.me/kkrugler
>> +1 530-210-6378 <(530)%20210-6378>
>>
>>

Re: [DISCUSS] Inverted (child-first) class loading

Posted by Stephan Ewen <ew...@gmail.com>.
@Ken very interesting thought.

One for have three options:
  - forbid duplicate classes
  - parent first conflict resolution
  - child first conflict resolution

Having number one as the default and let the error message suggest options
two and three as options would definitely make users aware of the issue...

On Fri, Mar 9, 2018, 21:09 Ken Krugler <kk...@transpac.com> wrote:

> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer
> of Thor” (aka “jar hell”) approach would be appropriate here.
>
> Basically they prevent a program from running if there are duplicate
> classes on the classpath.
>
> This causes headaches when you really need a different version of library
> X, and that’s already on the class path.
>
> See https://github.com/elastic/elasticsearch/issues/14348 for an example
> of the issues it can cause.
>
> But it definitely catches a lot of oops-ish mistakes in building the jars,
> and makes debugging easier (they print out “class X jar1: <path to jar>
> jar2: <path to jar>”).
>
> Caused by: java.lang.IllegalStateException: jar hell!
> class: jdk.packager.services.UserJvmOptionsService
> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
>
> — Ken
>
>
> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
>
> Hi all!
>
> Flink 1.4 introduces child-first classloading by default, for the
> application libraries.
>
> We added that, because it allows applications to use different versions of
> many libraries, compared to what Flink uses in its core, or compared to
> what other dependencies (like Hadoop) pull into the class path.
>
> For example, applications can use different versions of akka, Avro,
> Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>
> Now, while that is nice, child-first classloading runs into trouble when
> the application jars are not properly built, meaning when the application
> JAR contains libraries that it should not (because they are already in the
> classpath / lib folder).
>
> For example, when the class path has the Kafka Connector (connector is in
> the lib directory) and the application jar also contains Kafka, the we get
> nasty errors due to class duplication and impossible class casts (X cannot
> be cast to X).
>
>
> What I would like to understand is how this change worked out for the
> users. Based on that, we can keep this or revert this change in the next
> release.
>
> Please answer to this mail with:
>
>   a. This was a great change, keep it and polish it.
>
>   b. This caused in the end more problems than it solved, so please set
> the default back to "parent-first" in 1.5 and leave "child-first" as an
> optional flag.
>
>
> Thanks a lot,
> Stephan
>
>
> --------------------------------------------
> http://about.me/kkrugler
> +1 530-210-6378
>
>

Re: [DISCUSS] Inverted (child-first) class loading

Posted by Stephan Ewen <ew...@gmail.com>.
@Ken very interesting thought.

One for have three options:
  - forbid duplicate classes
  - parent first conflict resolution
  - child first conflict resolution

Having number one as the default and let the error message suggest options
two and three as options would definitely make users aware of the issue...

On Fri, Mar 9, 2018, 21:09 Ken Krugler <kk...@transpac.com> wrote:

> I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer
> of Thor” (aka “jar hell”) approach would be appropriate here.
>
> Basically they prevent a program from running if there are duplicate
> classes on the classpath.
>
> This causes headaches when you really need a different version of library
> X, and that’s already on the class path.
>
> See https://github.com/elastic/elasticsearch/issues/14348 for an example
> of the issues it can cause.
>
> But it definitely catches a lot of oops-ish mistakes in building the jars,
> and makes debugging easier (they print out “class X jar1: <path to jar>
> jar2: <path to jar>”).
>
> Caused by: java.lang.IllegalStateException: jar hell!
> class: jdk.packager.services.UserJvmOptionsService
> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar
>
> — Ken
>
>
> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
>
> Hi all!
>
> Flink 1.4 introduces child-first classloading by default, for the
> application libraries.
>
> We added that, because it allows applications to use different versions of
> many libraries, compared to what Flink uses in its core, or compared to
> what other dependencies (like Hadoop) pull into the class path.
>
> For example, applications can use different versions of akka, Avro,
> Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>
> Now, while that is nice, child-first classloading runs into trouble when
> the application jars are not properly built, meaning when the application
> JAR contains libraries that it should not (because they are already in the
> classpath / lib folder).
>
> For example, when the class path has the Kafka Connector (connector is in
> the lib directory) and the application jar also contains Kafka, the we get
> nasty errors due to class duplication and impossible class casts (X cannot
> be cast to X).
>
>
> What I would like to understand is how this change worked out for the
> users. Based on that, we can keep this or revert this change in the next
> release.
>
> Please answer to this mail with:
>
>   a. This was a great change, keep it and polish it.
>
>   b. This caused in the end more problems than it solved, so please set
> the default back to "parent-first" in 1.5 and leave "child-first" as an
> optional flag.
>
>
> Thanks a lot,
> Stephan
>
>
> --------------------------------------------
> http://about.me/kkrugler
> +1 530-210-6378
>
>

Re: [DISCUSS] Inverted (child-first) class loading

Posted by Ken Krugler <kk...@transpac.com>.
I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.

Basically they prevent a program from running if there are duplicate classes on the classpath.

This causes headaches when you really need a different version of library X, and that’s already on the class path.

See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.

But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).

> Caused by: java.lang.IllegalStateException: jar hell!
> class: jdk.packager.services.UserJvmOptionsService
> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar

— Ken


> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi all!
> 
> Flink 1.4 introduces child-first classloading by default, for the application libraries.
> 
> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
> 
> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
> 
> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
> 
> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
> 
> 
> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
> 
> Please answer to this mail with:
> 
>   a. This was a great change, keep it and polish it.
> 
>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
> 
> 
> Thanks a lot,
> Stephan
> 

--------------------------------------------
http://about.me/kkrugler
+1 530-210-6378


Re: [DISCUSS] Inverted (child-first) class loading

Posted by Ken Krugler <kk...@transpac.com>.
I can’t believe I’m suggesting this, but perhaps the Elasticsearch “Hammer of Thor” (aka “jar hell”) approach would be appropriate here.

Basically they prevent a program from running if there are duplicate classes on the classpath.

This causes headaches when you really need a different version of library X, and that’s already on the class path.

See https://github.com/elastic/elasticsearch/issues/14348 <https://github.com/elastic/elasticsearch/issues/14348> for an example of the issues it can cause.

But it definitely catches a lot of oops-ish mistakes in building the jars, and makes debugging easier (they print out “class X jar1: <path to jar> jar2: <path to jar>”).

> Caused by: java.lang.IllegalStateException: jar hell!
> class: jdk.packager.services.UserJvmOptionsService
> jar1: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar
> jar2: /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar

— Ken


> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote:
> 
> Hi all!
> 
> Flink 1.4 introduces child-first classloading by default, for the application libraries.
> 
> We added that, because it allows applications to use different versions of many libraries, compared to what Flink uses in its core, or compared to what other dependencies (like Hadoop) pull into the class path.
> 
> For example, applications can use different versions of akka, Avro, Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
> 
> Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built, meaning when the application JAR contains libraries that it should not (because they are already in the classpath / lib folder).
> 
> For example, when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
> 
> 
> What I would like to understand is how this change worked out for the users. Based on that, we can keep this or revert this change in the next release.
> 
> Please answer to this mail with:
> 
>   a. This was a great change, keep it and polish it.
> 
>   b. This caused in the end more problems than it solved, so please set the default back to "parent-first" in 1.5 and leave "child-first" as an optional flag.
> 
> 
> Thanks a lot,
> Stephan
> 

--------------------------------------------
http://about.me/kkrugler
+1 530-210-6378


Re:[DISCUSS] Inverted (child-first) class loading

Posted by mingleizhang <zm...@163.com>.
Hi, Stephan


It is a great change, keep it and polish it. nice nice nice


I think the more fewer NoClassDefFoundError or ClassNotFoundException the user will encounter in the future. But I would like to two question about this functionality. If Im am wrong, please helps me out. Thank you in advance.
You said : Now, while that is nice, child-first classloading runs into trouble when the application jars are not properly built. 
Q: Hmm, I think it is seldom to happens as users (like me in some times, but seldom) always debug the application to run correctly on my local machine before they deploy to production environment. If I set the child-first strategy, I can use whatever software (akka, Avro,protobuf) version I want  and those software also in flink core is being using and can not cause class conflicts. I am correct ?


You said: when the class path has the Kafka Connector (connector is in the lib directory) and the application jar also contains Kafka, the we get nasty errors due to class duplication and impossible class casts (X cannot be cast to X).
Q: the class path is what  ? flink itself runtime classpath ? (I dont think it is a flink itself runtime classpath), and the application jar ( I think it is the user jar that upload to a flink cluster) 


Could you tell me more ? 


Thanks
Minglei.







At 2018-03-09 19:21:18, "Stephan Ewen" <se...@apache.org> wrote:
>Hi all!
>
>Flink 1.4 introduces child-first classloading by default, for the
>application libraries.
>
>We added that, because it allows applications to use different versions of
>many libraries, compared to what Flink uses in its core, or compared to
>what other dependencies (like Hadoop) pull into the class path.
>
>For example, applications can use different versions of akka, Avro,
>Protobuf, etc. Compared to what Flink / Hadoop / etc. uses.
>
>Now, while that is nice, child-first classloading runs into trouble when
>the application jars are not properly built, meaning when the application
>JAR contains libraries that it should not (because they are already in the
>classpath / lib folder).
>
>For example, when the class path has the Kafka Connector (connector is in
>the lib directory) and the application jar also contains Kafka, the we get
>nasty errors due to class duplication and impossible class casts (X cannot
>be cast to X).
>
>
>What I would like to understand is how this change worked out for the
>users. Based on that, we can keep this or revert this change in the next
>release.
>
>Please answer to this mail with:
>
>  a. This was a great change, keep it and polish it.
>
>  b. This caused in the end more problems than it solved, so please set the
>default back to "parent-first" in 1.5 and leave "child-first" as an
>optional flag.
>
>
>Thanks a lot,
>Stephan