You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Federico D'Ambrosio <fe...@smartlab.ws> on 2017/09/23 15:53:44 UTC

Unable to run flink job after adding a new dependency (cannot find Main-class)

Hello everyone,

I'd like to submit to you this weird issue I'm having, hoping you could
help me.
Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
So, I'm trying to implement an sink for Hive so I added the following
dependency in my build.sbt:

"org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
"1.2.1000.2.6.1.0-129"

in order to use hive streaming capabilities.

After importing this dependency, not even using it, if I try to flink run
the job I get

org.apache.flink.client.program.ProgramInvocationException: The program's
entry point class 'package.MainObj' was not found in the jar file.

If I remove the dependency, everything goes back to normal.
What is weird is that if I try to use sbt run in order to run job, *it does
find the Main class* and obviously crash because of the missing flink core
dependencies (AbstractStateBackend missing and whatnot).

Here are the complete dependencies of the project:

"org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
"org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
"org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
"org.apache.flink" %% "flink-cep-scala" % flinkVersion,
"org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
"1.2.1000.2.6.1.0-129",
"org.joda" % "joda-convert" % "1.8.3",
"com.typesafe.play" %% "play-json" % "2.6.2",
"org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
"org.scalactic" %% "scalactic" % "3.0.1",
"org.scalatest" %% "scalatest" % "3.0.1" % "test",
"de.javakaffee" % "kryo-serializers" % "0.42"

Could it be an issue of dependencies conflicts between mongo-hadoop and
hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
though no issue between mongodb-hadoop and flink)? I'm even starting to
think that Flink cannot handle that well big jars (before the new
dependency it was 44M, afterwards it became 115M) when it comes to
classpath loading?

Any help would be really appreciated,
Kind regards,
Federico

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Posted by Federico D'Ambrosio <fe...@smartlab.ws>.
As a little update, the pattern for the exclusion of those files in
sbt-assembly is the following:

assemblyMergeStrategy in assembly := {
  case PathList(ps @ _*) if ps.last.endsWith(".DSA") ||
ps.last.endsWith(".SF") || ps.last.endsWith(".RSA")  =>
MergeStrategy.discard
  //Other MergeStrategies
}

2017-09-25 11:48 GMT+02:00 Federico D'Ambrosio <
federico.dambrosio@smartlab.ws>:

> Hi Urs,
>
> Thank you very much for your advice, I will look into excluding those
> files directly during the assembly.
>
> 2017-09-25 10:58 GMT+02:00 Urs Schoenenberger <urs.schoenenberger@tngtech.
> com>:
>
>> Hi Federico,
>>
>> oh, I remember running into this problem some time ago. If I recall
>> correctly, this is not a flink issue, but an issue with technically
>> incorrect jars from dependencies which prevent the verification of the
>> manifest. I was using the maven-shade plugin back then and configured an
>> exclusion for these file types. I assume that sbt/sbt-assembly has a
>> similar option, this should be more stable than manually stripping the
>> jar.
>> Alternatively, you could try to find out which dependency puts the
>> .SF/etc files there and exclude this dependency altogether, it might be
>> a transitive lib dependency that comes with hadoop anyways, or simply
>> one that you don't need anyways.
>>
>> Best,
>> Urs
>>
>> On 25.09.2017 10:09, Federico D'Ambrosio wrote:
>> > Hi Urs,
>> >
>> > Yes the main class is set, just like you said.
>> >
>> > Still, I might have managed to get it working: during the assembly some
>> > .SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
>> > possibly coming from some of the new dependencies in the deps tree.
>> > Apparently, this caused this weird issue. Using an appropriate pattern
>> for
>> > discarding the files during the assembly or removing them via zip -d
>> should
>> > be enough (I sure hope so, since this is some of the worst issues I've
>> come
>> > across).
>> >
>> >
>> > Federico D'Ambrosio
>> >
>> > Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <
>> urs.schoenenberger@tngtech.com>
>> > ha scritto:
>> >
>> >> Hi Federico,
>> >>
>> >> just guessing, but are you explicitly setting the Main-Class manifest
>> >> attribute for the jar that you are building?
>> >>
>> >> Should be something like
>> >>
>> >> mainClass in (Compile, packageBin) :=
>> >> Some("org.yourorg.YourFlinkJobMainClass")
>> >>
>> >> Best,
>> >> Urs
>> >>
>> >>
>> >> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>> >>> Hello everyone,
>> >>>
>> >>> I'd like to submit to you this weird issue I'm having, hoping you
>> could
>> >>> help me.
>> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
>> 1.3.2
>> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>> So, I'm trying to implement an sink for Hive so I added the following
>> >>> dependency in my build.sbt:
>> >>>
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129"
>> >>>
>> >>> in order to use hive streaming capabilities.
>> >>>
>> >>> After importing this dependency, not even using it, if I try to flink
>> run
>> >>> the job I get
>> >>>
>> >>> org.apache.flink.client.program.ProgramInvocationException: The
>> >> program's
>> >>> entry point class 'package.MainObj' was not found in the jar file.
>> >>>
>> >>> If I remove the dependency, everything goes back to normal.
>> >>> What is weird is that if I try to use sbt run in order to run job, *it
>> >> does
>> >>> find the Main class* and obviously crash because of the missing flink
>> >> core
>> >>> dependencies (AbstractStateBackend missing and whatnot).
>> >>>
>> >>> Here are the complete dependencies of the project:
>> >>>
>> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >> "provided",
>> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129",
>> >>> "org.joda" % "joda-convert" % "1.8.3",
>> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>> "org.scalactic" %% "scalactic" % "3.0.1",
>> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>>
>> >>> Could it be an issue of dependencies conflicts between mongo-hadoop
>> and
>> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>> >>> though no issue between mongodb-hadoop and flink)? I'm even starting
>> to
>> >>> think that Flink cannot handle that well big jars (before the new
>> >>> dependency it was 44M, afterwards it became 115M) when it comes to
>> >>> classpath loading?
>> >>>
>> >>> Any help would be really appreciated,
>> >>> Kind regards,
>> >>> Federico
>> >>>
>> >>>
>> >>>
>> >>> Hello everyone,
>> >>>
>> >>> I'd like to submit to you this weird issue I'm having, hoping you
>> could
>> >>> help me.
>> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
>> 1.3.2
>> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>> So, I'm trying to implement an sink for Hive so I added the following
>> >>> dependency in my build.sbt:
>> >>>
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129"
>> >>>
>> >>> in order to use hive streaming capabilities.
>> >>>
>> >>> After importing this dependency, not even using it, if I try to flink
>> >>> run the job I get
>> >>>
>> >>> org.apache.flink.client.program.ProgramInvocationException: The
>> >>> program's entry point class 'package.MainObj' was not found in the jar
>> >> file.
>> >>>
>> >>> If I remove the dependency, everything goes back to normal.
>> >>> What is weird is that if I try to use sbt run in order to run job, *it
>> >>> does find the Main class* and obviously crash because of the missing
>> >>> flink core dependencies (AbstractStateBackend missing and whatnot).
>> >>>
>> >>> Here are the complete dependencies of the project:
>> >>>
>> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >> "provided",
>> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>> "1.2.1000.2.6.1.0-129",
>> >>> "org.joda" % "joda-convert" % "1.8.3",
>> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>> "org.scalactic" %% "scalactic" % "3.0.1",
>> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>>
>> >>> Could it be an issue of dependencies conflicts between mongo-hadoop
>> and
>> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>> >>> though no issue between mongodb-hadoop and flink)? I'm even starting
>> to
>> >>> think that Flink cannot handle that well big jars (before the new
>> >>> dependency it was 44M, afterwards it became 115M) when it comes to
>> >>> classpath loading?
>> >>>
>> >>> Any help would be really appreciated,
>> >>> Kind regards,
>> >>> Federico
>> >>
>> >> --
>> >> Urs Schönenberger - urs.schoenenberger@tngtech.com
>> >>
>> >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> >> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> >> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>> >>
>> >>
>> >>
>> >> Hi Urs,
>> >>
>> >> Yes the main class is set, just like you said.
>> >>
>> >> Still, I might have managed to get it working: during the assembly
>> >> some .SF, .DSA and .RSA files are put inside the META-INF folder of
>> >> the jar, possibly coming from some of the new dependencies in the deps
>> >> tree.
>> >> Apparently, this caused this weird issue. Using an appropriate pattern
>> >> for discarding the files during the assembly or removing them via zip
>> >> -d should be enough (I sure hope so, since this is some of the worst
>> >> issues I've come across).
>> >>
>> >>
>> >> Federico D'Ambrosio
>> >>
>> >> Il 25 set 2017 9:51 AM, "Urs Schoenenberger"
>> >> <urs.schoenenberger@tngtech.com
>> >> <ma...@tngtech.com>> ha scritto:
>> >>
>> >>     Hi Federico,
>> >>
>> >>     just guessing, but are you explicitly setting the Main-Class
>> manifest
>> >>     attribute for the jar that you are building?
>> >>
>> >>     Should be something like
>> >>
>> >>     mainClass in (Compile, packageBin) :=
>> >>     Some("org.yourorg.YourFlinkJobMainClass")
>> >>
>> >>     Best,
>> >>     Urs
>> >>
>> >>
>> >>     On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>> >>     > Hello everyone,
>> >>     >
>> >>     > I'd like to submit to you this weird issue I'm having, hoping
>> >>     you could
>> >>     > help me.
>> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>> >>     flink 1.3.2
>> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>     > So, I'm trying to implement an sink for Hive so I added the
>> >>     following
>> >>     > dependency in my build.sbt:
>> >>     >
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129"
>> >>     >
>> >>     > in order to use hive streaming capabilities.
>> >>     >
>> >>     > After importing this dependency, not even using it, if I try to
>> >>     flink run
>> >>     > the job I get
>> >>     >
>> >>     > org.apache.flink.client.program.ProgramInvocationException: The
>> >>     program's
>> >>     > entry point class 'package.MainObj' was not found in the jar
>> file.
>> >>     >
>> >>     > If I remove the dependency, everything goes back to normal.
>> >>     > What is weird is that if I try to use sbt run in order to run
>> >>     job, *it does
>> >>     > find the Main class* and obviously crash because of the missing
>> >>     flink core
>> >>     > dependencies (AbstractStateBackend missing and whatnot).
>> >>     >
>> >>     > Here are the complete dependencies of the project:
>> >>     >
>> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >>     "provided",
>> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" %
>> flinkVersion,
>> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129",
>> >>     > "org.joda" % "joda-convert" % "1.8.3",
>> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
>> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>     >
>> >>     > Could it be an issue of dependencies conflicts between
>> >>     mongo-hadoop and
>> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>> >>     even
>> >>     > though no issue between mongodb-hadoop and flink)? I'm even
>> >>     starting to
>> >>     > think that Flink cannot handle that well big jars (before the new
>> >>     > dependency it was 44M, afterwards it became 115M) when it comes
>> to
>> >>     > classpath loading?
>> >>     >
>> >>     > Any help would be really appreciated,
>> >>     > Kind regards,
>> >>     > Federico
>> >>     >
>> >>     >
>> >>     >
>> >>     > Hello everyone,
>> >>     >
>> >>     > I'd like to submit to you this weird issue I'm having, hoping
>> >>     you could
>> >>     > help me.
>> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>> >>     flink 1.3.2
>> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>> >>     > So, I'm trying to implement an sink for Hive so I added the
>> >>     following
>> >>     > dependency in my build.sbt:
>> >>     >
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129"
>> >>     >
>> >>     > in order to use hive streaming capabilities.
>> >>     >
>> >>     > After importing this dependency, not even using it, if I try to
>> >>     flink
>> >>     > run the job I get
>> >>     >
>> >>     > org.apache.flink.client.program.ProgramInvocationException: The
>> >>     > program's entry point class 'package.MainObj' was not found in
>> >>     the jar file.
>> >>     >
>> >>     > If I remove the dependency, everything goes back to normal.
>> >>     > What is weird is that if I try to use sbt run in order to run
>> >>     job, *it
>> >>     > does find the Main class* and obviously crash because of the
>> missing
>> >>     > flink core dependencies (AbstractStateBackend missing and
>> whatnot).
>> >>     >
>> >>     > Here are the complete dependencies of the project:
>> >>     >
>> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> >>     "provided",
>> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" %
>> flinkVersion,
>> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>> >>     > "1.2.1000.2.6.1.0-129",
>> >>     > "org.joda" % "joda-convert" % "1.8.3",
>> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
>> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>> >>     >
>> >>     > Could it be an issue of dependencies conflicts between
>> >>     mongo-hadoop and
>> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>> >>     even
>> >>     > though no issue between mongodb-hadoop and flink)? I'm even
>> >>     starting to
>> >>     > think that Flink cannot handle that well big jars (before the new
>> >>     > dependency it was 44M, afterwards it became 115M) when it comes
>> to
>> >>     > classpath loading?
>> >>     >
>> >>     > Any help would be really appreciated,
>> >>     > Kind regards,
>> >>     > Federico
>> >>
>> >>     --
>> >>     Urs Schönenberger - urs.schoenenberger@tngtech.com
>> >>     <ma...@tngtech.com>
>> >>
>> >>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> >>     Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> >>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>> >>
>>
>> --
>> Urs Schönenberger - urs.schoenenberger@tngtech.com
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>
>

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Posted by Federico D'Ambrosio <fe...@smartlab.ws>.
Hi Urs,

Thank you very much for your advice, I will look into excluding those files
directly during the assembly.

2017-09-25 10:58 GMT+02:00 Urs Schoenenberger <
urs.schoenenberger@tngtech.com>:

> Hi Federico,
>
> oh, I remember running into this problem some time ago. If I recall
> correctly, this is not a flink issue, but an issue with technically
> incorrect jars from dependencies which prevent the verification of the
> manifest. I was using the maven-shade plugin back then and configured an
> exclusion for these file types. I assume that sbt/sbt-assembly has a
> similar option, this should be more stable than manually stripping the jar.
> Alternatively, you could try to find out which dependency puts the
> .SF/etc files there and exclude this dependency altogether, it might be
> a transitive lib dependency that comes with hadoop anyways, or simply
> one that you don't need anyways.
>
> Best,
> Urs
>
> On 25.09.2017 10:09, Federico D'Ambrosio wrote:
> > Hi Urs,
> >
> > Yes the main class is set, just like you said.
> >
> > Still, I might have managed to get it working: during the assembly some
> > .SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
> > possibly coming from some of the new dependencies in the deps tree.
> > Apparently, this caused this weird issue. Using an appropriate pattern
> for
> > discarding the files during the assembly or removing them via zip -d
> should
> > be enough (I sure hope so, since this is some of the worst issues I've
> come
> > across).
> >
> >
> > Federico D'Ambrosio
> >
> > Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <
> urs.schoenenberger@tngtech.com>
> > ha scritto:
> >
> >> Hi Federico,
> >>
> >> just guessing, but are you explicitly setting the Main-Class manifest
> >> attribute for the jar that you are building?
> >>
> >> Should be something like
> >>
> >> mainClass in (Compile, packageBin) :=
> >> Some("org.yourorg.YourFlinkJobMainClass")
> >>
> >> Best,
> >> Urs
> >>
> >>
> >> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
> >>> Hello everyone,
> >>>
> >>> I'd like to submit to you this weird issue I'm having, hoping you could
> >>> help me.
> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
> 1.3.2
> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> >>> So, I'm trying to implement an sink for Hive so I added the following
> >>> dependency in my build.sbt:
> >>>
> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>> "1.2.1000.2.6.1.0-129"
> >>>
> >>> in order to use hive streaming capabilities.
> >>>
> >>> After importing this dependency, not even using it, if I try to flink
> run
> >>> the job I get
> >>>
> >>> org.apache.flink.client.program.ProgramInvocationException: The
> >> program's
> >>> entry point class 'package.MainObj' was not found in the jar file.
> >>>
> >>> If I remove the dependency, everything goes back to normal.
> >>> What is weird is that if I try to use sbt run in order to run job, *it
> >> does
> >>> find the Main class* and obviously crash because of the missing flink
> >> core
> >>> dependencies (AbstractStateBackend missing and whatnot).
> >>>
> >>> Here are the complete dependencies of the project:
> >>>
> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> >> "provided",
> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>> "1.2.1000.2.6.1.0-129",
> >>> "org.joda" % "joda-convert" % "1.8.3",
> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> >>> "org.scalactic" %% "scalactic" % "3.0.1",
> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
> >>>
> >>> Could it be an issue of dependencies conflicts between mongo-hadoop and
> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> >>> though no issue between mongodb-hadoop and flink)? I'm even starting to
> >>> think that Flink cannot handle that well big jars (before the new
> >>> dependency it was 44M, afterwards it became 115M) when it comes to
> >>> classpath loading?
> >>>
> >>> Any help would be really appreciated,
> >>> Kind regards,
> >>> Federico
> >>>
> >>>
> >>>
> >>> Hello everyone,
> >>>
> >>> I'd like to submit to you this weird issue I'm having, hoping you could
> >>> help me.
> >>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink
> 1.3.2
> >>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> >>> So, I'm trying to implement an sink for Hive so I added the following
> >>> dependency in my build.sbt:
> >>>
> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>> "1.2.1000.2.6.1.0-129"
> >>>
> >>> in order to use hive streaming capabilities.
> >>>
> >>> After importing this dependency, not even using it, if I try to flink
> >>> run the job I get
> >>>
> >>> org.apache.flink.client.program.ProgramInvocationException: The
> >>> program's entry point class 'package.MainObj' was not found in the jar
> >> file.
> >>>
> >>> If I remove the dependency, everything goes back to normal.
> >>> What is weird is that if I try to use sbt run in order to run job, *it
> >>> does find the Main class* and obviously crash because of the missing
> >>> flink core dependencies (AbstractStateBackend missing and whatnot).
> >>>
> >>> Here are the complete dependencies of the project:
> >>>
> >>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> >>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> >> "provided",
> >>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> >>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> >>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>> "1.2.1000.2.6.1.0-129",
> >>> "org.joda" % "joda-convert" % "1.8.3",
> >>> "com.typesafe.play" %% "play-json" % "2.6.2",
> >>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> >>> "org.scalactic" %% "scalactic" % "3.0.1",
> >>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> >>> "de.javakaffee" % "kryo-serializers" % "0.42"
> >>>
> >>> Could it be an issue of dependencies conflicts between mongo-hadoop and
> >>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> >>> though no issue between mongodb-hadoop and flink)? I'm even starting to
> >>> think that Flink cannot handle that well big jars (before the new
> >>> dependency it was 44M, afterwards it became 115M) when it comes to
> >>> classpath loading?
> >>>
> >>> Any help would be really appreciated,
> >>> Kind regards,
> >>> Federico
> >>
> >> --
> >> Urs Schönenberger - urs.schoenenberger@tngtech.com
> >>
> >> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> >> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
> >> Sitz: Unterföhring * Amtsgericht München * HRB 135082
> >>
> >>
> >>
> >> Hi Urs,
> >>
> >> Yes the main class is set, just like you said.
> >>
> >> Still, I might have managed to get it working: during the assembly
> >> some .SF, .DSA and .RSA files are put inside the META-INF folder of
> >> the jar, possibly coming from some of the new dependencies in the deps
> >> tree.
> >> Apparently, this caused this weird issue. Using an appropriate pattern
> >> for discarding the files during the assembly or removing them via zip
> >> -d should be enough (I sure hope so, since this is some of the worst
> >> issues I've come across).
> >>
> >>
> >> Federico D'Ambrosio
> >>
> >> Il 25 set 2017 9:51 AM, "Urs Schoenenberger"
> >> <urs.schoenenberger@tngtech.com
> >> <ma...@tngtech.com>> ha scritto:
> >>
> >>     Hi Federico,
> >>
> >>     just guessing, but are you explicitly setting the Main-Class
> manifest
> >>     attribute for the jar that you are building?
> >>
> >>     Should be something like
> >>
> >>     mainClass in (Compile, packageBin) :=
> >>     Some("org.yourorg.YourFlinkJobMainClass")
> >>
> >>     Best,
> >>     Urs
> >>
> >>
> >>     On 23.09.2017 17:53, Federico D'Ambrosio wrote:
> >>     > Hello everyone,
> >>     >
> >>     > I'd like to submit to you this weird issue I'm having, hoping
> >>     you could
> >>     > help me.
> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
> >>     flink 1.3.2
> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> >>     > So, I'm trying to implement an sink for Hive so I added the
> >>     following
> >>     > dependency in my build.sbt:
> >>     >
> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>     > "1.2.1000.2.6.1.0-129"
> >>     >
> >>     > in order to use hive streaming capabilities.
> >>     >
> >>     > After importing this dependency, not even using it, if I try to
> >>     flink run
> >>     > the job I get
> >>     >
> >>     > org.apache.flink.client.program.ProgramInvocationException: The
> >>     program's
> >>     > entry point class 'package.MainObj' was not found in the jar file.
> >>     >
> >>     > If I remove the dependency, everything goes back to normal.
> >>     > What is weird is that if I try to use sbt run in order to run
> >>     job, *it does
> >>     > find the Main class* and obviously crash because of the missing
> >>     flink core
> >>     > dependencies (AbstractStateBackend missing and whatnot).
> >>     >
> >>     > Here are the complete dependencies of the project:
> >>     >
> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> >>     "provided",
> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>     > "1.2.1000.2.6.1.0-129",
> >>     > "org.joda" % "joda-convert" % "1.8.3",
> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
> >>     >
> >>     > Could it be an issue of dependencies conflicts between
> >>     mongo-hadoop and
> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
> >>     even
> >>     > though no issue between mongodb-hadoop and flink)? I'm even
> >>     starting to
> >>     > think that Flink cannot handle that well big jars (before the new
> >>     > dependency it was 44M, afterwards it became 115M) when it comes to
> >>     > classpath loading?
> >>     >
> >>     > Any help would be really appreciated,
> >>     > Kind regards,
> >>     > Federico
> >>     >
> >>     >
> >>     >
> >>     > Hello everyone,
> >>     >
> >>     > I'd like to submit to you this weird issue I'm having, hoping
> >>     you could
> >>     > help me.
> >>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
> >>     flink 1.3.2
> >>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> >>     > So, I'm trying to implement an sink for Hive so I added the
> >>     following
> >>     > dependency in my build.sbt:
> >>     >
> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>     > "1.2.1000.2.6.1.0-129"
> >>     >
> >>     > in order to use hive streaming capabilities.
> >>     >
> >>     > After importing this dependency, not even using it, if I try to
> >>     flink
> >>     > run the job I get
> >>     >
> >>     > org.apache.flink.client.program.ProgramInvocationException: The
> >>     > program's entry point class 'package.MainObj' was not found in
> >>     the jar file.
> >>     >
> >>     > If I remove the dependency, everything goes back to normal.
> >>     > What is weird is that if I try to use sbt run in order to run
> >>     job, *it
> >>     > does find the Main class* and obviously crash because of the
> missing
> >>     > flink core dependencies (AbstractStateBackend missing and
> whatnot).
> >>     >
> >>     > Here are the complete dependencies of the project:
> >>     >
> >>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> >>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> >>     "provided",
> >>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> >>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> >>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> >>     > "1.2.1000.2.6.1.0-129",
> >>     > "org.joda" % "joda-convert" % "1.8.3",
> >>     > "com.typesafe.play" %% "play-json" % "2.6.2",
> >>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> >>     > "org.scalactic" %% "scalactic" % "3.0.1",
> >>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> >>     > "de.javakaffee" % "kryo-serializers" % "0.42"
> >>     >
> >>     > Could it be an issue of dependencies conflicts between
> >>     mongo-hadoop and
> >>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
> >>     even
> >>     > though no issue between mongodb-hadoop and flink)? I'm even
> >>     starting to
> >>     > think that Flink cannot handle that well big jars (before the new
> >>     > dependency it was 44M, afterwards it became 115M) when it comes to
> >>     > classpath loading?
> >>     >
> >>     > Any help would be really appreciated,
> >>     > Kind regards,
> >>     > Federico
> >>
> >>     --
> >>     Urs Schönenberger - urs.schoenenberger@tngtech.com
> >>     <ma...@tngtech.com>
> >>
> >>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> >>     Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
> >>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
> >>
>
> --
> Urs Schönenberger - urs.schoenenberger@tngtech.com
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Posted by Urs Schoenenberger <ur...@tngtech.com>.
Hi Federico,

oh, I remember running into this problem some time ago. If I recall
correctly, this is not a flink issue, but an issue with technically
incorrect jars from dependencies which prevent the verification of the
manifest. I was using the maven-shade plugin back then and configured an
exclusion for these file types. I assume that sbt/sbt-assembly has a
similar option, this should be more stable than manually stripping the jar.
Alternatively, you could try to find out which dependency puts the
.SF/etc files there and exclude this dependency altogether, it might be
a transitive lib dependency that comes with hadoop anyways, or simply
one that you don't need anyways.

Best,
Urs

On 25.09.2017 10:09, Federico D'Ambrosio wrote:
> Hi Urs,
> 
> Yes the main class is set, just like you said.
> 
> Still, I might have managed to get it working: during the assembly some
> .SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
> possibly coming from some of the new dependencies in the deps tree.
> Apparently, this caused this weird issue. Using an appropriate pattern for
> discarding the files during the assembly or removing them via zip -d should
> be enough (I sure hope so, since this is some of the worst issues I've come
> across).
> 
> 
> Federico D'Ambrosio
> 
> Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <ur...@tngtech.com>
> ha scritto:
> 
>> Hi Federico,
>>
>> just guessing, but are you explicitly setting the Main-Class manifest
>> attribute for the jar that you are building?
>>
>> Should be something like
>>
>> mainClass in (Compile, packageBin) :=
>> Some("org.yourorg.YourFlinkJobMainClass")
>>
>> Best,
>> Urs
>>
>>
>> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>>> Hello everyone,
>>>
>>> I'd like to submit to you this weird issue I'm having, hoping you could
>>> help me.
>>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
>>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>> So, I'm trying to implement an sink for Hive so I added the following
>>> dependency in my build.sbt:
>>>
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129"
>>>
>>> in order to use hive streaming capabilities.
>>>
>>> After importing this dependency, not even using it, if I try to flink run
>>> the job I get
>>>
>>> org.apache.flink.client.program.ProgramInvocationException: The
>> program's
>>> entry point class 'package.MainObj' was not found in the jar file.
>>>
>>> If I remove the dependency, everything goes back to normal.
>>> What is weird is that if I try to use sbt run in order to run job, *it
>> does
>>> find the Main class* and obviously crash because of the missing flink
>> core
>>> dependencies (AbstractStateBackend missing and whatnot).
>>>
>>> Here are the complete dependencies of the project:
>>>
>>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> "provided",
>>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129",
>>> "org.joda" % "joda-convert" % "1.8.3",
>>> "com.typesafe.play" %% "play-json" % "2.6.2",
>>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>> "org.scalactic" %% "scalactic" % "3.0.1",
>>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>> "de.javakaffee" % "kryo-serializers" % "0.42"
>>>
>>> Could it be an issue of dependencies conflicts between mongo-hadoop and
>>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>>> though no issue between mongodb-hadoop and flink)? I'm even starting to
>>> think that Flink cannot handle that well big jars (before the new
>>> dependency it was 44M, afterwards it became 115M) when it comes to
>>> classpath loading?
>>>
>>> Any help would be really appreciated,
>>> Kind regards,
>>> Federico
>>>
>>>
>>>
>>> Hello everyone,
>>>
>>> I'd like to submit to you this weird issue I'm having, hoping you could
>>> help me.
>>> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
>>> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>> So, I'm trying to implement an sink for Hive so I added the following
>>> dependency in my build.sbt:
>>>
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129"
>>>
>>> in order to use hive streaming capabilities.
>>>
>>> After importing this dependency, not even using it, if I try to flink
>>> run the job I get
>>>
>>> org.apache.flink.client.program.ProgramInvocationException: The
>>> program's entry point class 'package.MainObj' was not found in the jar
>> file.
>>>
>>> If I remove the dependency, everything goes back to normal.
>>> What is weird is that if I try to use sbt run in order to run job, *it
>>> does find the Main class* and obviously crash because of the missing
>>> flink core dependencies (AbstractStateBackend missing and whatnot).
>>>
>>> Here are the complete dependencies of the project:
>>>
>>> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> "provided",
>>> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>> "1.2.1000.2.6.1.0-129",
>>> "org.joda" % "joda-convert" % "1.8.3",
>>> "com.typesafe.play" %% "play-json" % "2.6.2",
>>> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>> "org.scalactic" %% "scalactic" % "3.0.1",
>>> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>> "de.javakaffee" % "kryo-serializers" % "0.42"
>>>
>>> Could it be an issue of dependencies conflicts between mongo-hadoop and
>>> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
>>> though no issue between mongodb-hadoop and flink)? I'm even starting to
>>> think that Flink cannot handle that well big jars (before the new
>>> dependency it was 44M, afterwards it became 115M) when it comes to
>>> classpath loading?
>>>
>>> Any help would be really appreciated,
>>> Kind regards,
>>> Federico
>>
>> --
>> Urs Schönenberger - urs.schoenenberger@tngtech.com
>>
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>
>>
>> Hi Urs,
>>
>> Yes the main class is set, just like you said. 
>>
>> Still, I might have managed to get it working: during the assembly
>> some .SF, .DSA and .RSA files are put inside the META-INF folder of
>> the jar, possibly coming from some of the new dependencies in the deps
>> tree. 
>> Apparently, this caused this weird issue. Using an appropriate pattern
>> for discarding the files during the assembly or removing them via zip
>> -d should be enough (I sure hope so, since this is some of the worst
>> issues I've come across).
>>
>>
>> Federico D'Ambrosio
>>
>> Il 25 set 2017 9:51 AM, "Urs Schoenenberger"
>> <urs.schoenenberger@tngtech.com
>> <ma...@tngtech.com>> ha scritto:
>>
>>     Hi Federico,
>>
>>     just guessing, but are you explicitly setting the Main-Class manifest
>>     attribute for the jar that you are building?
>>
>>     Should be something like
>>
>>     mainClass in (Compile, packageBin) :=
>>     Some("org.yourorg.YourFlinkJobMainClass")
>>
>>     Best,
>>     Urs
>>
>>
>>     On 23.09.2017 17:53, Federico D'Ambrosio wrote:
>>     > Hello everyone,
>>     >
>>     > I'd like to submit to you this weird issue I'm having, hoping
>>     you could
>>     > help me.
>>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>>     flink 1.3.2
>>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>     > So, I'm trying to implement an sink for Hive so I added the
>>     following
>>     > dependency in my build.sbt:
>>     >
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129"
>>     >
>>     > in order to use hive streaming capabilities.
>>     >
>>     > After importing this dependency, not even using it, if I try to
>>     flink run
>>     > the job I get
>>     >
>>     > org.apache.flink.client.program.ProgramInvocationException: The
>>     program's
>>     > entry point class 'package.MainObj' was not found in the jar file.
>>     >
>>     > If I remove the dependency, everything goes back to normal.
>>     > What is weird is that if I try to use sbt run in order to run
>>     job, *it does
>>     > find the Main class* and obviously crash because of the missing
>>     flink core
>>     > dependencies (AbstractStateBackend missing and whatnot).
>>     >
>>     > Here are the complete dependencies of the project:
>>     >
>>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>>     "provided",
>>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129",
>>     > "org.joda" % "joda-convert" % "1.8.3",
>>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>     > "org.scalactic" %% "scalactic" % "3.0.1",
>>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>>     >
>>     > Could it be an issue of dependencies conflicts between
>>     mongo-hadoop and
>>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>>     even
>>     > though no issue between mongodb-hadoop and flink)? I'm even
>>     starting to
>>     > think that Flink cannot handle that well big jars (before the new
>>     > dependency it was 44M, afterwards it became 115M) when it comes to
>>     > classpath loading?
>>     >
>>     > Any help would be really appreciated,
>>     > Kind regards,
>>     > Federico
>>     >
>>     >
>>     >
>>     > Hello everyone,
>>     >
>>     > I'd like to submit to you this weird issue I'm having, hoping
>>     you could
>>     > help me.
>>     > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and
>>     flink 1.3.2
>>     > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
>>     > So, I'm trying to implement an sink for Hive so I added the
>>     following
>>     > dependency in my build.sbt:
>>     >
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129"
>>     >
>>     > in order to use hive streaming capabilities.
>>     >
>>     > After importing this dependency, not even using it, if I try to
>>     flink
>>     > run the job I get
>>     >
>>     > org.apache.flink.client.program.ProgramInvocationException: The
>>     > program's entry point class 'package.MainObj' was not found in
>>     the jar file.
>>     >
>>     > If I remove the dependency, everything goes back to normal.
>>     > What is weird is that if I try to use sbt run in order to run
>>     job, *it
>>     > does find the Main class* and obviously crash because of the missing
>>     > flink core dependencies (AbstractStateBackend missing and whatnot).
>>     >
>>     > Here are the complete dependencies of the project:
>>     >
>>     > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>     > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>>     "provided",
>>     > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>     > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>     > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
>>     > "1.2.1000.2.6.1.0-129",
>>     > "org.joda" % "joda-convert" % "1.8.3",
>>     > "com.typesafe.play" %% "play-json" % "2.6.2",
>>     > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
>>     > "org.scalactic" %% "scalactic" % "3.0.1",
>>     > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
>>     > "de.javakaffee" % "kryo-serializers" % "0.42"
>>     >
>>     > Could it be an issue of dependencies conflicts between
>>     mongo-hadoop and
>>     > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129,
>>     even
>>     > though no issue between mongodb-hadoop and flink)? I'm even
>>     starting to
>>     > think that Flink cannot handle that well big jars (before the new
>>     > dependency it was 44M, afterwards it became 115M) when it comes to
>>     > classpath loading?
>>     >
>>     > Any help would be really appreciated,
>>     > Kind regards,
>>     > Federico
>>
>>     --
>>     Urs Schönenberger - urs.schoenenberger@tngtech.com
>>     <ma...@tngtech.com>
>>
>>     TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>>     Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
>>     Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>

-- 
Urs Schönenberger - urs.schoenenberger@tngtech.com
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Posted by Federico D'Ambrosio <fe...@smartlab.ws>.
Hi Urs,

Yes the main class is set, just like you said.

Still, I might have managed to get it working: during the assembly some
.SF, .DSA and .RSA files are put inside the META-INF folder of the jar,
possibly coming from some of the new dependencies in the deps tree.
Apparently, this caused this weird issue. Using an appropriate pattern for
discarding the files during the assembly or removing them via zip -d should
be enough (I sure hope so, since this is some of the worst issues I've come
across).


Federico D'Ambrosio

Il 25 set 2017 9:51 AM, "Urs Schoenenberger" <ur...@tngtech.com>
ha scritto:

> Hi Federico,
>
> just guessing, but are you explicitly setting the Main-Class manifest
> attribute for the jar that you are building?
>
> Should be something like
>
> mainClass in (Compile, packageBin) :=
> Some("org.yourorg.YourFlinkJobMainClass")
>
> Best,
> Urs
>
>
> On 23.09.2017 17:53, Federico D'Ambrosio wrote:
> > Hello everyone,
> >
> > I'd like to submit to you this weird issue I'm having, hoping you could
> > help me.
> > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
> > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> > So, I'm trying to implement an sink for Hive so I added the following
> > dependency in my build.sbt:
> >
> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> > "1.2.1000.2.6.1.0-129"
> >
> > in order to use hive streaming capabilities.
> >
> > After importing this dependency, not even using it, if I try to flink run
> > the job I get
> >
> > org.apache.flink.client.program.ProgramInvocationException: The
> program's
> > entry point class 'package.MainObj' was not found in the jar file.
> >
> > If I remove the dependency, everything goes back to normal.
> > What is weird is that if I try to use sbt run in order to run job, *it
> does
> > find the Main class* and obviously crash because of the missing flink
> core
> > dependencies (AbstractStateBackend missing and whatnot).
> >
> > Here are the complete dependencies of the project:
> >
> > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> "provided",
> > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> > "1.2.1000.2.6.1.0-129",
> > "org.joda" % "joda-convert" % "1.8.3",
> > "com.typesafe.play" %% "play-json" % "2.6.2",
> > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> > "org.scalactic" %% "scalactic" % "3.0.1",
> > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> > "de.javakaffee" % "kryo-serializers" % "0.42"
> >
> > Could it be an issue of dependencies conflicts between mongo-hadoop and
> > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> > though no issue between mongodb-hadoop and flink)? I'm even starting to
> > think that Flink cannot handle that well big jars (before the new
> > dependency it was 44M, afterwards it became 115M) when it comes to
> > classpath loading?
> >
> > Any help would be really appreciated,
> > Kind regards,
> > Federico
> >
> >
> >
> > Hello everyone,
> >
> > I'd like to submit to you this weird issue I'm having, hoping you could
> > help me.
> > Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
> > compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> > So, I'm trying to implement an sink for Hive so I added the following
> > dependency in my build.sbt:
> >
> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> > "1.2.1000.2.6.1.0-129"
> >
> > in order to use hive streaming capabilities.
> >
> > After importing this dependency, not even using it, if I try to flink
> > run the job I get
> >
> > org.apache.flink.client.program.ProgramInvocationException: The
> > program's entry point class 'package.MainObj' was not found in the jar
> file.
> >
> > If I remove the dependency, everything goes back to normal.
> > What is weird is that if I try to use sbt run in order to run job, *it
> > does find the Main class* and obviously crash because of the missing
> > flink core dependencies (AbstractStateBackend missing and whatnot).
> >
> > Here are the complete dependencies of the project:
> >
> > "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> > "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
> "provided",
> > "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> > "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> > "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> > "1.2.1000.2.6.1.0-129",
> > "org.joda" % "joda-convert" % "1.8.3",
> > "com.typesafe.play" %% "play-json" % "2.6.2",
> > "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> > "org.scalactic" %% "scalactic" % "3.0.1",
> > "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> > "de.javakaffee" % "kryo-serializers" % "0.42"
> >
> > Could it be an issue of dependencies conflicts between mongo-hadoop and
> > hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> > though no issue between mongodb-hadoop and flink)? I'm even starting to
> > think that Flink cannot handle that well big jars (before the new
> > dependency it was 44M, afterwards it became 115M) when it comes to
> > classpath loading?
> >
> > Any help would be really appreciated,
> > Kind regards,
> > Federico
>
> --
> Urs Schönenberger - urs.schoenenberger@tngtech.com
>
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>

Re: Unable to run flink job after adding a new dependency (cannot find Main-class)

Posted by Urs Schoenenberger <ur...@tngtech.com>.
Hi Federico,

just guessing, but are you explicitly setting the Main-Class manifest
attribute for the jar that you are building?

Should be something like

mainClass in (Compile, packageBin) :=
Some("org.yourorg.YourFlinkJobMainClass")

Best,
Urs


On 23.09.2017 17:53, Federico D'Ambrosio wrote:
> Hello everyone,
> 
> I'd like to submit to you this weird issue I'm having, hoping you could
> help me.
> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> So, I'm trying to implement an sink for Hive so I added the following
> dependency in my build.sbt:
> 
> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> "1.2.1000.2.6.1.0-129"
> 
> in order to use hive streaming capabilities.
> 
> After importing this dependency, not even using it, if I try to flink run
> the job I get
> 
> org.apache.flink.client.program.ProgramInvocationException: The program's
> entry point class 'package.MainObj' was not found in the jar file.
> 
> If I remove the dependency, everything goes back to normal.
> What is weird is that if I try to use sbt run in order to run job, *it does
> find the Main class* and obviously crash because of the missing flink core
> dependencies (AbstractStateBackend missing and whatnot).
> 
> Here are the complete dependencies of the project:
> 
> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> "1.2.1000.2.6.1.0-129",
> "org.joda" % "joda-convert" % "1.8.3",
> "com.typesafe.play" %% "play-json" % "2.6.2",
> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> "org.scalactic" %% "scalactic" % "3.0.1",
> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> "de.javakaffee" % "kryo-serializers" % "0.42"
> 
> Could it be an issue of dependencies conflicts between mongo-hadoop and
> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> though no issue between mongodb-hadoop and flink)? I'm even starting to
> think that Flink cannot handle that well big jars (before the new
> dependency it was 44M, afterwards it became 115M) when it comes to
> classpath loading?
> 
> Any help would be really appreciated,
> Kind regards,
> Federico
> 
> 
> 
> Hello everyone,
> 
> I'd like to submit to you this weird issue I'm having, hoping you could
> help me.
> Premise: I'm using sbt 0.13.6 for building, scala 2.11.8 and flink 1.3.2
> compiled from sources against hadoop 2.7.3.2.6.1.0-129 (HDP 2.6)
> So, I'm trying to implement an sink for Hive so I added the following
> dependency in my build.sbt:
> 
> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> "1.2.1000.2.6.1.0-129"
> 
> in order to use hive streaming capabilities.
> 
> After importing this dependency, not even using it, if I try to flink
> run the job I get
> 
> org.apache.flink.client.program.ProgramInvocationException: The
> program's entry point class 'package.MainObj' was not found in the jar file.
> 
> If I remove the dependency, everything goes back to normal.
> What is weird is that if I try to use sbt run in order to run job, *it
> does find the Main class* and obviously crash because of the missing
> flink core dependencies (AbstractStateBackend missing and whatnot).
> 
> Here are the complete dependencies of the project:
> 
> "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
> "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
> "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
> "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
> "org.apache.hive.hcatalog" % "hive-hcatalog-streaming" %
> "1.2.1000.2.6.1.0-129",
> "org.joda" % "joda-convert" % "1.8.3",
> "com.typesafe.play" %% "play-json" % "2.6.2",
> "org.mongodb.mongo-hadoop" % "mongo-hadoop-core" % "2.0.2",
> "org.scalactic" %% "scalactic" % "3.0.1",
> "org.scalatest" %% "scalatest" % "3.0.1" % "test",
> "de.javakaffee" % "kryo-serializers" % "0.42"
> 
> Could it be an issue of dependencies conflicts between mongo-hadoop and
> hive hadoop versions (respectively 2.7.1 and  2.7.3.2.6.1.0-129, even
> though no issue between mongodb-hadoop and flink)? I'm even starting to
> think that Flink cannot handle that well big jars (before the new
> dependency it was 44M, afterwards it became 115M) when it comes to
> classpath loading?
> 
> Any help would be really appreciated,
> Kind regards,
> Federico

-- 
Urs Schönenberger - urs.schoenenberger@tngtech.com

TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Dr. Robert Dahlke, Gerhard Müller
Sitz: Unterföhring * Amtsgericht München * HRB 135082