You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Albert Giménez <al...@datacamp.com> on 2017/08/29 09:46:02 UTC

Flink session on Yarn - ClassNotFoundException

Hi,

I’m trying to start a flink (1.3.2) session as explained in the docs (https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#start-a-session <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#start-a-session>), but I keep getting the “ClassNotFoundException” you can see below.

I’m running an HDInsight cluster on Azure (HDFS and YARN v2.7.3), and I’m exporting my HADOOP_CONF_DIR before running the yarn-session command.

Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:443)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:630)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:486)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:483)
	at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:483)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
	at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
	at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
	at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.getYarnClient(AbstractYarnClusterDescriptor.java:381)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:458)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:441)
	... 9 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
	... 17 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
	at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
	... 18 more


So far, I’ve check that the yarn class path is correct (it is), I also tried manually linking the “hadoop-yarn-common" jar from my /usr/hdp directories into Flink’s “lib” directory. In that case, I get an “IllegalAccessError”:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxyInternal()Ljava/lang/Object; from class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
	at org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider.init(RequestHedgingRMFailoverProxyProvider.java:75)
	at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:163)
	at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
	at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.getYarnClient(AbstractYarnClusterDescriptor.java:381)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:458)
	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:441)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:630)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:486)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:483)
	at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:483)

Any ideas on what might be going on? This last error makes me think that maybe the error has to do with the HDFS version Flink is compiled against, but I’m not sure.

Thanks a lot!

Re: Flink session on Yarn - ClassNotFoundException

Posted by nishutayal <ni...@gmail.com>.
Hi Albert,

Your thread is really helpful for flink setup on HDInsight cluster. I
followed the same and now I hit another error:

*Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V
        at
org.apache.hadoop.fs.adl.AdlConfKeys.addDeprecatedKeys(AdlConfKeys.java:114)
        at
org.apache.hadoop.fs.adl.AdlFileSystem.<clinit>(AdlFileSystem.java:92)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at
org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
        at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
        at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
        at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2638)
        at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2651)
*

Do you have any pointers for above error?

Regards,
Nishu



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink session on Yarn - ClassNotFoundException

Posted by Albert Giménez <al...@datacamp.com>.
Thanks for the replies :)

I managed to get it working following the instructions here <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions>, but I found a few issues that I guess were specific to HDInsight, or at least to the HDP version it uses. Trying to summarize:

Hadoop version 
After running “hadoop version”, the result was “2.7.3.2.6.1.3-4”.
However, when building I was getting errors that some dependencies from the Hortonworks repo were not found, for instance zookeeper “3.4.6.2.6.1.3-4”.
I browsed to the Hortonworks repo <http://repo.hortonworks.com/content/repositories/releases/org/apache/zookeeper/zookeeper/> to find a suitable version, so I ended up using 2.7.3.2.6.1.31-3 instead.

Scala version
I also had issues with dependencies if I tried using Scala version 2.11.11, so I compiled agains 2.11.7.

So, the maven command I used was this:
mvn install -DskipTests -Dscala.version=2.11.7 -Pvendor-repos -Dhadoop.version=2.7.3.2.6.1.31-3

Azure Jars
With all that, I still had class not found errors errors when trying to start my Flink session, for instance "java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.adl.HdiAdlFileSystem”.
To fix that, I had to find the azure-specific jars I needed to use. For that, I checked which jars Spark was using and copied / symlinked them into Flink’s “lib” directory:
/usr/hdp/current/spark2-client/jars/*azure*.jar
/usr/lib/hdinsight-datalake/adls2-oauth2-token-provider.jar

Guava Jar
Finally, for some reason my jobs were failing (the Cassandra driver was complaining about the Guava version being too old, although I had the right version in my assembled jar).
I just downloaded the version I needed (in my case, 23.0 <http://central.maven.org/maven2/com/google/guava/guava/23.0/guava-23.0.jar>) and also put that into Flink’s lib directory.

I hope it helps other people trying to run Flink on Azure HDInsight :)

Kind regards,

Albert

> On Aug 31, 2017, at 8:18 PM, Banias H <ba...@gmail.com> wrote:
> 
> We had the same issue. Get the hdp version, from /usr/hdp/current/hadoop-client/hadoop-common-<version>.jar for example. Then rebuild flink from src:
> mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=<version>
> 
> for example: mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.7.3.2.6.1.0-129
> 
> Copy and setup build-target/ to the cluster. Export HADOOP_CONF_DIR and YARN_CONF_DIR according to your env. You should have no problem starting the session.
> 
> 
> On Wed, Aug 30, 2017 at 6:45 AM, Federico D'Ambrosio <fedexist@gmail.com <ma...@gmail.com>> wrote:
> Hi,
> What is your "hadoop version" output? I'm asking because you said your hadoop distribution is in /usr/hdp so it looks like you're using Hortonworks HDP, just like myself. So, this would be a third party distribution and you'd need to build Flink from source according to this: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions>
> 
> Federico D'Ambrosio
> 
> Il 30 ago 2017 13:33, "albert" <albert@datacamp.com <ma...@datacamp.com>> ha scritto:
> Hi Chesnay,
> 
> Thanks for your reply. I did download the binaries matching my Hadoop
> version (2.7), that's why I was wondering if the issue had something to do
> with the exact hadoop version flink is compiled again, or if there might be
> things that are missing in my environment.
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
> 


Re: Flink session on Yarn - ClassNotFoundException

Posted by Banias H <ba...@gmail.com>.
We had the same issue. Get the hdp version, from
/usr/hdp/current/hadoop-client/hadoop-common-<version>.jar for example.
Then rebuild flink from src:

mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=<version>

for example: mvn clean install -DskipTests -Pvendor-repos
-Dhadoop.version=2.7.3.2.6.1.0-129

Copy and setup build-target/ to the cluster. Export HADOOP_CONF_DIR and
YARN_CONF_DIR according to your env. You should have no problem starting
the session.

On Wed, Aug 30, 2017 at 6:45 AM, Federico D'Ambrosio <fe...@gmail.com>
wrote:

> Hi,
> What is your "hadoop version" output? I'm asking because you said your
> hadoop distribution is in /usr/hdp so it looks like you're using
> Hortonworks HDP, just like myself. So, this would be a third party
> distribution and you'd need to build Flink from source according to this:
> https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/building.html#vendor-specific-versions
>
> Federico D'Ambrosio
>
> Il 30 ago 2017 13:33, "albert" <al...@datacamp.com> ha scritto:
>
>> Hi Chesnay,
>>
>> Thanks for your reply. I did download the binaries matching my Hadoop
>> version (2.7), that's why I was wondering if the issue had something to do
>> with the exact hadoop version flink is compiled again, or if there might
>> be
>> things that are missing in my environment.
>>
>>
>>
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.
>> nabble.com/
>>
>

Re: Flink session on Yarn - ClassNotFoundException

Posted by Federico D'Ambrosio <fe...@gmail.com>.
Hi,
What is your "hadoop version" output? I'm asking because you said your
hadoop distribution is in /usr/hdp so it looks like you're using
Hortonworks HDP, just like myself. So, this would be a third party
distribution and you'd need to build Flink from source according to this:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/building.html#vendor-specific-versions

Federico D'Ambrosio

Il 30 ago 2017 13:33, "albert" <al...@datacamp.com> ha scritto:

> Hi Chesnay,
>
> Thanks for your reply. I did download the binaries matching my Hadoop
> version (2.7), that's why I was wondering if the issue had something to do
> with the exact hadoop version flink is compiled again, or if there might be
> things that are missing in my environment.
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/
>

Re: Flink session on Yarn - ClassNotFoundException

Posted by albert <al...@datacamp.com>.
Hi Chesnay,

Thanks for your reply. I did download the binaries matching my Hadoop
version (2.7), that's why I was wondering if the issue had something to do
with the exact hadoop version flink is compiled again, or if there might be
things that are missing in my environment.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink session on Yarn - ClassNotFoundException

Posted by Chesnay Schepler <ch...@apache.org>.
Hello,

The ClassNotFoundException indicates that you are using a Flink version 
that wasn't compiled against hadoop 2.7.
Replacing one part of the hadoop dependency will most likely not cut it 
(or fail mysteriously down the line),
so i would suggest to check the downloads 
<http://flink.apache.org/downloads.html> page and pick the binary 
matching your hadoop version.

It should be enough to replace the flink-shaded-hadoop2-uber-1.3.2.jar 
under /lib.

On 29.08.2017 11:46, Albert Giménez wrote:
> Hi,
>
> I’m trying to start a flink (1.3.2) session as explained in the docs 
> (https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#start-a-session), 
> but I keep getting the “ClassNotFoundException” you can see below.
>
> I’m running an HDInsight cluster on Azure (HDFS and YARN v2.7.3), and 
> I’m exporting my HADOOP_CONF_DIR before running the yarn-session command.
>
> Error while deploying YARN cluster: Couldn't deploy Yarn cluster
> java.lang.RuntimeException: Couldn't deploy Yarn cluster
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:443)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:630)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:486)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:483)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:483)
> Caused by: java.lang.RuntimeException: java.lang.RuntimeException: 
> java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider 
> not found
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
> at 
> org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
> at 
> org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.getYarnClient(AbstractYarnClusterDescriptor.java:381)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:458)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:441)
> ... 9 more
> Caused by: java.lang.RuntimeException: 
> java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider 
> not found
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
> ... 17 more
> Caused by: java.lang.ClassNotFoundException: Class 
> org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider 
> not found
> at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
> at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
> ... 18 more
>
>
> So far, I’ve check that the yarn class path is correct (it is), I also 
> tried manually linking the “hadoop-yarn-common" jar from my /usr/hdp 
> directories into Flink’s “lib” directory. In that case, I get an 
> “IllegalAccessError”:
>
> Exception in thread "main" java.lang.IllegalAccessError: tried to 
> access method 
> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.getProxyInternal()Ljava/lang/Object; 
> from class 
> org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
> at 
> org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider.init(RequestHedgingRMFailoverProxyProvider.java:75)
> at 
> org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:163)
> at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
> at 
> org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
> at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.getYarnClient(AbstractYarnClusterDescriptor.java:381)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:458)
> at 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:441)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:630)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:486)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:483)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
> at 
> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:483)
>
> Any ideas on what might be going on? This last error makes me think 
> that maybe the error has to do with the HDFS version Flink is compiled 
> against, but I’m not sure.
>
> Thanks a lot!