You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Haopu Wang <HW...@qilinsoft.com> on 2014/12/19 15:11:14 UTC

Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

I’m using Spark 1.1.0 built for HDFS 2.4.

My application enables check-point (to HDFS 2.5.1) and it can build. But when I run it, I get below error:

 

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4

    at org.apache.hadoop.ipc.Client.call(Client.java:1070)

    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)

    at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)

    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)

    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)

    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)

    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)

    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)

    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)

    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)

    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)

    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)

    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)

    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)

    at org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:201)

 

Does that mean I have to use HDFS 2.4 to save check-point? Thank you!

 


Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Marcelo Vanzin <va...@cloudera.com>.
On Fri, Dec 19, 2014 at 4:05 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> My application doesn’t depends on hadoop-client directly.
>
> It only depends on spark-core_2.10 which depends on hadoop-client 1.0.4.
> This can be checked by Maven repository at
> http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.1.0

You can declare an explicit dependency on the newer hadoop libraries.
That would override Spark's dependencies.

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Haopu Wang <HW...@qilinsoft.com>.
Hi Sean,

 

I change Spark as provided dependency and declare hadoop-client 2.5.1 as compile dependency. 

Now I see this error when do “mvn package”. Do you know what could be the reason?

 

[INFO] --- scala-maven-plugin:3.1.3:compile (default) @ testspark ---

[WARNING]  Expected all dependencies to require Scala version: 2.10.0

[WARNING]  com.vitria:testspark:0.0.1-SNAPSHOT requires scala version: 2.10.0

[WARNING]  org.specs2:scalaz-core_2.10:7.0.0 requires scala version: 2.10.0

[WARNING]  org.specs2:scalaz-effect_2.10:7.0.0 requires scala version: 2.10.0

[WARNING]  org.specs2:scalaz-concurrent_2.10:7.0.0 requires scala version: 2.10.0

[WARNING]  org.scalatest:scalatest_2.10:2.0 requires scala version: 2.10.0

[WARNING]  org.scala-lang:scala-reflect:2.10.0 requires scala version: 2.10.0

[WARNING]  com.twitter:chill_2.10:0.3.6 requires scala version: 2.10.3

[WARNING] Multiple versions of scala libraries detected!

[INFO] D:\spark\workspace\testspark\src\main\scala:-1: info: compiling

[INFO] Compiling 1 source files to D:\spark\workspace\testspark\target\classes at 1419035872083

[ERROR] error: error while loading <root>, invalid CEN header (bad signature)

[ERROR] error: scala.reflect.internal.MissingRequirementError: object scala.runtime in compiler mirror not found.

[ERROR]         at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16)

[ERROR]         at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17)

……

 

________________________________

From: Sean Owen [mailto:sowen@cloudera.com] 
Sent: Saturday, December 20, 2014 8:12 AM
To: Haopu Wang
Cc: user@spark.apache.org; Raghavendra Pandey
Subject: RE: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

 

That's exactly the problem. You should mark Spark as a provided dependency only, and must declare your direct dependency on the correct version of hadoop-client. 

On Dec 20, 2014 12:04 AM, "Haopu Wang" <HW...@qilinsoft.com> wrote:

My application doesn’t depends on hadoop-client directly.

It only depends on spark-core_2.10 which depends on hadoop-client 1.0.4. This can be checked by Maven repository at http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.1.0

 

That’s strange and how to workaround the issue? Thanks for any suggestions.

 

________________________________

From: Raghavendra Pandey [mailto:raghavendra.pandey@gmail.com] 
Sent: Saturday, December 20, 2014 12:08 AM
To: Sean Owen; Haopu Wang
Cc: user@spark.apache.org
Subject: Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

 

It seems there is hadoop 1 somewhere in the path. 

On Fri, Dec 19, 2014, 21:24 Sean Owen <so...@cloudera.com> wrote:

Yes, but your error indicates that your application is actually using
Hadoop 1.x of some kind. Check your dependencies, especially
hadoop-client.

On Fri, Dec 19, 2014 at 2:11 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> I’m using Spark 1.1.0 built for HDFS 2.4.
>
> My application enables check-point (to HDFS 2.5.1) and it can build. But
> when I run it, I get below error:
>
>
>
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC
> version 9 cannot communicate with client version 4
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>
>     at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>
>     at
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:201)
>
>
>
> Does that mean I have to use HDFS 2.4 to save check-point? Thank you!
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


RE: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Sean Owen <so...@cloudera.com>.
That's exactly the problem. You should mark Spark as a provided dependency
only, and must declare your direct dependency on the correct version of
hadoop-client.
On Dec 20, 2014 12:04 AM, "Haopu Wang" <HW...@qilinsoft.com> wrote:

>    My application doesn’t depends on hadoop-client directly.
>
> It only depends on spark-core_2.10 which depends on hadoop-client 1.0.4.
> This can be checked by Maven repository at
> http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.1.0
>
>
>
> That’s strange and how to workaround the issue? Thanks for any suggestions.
>
>
>  ------------------------------
>
> *From:* Raghavendra Pandey [mailto:raghavendra.pandey@gmail.com]
> *Sent:* Saturday, December 20, 2014 12:08 AM
> *To:* Sean Owen; Haopu Wang
> *Cc:* user@spark.apache.org
> *Subject:* Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?
>
>
>
> It seems there is hadoop 1 somewhere in the path.
>
> On Fri, Dec 19, 2014, 21:24 Sean Owen <so...@cloudera.com> wrote:
>
> Yes, but your error indicates that your application is actually using
> Hadoop 1.x of some kind. Check your dependencies, especially
> hadoop-client.
>
> On Fri, Dec 19, 2014 at 2:11 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> > I’m using Spark 1.1.0 built for HDFS 2.4.
> >
> > My application enables check-point (to HDFS 2.5.1) and it can build. But
> > when I run it, I get below error:
> >
> >
> >
> > Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server
> IPC
> > version 9 cannot communicate with client version 4
> >
> >     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
> >
> >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> >
> >     at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)
> >
> >     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
> >
> >     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
> >
> >     at
> > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
> >
> >     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
> >
> >     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
> >
> >     at
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
> >
> >     at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
> >
> >     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> >
> >     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
> >
> >     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> >
> >     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> >
> >     at
> >
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:201)
> >
> >
> >
> > Does that mean I have to use HDFS 2.4 to save check-point? Thank you!
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>

RE: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Haopu Wang <HW...@qilinsoft.com>.
My application doesn’t depends on hadoop-client directly.

It only depends on spark-core_2.10 which depends on hadoop-client 1.0.4. This can be checked by Maven repository at http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.1.0

 

That’s strange and how to workaround the issue? Thanks for any suggestions.

 

________________________________

From: Raghavendra Pandey [mailto:raghavendra.pandey@gmail.com] 
Sent: Saturday, December 20, 2014 12:08 AM
To: Sean Owen; Haopu Wang
Cc: user@spark.apache.org
Subject: Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

 

It seems there is hadoop 1 somewhere in the path. 

On Fri, Dec 19, 2014, 21:24 Sean Owen <so...@cloudera.com> wrote:

Yes, but your error indicates that your application is actually using
Hadoop 1.x of some kind. Check your dependencies, especially
hadoop-client.

On Fri, Dec 19, 2014 at 2:11 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> I’m using Spark 1.1.0 built for HDFS 2.4.
>
> My application enables check-point (to HDFS 2.5.1) and it can build. But
> when I run it, I get below error:
>
>
>
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC
> version 9 cannot communicate with client version 4
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>
>     at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>
>     at
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:201)
>
>
>
> Does that mean I have to use HDFS 2.4 to save check-point? Thank you!
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Raghavendra Pandey <ra...@gmail.com>.
It seems there is hadoop 1 somewhere in the path.

On Fri, Dec 19, 2014, 21:24 Sean Owen <so...@cloudera.com> wrote:

> Yes, but your error indicates that your application is actually using
> Hadoop 1.x of some kind. Check your dependencies, especially
> hadoop-client.
>
> On Fri, Dec 19, 2014 at 2:11 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> > I’m using Spark 1.1.0 built for HDFS 2.4.
> >
> > My application enables check-point (to HDFS 2.5.1) and it can build. But
> > when I run it, I get below error:
> >
> >
> >
> > Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
> Server IPC
> > version 9 cannot communicate with client version 4
> >
> >     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
> >
> >     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> >
> >     at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)
> >
> >     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
> >
> >     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
> >
> >     at
> > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
> >
> >     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
> >
> >     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
> >
> >     at
> > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(
> DistributedFileSystem.java:89)
> >
> >     at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
> >
> >     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> >
> >     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
> >
> >     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> >
> >     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> >
> >     at
> > org.apache.spark.streaming.StreamingContext.checkpoint(
> StreamingContext.scala:201)
> >
> >
> >
> > Does that mean I have to use HDFS 2.4 to save check-point? Thank you!
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: Can Spark 1.1.0 save checkpoint to HDFS 2.5.1?

Posted by Sean Owen <so...@cloudera.com>.
Yes, but your error indicates that your application is actually using
Hadoop 1.x of some kind. Check your dependencies, especially
hadoop-client.

On Fri, Dec 19, 2014 at 2:11 PM, Haopu Wang <HW...@qilinsoft.com> wrote:
> I’m using Spark 1.1.0 built for HDFS 2.4.
>
> My application enables check-point (to HDFS 2.5.1) and it can build. But
> when I run it, I get below error:
>
>
>
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC
> version 9 cannot communicate with client version 4
>
>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>
>     at com.sun.proxy.$Proxy6.getProtocolVersion(Unknown Source)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>
>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>
>     at
> org.apache.spark.streaming.StreamingContext.checkpoint(StreamingContext.scala:201)
>
>
>
> Does that mean I have to use HDFS 2.4 to save check-point? Thank you!
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org