You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Hu...@Dell.com on 2013/11/21 18:42:03 UTC

setting SPARK_HADOOP_VERSION

Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam

RE: setting SPARK_HADOOP_VERSION

Posted by Hu...@Dell.com.
Not working as I am still getting the exception
org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4

when using setting  SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0

The only option I have now is compile spark 0.8 branch in my CDH4 env.

The prebuilt spark binaries does not work in my CDH4 env.

Thnaks,
Hussam

From: Jarada, Hussam
Sent: Thursday, November 21, 2013 11:33 AM
To: user@spark.incubator.apache.org
Subject: RE: setting SPARK_HADOOP_VERSION


Dell - Internal Use - Confidential
So my setting should be

SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0

When using prebuilt spark binary from the download site.

Let me try it cause I am still getting  this exception when trying to run with setting 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0
exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'



From: Patrick Wendell [mailto:pwendell@gmail.com]
Sent: Thursday, November 21, 2013 11:29 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

It was 4.2.0 - we've kept it this way for the last few releases for consistency.

On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <pi...@realimpactanalytics.com>> wrote:
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 20:16, <Hu...@Dell.com>> a écrit :
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam


RE: setting SPARK_HADOOP_VERSION

Posted by Hu...@Dell.com.
Dell - Internal Use - Confidential
So my setting should be

SPARK_HADOOP_VERSION=2.0.0-cdh4.2.0

When using prebuilt spark binary from the download site.

Let me try it cause I am still getting  this exception when trying to run with setting 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0
exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'



From: Patrick Wendell [mailto:pwendell@gmail.com]
Sent: Thursday, November 21, 2013 11:29 AM
To: user@spark.incubator.apache.org
Subject: Re: setting SPARK_HADOOP_VERSION

It was 4.2.0 - we've kept it this way for the last few releases for consistency.

On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <pi...@realimpactanalytics.com>> wrote:
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 20:16, <Hu...@Dell.com>> a écrit :
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam


RE: setting SPARK_HADOOP_VERSION

Posted by Hu...@Dell.com.
Thank u Kapil.

I was able to make it working after I built spark in different env.  For some reason the current env. causing the final spark package not working due to pulling hadoop 1.x libraries.

Hussam

From: Kapil Malik [mailto:kmalik@adobe.com]
Sent: Thursday, November 21, 2013 10:08 PM
To: user@spark.incubator.apache.org
Subject: RE: setting SPARK_HADOOP_VERSION

Hi Hussam,

So you successfully built spark 0.8.0 (latest download from site) with Hadoop 2.0.0-CDH4.4.0  (i.e sbt clean and sbt assembly) and it still cannot connect to CDH 4 ?
That's strange because I'm using the exact same versions of spark and hadoop (for a standalone cluster), and am able to connect from my client machine after building.

Regards,

Kapil Malik | kmalik@adobe.com<ma...@adobe.com>

From: Hussam_Jarada@Dell.com<ma...@Dell.com> [mailto:Hussam_Jarada@Dell.com]
Sent: 22 November 2013 05:18
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: RE: setting SPARK_HADOOP_VERSION

Pat - Can u please be more specific on which hadoop version the prebuilt spark binaries CDH4 used?

is it 2.0.0-cdh4.2.0 or 2.0.0-mr1-cdh4.2.0


I am blocked here cause all my efforts to build spark 0.8.0 with different spark hadoop settings did not resolve my exception when invoking javaRDD.saveastextfile

org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
        at org.apache.hadoop.ipc.Client.call(Client.java:1113)

same code works if I run it under hadoop 1.2.1 but not under CDH4 which has Hadoop 2.0.0-cdh4.4.0


Thanks,
Hussam
From: Patrick Wendell [mailto:pwendell@gmail.com]
Sent: Thursday, November 21, 2013 11:29 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

It was 4.2.0 - we've kept it this way for the last few releases for consistency.

On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <pi...@realimpactanalytics.com>> wrote:
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 20:16, <Hu...@Dell.com>> a écrit :
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam


RE: setting SPARK_HADOOP_VERSION

Posted by Kapil Malik <km...@adobe.com>.
Hi Hussam,

So you successfully built spark 0.8.0 (latest download from site) with Hadoop 2.0.0-CDH4.4.0  (i.e sbt clean and sbt assembly) and it still cannot connect to CDH 4 ?
That's strange because I'm using the exact same versions of spark and hadoop (for a standalone cluster), and am able to connect from my client machine after building.

Regards,

Kapil Malik | kmalik@adobe.com<ma...@adobe.com>

From: Hussam_Jarada@Dell.com [mailto:Hussam_Jarada@Dell.com]
Sent: 22 November 2013 05:18
To: user@spark.incubator.apache.org
Subject: RE: setting SPARK_HADOOP_VERSION

Pat - Can u please be more specific on which hadoop version the prebuilt spark binaries CDH4 used?

is it 2.0.0-cdh4.2.0 or 2.0.0-mr1-cdh4.2.0


I am blocked here cause all my efforts to build spark 0.8.0 with different spark hadoop settings did not resolve my exception when invoking javaRDD.saveastextfile

org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
        at org.apache.hadoop.ipc.Client.call(Client.java:1113)

same code works if I run it under hadoop 1.2.1 but not under CDH4 which has Hadoop 2.0.0-cdh4.4.0


Thanks,
Hussam
From: Patrick Wendell [mailto:pwendell@gmail.com]
Sent: Thursday, November 21, 2013 11:29 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

It was 4.2.0 - we've kept it this way for the last few releases for consistency.

On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <pi...@realimpactanalytics.com>> wrote:
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 20:16, <Hu...@Dell.com>> a écrit :
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam


RE: setting SPARK_HADOOP_VERSION

Posted by Hu...@Dell.com.
Pat - Can u please be more specific on which hadoop version the prebuilt spark binaries CDH4 used?

is it 2.0.0-cdh4.2.0 or 2.0.0-mr1-cdh4.2.0


I am blocked here cause all my efforts to build spark 0.8.0 with different spark hadoop settings did not resolve my exception when invoking javaRDD.saveastextfile

org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
        at org.apache.hadoop.ipc.Client.call(Client.java:1113)

same code works if I run it under hadoop 1.2.1 but not under CDH4 which has Hadoop 2.0.0-cdh4.4.0


Thanks,
Hussam
From: Patrick Wendell [mailto:pwendell@gmail.com]
Sent: Thursday, November 21, 2013 11:29 AM
To: user@spark.incubator.apache.org
Subject: Re: setting SPARK_HADOOP_VERSION

It was 4.2.0 - we've kept it this way for the last few releases for consistency.

On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <pi...@realimpactanalytics.com>> wrote:
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 20:16, <Hu...@Dell.com>> a écrit :
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org<ma...@spark.incubator.apache.org>
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 ... is that correct?

Not sure if it's related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam


Re: setting SPARK_HADOOP_VERSION

Posted by Patrick Wendell <pw...@gmail.com>.
It was 4.2.0 - we've kept it this way for the last few releases for
consistency.


On Thu, Nov 21, 2013 at 11:19 AM, Pierre Borckmans <
pierre.borckmans@realimpactanalytics.com> wrote:

> You are right about the typo. I did a copy paste from the website which
> presents this mistake.
> I forgot about this even though I submitted the correction to GutHub last
> week ;)
>
>
>
> Message sent from a mobile service - excuse typos and abbreviations
>
> Le 21 nov. 2013 à 20:16, <Hu...@Dell.com> a écrit :
>
> Thank u Pierre.  I think u had typo, as it should be cdh and not chd
>
>
>
> Anyone know the prebuilt spark binaries for CDH4 which hadoop version was
> built against?
>
>
>
> http://spark.incubator.apache.org/downloads.html
>
>
>
> was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?
>
>
>
> Thanks,
>
> Hussam
>
> *From:* Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com<pi...@realimpactanalytics.com>]
>
> *Sent:* Thursday, November 21, 2013 10:33 AM
> *To:* user@spark.incubator.apache.org
> *Subject:* Re: setting SPARK_HADOOP_VERSION
>
>
>
> Given your cdh4 distribution, you should use
>
> 2.0.0-mr1-chd4.4.0
>
> That is if you don't want to use YARN
>
> Otherwise you should use:
> 2.0.0-chd4.4.0
>
> I think you were misreading the table as the column you mentioned is for
> the HDP distribution.
>
>
>
> HTH
>
>
>
> Pierre
>
>
>
> Sent from a mobile, excuse typos and abbreviations
>
> Message sent from a mobile service - excuse typos and abbreviations
>
>
> Le 21 nov. 2013 à 18:42, <Hu...@Dell.com> a écrit :
>
> Hi,
>
>
>
> Can someone advise please on the following setting of SPARK_HADOOP_VERSION?
>
>
>
> I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site,
> running in CDH4 env.
>
>
>
> Running hadoop version shows
>
>
>
> [spark@kserv06 ~]$ hadoop version
>
> *Hadoop 2.0.0-cdh4.4.0*
>
> Subversion
> file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common-r c0eba6cd38c984557e96a16ccd7356b7de835e79
>
> Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
>
> From source with checksum ac7e170aa709b3ace13dc5f775487180
>
> This command was run using
> /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
>
> [spark@kserv06 ~]$
>
>
>
> So per checking
> http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html
>
>
>
> SPARK_HADOOP_VERSION should be 1.2.0 … is that correct?
>
>
>
> Not sure if it’s related but I am getting this exception when invoking
> javaRDD.saveAsTextFile
>
>
>
> exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7
> cannot communicate with client version 4'
>
>
>
>
>
> Thanks,
>
> Hussam
>
>

Re: setting SPARK_HADOOP_VERSION

Posted by Pierre Borckmans <pi...@realimpactanalytics.com>.
You are right about the typo. I did a copy paste from the website which presents this mistake.
I forgot about this even though I submitted the correction to GutHub last week ;)



Message sent from a mobile service - excuse typos and abbreviations

> Le 21 nov. 2013 à 20:16, <Hu...@Dell.com> a écrit :
> 
> Thank u Pierre.  I think u had typo, as it should be cdh and not chd
>  
> Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?
>  
> http://spark.incubator.apache.org/downloads.html
>  
> was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?
>  
> Thanks,
> Hussam
> From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com] 
> Sent: Thursday, November 21, 2013 10:33 AM
> To: user@spark.incubator.apache.org
> Subject: Re: setting SPARK_HADOOP_VERSION
>  
> Given your cdh4 distribution, you should use 
> 2.0.0-mr1-chd4.4.0
> 
> That is if you don't want to use YARN
> 
> Otherwise you should use:
> 2.0.0-chd4.4.0
> 
> I think you were misreading the table as the column you mentioned is for the HDP distribution.
>  
> HTH
>  
> Pierre
>  
> Sent from a mobile, excuse typos and abbreviations 
> 
> Message sent from a mobile service - excuse typos and abbreviations
> 
> Le 21 nov. 2013 à 18:42, <Hu...@Dell.com> a écrit :
> 
> Hi,
>  
> Can someone advise please on the following setting of SPARK_HADOOP_VERSION?
>  
> I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.
>  
> Running hadoop version shows
>  
> [spark@kserv06 ~]$ hadoop version
> Hadoop 2.0.0-cdh4.4.0
> Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common -r c0eba6cd38c984557e96a16ccd7356b7de835e79
> Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
> From source with checksum ac7e170aa709b3ace13dc5f775487180
> This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
> [spark@kserv06 ~]$
>  
> So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html
>  
> SPARK_HADOOP_VERSION should be 1.2.0 … is that correct?
>  
> Not sure if it’s related but I am getting this exception when invoking javaRDD.saveAsTextFile
>  
> exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'
>  
>  
> Thanks,
> Hussam

RE: setting SPARK_HADOOP_VERSION

Posted by Hu...@Dell.com.
Thank u Pierre.  I think u had typo, as it should be cdh and not chd

Anyone know the prebuilt spark binaries for CDH4 which hadoop version was built against?

http://spark.incubator.apache.org/downloads.html

was it 2.0.0-cdh4.4.0 or 2.0.0-mr1-cdh4.4.0 ?

Thanks,
Hussam
From: Pierre Borckmans [mailto:pierre.borckmans@realimpactanalytics.com]
Sent: Thursday, November 21, 2013 10:33 AM
To: user@spark.incubator.apache.org
Subject: Re: setting SPARK_HADOOP_VERSION

Given your cdh4 distribution, you should use

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0


I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations

Message sent from a mobile service - excuse typos and abbreviations

Le 21 nov. 2013 à 18:42, <Hu...@Dell.com>> a écrit :
Hi,

Can someone advise please on the following setting of SPARK_HADOOP_VERSION?

I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.

Running hadoop version shows

[spark@kserv06 ~]$ hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common<file:///\\data\1\jenkins\workspace\generic-package-rhel64-6-0\topdir\BUILD\hadoop-2.0.0-cdh4.4.0\src\hadoop-common-project\hadoop-common> -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
[spark@kserv06 ~]$

So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html

SPARK_HADOOP_VERSION should be 1.2.0 … is that correct?

Not sure if it’s related but I am getting this exception when invoking javaRDD.saveAsTextFile

exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'


Thanks,
Hussam

Re: setting SPARK_HADOOP_VERSION

Posted by Pierre Borckmans <pi...@realimpactanalytics.com>.
Given your cdh4 distribution, you should use 

2.0.0-mr1-chd4.4.0

That is if you don't want to use YARN

Otherwise you should use:
2.0.0-chd4.4.0
I think you were misreading the table as the column you mentioned is for the HDP distribution.

HTH

Pierre

Sent from a mobile, excuse typos and abbreviations 

Message sent from a mobile service - excuse typos and abbreviations

> Le 21 nov. 2013 à 18:42, <Hu...@Dell.com> a écrit :
> 
> Hi,
>  
> Can someone advise please on the following setting of SPARK_HADOOP_VERSION?
>  
> I have spark 0.8 pre compiled binary of CDH4 downloaded from spark site, running in CDH4 env.
>  
> Running hadoop version shows
>  
> [spark@kserv06 ~]$ hadoop version
> Hadoop 2.0.0-cdh4.4.0
> Subversion file:///data/1/jenkins/workspace/generic-package-rhel64-6-0/topdir/BUILD/hadoop-2.0.0-cdh4.4.0/src/hadoop-common-project/hadoop-common -r c0eba6cd38c984557e96a16ccd7356b7de835e79
> Compiled by jenkins on Tue Sep  3 19:33:17 PDT 2013
> From source with checksum ac7e170aa709b3ace13dc5f775487180
> This command was run using /opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar
> [spark@kserv06 ~]$
>  
> So per checking http://spark.incubator.apache.org/docs/latest/hadoop-third-party-distributions.html
>  
> SPARK_HADOOP_VERSION should be 1.2.0 … is that correct?
>  
> Not sure if it’s related but I am getting this exception when invoking javaRDD.saveAsTextFile
>  
> exception 'org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4'
>  
>  
> Thanks,
> Hussam