You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Patrick Wendell <pw...@gmail.com> on 2014/08/04 10:15:53 UTC

Re: Issues with HDP 2.4.0.2.1.3.0-563

For hortonworks, I believe it should work to just link against the
corresponding upstream version. I.e. just set the Hadoop version to "2.4.0"

Does that work?

- Patrick


On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
wrote:

> Hi,
>   Not sure whose issue this is, but if I run make-distribution using HDP
> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
> make-distribution.sh), I get a strange error with the exception below. If I
> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
> make-distribution, using the generated assembly all works fine for me.
> Either 1.0.0 or 1.0.1 will work fine.
>
>   Should I file a JIRA or is this a known issue?
>
> Thanks,
> Ron
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
> Exception failure in TID 0 on host localhost:
> java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>         org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(
> AvroKeyInputFormat.java:47)
>         org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(
> NewHadoopRDD.scala:111)
>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111
> )
>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>         org.apache.spark.executor.Executor$TaskRunner.run(
> Executor.scala:187)
>         java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
>         java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Ron Gonzalez <zl...@yahoo.com.INVALID>.

One key thing I forgot to mention is that I changed the avro version to 1.7.7 to get AVRO-1476.

I took a closer look at the jars, and what I noticed is that the assembly jars that work do not have the org.apache.avro.mapreduce package packaged into the assembly. For spark-1.0.1, org.apache.avro.mapreduce is always found. When creating an assembly from an older download of Spark 1.0.0, this package doesn't exist. In a recent download of Spark 1.0.0, the generated assembly with any HDP version also has org.apache.avro.mapreduce. I recompiled against the new download, and it also has the same problems even with an older version of HDP.

So I think the bottom line issue here is that the generated assemblies that include org.apache.avro.mapreduce seems to cause this issue. If I use the older Spark 1.0.0 version, I am able to create assemblies that work. I noticed that assemblies generated from the newer versions are indeed bigger so it seems a bug was perhaps fixed to ensure that all dependencies are pulled into the final assembly, but is now causing this symptom that I have reported...

Thanks,
Ron

On Monday, August 4, 2014 10:39 AM, Steve Nunez <sn...@hortonworks.com> wrote:

Hmm. Fair enough. I hadn¹t given that answer much thought and on
reflection think you¹re right in that a profile would just be a bad hack.

On 8/4/14, 10:35, "Sean Owen" <so...@cloudera.com> wrote:

>What would such a profile do though? In general building for a
>specific vendor version means setting hadoop.verison and/or
>yarn.version. Any hard-coded value is unlikely to match what a
>particular user needs. Setting protobuf versions and so on is already
>done by the generic profiles.
>
>In a similar vein, I am not clear on why there's a mapr profile in the
>build. Its versions are about to be out of date and won't work with
>upcoming Hbase changes for example.
>
>(Elsewhere in the build I think it wouldn't hurt to clear out
>cloudera-specific profiles and releases too -- they're not in the pom
>but are in the distribution script. It's the vendor's problem.)
>
>This isn't any argument about being purist but just that I am not sure
>these are things that the project can meaningfully bother with.
>
>It makes sense to set vendor repos in the pom for convenience, and
>makes sense to run smoke tests in Jenkins against particular versions.
>
>$0.02
>Sean
>
>On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez <sn...@hortonworks.com>
>wrote:
>> I don¹t think there is an hwx profile, but there probably should be.
>>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

Hmm. Fair enough. I hadn¹t given that answer much thought and on
reflection think you¹re right in that a profile would just be a bad hack.



On 8/4/14, 10:35, "Sean Owen" <so...@cloudera.com> wrote:

>What would such a profile do though? In general building for a
>specific vendor version means setting hadoop.verison and/or
>yarn.version. Any hard-coded value is unlikely to match what a
>particular user needs. Setting protobuf versions and so on is already
>done by the generic profiles.
>
>In a similar vein, I am not clear on why there's a mapr profile in the
>build. Its versions are about to be out of date and won't work with
>upcoming Hbase changes for example.
>
>(Elsewhere in the build I think it wouldn't hurt to clear out
>cloudera-specific profiles and releases too -- they're not in the pom
>but are in the distribution script. It's the vendor's problem.)
>
>This isn't any argument about being purist but just that I am not sure
>these are things that the project can meaningfully bother with.
>
>It makes sense to set vendor repos in the pom for convenience, and
>makes sense to run smoke tests in Jenkins against particular versions.
>
>$0.02
>Sean
>
>On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez <sn...@hortonworks.com>
>wrote:
>> I don¹t think there is an hwx profile, but there probably should be.
>>



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

Hmm. Fair enough. I hadn¹t given that answer much thought and on
reflection think you¹re right in that a profile would just be a bad hack.



On 8/4/14, 10:35, "Sean Owen" <so...@cloudera.com> wrote:

>What would such a profile do though? In general building for a
>specific vendor version means setting hadoop.verison and/or
>yarn.version. Any hard-coded value is unlikely to match what a
>particular user needs. Setting protobuf versions and so on is already
>done by the generic profiles.
>
>In a similar vein, I am not clear on why there's a mapr profile in the
>build. Its versions are about to be out of date and won't work with
>upcoming Hbase changes for example.
>
>(Elsewhere in the build I think it wouldn't hurt to clear out
>cloudera-specific profiles and releases too -- they're not in the pom
>but are in the distribution script. It's the vendor's problem.)
>
>This isn't any argument about being purist but just that I am not sure
>these are things that the project can meaningfully bother with.
>
>It makes sense to set vendor repos in the pom for convenience, and
>makes sense to run smoke tests in Jenkins against particular versions.
>
>$0.02
>Sean
>
>On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez <sn...@hortonworks.com>
>wrote:
>> I don¹t think there is an hwx profile, but there probably should be.
>>



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Sean Owen <so...@cloudera.com>.

What would such a profile do though? In general building for a
specific vendor version means setting hadoop.verison and/or
yarn.version. Any hard-coded value is unlikely to match what a
particular user needs. Setting protobuf versions and so on is already
done by the generic profiles.

In a similar vein, I am not clear on why there's a mapr profile in the
build. Its versions are about to be out of date and won't work with
upcoming Hbase changes for example.

(Elsewhere in the build I think it wouldn't hurt to clear out
cloudera-specific profiles and releases too -- they're not in the pom
but are in the distribution script. It's the vendor's problem.)

This isn't any argument about being purist but just that I am not sure
these are things that the project can meaningfully bother with.

It makes sense to set vendor repos in the pom for convenience, and
makes sense to run smoke tests in Jenkins against particular versions.

$0.02
Sean

On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez <sn...@hortonworks.com> wrote:
> I don’t think there is an hwx profile, but there probably should be.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Sean Owen <so...@cloudera.com>.

What would such a profile do though? In general building for a
specific vendor version means setting hadoop.verison and/or
yarn.version. Any hard-coded value is unlikely to match what a
particular user needs. Setting protobuf versions and so on is already
done by the generic profiles.

In a similar vein, I am not clear on why there's a mapr profile in the
build. Its versions are about to be out of date and won't work with
upcoming Hbase changes for example.

(Elsewhere in the build I think it wouldn't hurt to clear out
cloudera-specific profiles and releases too -- they're not in the pom
but are in the distribution script. It's the vendor's problem.)

This isn't any argument about being purist but just that I am not sure
these are things that the project can meaningfully bother with.

It makes sense to set vendor repos in the pom for convenience, and
makes sense to run smoke tests in Jenkins against particular versions.

$0.02
Sean

On Mon, Aug 4, 2014 at 6:21 PM, Steve Nunez <sn...@hortonworks.com> wrote:
> I don’t think there is an hwx profile, but there probably should be.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

I don’t think there is an hwx profile, but there probably should be.

- Steve

From:  Patrick Wendell <pw...@gmail.com>
Date:  Monday, August 4, 2014 at 10:08
To:  Ron's Yahoo! <zl...@yahoo.com>
Cc:  Ron's Yahoo! <zl...@yahoo.com.invalid>, Steve Nunez
<sn...@hortonworks.com>, <us...@spark.apache.org>, "dev@spark.apache.org"
<de...@spark.apache.org>
Subject:  Re: Issues with HDP 2.4.0.2.1.3.0-563

Ah I see, yeah you might need to set hadoop.version and yarn.version. I
thought he profile set this automatically.


On Mon, Aug 4, 2014 at 10:02 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
> I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since 1.0.4
> doesn’t exist for yarn...
> 
> Thanks,
> Ron
> 
> On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
> 
>> That failed since it defaulted the versions for yarn and hadoop
>> I’ll give it a try with just 2.4.0 for both yarn and hadoop…
>> 
>> Thanks,
>> Ron
>> 
>> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
>> 
>>> Can you try building without any of the special `hadoop.version` flags and
>>> just building only with -Phadoop-2.4? In the past users have reported issues
>>> trying to build random spot versions... I think HW is supposed to be
>>> compatible with the normal 2.4.0 build.
>>> 
>>> 
>>> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
>>> wrote:
>>>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under the
>>>> repositories element. I also confirmed that if the build couldn’t find the
>>>> version, it would fail fast so it seems as if it’s able to get the versions
>>>> it needs to build the distribution.
>>>> I ran the following (generated from make-distribution.sh), but it did not
>>>> address the problem, while building with an older version
>>>> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>>>> 
>>>> mvn clean package -Phadoop-2.4 -Phive -Pyarn
>>>> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
>>>> -DskipTests
>>>> 
>>>> 
>>>> Thanks,
>>>> Ron
>>>> 
>>>> 
>>>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>>>> 
>>>>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>>>>> line:
>>>>> 
>>>>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>>>>> -DskipTests clean package
>>>>> 
>>>>> I haven¹t tried building a distro, but it should be similar.
>>>>> 
>>>>> 
>>>>> - SteveN
>>>>> 
>>>>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>>>> 
>>>>>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>>>>>> -Phadoop-2.4.
>>>>>> http://spark.apache.org/docs/latest/building-with-maven.html
>>>>>> 
>>>>>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>>>>>> wrote:
>>>>>>> For hortonworks, I believe it should work to just link against the
>>>>>>> corresponding upstream version. I.e. just set the Hadoop version to
>>>>>>> "2.4.0"
>>>>>>> 
>>>>>>> Does that work?
>>>>>>> 
>>>>>>> - Patrick
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>>>>>> <zl...@yahoo.com.invalid>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>>  Not sure whose issue this is, but if I run make-distribution using
>>>>>>> HDP
>>>>>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>>>>>> make-distribution.sh), I get a strange error with the exception below.
>>>>>>> If I
>>>>>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>>>>>> make-distribution, using the generated assembly all works fine for me.
>>>>>>> Either 1.0.0 or 1.0.1 will work fine.
>>>>>>> 
>>>>>>>  Should I file a JIRA or is this a known issue?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Ron
>>>>>>> 
>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>>>>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>>>>>> Exception failure in TID 0 on host localhost:
>>>>>>> java.lang.IncompatibleClassChangeError: Found interface
>>>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>>>>> 
>>>>>>> 
>>>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>>>>> nputFormat.java:47)
>>>>>>> 
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>> 
>>>>>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>>> 
>>>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>>>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>>>>> 
>>>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>>>>> 
>>>>>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>>>>> a:1145)
>>>>>>> 
>>>>>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>>>>> va:615)
>>>>>>>        java.lang.Thread.run(Thread.java:745)
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or entity
>>>>> to 
>>>>> which it is addressed and may contain information that is confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the reader
>>>>> of this message is not the intended recipient, you are hereby notified
>>>>> that 
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>>> immediately 
>>>>> and delete it from your system. Thank You.
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>> 
>>> 
>> 
> 




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Sean Owen <so...@cloudera.com>.

The profile does set it automatically:
https://github.com/apache/spark/blob/master/pom.xml#L1086

yarn.version should default to hadoop.version
It shouldn't hurt, and should work, to set to any other specific
version. If one HDP version works and another doesn't, are you sure
the repo has the desired version?

On Mon, Aug 4, 2014 at 6:08 PM, Patrick Wendell <pw...@gmail.com> wrote:
> Ah I see, yeah you might need to set hadoop.version and yarn.version. I
> thought he profile set this automatically.
>
>
> On Mon, Aug 4, 2014 at 10:02 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
>
>> I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since
>> 1.0.4 doesn't exist for yarn...
>>
>> Thanks,
>> Ron
>>
>> On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
>>
>> That failed since it defaulted the versions for yarn and hadoop
>> I'll give it a try with just 2.4.0 for both yarn and hadoop...
>>
>> Thanks,
>> Ron
>>
>> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
>>
>> Can you try building without any of the special `hadoop.version` flags and
>> just building only with -Phadoop-2.4? In the past users have reported
>> issues trying to build random spot versions... I think HW is supposed to be
>> compatible with the normal 2.4.0 build.
>>
>>
>> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zlgonzalez@yahoo.com.invalid
>> > wrote:
>>
>>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under
>>> the repositories element. I also confirmed that if the build couldn't find
>>> the version, it would fail fast so it seems as if it's able to get the
>>> versions it needs to build the distribution.
>>> I ran the following (generated from make-distribution.sh), but it did not
>>> address the problem, while building with an older version
>>> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>>>
>>> mvn clean package -Phadoop-2.4 -Phive -Pyarn
>>> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
>>> -DskipTests
>>>
>>>
>>> Thanks,
>>> Ron
>>>
>>>
>>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>>>
>>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>>> line:
>>>
>>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>>> -DskipTests clean package
>>>
>>> I haven¹t tried building a distro, but it should be similar.
>>>
>>>
>>> - SteveN
>>>
>>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>>
>>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>>> -Phadoop-2.4.
>>> http://spark.apache.org/docs/latest/building-with-maven.html
>>>
>>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>>> wrote:
>>>
>>> For hortonworks, I believe it should work to just link against the
>>> corresponding upstream version. I.e. just set the Hadoop version to
>>> "2.4.0"
>>>
>>> Does that work?
>>>
>>> - Patrick
>>>
>>>
>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>> <zl...@yahoo.com.invalid>
>>> wrote:
>>>
>>>
>>> Hi,
>>>  Not sure whose issue this is, but if I run make-distribution using
>>> HDP
>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>> make-distribution.sh), I get a strange error with the exception below.
>>> If I
>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>> make-distribution, using the generated assembly all works fine for me.
>>> Either 1.0.0 or 1.0.1 will work fine.
>>>
>>>  Should I file a JIRA or is this a known issue?
>>>
>>> Thanks,
>>> Ron
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>> Exception failure in TID 0 on host localhost:
>>> java.lang.IncompatibleClassChangeError: Found interface
>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>
>>>
>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>> nputFormat.java:47)
>>>
>>>
>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>
>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>
>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>
>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>
>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>> a:1145)
>>>
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>> va:615)
>>>        java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>>
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>>
>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>>> immediately
>>> and delete it from your system. Thank You.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>>
>>>
>>
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

I don’t think there is an hwx profile, but there probably should be.

- Steve

From:  Patrick Wendell <pw...@gmail.com>
Date:  Monday, August 4, 2014 at 10:08
To:  Ron's Yahoo! <zl...@yahoo.com>
Cc:  Ron's Yahoo! <zl...@yahoo.com.invalid>, Steve Nunez
<sn...@hortonworks.com>, <us...@spark.apache.org>, "dev@spark.apache.org"
<de...@spark.apache.org>
Subject:  Re: Issues with HDP 2.4.0.2.1.3.0-563

Ah I see, yeah you might need to set hadoop.version and yarn.version. I
thought he profile set this automatically.


On Mon, Aug 4, 2014 at 10:02 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
> I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since 1.0.4
> doesn’t exist for yarn...
> 
> Thanks,
> Ron
> 
> On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
> 
>> That failed since it defaulted the versions for yarn and hadoop
>> I’ll give it a try with just 2.4.0 for both yarn and hadoop…
>> 
>> Thanks,
>> Ron
>> 
>> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
>> 
>>> Can you try building without any of the special `hadoop.version` flags and
>>> just building only with -Phadoop-2.4? In the past users have reported issues
>>> trying to build random spot versions... I think HW is supposed to be
>>> compatible with the normal 2.4.0 build.
>>> 
>>> 
>>> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
>>> wrote:
>>>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under the
>>>> repositories element. I also confirmed that if the build couldn’t find the
>>>> version, it would fail fast so it seems as if it’s able to get the versions
>>>> it needs to build the distribution.
>>>> I ran the following (generated from make-distribution.sh), but it did not
>>>> address the problem, while building with an older version
>>>> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>>>> 
>>>> mvn clean package -Phadoop-2.4 -Phive -Pyarn
>>>> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
>>>> -DskipTests
>>>> 
>>>> 
>>>> Thanks,
>>>> Ron
>>>> 
>>>> 
>>>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>>>> 
>>>>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>>>>> line:
>>>>> 
>>>>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>>>>> -DskipTests clean package
>>>>> 
>>>>> I haven¹t tried building a distro, but it should be similar.
>>>>> 
>>>>> 
>>>>> - SteveN
>>>>> 
>>>>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>>>> 
>>>>>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>>>>>> -Phadoop-2.4.
>>>>>> http://spark.apache.org/docs/latest/building-with-maven.html
>>>>>> 
>>>>>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>>>>>> wrote:
>>>>>>> For hortonworks, I believe it should work to just link against the
>>>>>>> corresponding upstream version. I.e. just set the Hadoop version to
>>>>>>> "2.4.0"
>>>>>>> 
>>>>>>> Does that work?
>>>>>>> 
>>>>>>> - Patrick
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>>>>>> <zl...@yahoo.com.invalid>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>>  Not sure whose issue this is, but if I run make-distribution using
>>>>>>> HDP
>>>>>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>>>>>> make-distribution.sh), I get a strange error with the exception below.
>>>>>>> If I
>>>>>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>>>>>> make-distribution, using the generated assembly all works fine for me.
>>>>>>> Either 1.0.0 or 1.0.1 will work fine.
>>>>>>> 
>>>>>>>  Should I file a JIRA or is this a known issue?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Ron
>>>>>>> 
>>>>>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>>>>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>>>>>> Exception failure in TID 0 on host localhost:
>>>>>>> java.lang.IncompatibleClassChangeError: Found interface
>>>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>>>>> 
>>>>>>> 
>>>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>>>>> nputFormat.java:47)
>>>>>>> 
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>>>>> 
>>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>> 
>>>>>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>>> 
>>>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>>>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>>>>> 
>>>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>>>>> 
>>>>>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>>>>> a:1145)
>>>>>>> 
>>>>>>> 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>>>>> va:615)
>>>>>>>        java.lang.Thread.run(Thread.java:745)
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or entity
>>>>> to 
>>>>> which it is addressed and may contain information that is confidential,
>>>>> privileged and exempt from disclosure under applicable law. If the reader
>>>>> of this message is not the intended recipient, you are hereby notified
>>>>> that 
>>>>> any printing, copying, dissemination, distribution, disclosure or
>>>>> forwarding of this communication is strictly prohibited. If you have
>>>>> received this communication in error, please contact the sender
>>>>> immediately 
>>>>> and delete it from your system. Thank You.
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>> 
>>> 
>> 
> 




-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Patrick Wendell <pw...@gmail.com>.

Ah I see, yeah you might need to set hadoop.version and yarn.version. I
thought he profile set this automatically.


On Mon, Aug 4, 2014 at 10:02 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:

> I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since
> 1.0.4 doesn't exist for yarn...
>
> Thanks,
> Ron
>
> On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
>
> That failed since it defaulted the versions for yarn and hadoop
> I'll give it a try with just 2.4.0 for both yarn and hadoop...
>
> Thanks,
> Ron
>
> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
>
> Can you try building without any of the special `hadoop.version` flags and
> just building only with -Phadoop-2.4? In the past users have reported
> issues trying to build random spot versions... I think HW is supposed to be
> compatible with the normal 2.4.0 build.
>
>
> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zlgonzalez@yahoo.com.invalid
> > wrote:
>
>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under
>> the repositories element. I also confirmed that if the build couldn't find
>> the version, it would fail fast so it seems as if it's able to get the
>> versions it needs to build the distribution.
>> I ran the following (generated from make-distribution.sh), but it did not
>> address the problem, while building with an older version
>> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>>
>> mvn clean package -Phadoop-2.4 -Phive -Pyarn
>> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
>> -DskipTests
>>
>>
>> Thanks,
>> Ron
>>
>>
>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>>
>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>> line:
>>
>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>> -DskipTests clean package
>>
>> I haven¹t tried building a distro, but it should be similar.
>>
>>
>> - SteveN
>>
>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>
>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>> -Phadoop-2.4.
>> http://spark.apache.org/docs/latest/building-with-maven.html
>>
>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>> wrote:
>>
>> For hortonworks, I believe it should work to just link against the
>> corresponding upstream version. I.e. just set the Hadoop version to
>> "2.4.0"
>>
>> Does that work?
>>
>> - Patrick
>>
>>
>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>> <zl...@yahoo.com.invalid>
>> wrote:
>>
>>
>> Hi,
>>  Not sure whose issue this is, but if I run make-distribution using
>> HDP
>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>> make-distribution.sh), I get a strange error with the exception below.
>> If I
>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>> make-distribution, using the generated assembly all works fine for me.
>> Either 1.0.0 or 1.0.1 will work fine.
>>
>>  Should I file a JIRA or is this a known issue?
>>
>> Thanks,
>> Ron
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>> Exception failure in TID 0 on host localhost:
>> java.lang.IncompatibleClassChangeError: Found interface
>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>
>>
>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>> nputFormat.java:47)
>>
>>
>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>
>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>
>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>> a:1145)
>>
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>> va:615)
>>        java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>>
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>>
>
>
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Patrick Wendell <pw...@gmail.com>.

Ah I see, yeah you might need to set hadoop.version and yarn.version. I
thought he profile set this automatically.


On Mon, Aug 4, 2014 at 10:02 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:

> I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since
> 1.0.4 doesn't exist for yarn...
>
> Thanks,
> Ron
>
> On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:
>
> That failed since it defaulted the versions for yarn and hadoop
> I'll give it a try with just 2.4.0 for both yarn and hadoop...
>
> Thanks,
> Ron
>
> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
>
> Can you try building without any of the special `hadoop.version` flags and
> just building only with -Phadoop-2.4? In the past users have reported
> issues trying to build random spot versions... I think HW is supposed to be
> compatible with the normal 2.4.0 build.
>
>
> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zlgonzalez@yahoo.com.invalid
> > wrote:
>
>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under
>> the repositories element. I also confirmed that if the build couldn't find
>> the version, it would fail fast so it seems as if it's able to get the
>> versions it needs to build the distribution.
>> I ran the following (generated from make-distribution.sh), but it did not
>> address the problem, while building with an older version
>> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>>
>> mvn clean package -Phadoop-2.4 -Phive -Pyarn
>> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
>> -DskipTests
>>
>>
>> Thanks,
>> Ron
>>
>>
>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>>
>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>> line:
>>
>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>> -DskipTests clean package
>>
>> I haven¹t tried building a distro, but it should be similar.
>>
>>
>> - SteveN
>>
>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>
>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>> -Phadoop-2.4.
>> http://spark.apache.org/docs/latest/building-with-maven.html
>>
>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>> wrote:
>>
>> For hortonworks, I believe it should work to just link against the
>> corresponding upstream version. I.e. just set the Hadoop version to
>> "2.4.0"
>>
>> Does that work?
>>
>> - Patrick
>>
>>
>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>> <zl...@yahoo.com.invalid>
>> wrote:
>>
>>
>> Hi,
>>  Not sure whose issue this is, but if I run make-distribution using
>> HDP
>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>> make-distribution.sh), I get a strange error with the exception below.
>> If I
>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>> make-distribution, using the generated assembly all works fine for me.
>> Either 1.0.0 or 1.0.1 will work fine.
>>
>>  Should I file a JIRA or is this a known issue?
>>
>> Thanks,
>> Ron
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>> Exception failure in TID 0 on host localhost:
>> java.lang.IncompatibleClassChangeError: Found interface
>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>
>>
>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>> nputFormat.java:47)
>>
>>
>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>
>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>
>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>> a:1145)
>>
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>> va:615)
>>        java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>>
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>>
>
>
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Ron's Yahoo! <zl...@yahoo.com.INVALID>.

I meant yarn and hadoop defaulted to 1.0.4 so the yarn build fails since 1.0.4 doesn’t exist for yarn...

Thanks,
Ron

On Aug 4, 2014, at 10:01 AM, Ron's Yahoo! <zl...@yahoo.com> wrote:

> That failed since it defaulted the versions for yarn and hadoop 
> I’ll give it a try with just 2.4.0 for both yarn and hadoop…
> 
> Thanks,
> Ron
> 
> On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:
> 
>> Can you try building without any of the special `hadoop.version` flags and just building only with -Phadoop-2.4? In the past users have reported issues trying to build random spot versions... I think HW is supposed to be compatible with the normal 2.4.0 build.
>> 
>> 
>> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid> wrote:
>> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under the repositories element. I also confirmed that if the build couldn’t find the version, it would fail fast so it seems as if it’s able to get the versions it needs to build the distribution.
>> I ran the following (generated from make-distribution.sh), but it did not address the problem, while building with an older version (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>> 
>> mvn clean package -Phadoop-2.4 -Phive -Pyarn -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563 -DskipTests
>> 
>> 
>> Thanks,
>> Ron
>> 
>> 
>> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>> 
>>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>>> line:
>>> 
>>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>>> -DskipTests clean package
>>> 
>>> I haven¹t tried building a distro, but it should be similar.
>>> 
>>> 
>>> 	- SteveN
>>> 
>>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>>> 
>>>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>>>> -Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>>>> 
>>>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>>>> wrote:
>>>>> For hortonworks, I believe it should work to just link against the
>>>>> corresponding upstream version. I.e. just set the Hadoop version to
>>>>> "2.4.0"
>>>>> 
>>>>> Does that work?
>>>>> 
>>>>> - Patrick
>>>>> 
>>>>> 
>>>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>>>> <zl...@yahoo.com.invalid>
>>>>> wrote:
>>>>>> 
>>>>>> Hi,
>>>>>>  Not sure whose issue this is, but if I run make-distribution using
>>>>>> HDP
>>>>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>>>>> make-distribution.sh), I get a strange error with the exception below.
>>>>>> If I
>>>>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>>>>> make-distribution, using the generated assembly all works fine for me.
>>>>>> Either 1.0.0 or 1.0.1 will work fine.
>>>>>> 
>>>>>>  Should I file a JIRA or is this a known issue?
>>>>>> 
>>>>>> Thanks,
>>>>>> Ron
>>>>>> 
>>>>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>>>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>>>>> Exception failure in TID 0 on host localhost:
>>>>>> java.lang.IncompatibleClassChangeError: Found interface
>>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>>>> 
>>>>>> 
>>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>>>> nputFormat.java:47)
>>>>>> 
>>>>>> 
>>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>>>> 
>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>>>> 
>>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>> 
>>>>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>> 
>>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>>>> 
>>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>>>> 
>>>>>> 
>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>>>> a:1145)
>>>>>> 
>>>>>> 
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>>>> va:615)
>>>>>>        java.lang.Thread.run(Thread.java:745)
>>>>> 
>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>> 
>>> 
>>> 
>>> 
>>> -- 
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity to 
>>> which it is addressed and may contain information that is confidential, 
>>> privileged and exempt from disclosure under applicable law. If the reader 
>>> of this message is not the intended recipient, you are hereby notified that 
>>> any printing, copying, dissemination, distribution, disclosure or 
>>> forwarding of this communication is strictly prohibited. If you have 
>>> received this communication in error, please contact the sender immediately 
>>> and delete it from your system. Thank You.
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>> 
>> 
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Ron's Yahoo! <zl...@yahoo.com.INVALID>.

That failed since it defaulted the versions for yarn and hadoop 
I’ll give it a try with just 2.4.0 for both yarn and hadoop…

Thanks,
Ron

On Aug 4, 2014, at 9:44 AM, Patrick Wendell <pw...@gmail.com> wrote:

> Can you try building without any of the special `hadoop.version` flags and just building only with -Phadoop-2.4? In the past users have reported issues trying to build random spot versions... I think HW is supposed to be compatible with the normal 2.4.0 build.
> 
> 
> On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid> wrote:
> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under the repositories element. I also confirmed that if the build couldn’t find the version, it would fail fast so it seems as if it’s able to get the versions it needs to build the distribution.
> I ran the following (generated from make-distribution.sh), but it did not address the problem, while building with an older version (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
> 
> mvn clean package -Phadoop-2.4 -Phive -Pyarn -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563 -DskipTests
> 
> 
> Thanks,
> Ron
> 
> 
> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
> 
>> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
>> line:
>> 
>> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
>> -DskipTests clean package
>> 
>> I haven¹t tried building a distro, but it should be similar.
>> 
>> 
>> 	- SteveN
>> 
>> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>> 
>>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>>> -Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>>> 
>>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>>> wrote:
>>>> For hortonworks, I believe it should work to just link against the
>>>> corresponding upstream version. I.e. just set the Hadoop version to
>>>> "2.4.0"
>>>> 
>>>> Does that work?
>>>> 
>>>> - Patrick
>>>> 
>>>> 
>>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>>> <zl...@yahoo.com.invalid>
>>>> wrote:
>>>>> 
>>>>> Hi,
>>>>>  Not sure whose issue this is, but if I run make-distribution using
>>>>> HDP
>>>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>>>> make-distribution.sh), I get a strange error with the exception below.
>>>>> If I
>>>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>>>> make-distribution, using the generated assembly all works fine for me.
>>>>> Either 1.0.0 or 1.0.1 will work fine.
>>>>> 
>>>>>  Should I file a JIRA or is this a known issue?
>>>>> 
>>>>> Thanks,
>>>>> Ron
>>>>> 
>>>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>>>> Exception failure in TID 0 on host localhost:
>>>>> java.lang.IncompatibleClassChangeError: Found interface
>>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>>> 
>>>>> 
>>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>>> nputFormat.java:47)
>>>>> 
>>>>> 
>>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>>> 
>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>>> 
>>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>> 
>>>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>> 
>>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>>> 
>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>>> 
>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>>> a:1145)
>>>>> 
>>>>> 
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>>> va:615)
>>>>>        java.lang.Thread.run(Thread.java:745)
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>> 
>> 
>> 
>> 
>> -- 
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to 
>> which it is addressed and may contain information that is confidential, 
>> privileged and exempt from disclosure under applicable law. If the reader 
>> of this message is not the intended recipient, you are hereby notified that 
>> any printing, copying, dissemination, distribution, disclosure or 
>> forwarding of this communication is strictly prohibited. If you have 
>> received this communication in error, please contact the sender immediately 
>> and delete it from your system. Thank You.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
> 
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Patrick Wendell <pw...@gmail.com>.

Can you try building without any of the special `hadoop.version` flags and
just building only with -Phadoop-2.4? In the past users have reported
issues trying to build random spot versions... I think HW is supposed to be
compatible with the normal 2.4.0 build.


On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
wrote:

> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under
> the repositories element. I also confirmed that if the build couldn't find
> the version, it would fail fast so it seems as if it's able to get the
> versions it needs to build the distribution.
> I ran the following (generated from make-distribution.sh), but it did not
> address the problem, while building with an older version
> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>
> mvn clean package -Phadoop-2.4 -Phive -Pyarn
> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
> -DskipTests
>
>
> Thanks,
> Ron
>
>
> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>
> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
> line:
>
> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
> -DskipTests clean package
>
> I haven¹t tried building a distro, but it should be similar.
>
>
> - SteveN
>
> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>
> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
> -Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>
> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
> For hortonworks, I believe it should work to just link against the
> corresponding upstream version. I.e. just set the Hadoop version to
> "2.4.0"
>
> Does that work?
>
> - Patrick
>
>
> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
> <zl...@yahoo.com.invalid>
> wrote:
>
>
> Hi,
>  Not sure whose issue this is, but if I run make-distribution using
> HDP
> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
> make-distribution.sh), I get a strange error with the exception below.
> If I
> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
> make-distribution, using the generated assembly all works fine for me.
> Either 1.0.0 or 1.0.1 will work fine.
>
>  Should I file a JIRA or is this a known issue?
>
> Thanks,
> Ron
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
> Exception failure in TID 0 on host localhost:
> java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>
>
> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
> nputFormat.java:47)
>
>
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>
> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>
> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
>
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
>        java.lang.Thread.run(Thread.java:745)
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
>
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
>
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
>
> and delete it from your system. Thank You.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Patrick Wendell <pw...@gmail.com>.

Can you try building without any of the special `hadoop.version` flags and
just building only with -Phadoop-2.4? In the past users have reported
issues trying to build random spot versions... I think HW is supposed to be
compatible with the normal 2.4.0 build.


On Mon, Aug 4, 2014 at 8:35 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
wrote:

> Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under
> the repositories element. I also confirmed that if the build couldn't find
> the version, it would fail fast so it seems as if it's able to get the
> versions it needs to build the distribution.
> I ran the following (generated from make-distribution.sh), but it did not
> address the problem, while building with an older version
> (2.4.0.2.1.2.0-402) worked. Any other thing I can try?
>
> mvn clean package -Phadoop-2.4 -Phive -Pyarn
> -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563
> -DskipTests
>
>
> Thanks,
> Ron
>
>
> On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:
>
> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
> line:
>
> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
> -DskipTests clean package
>
> I haven¹t tried building a distro, but it should be similar.
>
>
> - SteveN
>
> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
>
> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
> -Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>
> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
> For hortonworks, I believe it should work to just link against the
> corresponding upstream version. I.e. just set the Hadoop version to
> "2.4.0"
>
> Does that work?
>
> - Patrick
>
>
> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
> <zl...@yahoo.com.invalid>
> wrote:
>
>
> Hi,
>  Not sure whose issue this is, but if I run make-distribution using
> HDP
> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
> make-distribution.sh), I get a strange error with the exception below.
> If I
> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
> make-distribution, using the generated assembly all works fine for me.
> Either 1.0.0 or 1.0.1 will work fine.
>
>  Should I file a JIRA or is this a known issue?
>
> Thanks,
> Ron
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
> Exception failure in TID 0 on host localhost:
> java.lang.IncompatibleClassChangeError: Found interface
> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>
>
> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
> nputFormat.java:47)
>
>
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>
> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>
> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
> a:1145)
>
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
> va:615)
>        java.lang.Thread.run(Thread.java:745)
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
>
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
>
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
>
> and delete it from your system. Thank You.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>
>

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Ron's Yahoo! <zl...@yahoo.com.INVALID>.

Thanks, I ensured that $SPARK_HOME/pom.xml had the HDP repository under the repositories element. I also confirmed that if the build couldn’t find the version, it would fail fast so it seems as if it’s able to get the versions it needs to build the distribution.
I ran the following (generated from make-distribution.sh), but it did not address the problem, while building with an older version (2.4.0.2.1.2.0-402) worked. Any other thing I can try?

mvn clean package -Phadoop-2.4 -Phive -Pyarn -Dyarn.version=2.4.0.2.1.2.0-563 -Dhadoop.version=2.4.0.2.1.3.0-563 -DskipTests


Thanks,
Ron

On Aug 4, 2014, at 7:13 AM, Steve Nunez <sn...@hortonworks.com> wrote:

> Provided you¹ve got the HWX repo in your pom.xml, you can build with this
> line:
> 
> mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
> -DskipTests clean package
> 
> I haven¹t tried building a distro, but it should be similar.
> 
> 
> 	- SteveN
> 
> On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:
> 
>> For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>> -Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>> 
>> On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>> wrote:
>>> For hortonworks, I believe it should work to just link against the
>>> corresponding upstream version. I.e. just set the Hadoop version to
>>> "2.4.0"
>>> 
>>> Does that work?
>>> 
>>> - Patrick
>>> 
>>> 
>>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>>> <zl...@yahoo.com.invalid>
>>> wrote:
>>>> 
>>>> Hi,
>>>>  Not sure whose issue this is, but if I run make-distribution using
>>>> HDP
>>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>>> make-distribution.sh), I get a strange error with the exception below.
>>>> If I
>>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>>> make-distribution, using the generated assembly all works fine for me.
>>>> Either 1.0.0 or 1.0.1 will work fine.
>>>> 
>>>>  Should I file a JIRA or is this a known issue?
>>>> 
>>>> Thanks,
>>>> Ron
>>>> 
>>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>>> Exception failure in TID 0 on host localhost:
>>>> java.lang.IncompatibleClassChangeError: Found interface
>>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>> 
>>>> 
>>>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>> nputFormat.java:47)
>>>> 
>>>> 
>>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>> 
>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>> 
>>>> org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>> 
>>>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>>        org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>>        org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>>        org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>> 
>>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>>        org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>> 
>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>> 
>>>> 
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>> a:1145)
>>>> 
>>>> 
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>> va:615)
>>>>        java.lang.Thread.run(Thread.java:745)
>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>> 
> 
> 
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

Provided you¹ve got the HWX repo in your pom.xml, you can build with this
line:

mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
-DskipTests clean package

I haven¹t tried building a distro, but it should be similar.


	- SteveN

On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:

>For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>-Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>
>On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>wrote:
>> For hortonworks, I believe it should work to just link against the
>> corresponding upstream version. I.e. just set the Hadoop version to
>>"2.4.0"
>>
>> Does that work?
>>
>> - Patrick
>>
>>
>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>><zl...@yahoo.com.invalid>
>> wrote:
>>>
>>> Hi,
>>>   Not sure whose issue this is, but if I run make-distribution using
>>>HDP
>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>> make-distribution.sh), I get a strange error with the exception below.
>>>If I
>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>> make-distribution, using the generated assembly all works fine for me.
>>> Either 1.0.0 or 1.0.1 will work fine.
>>>
>>>   Should I file a JIRA or is this a known issue?
>>>
>>> Thanks,
>>> Ron
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>> Exception failure in TID 0 on host localhost:
>>> java.lang.IncompatibleClassChangeError: Found interface
>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>
>>> 
>>>org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>nputFormat.java:47)
>>>
>>> 
>>>org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>         
>>>org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>         
>>>org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         
>>>org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>
>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>
>>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:1145)
>>>
>>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va:615)
>>>         java.lang.Thread.run(Thread.java:745)
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>For additional commands, e-mail: dev-help@spark.apache.org
>



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Steve Nunez <sn...@hortonworks.com>.

Provided you¹ve got the HWX repo in your pom.xml, you can build with this
line:

mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385
-DskipTests clean package

I haven¹t tried building a distro, but it should be similar.


	- SteveN

On 8/4/14, 1:25, "Sean Owen" <so...@cloudera.com> wrote:

>For any Hadoop 2.4 distro, yes, set hadoop.version but also set
>-Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html
>
>On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com>
>wrote:
>> For hortonworks, I believe it should work to just link against the
>> corresponding upstream version. I.e. just set the Hadoop version to
>>"2.4.0"
>>
>> Does that work?
>>
>> - Patrick
>>
>>
>> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo!
>><zl...@yahoo.com.invalid>
>> wrote:
>>>
>>> Hi,
>>>   Not sure whose issue this is, but if I run make-distribution using
>>>HDP
>>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>>> make-distribution.sh), I get a strange error with the exception below.
>>>If I
>>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>>> make-distribution, using the generated assembly all works fine for me.
>>> Either 1.0.0 or 1.0.1 will work fine.
>>>
>>>   Should I file a JIRA or is this a known issue?
>>>
>>> Thanks,
>>> Ron
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>>> Exception failure in TID 0 on host localhost:
>>> java.lang.IncompatibleClassChangeError: Found interface
>>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>>
>>> 
>>>org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyI
>>>nputFormat.java:47)
>>>
>>> 
>>>org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>>         
>>>org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>>         
>>>org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         
>>>org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>>
>>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>>
>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>>
>>> 
>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.jav
>>>a:1145)
>>>
>>> 
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va:615)
>>>         java.lang.Thread.run(Thread.java:745)
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>For additional commands, e-mail: dev-help@spark.apache.org
>



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Sean Owen <so...@cloudera.com>.

For any Hadoop 2.4 distro, yes, set hadoop.version but also set
-Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html

On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com> wrote:
> For hortonworks, I believe it should work to just link against the
> corresponding upstream version. I.e. just set the Hadoop version to "2.4.0"
>
> Does that work?
>
> - Patrick
>
>
> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
> wrote:
>>
>> Hi,
>>   Not sure whose issue this is, but if I run make-distribution using HDP
>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>> make-distribution.sh), I get a strange error with the exception below. If I
>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>> make-distribution, using the generated assembly all works fine for me.
>> Either 1.0.0 or 1.0.1 will work fine.
>>
>>   Should I file a JIRA or is this a known issue?
>>
>> Thanks,
>> Ron
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>> Exception failure in TID 0 on host localhost:
>> java.lang.IncompatibleClassChangeError: Found interface
>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>
>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
>>
>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         java.lang.Thread.run(Thread.java:745)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Issues with HDP 2.4.0.2.1.3.0-563

Posted by Sean Owen <so...@cloudera.com>.

For any Hadoop 2.4 distro, yes, set hadoop.version but also set
-Phadoop-2.4. http://spark.apache.org/docs/latest/building-with-maven.html

On Mon, Aug 4, 2014 at 9:15 AM, Patrick Wendell <pw...@gmail.com> wrote:
> For hortonworks, I believe it should work to just link against the
> corresponding upstream version. I.e. just set the Hadoop version to "2.4.0"
>
> Does that work?
>
> - Patrick
>
>
> On Mon, Aug 4, 2014 at 12:13 AM, Ron's Yahoo! <zl...@yahoo.com.invalid>
> wrote:
>>
>> Hi,
>>   Not sure whose issue this is, but if I run make-distribution using HDP
>> 2.4.0.2.1.3.0-563 as the hadoop version (replacing it in
>> make-distribution.sh), I get a strange error with the exception below. If I
>> use a slightly older version of HDP (2.4.0.2.1.2.0-402) with
>> make-distribution, using the generated assembly all works fine for me.
>> Either 1.0.0 or 1.0.1 will work fine.
>>
>>   Should I file a JIRA or is this a known issue?
>>
>> Thanks,
>> Ron
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0.0:0 failed 1 times, most recent failure:
>> Exception failure in TID 0 on host localhost:
>> java.lang.IncompatibleClassChangeError: Found interface
>> org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
>>
>> org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
>>
>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:111)
>>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:99)
>>         org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:61)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>>         org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>>         org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>>         org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         java.lang.Thread.run(Thread.java:745)
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org