You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/06/12 06:49:01 UTC

[jira] [Resolved] (SPARK-8311) saveAsTextFile with Hadoop1 could lead to errors

     [ https://issues.apache.org/jira/browse/SPARK-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-8311.
------------------------------
    Resolution: Duplicate

Yes 95% sure that's a duplicate

> saveAsTextFile with Hadoop1 could lead to errors
> ------------------------------------------------
>
>                 Key: SPARK-8311
>                 URL: https://issues.apache.org/jira/browse/SPARK-8311
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.1
>            Reporter: Shivaram Venkataraman
>
> I've run into this bug a couple of times and wanted to document things I have found so far in a JIRA. From what I see if an application is linked to Hadoop1 and running on a Spark 1.3.1 + Hadoop1 cluster then the saveAsTextFile call consistently fails with errors of the form
> {code}
> 15/06/11 19:47:10 WARN scheduler.TaskSetManager: Lost task 3.0 in stage 3.0 (TID 13, ip-10-212-141-222.us-west-2.compute.internal): java.lang.IncompatibleClassChangeError: Found class org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
>         at org.apache.spark.mapred.SparkHadoopMapRedUtil$.commitTask(SparkHadoopMapRedUtil.scala:95)
>         at org.apache.spark.SparkHadoopWriter.commit(SparkHadoopWriter.scala:106)
>         at org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1082)
>         at org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
> {code}
> This does not happen in 1.2.1
> I think the bug is caused by the following commit
> https://github.com/apache/spark/commit/fde6945417355ae57500b67d034c9cad4f20d240 where we the function `commitTask` assumes that the mrTaskContext is always a `mapreduce.TaskContext` while it is a `mapred.TaskContext` in Hadoop1.  But this is just a hypothesis as I haven't tried reverting this to see if the problem goes away
> cc [~liancheng]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org