You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by gaganbm <ga...@gmail.com> on 2014/03/26 06:43:22 UTC

Re: rdd.saveAsTextFile problem

Hi Folks,

Is this issue resolved ? If yes, could you please throw some light on how to
fix this ?

I am facing the same problem during writing to text files.

When I do 

stream.foreachRDD(rdd =>{
				rdd.saveAsTextFile(<"Some path">)
			})

This works fine for me. But it creates multiple text files for each
partition within an RDD.

So I tried with coalesce option to merge my results in a single file for
each RDD as :

stream.foreachRDD(rdd =>{
				rdd.coalesce(1, true).saveAsTextFile(<"Some path">)
			})

This fails with :
org.apache.spark.SparkException: Job aborted: Task 75.0:0 failed 1 times
(most recent failure: Exception failure: java.lang.IllegalStateException:
unread block data)

I am using Spark Streaming 0.9.0

Any clue what's going wrong when using coalesce ?





--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/rdd-saveAsTextFile-problem-tp176p3238.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: rdd.saveAsTextFile problem

Posted by Tathagata Das <ta...@gmail.com>.
Can you give us the more detailed exception + stack trace in the log? It
should be in the driver log. If not, please take a look at the executor
logs, through the web ui to find the stack trace.

TD


On Tue, Mar 25, 2014 at 10:43 PM, gaganbm <ga...@gmail.com> wrote:

> Hi Folks,
>
> Is this issue resolved ? If yes, could you please throw some light on how
> to
> fix this ?
>
> I am facing the same problem during writing to text files.
>
> When I do
>
> stream.foreachRDD(rdd =>{
>                                 rdd.saveAsTextFile(<"Some path">)
>                         })
>
> This works fine for me. But it creates multiple text files for each
> partition within an RDD.
>
> So I tried with coalesce option to merge my results in a single file for
> each RDD as :
>
> stream.foreachRDD(rdd =>{
>                                 rdd.coalesce(1,
> true).saveAsTextFile(<"Some path">)
>                         })
>
> This fails with :
> org.apache.spark.SparkException: Job aborted: Task 75.0:0 failed 1 times
> (most recent failure: Exception failure: java.lang.IllegalStateException:
> unread block data)
>
> I am using Spark Streaming 0.9.0
>
> Any clue what's going wrong when using coalesce ?
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/rdd-saveAsTextFile-problem-tp176p3238.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>