You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Arun Ahuja <aa...@gmail.com> on 2014/09/26 15:11:30 UTC

java.io.IOException Error in task deserialization

Has anyone else seen this erorr in task deserialization?  The task is
processing a small amount of data and doesn't seem to have much data
hanging to the closure?  I've only seen this with Spark 1.1

Job aborted due to stage failure: Task 975 in stage 8.0 failed 4
times, most recent failure: Lost task 975.3 in stage 8.0 (TID 24777,
host.com): java.io.IOException: unexpected exception type
        java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
        java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
        org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:744)

Re: java.io.IOException Error in task deserialization

Posted by Sung Hwan Chung <co...@cs.stanford.edu>.

I can tell you what the environment and rough processes are like:

CDH5 Yarn
15 executors (16GB for driver, 8GB for executors)
Total cached data about 10GB
Shuffled data size per iteration ~1GB. - map followed by groupby followed
by map followed by collect
I'd imagine that every time map/groupby is called, the environment data
that get serialized to the mappers/groupbys are maxed at 250MB.
Periodic checkpointing



On Fri, Oct 10, 2014 at 10:34 AM, Davies Liu <da...@databricks.com> wrote:

> Maybe, TorrentBroadcast is more complicated than HttpBroadcast, could
> you tell us
> how to reproduce this issue? That will help us a lot to improve
> TorrentBroadcast.
>
> Thanks!
>
> On Fri, Oct 10, 2014 at 8:46 AM, Sung Hwan Chung
> <co...@cs.stanford.edu> wrote:
> > I haven't seen this at all since switching to HttpBroadcast. It seems
> > TorrentBroadcast might have some issues?
> >
> > On Thu, Oct 9, 2014 at 4:28 PM, Sung Hwan Chung <
> codedeft@cs.stanford.edu>
> > wrote:
> >>
> >> I don't think that I saw any other error message. This is all I saw.
> >>
> >> I'm currently experimenting to see if this can be alleviated by using
> >> HttpBroadcastFactory instead of TorrentBroadcast. So far, with
> >> HttpBroadcast, I haven't seen this recurring as of yet. I'll keep you
> >> posted.
> >>
> >> On Thu, Oct 9, 2014 at 4:21 PM, Davies Liu <da...@databricks.com>
> wrote:
> >>>
> >>> This exception should be caused by another one, could you paste all of
> >>> them here?
> >>>
> >>> Also, that will be great if you can provide a script to reproduce this
> >>> problem.
> >>>
> >>> Thanks!
> >>>
> >>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com>
> wrote:
> >>> > Has anyone else seen this erorr in task deserialization?  The task is
> >>> > processing a small amount of data and doesn't seem to have much data
> >>> > hanging
> >>> > to the closure?  I've only seen this with Spark 1.1
> >>> >
> >>> > Job aborted due to stage failure: Task 975 in stage 8.0 failed 4
> times,
> >>> > most
> >>> > recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
> >>> > java.io.IOException: unexpected exception type
> >>> >
> >>> >
> >>> >
> java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
> >>> >
> >>> >
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
> >>> >
> >>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
> >>> >
> >>> >
> >>> >
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >>> >
> >>> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >>> >
> >>> >
> >>> >
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> >>> >
> >>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> >>> >
> >>> >
> >>> >
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >>> >
> >>> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >>> >
> >>> > java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> >>> >
> >>> >
> >>> >
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
> >>> >
> >>> >
> >>> >
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> >>> >
> >>> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
> >>> >
> >>> >
> >>> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>> >
> >>> >
> >>> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>> >         java.lang.Thread.run(Thread.java:744)
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >>> For additional commands, e-mail: user-help@spark.apache.org
> >>>
> >>
> >
>

Re: java.io.IOException Error in task deserialization

Posted by Davies Liu <da...@databricks.com>.

Maybe, TorrentBroadcast is more complicated than HttpBroadcast, could
you tell us
how to reproduce this issue? That will help us a lot to improve
TorrentBroadcast.

Thanks!

On Fri, Oct 10, 2014 at 8:46 AM, Sung Hwan Chung
<co...@cs.stanford.edu> wrote:
> I haven't seen this at all since switching to HttpBroadcast. It seems
> TorrentBroadcast might have some issues?
>
> On Thu, Oct 9, 2014 at 4:28 PM, Sung Hwan Chung <co...@cs.stanford.edu>
> wrote:
>>
>> I don't think that I saw any other error message. This is all I saw.
>>
>> I'm currently experimenting to see if this can be alleviated by using
>> HttpBroadcastFactory instead of TorrentBroadcast. So far, with
>> HttpBroadcast, I haven't seen this recurring as of yet. I'll keep you
>> posted.
>>
>> On Thu, Oct 9, 2014 at 4:21 PM, Davies Liu <da...@databricks.com> wrote:
>>>
>>> This exception should be caused by another one, could you paste all of
>>> them here?
>>>
>>> Also, that will be great if you can provide a script to reproduce this
>>> problem.
>>>
>>> Thanks!
>>>
>>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>>> > Has anyone else seen this erorr in task deserialization?  The task is
>>> > processing a small amount of data and doesn't seem to have much data
>>> > hanging
>>> > to the closure?  I've only seen this with Spark 1.1
>>> >
>>> > Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times,
>>> > most
>>> > recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
>>> > java.io.IOException: unexpected exception type
>>> >
>>> >
>>> > java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>> >
>>> > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>> >
>>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>> >
>>> >
>>> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> >
>>> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> >
>>> >
>>> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>> >
>>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>> >
>>> >
>>> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>> >
>>> > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>> >
>>> > java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>> >
>>> >
>>> > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>> >
>>> >
>>> > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>> >
>>> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>> >
>>> >
>>> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> >
>>> >
>>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> >         java.lang.Thread.run(Thread.java:744)
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: java.io.IOException Error in task deserialization

Posted by Sung Hwan Chung <co...@cs.stanford.edu>.

I haven't seen this at all since switching to HttpBroadcast. It seems
TorrentBroadcast might have some issues?

On Thu, Oct 9, 2014 at 4:28 PM, Sung Hwan Chung <co...@cs.stanford.edu>
wrote:

> I don't think that I saw any other error message. This is all I saw.
>
> I'm currently experimenting to see if this can be alleviated by using
> HttpBroadcastFactory instead of TorrentBroadcast. So far, with
> HttpBroadcast, I haven't seen this recurring as of yet. I'll keep you
> posted.
>
> On Thu, Oct 9, 2014 at 4:21 PM, Davies Liu <da...@databricks.com> wrote:
>
>> This exception should be caused by another one, could you paste all of
>> them here?
>>
>> Also, that will be great if you can provide a script to reproduce this
>> problem.
>>
>> Thanks!
>>
>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>> > Has anyone else seen this erorr in task deserialization?  The task is
>> > processing a small amount of data and doesn't seem to have much data
>> hanging
>> > to the closure?  I've only seen this with Spark 1.1
>> >
>> > Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times,
>> most
>> > recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
>> > java.io.IOException: unexpected exception type
>> >
>> >
>> java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>> >
>> > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>> >
>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>> >
>> >
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> >
>>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> >
>> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>> >
>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>> >
>> >
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>> >
>>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>> >         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>> >
>> >
>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>> >
>> >
>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>> >
>> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>> >
>> >
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> >
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> >         java.lang.Thread.run(Thread.java:744)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>>
>

Re: java.io.IOException Error in task deserialization

Posted by Sung Hwan Chung <co...@cs.stanford.edu>.

I don't think that I saw any other error message. This is all I saw.

I'm currently experimenting to see if this can be alleviated by using
HttpBroadcastFactory instead of TorrentBroadcast. So far, with
HttpBroadcast, I haven't seen this recurring as of yet. I'll keep you
posted.

On Thu, Oct 9, 2014 at 4:21 PM, Davies Liu <da...@databricks.com> wrote:

> This exception should be caused by another one, could you paste all of
> them here?
>
> Also, that will be great if you can provide a script to reproduce this
> problem.
>
> Thanks!
>
> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
> > Has anyone else seen this erorr in task deserialization?  The task is
> > processing a small amount of data and doesn't seem to have much data
> hanging
> > to the closure?  I've only seen this with Spark 1.1
> >
> > Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times,
> most
> > recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
> > java.io.IOException: unexpected exception type
> >
> > java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
> >
> > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
> >
> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >
> > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> >
> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> >
> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> >
>  java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> >         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> >
> >
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
> >
> >
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> >
> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >         java.lang.Thread.run(Thread.java:744)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Re: java.io.IOException Error in task deserialization

Posted by Davies Liu <da...@databricks.com>.

This exception should be caused by another one, could you paste all of
them here?

Also, that will be great if you can provide a script to reproduce this problem.

Thanks!

On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
> Has anyone else seen this erorr in task deserialization?  The task is
> processing a small amount of data and doesn't seem to have much data hanging
> to the closure?  I've only seen this with Spark 1.1
>
> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times, most
> recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
> java.io.IOException: unexpected exception type
>
> java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:744)

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: java.io.IOException Error in task deserialization

Posted by Davies Liu <da...@databricks.com>.

Could you provide a script to reproduce this problem?

Thanks!

On Wed, Oct 8, 2014 at 9:13 PM, Sung Hwan Chung
<co...@cs.stanford.edu> wrote:
> This is also happening to me on a regular basis, when the job is large with
> relatively large serialized objects used in each RDD lineage. A bad thing
> about this is that this exception always stops the whole job.
>
>
> On Fri, Sep 26, 2014 at 11:17 AM, Brad Miller <bm...@eecs.berkeley.edu>
> wrote:
>>
>> FWIW I suspect that each count operation is an opportunity for you to
>> trigger the bug, and each filter operation increases the likelihood of
>> setting up the bug.  I normally don't come across this error until my job
>> has been running for an hour or two and had a chance to build up longer
>> lineages for some RDDs.  It sounds like your data is a bit smaller and it's
>> more feasible for you to build up longer lineages more quickly.
>>
>> If you can reduce your number of filter operations (for example by
>> combining some into a single function) that may help.  It may also help to
>> introduce persistence or checkpointing at intermediate stages so that the
>> length of the lineages that have to get replayed isn't as long.
>>
>> On Fri, Sep 26, 2014 at 11:10 AM, Arun Ahuja <aa...@gmail.com> wrote:
>>>
>>> No for me as well it is non-deterministic.  It happens in a piece of code
>>> that does many filter and counts on a small set of records (~1k-10k).  The
>>> originally set is persisted in memory and we have a Kryo serializer set for
>>> it.  The task itself takes in just a few filtering parameters.  This with
>>> the same setting has sometimes completed to sucess and sometimes failed
>>> during this step.
>>>
>>> Arun
>>>
>>> On Fri, Sep 26, 2014 at 1:32 PM, Brad Miller <bm...@eecs.berkeley.edu>
>>> wrote:
>>>>
>>>> I've had multiple jobs crash due to "java.io.IOException: unexpected
>>>> exception type"; I've been running the 1.1 branch for some time and am now
>>>> running the 1.1 release binaries. Note that I only use PySpark. I haven't
>>>> kept detailed notes or the tracebacks around since there are other problems
>>>> that have caused my greater grief (namely "key not found" errors).
>>>>
>>>> For me the exception seems to occur non-deterministically, which is a
>>>> bit interesting since the error message shows that the same stage has failed
>>>> multiple times.  Are you able to consistently re-produce the bug across
>>>> multiple invocations at the same place?
>>>>
>>>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>>>>>
>>>>> Has anyone else seen this erorr in task deserialization?  The task is
>>>>> processing a small amount of data and doesn't seem to have much data hanging
>>>>> to the closure?  I've only seen this with Spark 1.1
>>>>>
>>>>> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times,
>>>>> most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com):
>>>>> java.io.IOException: unexpected exception type
>>>>>
>>>>> java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>>>>
>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>>>>
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>>>
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>>
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>>
>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>>
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>>
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>>
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>>
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>>
>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>>>>
>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>>>>
>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>         java.lang.Thread.run(Thread.java:744)
>>>>
>>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: java.io.IOException Error in task deserialization

Posted by Sung Hwan Chung <co...@cs.stanford.edu>.

This is also happening to me on a regular basis, when the job is large with
relatively large serialized objects used in each RDD lineage. A bad thing
about this is that this exception always stops the whole job.


On Fri, Sep 26, 2014 at 11:17 AM, Brad Miller <bm...@eecs.berkeley.edu>
wrote:

> FWIW I suspect that each count operation is an opportunity for you to
> trigger the bug, and each filter operation increases the likelihood of
> setting up the bug.  I normally don't come across this error until my job
> has been running for an hour or two and had a chance to build up longer
> lineages for some RDDs.  It sounds like your data is a bit smaller and it's
> more feasible for you to build up longer lineages more quickly.
>
> If you can reduce your number of filter operations (for example by
> combining some into a single function) that may help.  It may also help to
> introduce persistence or checkpointing at intermediate stages so that the
> length of the lineages that have to get replayed isn't as long.
>
> On Fri, Sep 26, 2014 at 11:10 AM, Arun Ahuja <aa...@gmail.com> wrote:
>
>> No for me as well it is non-deterministic.  It happens in a piece of code
>> that does many filter and counts on a small set of records (~1k-10k).  The
>> originally set is persisted in memory and we have a Kryo serializer set for
>> it.  The task itself takes in just a few filtering parameters.  This with
>> the same setting has sometimes completed to sucess and sometimes failed
>> during this step.
>>
>> Arun
>>
>> On Fri, Sep 26, 2014 at 1:32 PM, Brad Miller <bm...@eecs.berkeley.edu>
>> wrote:
>>
>>> I've had multiple jobs crash due to "java.io.IOException: unexpected
>>> exception type"; I've been running the 1.1 branch for some time and am now
>>> running the 1.1 release binaries. Note that I only use PySpark. I haven't
>>> kept detailed notes or the tracebacks around since there are other problems
>>> that have caused my greater grief (namely "key not found" errors).
>>>
>>> For me the exception seems to occur non-deterministically, which is a
>>> bit interesting since the error message shows that the same stage has
>>> failed multiple times.  Are you able to consistently re-produce the bug
>>> across multiple invocations at the same place?
>>>
>>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>>>
>>>> Has anyone else seen this erorr in task deserialization?  The task is
>>>> processing a small amount of data and doesn't seem to have much data
>>>> hanging to the closure?  I've only seen this with Spark 1.1
>>>>
>>>> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times, most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com): java.io.IOException: unexpected exception type
>>>>         java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>>>         java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>         java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>>>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>>>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>>>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>         java.lang.Thread.run(Thread.java:744)
>>>>
>>>>
>>>
>>
>

Re: java.io.IOException Error in task deserialization

Posted by Brad Miller <bm...@eecs.berkeley.edu>.

FWIW I suspect that each count operation is an opportunity for you to
trigger the bug, and each filter operation increases the likelihood of
setting up the bug.  I normally don't come across this error until my job
has been running for an hour or two and had a chance to build up longer
lineages for some RDDs.  It sounds like your data is a bit smaller and it's
more feasible for you to build up longer lineages more quickly.

If you can reduce your number of filter operations (for example by
combining some into a single function) that may help.  It may also help to
introduce persistence or checkpointing at intermediate stages so that the
length of the lineages that have to get replayed isn't as long.

On Fri, Sep 26, 2014 at 11:10 AM, Arun Ahuja <aa...@gmail.com> wrote:

> No for me as well it is non-deterministic.  It happens in a piece of code
> that does many filter and counts on a small set of records (~1k-10k).  The
> originally set is persisted in memory and we have a Kryo serializer set for
> it.  The task itself takes in just a few filtering parameters.  This with
> the same setting has sometimes completed to sucess and sometimes failed
> during this step.
>
> Arun
>
> On Fri, Sep 26, 2014 at 1:32 PM, Brad Miller <bm...@eecs.berkeley.edu>
> wrote:
>
>> I've had multiple jobs crash due to "java.io.IOException: unexpected
>> exception type"; I've been running the 1.1 branch for some time and am now
>> running the 1.1 release binaries. Note that I only use PySpark. I haven't
>> kept detailed notes or the tracebacks around since there are other problems
>> that have caused my greater grief (namely "key not found" errors).
>>
>> For me the exception seems to occur non-deterministically, which is a bit
>> interesting since the error message shows that the same stage has failed
>> multiple times.  Are you able to consistently re-produce the bug across
>> multiple invocations at the same place?
>>
>> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>>
>>> Has anyone else seen this erorr in task deserialization?  The task is
>>> processing a small amount of data and doesn't seem to have much data
>>> hanging to the closure?  I've only seen this with Spark 1.1
>>>
>>> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times, most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com): java.io.IOException: unexpected exception type
>>>         java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>>         java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>         java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>         java.lang.Thread.run(Thread.java:744)
>>>
>>>
>>
>

Re: java.io.IOException Error in task deserialization

Posted by Arun Ahuja <aa...@gmail.com>.

No for me as well it is non-deterministic.  It happens in a piece of code
that does many filter and counts on a small set of records (~1k-10k).  The
originally set is persisted in memory and we have a Kryo serializer set for
it.  The task itself takes in just a few filtering parameters.  This with
the same setting has sometimes completed to sucess and sometimes failed
during this step.

Arun

On Fri, Sep 26, 2014 at 1:32 PM, Brad Miller <bm...@eecs.berkeley.edu>
wrote:

> I've had multiple jobs crash due to "java.io.IOException: unexpected
> exception type"; I've been running the 1.1 branch for some time and am now
> running the 1.1 release binaries. Note that I only use PySpark. I haven't
> kept detailed notes or the tracebacks around since there are other problems
> that have caused my greater grief (namely "key not found" errors).
>
> For me the exception seems to occur non-deterministically, which is a bit
> interesting since the error message shows that the same stage has failed
> multiple times.  Are you able to consistently re-produce the bug across
> multiple invocations at the same place?
>
> On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:
>
>> Has anyone else seen this erorr in task deserialization?  The task is
>> processing a small amount of data and doesn't seem to have much data
>> hanging to the closure?  I've only seen this with Spark 1.1
>>
>> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times, most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com): java.io.IOException: unexpected exception type
>>         java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>>         java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         java.lang.Thread.run(Thread.java:744)
>>
>>
>

Re: java.io.IOException Error in task deserialization

Posted by Brad Miller <bm...@eecs.berkeley.edu>.

I've had multiple jobs crash due to "java.io.IOException: unexpected
exception type"; I've been running the 1.1 branch for some time and am now
running the 1.1 release binaries. Note that I only use PySpark. I haven't
kept detailed notes or the tracebacks around since there are other problems
that have caused my greater grief (namely "key not found" errors).

For me the exception seems to occur non-deterministically, which is a bit
interesting since the error message shows that the same stage has failed
multiple times.  Are you able to consistently re-produce the bug across
multiple invocations at the same place?

On Fri, Sep 26, 2014 at 6:11 AM, Arun Ahuja <aa...@gmail.com> wrote:

> Has anyone else seen this erorr in task deserialization?  The task is
> processing a small amount of data and doesn't seem to have much data
> hanging to the closure?  I've only seen this with Spark 1.1
>
> Job aborted due to stage failure: Task 975 in stage 8.0 failed 4 times, most recent failure: Lost task 975.3 in stage 8.0 (TID 24777, host.com): java.io.IOException: unexpected exception type
>         java.io.ObjectStreamClass.throwMiscException(ObjectStreamClass.java:1538)
>         java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1025)
>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>         java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>         java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>         java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>         java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>         org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>         org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
>         org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:159)
>         java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:744)
>
>