You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Chen Song <ch...@gmail.com> on 2015/07/15 21:46:23 UTC

NotSerializableException in spark 1.4.0

The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
1.4, I started seeing error as below. It appears that it fails in validate
method in StreamingContext. Is there anything changed on 1.4.0 w.r.t
DStream checkpointint?

Detailed error from driver:

15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
exception: *java.io.NotSerializableException:
DStream checkpointing has been enabled but the DStreams with their
functions are not serializable*
Serialization stack:

java.io.NotSerializableException: DStream checkpointing has been enabled
but the DStreams with their functions are not serializable
Serialization stack:

at
org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
at
org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
at
org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)

-- 
Chen Song

Re: NotSerializableException in spark 1.4.0

Posted by Chen Song <ch...@gmail.com>.
Ah, cool. Thanks.

On Wed, Jul 15, 2015 at 5:58 PM, Tathagata Das <td...@databricks.com> wrote:

> Spark 1.4.1 just got released! So just download that. Yay for timing.
>
> On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> Should be this one:
>>     [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
>> SerializationDebugger bugs and limitations
>> ...
>>     Closes #6625 from tdas/SPARK-7180 and squashes the following commits:
>>
>> On Wed, Jul 15, 2015 at 2:37 PM, Chen Song <ch...@gmail.com>
>> wrote:
>>
>>> Thanks
>>>
>>> Can you point me to the patch to fix the serialization stack? Maybe I
>>> can pull it in and rerun my job.
>>>
>>> Chen
>>>
>>> On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das <td...@databricks.com>
>>> wrote:
>>>
>>>> Your streaming job may have been seemingly running ok, but the DStream
>>>> checkpointing must have been failing in the background. It would have been
>>>> visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
>>>> that checkpointing failures dont get hidden in the background.
>>>>
>>>> The fact that the serialization stack is not being shown correctly, is
>>>> a known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
>>>> next couple of days. That should help you to narrow down the culprit
>>>> preventing serialization.
>>>>
>>>> On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>
>>>>> Can you show us your function(s) ?
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The streaming job has been running ok in 1.2 and 1.3. After I
>>>>>> upgraded to 1.4, I started seeing error as below. It appears that it fails
>>>>>> in validate method in StreamingContext. Is there anything changed on 1.4.0
>>>>>> w.r.t DStream checkpointint?
>>>>>>
>>>>>> Detailed error from driver:
>>>>>>
>>>>>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>>>>>> exception: *java.io.NotSerializableException: DStream checkpointing
>>>>>> has been enabled but the DStreams with their functions are not serializable*
>>>>>> Serialization stack:
>>>>>>
>>>>>> java.io.NotSerializableException: DStream checkpointing has been
>>>>>> enabled but the DStreams with their functions are not serializable
>>>>>> Serialization stack:
>>>>>>
>>>>>> at
>>>>>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>>>>>> at
>>>>>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>>>>>> at
>>>>>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>>>>>
>>>>>> --
>>>>>> Chen Song
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Chen Song
>>>
>>>
>>
>


-- 
Chen Song

Re: NotSerializableException in spark 1.4.0

Posted by Tathagata Das <td...@databricks.com>.
Spark 1.4.1 just got released! So just download that. Yay for timing.

On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu <yu...@gmail.com> wrote:

> Should be this one:
>     [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
> SerializationDebugger bugs and limitations
> ...
>     Closes #6625 from tdas/SPARK-7180 and squashes the following commits:
>
> On Wed, Jul 15, 2015 at 2:37 PM, Chen Song <ch...@gmail.com> wrote:
>
>> Thanks
>>
>> Can you point me to the patch to fix the serialization stack? Maybe I can
>> pull it in and rerun my job.
>>
>> Chen
>>
>> On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das <td...@databricks.com>
>> wrote:
>>
>>> Your streaming job may have been seemingly running ok, but the DStream
>>> checkpointing must have been failing in the background. It would have been
>>> visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
>>> that checkpointing failures dont get hidden in the background.
>>>
>>> The fact that the serialization stack is not being shown correctly, is a
>>> known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
>>> next couple of days. That should help you to narrow down the culprit
>>> preventing serialization.
>>>
>>> On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yu...@gmail.com> wrote:
>>>
>>>> Can you show us your function(s) ?
>>>>
>>>> Thanks
>>>>
>>>> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded
>>>>> to 1.4, I started seeing error as below. It appears that it fails in
>>>>> validate method in StreamingContext. Is there anything changed on 1.4.0
>>>>> w.r.t DStream checkpointint?
>>>>>
>>>>> Detailed error from driver:
>>>>>
>>>>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>>>>> exception: *java.io.NotSerializableException: DStream checkpointing
>>>>> has been enabled but the DStreams with their functions are not serializable*
>>>>> Serialization stack:
>>>>>
>>>>> java.io.NotSerializableException: DStream checkpointing has been
>>>>> enabled but the DStreams with their functions are not serializable
>>>>> Serialization stack:
>>>>>
>>>>> at
>>>>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>>>>> at
>>>>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>>>>> at
>>>>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>>>>
>>>>> --
>>>>> Chen Song
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Chen Song
>>
>>
>

Re: NotSerializableException in spark 1.4.0

Posted by Ted Yu <yu...@gmail.com>.
Should be this one:
    [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
SerializationDebugger bugs and limitations
...
    Closes #6625 from tdas/SPARK-7180 and squashes the following commits:

On Wed, Jul 15, 2015 at 2:37 PM, Chen Song <ch...@gmail.com> wrote:

> Thanks
>
> Can you point me to the patch to fix the serialization stack? Maybe I can
> pull it in and rerun my job.
>
> Chen
>
> On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das <td...@databricks.com>
> wrote:
>
>> Your streaming job may have been seemingly running ok, but the DStream
>> checkpointing must have been failing in the background. It would have been
>> visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
>> that checkpointing failures dont get hidden in the background.
>>
>> The fact that the serialization stack is not being shown correctly, is a
>> known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
>> next couple of days. That should help you to narrow down the culprit
>> preventing serialization.
>>
>> On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> Can you show us your function(s) ?
>>>
>>> Thanks
>>>
>>> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com>
>>> wrote:
>>>
>>>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded
>>>> to 1.4, I started seeing error as below. It appears that it fails in
>>>> validate method in StreamingContext. Is there anything changed on 1.4.0
>>>> w.r.t DStream checkpointint?
>>>>
>>>> Detailed error from driver:
>>>>
>>>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: *java.io.NotSerializableException: DStream checkpointing
>>>> has been enabled but the DStreams with their functions are not serializable*
>>>> Serialization stack:
>>>>
>>>> java.io.NotSerializableException: DStream checkpointing has been
>>>> enabled but the DStreams with their functions are not serializable
>>>> Serialization stack:
>>>>
>>>> at
>>>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>>>> at
>>>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>>>> at
>>>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>>>
>>>> --
>>>> Chen Song
>>>>
>>>>
>>>
>>
>
>
> --
> Chen Song
>
>

Re: NotSerializableException in spark 1.4.0

Posted by Chen Song <ch...@gmail.com>.
Thanks

Can you point me to the patch to fix the serialization stack? Maybe I can
pull it in and rerun my job.

Chen

On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das <td...@databricks.com> wrote:

> Your streaming job may have been seemingly running ok, but the DStream
> checkpointing must have been failing in the background. It would have been
> visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
> that checkpointing failures dont get hidden in the background.
>
> The fact that the serialization stack is not being shown correctly, is a
> known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
> next couple of days. That should help you to narrow down the culprit
> preventing serialization.
>
> On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> Can you show us your function(s) ?
>>
>> Thanks
>>
>> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com>
>> wrote:
>>
>>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded
>>> to 1.4, I started seeing error as below. It appears that it fails in
>>> validate method in StreamingContext. Is there anything changed on 1.4.0
>>> w.r.t DStream checkpointint?
>>>
>>> Detailed error from driver:
>>>
>>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>>> exception: *java.io.NotSerializableException: DStream checkpointing has
>>> been enabled but the DStreams with their functions are not serializable*
>>> Serialization stack:
>>>
>>> java.io.NotSerializableException: DStream checkpointing has been enabled
>>> but the DStreams with their functions are not serializable
>>> Serialization stack:
>>>
>>> at
>>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>>> at
>>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>>> at
>>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>>
>>> --
>>> Chen Song
>>>
>>>
>>
>


-- 
Chen Song

Re: NotSerializableException in spark 1.4.0

Posted by Tathagata Das <td...@databricks.com>.
Your streaming job may have been seemingly running ok, but the DStream
checkpointing must have been failing in the background. It would have been
visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
that checkpointing failures dont get hidden in the background.

The fact that the serialization stack is not being shown correctly, is a
known bug in Spark 1.4.0, but is fixed in 1.4.1 about to come out in the
next couple of days. That should help you to narrow down the culprit
preventing serialization.

On Wed, Jul 15, 2015 at 1:12 PM, Ted Yu <yu...@gmail.com> wrote:

> Can you show us your function(s) ?
>
> Thanks
>
> On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com>
> wrote:
>
>> The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
>> 1.4, I started seeing error as below. It appears that it fails in validate
>> method in StreamingContext. Is there anything changed on 1.4.0 w.r.t
>> DStream checkpointint?
>>
>> Detailed error from driver:
>>
>> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
>> exception: *java.io.NotSerializableException: DStream checkpointing has
>> been enabled but the DStreams with their functions are not serializable*
>> Serialization stack:
>>
>> java.io.NotSerializableException: DStream checkpointing has been enabled
>> but the DStreams with their functions are not serializable
>> Serialization stack:
>>
>> at
>> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
>> at
>> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
>> at
>> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>>
>> --
>> Chen Song
>>
>>
>

Re: NotSerializableException in spark 1.4.0

Posted by Ted Yu <yu...@gmail.com>.
Can you show us your function(s) ?

Thanks

On Wed, Jul 15, 2015 at 12:46 PM, Chen Song <ch...@gmail.com> wrote:

> The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
> 1.4, I started seeing error as below. It appears that it fails in validate
> method in StreamingContext. Is there anything changed on 1.4.0 w.r.t
> DStream checkpointint?
>
> Detailed error from driver:
>
> 15/07/15 18:00:39 ERROR yarn.ApplicationMaster: User class threw
> exception: *java.io.NotSerializableException: DStream checkpointing has
> been enabled but the DStreams with their functions are not serializable*
> Serialization stack:
>
> java.io.NotSerializableException: DStream checkpointing has been enabled
> but the DStreams with their functions are not serializable
> Serialization stack:
>
> at
> org.apache.spark.streaming.StreamingContext.validate(StreamingContext.scala:550)
> at
> org.apache.spark.streaming.StreamingContext.liftedTree1$1(StreamingContext.scala:587)
> at
> org.apache.spark.streaming.StreamingContext.start(StreamingContext.scala:586)
>
> --
> Chen Song
>
>