You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Chen Song <ch...@gmail.com> on 2015/07/07 21:43:57 UTC

(de)serialize DStream

In Spark Streaming, when using updateStateByKey, it requires the generated
DStream to be checkpointed.

It seems that it always use JavaSerializer, no matter what I set for
spark.serializer. Can I use KryoSerializer for checkpointing? If not, I
assume the key and value types have to be Serializable?

Chen

Re: (de)serialize DStream

Posted by Shixiong Zhu <zs...@gmail.com>.
DStream must be Serializable, it's metadata checkpointing. But you can use
KryoSerializer for data checkpointing. The data checkpointing uses
RDD.checkpoint which can be set by spark.serializer.

Best Regards,
Shixiong Zhu

2015-07-08 3:43 GMT+08:00 Chen Song <ch...@gmail.com>:

> In Spark Streaming, when using updateStateByKey, it requires the generated
> DStream to be checkpointed.
>
> It seems that it always use JavaSerializer, no matter what I set for
> spark.serializer. Can I use KryoSerializer for checkpointing? If not, I
> assume the key and value types have to be Serializable?
>
> Chen
>