You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by manasdebashiskar <po...@gmail.com> on 2016/09/18 02:49:40 UTC

Re: Recovered state for updateStateByKey and incremental streams processing

If you are using spark 1.6 onwards there is a better solution for you.
It is called mapwithState

mapwithState takes a state function and an initial RDD.

1) When you start your program for the first time/OR version changes and new
code can't use the checkpoint, the initialRDD comes handy.
2) For the rest of the occasion(i.e. program re-start after failure, or
regular stop/start for the same version) the checkpoint works for you.

Also, mapwithstate is easier to reason about then updatestatebykey and is
optimized to handle larger amount of data for the same amount of memory.

I use the same mechanism in production to great success.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Recovered-state-for-updateStateByKey-and-incremental-streams-processing-tp22229p27747.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org