You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Yuval.Itzchakov" <yu...@gmail.com> on 2017/07/03 05:22:06 UTC

Re: How to reduce the amount of data that is getting written to the checkpoint from Spark Streaming

You can't. Spark doesn't let you fiddle with the data being checkpoint, as
it's an internal implementation detail.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-reduce-the-amount-of-data-that-is-getting-written-to-the-checkpoint-from-Spark-Streaming-tp28798p28815.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: How to reduce the amount of data that is getting written to the checkpoint from Spark Streaming

Posted by "Yuval.Itzchakov" <yu...@gmail.com>.
Using a long period betweem checkpoints may cause a long linage of the graphs
computations to be created, since Spark uses checkpointing to cut it, which
can also cause a delay in the streaming job.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-reduce-the-amount-of-data-that-is-getting-written-to-the-checkpoint-from-Spark-Streaming-tp28798p28820.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org