You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by diplomatic Guru <di...@gmail.com> on 2015/10/27 16:05:46 UTC

[Spark Streaming] Why are some uncached RDDs are growing?

Hello All,

When I checked my running Stream job on WebUI, I can see that some RDDs are
being listed that were not requested to be cached. What more is that they
are growing! I've not asked them to be cached. What are they? Are they the
state (UpdateStateByKey)?

Only the rows in white are being requested to be cached. But where are the
RDDs  that are highlighted in yellow are from?



​

Re: [Spark Streaming] Why are some uncached RDDs are growing?

Posted by Tathagata Das <td...@databricks.com>.
UpdateStateByKey automatically caches its RDDs.

On Tue, Oct 27, 2015 at 8:05 AM, diplomatic Guru <di...@gmail.com>
wrote:

>
> Hello All,
>
> When I checked my running Stream job on WebUI, I can see that some RDDs
> are being listed that were not requested to be cached. What more is that
> they are growing! I've not asked them to be cached. What are they? Are they
> the state (UpdateStateByKey)?
>
> Only the rows in white are being requested to be cached. But where are the
> RDDs  that are highlighted in yellow are from?
>
>
>
> ​
>