You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Amit Sharma <re...@gmail.com> on 2019/07/21 23:18:51 UTC

spark dataset.cache is not thread safe

Hi , I wrote a code in future block which read data from dataset and cache
it which is used later in the code. I faced a issue that data.cached() data
will be replaced by concurrent running thread . Is there any way we can
avoid this condition.

val dailyData = callDetailsDS.collect.toList
val adjustedData = dailyData.map(callDataPerDay => Future{



  val data = callDetailsDS.filter((callDetailsDS(DateColumn) geq (some
conditional date ))
    data.cache()

....................

}

Re: spark dataset.cache is not thread safe

Posted by Amit Sharma <re...@gmail.com>.
please update me if any one knows how to handle it.

On Sun, Jul 21, 2019 at 7:18 PM Amit Sharma <re...@gmail.com> wrote:

> Hi , I wrote a code in future block which read data from dataset and cache
> it which is used later in the code. I faced a issue that data.cached() data
> will be replaced by concurrent running thread . Is there any way we can
> avoid this condition.
>
> val dailyData = callDetailsDS.collect.toList
> val adjustedData = dailyData.map(callDataPerDay => Future{
>
>
>
>   val data = callDetailsDS.filter((callDetailsDS(DateColumn) geq (some conditional date ))
>     data.cache()
>
> ....................
>
> }
>
>
>