You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Amit Sharma <re...@gmail.com> on 2019/07/21 23:18:51 UTC
spark dataset.cache is not thread safe
Hi , I wrote a code in future block which read data from dataset and cache
it which is used later in the code. I faced a issue that data.cached() data
will be replaced by concurrent running thread . Is there any way we can
avoid this condition.
val dailyData = callDetailsDS.collect.toList
val adjustedData = dailyData.map(callDataPerDay => Future{
val data = callDetailsDS.filter((callDetailsDS(DateColumn) geq (some
conditional date ))
data.cache()
....................
}
Re: spark dataset.cache is not thread safe
Posted by Amit Sharma <re...@gmail.com>.
please update me if any one knows how to handle it.
On Sun, Jul 21, 2019 at 7:18 PM Amit Sharma <re...@gmail.com> wrote:
> Hi , I wrote a code in future block which read data from dataset and cache
> it which is used later in the code. I faced a issue that data.cached() data
> will be replaced by concurrent running thread . Is there any way we can
> avoid this condition.
>
> val dailyData = callDetailsDS.collect.toList
> val adjustedData = dailyData.map(callDataPerDay => Future{
>
>
>
> val data = callDetailsDS.filter((callDetailsDS(DateColumn) geq (some conditional date ))
> data.cache()
>
> ....................
>
> }
>
>
>