You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Michel Hubert <mi...@vsnsystemen.nl> on 2015/07/15 14:42:03 UTC

updateStateByKey schedule time

Hi,


I want to implement a time-out mechanism in de updateStateByKey(...) routine.

But is there a way the retrieve the time of the start of the batch corresponding to the call to my updateStateByKey routines?

Suppose the streaming has build up some delay then a System.currentTimeMillis() will not be the time of the time the batch was scheduled.

I want to retrieve the job/task schedule time of the batch for which my updateStateByKey(..) routine is called.

Is this possible?

With kind regards,
Michel Hubert



Re: updateStateByKey schedule time

Posted by Tathagata Das <td...@databricks.com>.
For future readers of this thread, Spark 1.6 adds trackStateByKey that has
native support for timeouts.

On Tue, Jul 21, 2015 at 12:00 AM, Anand Nalya <an...@gmail.com> wrote:

> I also ran into a similar use case. Is this possible?
>
> On 15 July 2015 at 18:12, Michel Hubert <mi...@vsnsystemen.nl> wrote:
>
>> Hi,
>>
>>
>>
>>
>>
>> I want to implement a time-out mechanism in de updateStateByKey(…)
>> routine.
>>
>>
>>
>> But is there a way the retrieve the time of the start of the batch
>> corresponding to the call to my updateStateByKey routines?
>>
>>
>>
>> Suppose the streaming has build up some delay then a System.currentTimeMillis()
>> will not be the time of the time the batch was scheduled.
>>
>>
>>
>> I want to retrieve the job/task schedule time of the batch for which my updateStateByKey(..)
>> routine is called.
>>
>>
>>
>> Is this possible?
>>
>>
>>
>> With kind regards,
>>
>> Michel Hubert
>>
>>
>>
>>
>>
>
>

Re: updateStateByKey schedule time

Posted by Anand Nalya <an...@gmail.com>.
I also ran into a similar use case. Is this possible?

On 15 July 2015 at 18:12, Michel Hubert <mi...@vsnsystemen.nl> wrote:

>  Hi,
>
>
>
>
>
> I want to implement a time-out mechanism in de updateStateByKey(…)
> routine.
>
>
>
> But is there a way the retrieve the time of the start of the batch
> corresponding to the call to my updateStateByKey routines?
>
>
>
> Suppose the streaming has build up some delay then a System.currentTimeMillis()
> will not be the time of the time the batch was scheduled.
>
>
>
> I want to retrieve the job/task schedule time of the batch for which my updateStateByKey(..)
> routine is called.
>
>
>
> Is this possible?
>
>
>
> With kind regards,
>
> Michel Hubert
>
>
>
>
>