You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Siyuan Hua <si...@datatorrent.com> on 2015/12/04 00:53:54 UTC

Is committed window id same sequence as checkpointed window id?

As of my knowledge, stram only knows window k is fully complete until all
operator(s) have checkpointed window k. So I can say I'm guaranteed to see
same sequence of numbers in  checkpointed and committed. Is this assumption
true?

Regards,
Siyuan

Re: Is committed window id same sequence as checkpointed window id?

Posted by Thomas Weise <th...@datatorrent.com>.
There is no guarantee that every checkpoint will result in a committed
callback. The app master periodically calculates the committed window based
on the recently received checkpointed windows and communicates it back to
the containers. In a scenario where two subsequent checkpoints where
received before a recovery checkpoint update, only the second checkpoint
will be notified as committed.


On Thu, Dec 3, 2015 at 5:01 PM, Ashwin Chandra Putta <
ashwinchandrap@gmail.com> wrote:

> The stram app master determines the committed id in a heartbeat loop that
> executes once every 1000 ms.
>
> Regards,
> Ashwin.
>
> On Thu, Dec 3, 2015 at 4:52 PM, Ashwin Chandra Putta <
> ashwinchandrap@gmail.com> wrote:
>
> > I think the committed callback is not guaranteed. For a given operator,
> > the window id we get in committed call back is a subset of window ids we
> > get in checkpointed callback.
> >
> > The committed callback is usually called for all checkpointed window ids
> > but when a given partition is blocked in between for some reason (say,
> disk
> > I/O or external system I/O) for sometime, it might be waiting in the
> window
> > (say window id 25) for some time while the other partitions have passed
> > through a checkpoint(say checkpoint id 30). When this partition gets
> > unblocked, it might quickly catch up to latest window (say 62, doing
> > checkpoints at 30 and 60). Meanwhile, other partitions have checkpointed
> > and sent id 60. The app master when deciding on the committed window id,
> it
> > looks through all the latest checkpoint window ids it got and at that
> time
> > it is possible that the latest checkpoint window for all the operators is
> > window id 60. So it sends the committed window id as 60 and not 30.
> >
> > Please correct me if my understanding is wrong.
> >
> > Regards,
> > Ashwin.
> >
> > On Thu, Dec 3, 2015 at 3:53 PM, Siyuan Hua <si...@datatorrent.com>
> wrote:
> >
> >> As of my knowledge, stram only knows window k is fully complete until
> all
> >> operator(s) have checkpointed window k. So I can say I'm guaranteed to
> see
> >> same sequence of numbers in  checkpointed and committed. Is this
> >> assumption
> >> true?
> >>
> >> Regards,
> >> Siyuan
> >>
> >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>

Re: Is committed window id same sequence as checkpointed window id?

Posted by Ashwin Chandra Putta <as...@gmail.com>.
The stram app master determines the committed id in a heartbeat loop that
executes once every 1000 ms.

Regards,
Ashwin.

On Thu, Dec 3, 2015 at 4:52 PM, Ashwin Chandra Putta <
ashwinchandrap@gmail.com> wrote:

> I think the committed callback is not guaranteed. For a given operator,
> the window id we get in committed call back is a subset of window ids we
> get in checkpointed callback.
>
> The committed callback is usually called for all checkpointed window ids
> but when a given partition is blocked in between for some reason (say, disk
> I/O or external system I/O) for sometime, it might be waiting in the window
> (say window id 25) for some time while the other partitions have passed
> through a checkpoint(say checkpoint id 30). When this partition gets
> unblocked, it might quickly catch up to latest window (say 62, doing
> checkpoints at 30 and 60). Meanwhile, other partitions have checkpointed
> and sent id 60. The app master when deciding on the committed window id, it
> looks through all the latest checkpoint window ids it got and at that time
> it is possible that the latest checkpoint window for all the operators is
> window id 60. So it sends the committed window id as 60 and not 30.
>
> Please correct me if my understanding is wrong.
>
> Regards,
> Ashwin.
>
> On Thu, Dec 3, 2015 at 3:53 PM, Siyuan Hua <si...@datatorrent.com> wrote:
>
>> As of my knowledge, stram only knows window k is fully complete until all
>> operator(s) have checkpointed window k. So I can say I'm guaranteed to see
>> same sequence of numbers in  checkpointed and committed. Is this
>> assumption
>> true?
>>
>> Regards,
>> Siyuan
>>
>
>
>
> --
>
> Regards,
> Ashwin.
>



-- 

Regards,
Ashwin.

Re: Is committed window id same sequence as checkpointed window id?

Posted by Ashwin Chandra Putta <as...@gmail.com>.
I think the committed callback is not guaranteed. For a given operator, the
window id we get in committed call back is a subset of window ids we get in
checkpointed callback.

The committed callback is usually called for all checkpointed window ids
but when a given partition is blocked in between for some reason (say, disk
I/O or external system I/O) for sometime, it might be waiting in the window
(say window id 25) for some time while the other partitions have passed
through a checkpoint(say checkpoint id 30). When this partition gets
unblocked, it might quickly catch up to latest window (say 62, doing
checkpoints at 30 and 60). Meanwhile, other partitions have checkpointed
and sent id 60. The app master when deciding on the committed window id, it
looks through all the latest checkpoint window ids it got and at that time
it is possible that the latest checkpoint window for all the operators is
window id 60. So it sends the committed window id as 60 and not 30.

Please correct me if my understanding is wrong.

Regards,
Ashwin.

On Thu, Dec 3, 2015 at 3:53 PM, Siyuan Hua <si...@datatorrent.com> wrote:

> As of my knowledge, stram only knows window k is fully complete until all
> operator(s) have checkpointed window k. So I can say I'm guaranteed to see
> same sequence of numbers in  checkpointed and committed. Is this assumption
> true?
>
> Regards,
> Siyuan
>



-- 

Regards,
Ashwin.