You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Yi Pan <ni...@gmail.com> on 2022/12/02 04:06:33 UTC

Re: SEP-31: Pipeline Drain: Support the ability to drain pipelines to allow incompatible intermediate schema changes

Hi, Ajo,

Sorry to reply this late. Could you clarify one thing in the design: For
watermark triggered window draining, is the infinitive watermark trigger
happen first, or the drain token in all source SSP happen first? Shouldn't
it be the following sequence: a) all drain token from all input source SSPs
(except for intermediate streams) are received by tasks ==> b) infinite
watermark triggers from the source and flush all window/triggers in the
pipeline ==> c) once the infinite watermark is propagated through all
stages in the pipeline, stops the tasks. Could you confirm?

Thanks a lot!

-Yi

On Thu, Nov 17, 2022 at 9:48 AM Ajo Thomas <aj...@gmail.com> wrote:

> Hi All,
>
> Samza currently doesn't have a way to gracefully drain pipelines before
> making a backward-incompatible intermediate schema change. We have added a
> feature called Pipeline Drain to the samza engine to address this problem.
> Here is the SEP page for it:
>
> https://cwiki.apache.org/confluence/display/SAMZA/SEP-31%3A+Pipeline+Drain%3A+Support+the+ability+to+drain+pipelines+to+allow+incompatible+intermediate+schema+changes
>
>
> If there are no major blockers, we are tentatively seeking to open a vote
> on Monday, Nov 28th, 2022.
>
> Thanks,
> Ajo
>

Re: SEP-31: Pipeline Drain: Support the ability to drain pipelines to allow incompatible intermediate schema changes

Posted by Yi Pan <ni...@gmail.com>.
As discussed offline and see the clarifications in the SEP, +1 (binding)

On Fri, Dec 2, 2022 at 8:05 AM Ajo Thomas <aj...@gmail.com> wrote:

> Hi Yi,
>
> The order currently is infinity watermark followed by drain control message
> for every source SSP (all input SSPs - intermediate SSPs) to insert in the
> in-memory buffer in SystemConsumers. Prior to this step, we also
> stop calling refresh in Chooser to make sure that the last messages in the
> in-memory SSP buffer are the watermark and drain messages.
> Infinity watermark is essentially tasked with flushing windows and
> triggers.
> Drain message essentially signals to the processing logic that it is the
> last message for SSP and it should shutdown. We track the SSPs that have
> received this token in a task. Once all SSPs have been drained, the task is
> marked ready to shutdown. Once all tasks are ready to shutdown, RunLoop
> shuts down.
>
> Do you see any issues with it ?
>
> - Ajo
>
>
> On Thu, 1 Dec 2022 at 20:06, Yi Pan <ni...@gmail.com> wrote:
>
> > Hi, Ajo,
> >
> > Sorry to reply this late. Could you clarify one thing in the design: For
> > watermark triggered window draining, is the infinitive watermark trigger
> > happen first, or the drain token in all source SSP happen first?
> Shouldn't
> > it be the following sequence: a) all drain token from all input source
> SSPs
> > (except for intermediate streams) are received by tasks ==> b) infinite
> > watermark triggers from the source and flush all window/triggers in the
> > pipeline ==> c) once the infinite watermark is propagated through all
> > stages in the pipeline, stops the tasks. Could you confirm?
> >
> > Thanks a lot!
> >
> > -Yi
> >
> > On Thu, Nov 17, 2022 at 9:48 AM Ajo Thomas <aj...@gmail.com>
> wrote:
> >
> > > Hi All,
> > >
> > > Samza currently doesn't have a way to gracefully drain pipelines before
> > > making a backward-incompatible intermediate schema change. We have
> added
> > a
> > > feature called Pipeline Drain to the samza engine to address this
> > problem.
> > > Here is the SEP page for it:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/SAMZA/SEP-31%3A+Pipeline+Drain%3A+Support+the+ability+to+drain+pipelines+to+allow+incompatible+intermediate+schema+changes
> > >
> > >
> > > If there are no major blockers, we are tentatively seeking to open a
> vote
> > > on Monday, Nov 28th, 2022.
> > >
> > > Thanks,
> > > Ajo
> > >
> >
>

Re: SEP-31: Pipeline Drain: Support the ability to drain pipelines to allow incompatible intermediate schema changes

Posted by Ajo Thomas <aj...@gmail.com>.
Hi Yi,

The order currently is infinity watermark followed by drain control message
for every source SSP (all input SSPs - intermediate SSPs) to insert in the
in-memory buffer in SystemConsumers. Prior to this step, we also
stop calling refresh in Chooser to make sure that the last messages in the
in-memory SSP buffer are the watermark and drain messages.
Infinity watermark is essentially tasked with flushing windows and triggers.
Drain message essentially signals to the processing logic that it is the
last message for SSP and it should shutdown. We track the SSPs that have
received this token in a task. Once all SSPs have been drained, the task is
marked ready to shutdown. Once all tasks are ready to shutdown, RunLoop
shuts down.

Do you see any issues with it ?

- Ajo


On Thu, 1 Dec 2022 at 20:06, Yi Pan <ni...@gmail.com> wrote:

> Hi, Ajo,
>
> Sorry to reply this late. Could you clarify one thing in the design: For
> watermark triggered window draining, is the infinitive watermark trigger
> happen first, or the drain token in all source SSP happen first? Shouldn't
> it be the following sequence: a) all drain token from all input source SSPs
> (except for intermediate streams) are received by tasks ==> b) infinite
> watermark triggers from the source and flush all window/triggers in the
> pipeline ==> c) once the infinite watermark is propagated through all
> stages in the pipeline, stops the tasks. Could you confirm?
>
> Thanks a lot!
>
> -Yi
>
> On Thu, Nov 17, 2022 at 9:48 AM Ajo Thomas <aj...@gmail.com> wrote:
>
> > Hi All,
> >
> > Samza currently doesn't have a way to gracefully drain pipelines before
> > making a backward-incompatible intermediate schema change. We have added
> a
> > feature called Pipeline Drain to the samza engine to address this
> problem.
> > Here is the SEP page for it:
> >
> >
> https://cwiki.apache.org/confluence/display/SAMZA/SEP-31%3A+Pipeline+Drain%3A+Support+the+ability+to+drain+pipelines+to+allow+incompatible+intermediate+schema+changes
> >
> >
> > If there are no major blockers, we are tentatively seeking to open a vote
> > on Monday, Nov 28th, 2022.
> >
> > Thanks,
> > Ajo
> >
>