You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Dong Lin <li...@gmail.com> on 2023/06/29 03:02:01 UTC

[VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Hi all,

We would like to start the vote for FLIP-309: Support using larger
checkpointing interval when source is processing backlog [1]. This FLIP was
discussed in this thread [2].

Flink 1.18 release will feature freeze on July 11. We hope to make this
feature available in Flink 1.18.

The vote will be open until at least July 4th (at least 72 hours), following
the consensus voting process.

Cheers,
Yunfeng and Dong

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
[2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Dong Lin <li...@gmail.com>.
Hi Chesnay, can you put your comments in the discussion thread, so that we
can continue the technical discussion there?

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Dong Lin <li...@gmail.com>.
Hi Chesnay,

Thank you for your comments and I would be happy to discuss together to
find a solution.

I just want to note that the discussion thread for this FLIP has been open
for almost two months for everyone to leave comments. I will really
appreciate it if in the future you can help provide comments earlier in the
discussion thread so that I (and probably other contributors) can have the
chance to address your concern and achieve consensus sooner than later. I
am hoping we can be more considerate and help each other in the community
be more productive.

Thanks,
Dong

On Fri, Jun 30, 2023 at 11:18 PM Chesnay Schepler <ch...@apache.org>
wrote:

> -1 (binding)
>
> I feel like this FLIP needs a bit more time in the oven.
>
> It seems to be very light on actual details; you can summarize the
> entire changes section as "The enumerator calls this method and then
> another checkpoint interval is used."
> I would love to know how this is wired into the triggering of
> checkpoints, what the behavior is with multiple sources, if a sink is
> allowed to set this at any point or just once, what the semantics of a
> "backlog" are for sources other than Hybrid/ MySQL CDC (because catching
> up after a failover is a common enough pattern), whether/how this
> information could also be interesting for the scheduler (because we may
> want to avoid rescalings during the backlog processing), whether the
> backlog processing should be exposed as a metric for users (or for that
> matter, how we inform users at all that we're using a different
> checkpoint interval at this time).
>
> Following my discussion with Piotr and Stefan I'm also not sure how
> future-proof the proposed API really is. Already I feel like the name
> "setIsProcessingBacklog()" is rather specific for the state of the
> source (making it technically wrong to call it in other situations like
> being backpressured (again, depending on what "backlog processing" even
> means)), while not being clear on what this actually results in. The
> javadocs don't even mention the checkpointing interval at all, but
> instead reference downstream optimizations that, afaict, aren't
> mentioned in the FLIP.
>
> I'd be very hesitant with marking it as public from the get-go. Ideally
> it would maybe even be added as a separate interface (somehow).
>
> On 30/06/2023 16:37, Piotr Nowojski wrote:
> > Hey,
> >
> > Sorry to disturb this voting, but after discussing this thoroughly with
> > Chesnay and Stefan Richter I have to vote:
> >   -1 (binding)
> > mainly to suspend the current voting thread. Please take a look at my
> mail
> > at dev mailing list.
> >
> > Best,
> > Piotrek
> >
> > czw., 29 cze 2023 o 14:59 feng xiangyu <xi...@gmail.com>
> napisał(a):
> >
> >> +1 (non-binding)
> >>
> >> Best,
> >> Xiangyu
> >>
> >> yuxia <lu...@alumni.sjtu.edu.cn> 于2023年6月29日周四 20:44写道:
> >>
> >>> +1 (binding)
> >>>
> >>> Best regards,
> >>> Yuxia
> >>>
> >>> ----- 原始邮件 -----
> >>> 发件人: "Yuepeng Pan" <fl...@126.com>
> >>> 收件人: "dev" <de...@flink.apache.org>
> >>> 发送时间: 星期四, 2023年 6 月 29日 下午 8:21:14
> >>> 主题: Re: [VOTE] FLIP-309: Support using larger checkpointing interval
> when
> >>> source is processing backlog
> >>>
> >>> +1  non-binding.
> >>>
> >>>
> >>> Best.
> >>> Yuepeng Pan
> >>>
> >>>
> >>> ---- Replied Message ----
> >>> | From | Jingsong Li<ji...@gmail.com> |
> >>> | Date | 06/29/2023 13:25 |
> >>> | To | dev<de...@flink.apache.org> |
> >>> | Cc | flink.zhouyunfeng<fl...@gmail.com> |
> >>> | Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing
> >>> interval when source is processing backlog |
> >>> +1 binding
> >>>
> >>> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
> >>>> Hi all,
> >>>>
> >>>> We would like to start the vote for FLIP-309: Support using larger
> >>>> checkpointing interval when source is processing backlog [1]. This
> FLIP
> >>> was
> >>>> discussed in this thread [2].
> >>>>
> >>>> Flink 1.18 release will feature freeze on July 11. We hope to make
> this
> >>>> feature available in Flink 1.18.
> >>>>
> >>>> The vote will be open until at least July 4th (at least 72 hours),
> >>> following
> >>>> the consensus voting process.
> >>>>
> >>>> Cheers,
> >>>> Yunfeng and Dong
> >>>>
> >>>> [1]
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> >>>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>
>
>

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Chesnay Schepler <ch...@apache.org>.
-1 (binding)

I feel like this FLIP needs a bit more time in the oven.

It seems to be very light on actual details; you can summarize the 
entire changes section as "The enumerator calls this method and then 
another checkpoint interval is used."
I would love to know how this is wired into the triggering of 
checkpoints, what the behavior is with multiple sources, if a sink is 
allowed to set this at any point or just once, what the semantics of a 
"backlog" are for sources other than Hybrid/ MySQL CDC (because catching 
up after a failover is a common enough pattern), whether/how this 
information could also be interesting for the scheduler (because we may 
want to avoid rescalings during the backlog processing), whether the 
backlog processing should be exposed as a metric for users (or for that 
matter, how we inform users at all that we're using a different 
checkpoint interval at this time).

Following my discussion with Piotr and Stefan I'm also not sure how 
future-proof the proposed API really is. Already I feel like the name 
"setIsProcessingBacklog()" is rather specific for the state of the 
source (making it technically wrong to call it in other situations like 
being backpressured (again, depending on what "backlog processing" even 
means)), while not being clear on what this actually results in. The 
javadocs don't even mention the checkpointing interval at all, but 
instead reference downstream optimizations that, afaict, aren't 
mentioned in the FLIP.

I'd be very hesitant with marking it as public from the get-go. Ideally 
it would maybe even be added as a separate interface (somehow).

On 30/06/2023 16:37, Piotr Nowojski wrote:
> Hey,
>
> Sorry to disturb this voting, but after discussing this thoroughly with
> Chesnay and Stefan Richter I have to vote:
>   -1 (binding)
> mainly to suspend the current voting thread. Please take a look at my mail
> at dev mailing list.
>
> Best,
> Piotrek
>
> czw., 29 cze 2023 o 14:59 feng xiangyu <xi...@gmail.com> napisał(a):
>
>> +1 (non-binding)
>>
>> Best,
>> Xiangyu
>>
>> yuxia <lu...@alumni.sjtu.edu.cn> 于2023年6月29日周四 20:44写道:
>>
>>> +1 (binding)
>>>
>>> Best regards,
>>> Yuxia
>>>
>>> ----- 原始邮件 -----
>>> 发件人: "Yuepeng Pan" <fl...@126.com>
>>> 收件人: "dev" <de...@flink.apache.org>
>>> 发送时间: 星期四, 2023年 6 月 29日 下午 8:21:14
>>> 主题: Re: [VOTE] FLIP-309: Support using larger checkpointing interval when
>>> source is processing backlog
>>>
>>> +1  non-binding.
>>>
>>>
>>> Best.
>>> Yuepeng Pan
>>>
>>>
>>> ---- Replied Message ----
>>> | From | Jingsong Li<ji...@gmail.com> |
>>> | Date | 06/29/2023 13:25 |
>>> | To | dev<de...@flink.apache.org> |
>>> | Cc | flink.zhouyunfeng<fl...@gmail.com> |
>>> | Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing
>>> interval when source is processing backlog |
>>> +1 binding
>>>
>>> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> We would like to start the vote for FLIP-309: Support using larger
>>>> checkpointing interval when source is processing backlog [1]. This FLIP
>>> was
>>>> discussed in this thread [2].
>>>>
>>>> Flink 1.18 release will feature freeze on July 11. We hope to make this
>>>> feature available in Flink 1.18.
>>>>
>>>> The vote will be open until at least July 4th (at least 72 hours),
>>> following
>>>> the consensus voting process.
>>>>
>>>> Cheers,
>>>> Yunfeng and Dong
>>>>
>>>> [1]
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
>>>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37



Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Piotr Nowojski <pn...@apache.org>.
Hey,

Sorry to disturb this voting, but after discussing this thoroughly with
Chesnay and Stefan Richter I have to vote:
 -1 (binding)
mainly to suspend the current voting thread. Please take a look at my mail
at dev mailing list.

Best,
Piotrek

czw., 29 cze 2023 o 14:59 feng xiangyu <xi...@gmail.com> napisał(a):

> +1 (non-binding)
>
> Best,
> Xiangyu
>
> yuxia <lu...@alumni.sjtu.edu.cn> 于2023年6月29日周四 20:44写道:
>
> > +1 (binding)
> >
> > Best regards,
> > Yuxia
> >
> > ----- 原始邮件 -----
> > 发件人: "Yuepeng Pan" <fl...@126.com>
> > 收件人: "dev" <de...@flink.apache.org>
> > 发送时间: 星期四, 2023年 6 月 29日 下午 8:21:14
> > 主题: Re: [VOTE] FLIP-309: Support using larger checkpointing interval when
> > source is processing backlog
> >
> > +1  non-binding.
> >
> >
> > Best.
> > Yuepeng Pan
> >
> >
> > ---- Replied Message ----
> > | From | Jingsong Li<ji...@gmail.com> |
> > | Date | 06/29/2023 13:25 |
> > | To | dev<de...@flink.apache.org> |
> > | Cc | flink.zhouyunfeng<fl...@gmail.com> |
> > | Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing
> > interval when source is processing backlog |
> > +1 binding
> >
> > On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > We would like to start the vote for FLIP-309: Support using larger
> > > checkpointing interval when source is processing backlog [1]. This FLIP
> > was
> > > discussed in this thread [2].
> > >
> > > Flink 1.18 release will feature freeze on July 11. We hope to make this
> > > feature available in Flink 1.18.
> > >
> > > The vote will be open until at least July 4th (at least 72 hours),
> > following
> > > the consensus voting process.
> > >
> > > Cheers,
> > > Yunfeng and Dong
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
> >
>

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by feng xiangyu <xi...@gmail.com>.
+1 (non-binding)

Best,
Xiangyu

yuxia <lu...@alumni.sjtu.edu.cn> 于2023年6月29日周四 20:44写道:

> +1 (binding)
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "Yuepeng Pan" <fl...@126.com>
> 收件人: "dev" <de...@flink.apache.org>
> 发送时间: 星期四, 2023年 6 月 29日 下午 8:21:14
> 主题: Re: [VOTE] FLIP-309: Support using larger checkpointing interval when
> source is processing backlog
>
> +1  non-binding.
>
>
> Best.
> Yuepeng Pan
>
>
> ---- Replied Message ----
> | From | Jingsong Li<ji...@gmail.com> |
> | Date | 06/29/2023 13:25 |
> | To | dev<de...@flink.apache.org> |
> | Cc | flink.zhouyunfeng<fl...@gmail.com> |
> | Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing
> interval when source is processing backlog |
> +1 binding
>
> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
> >
> > Hi all,
> >
> > We would like to start the vote for FLIP-309: Support using larger
> > checkpointing interval when source is processing backlog [1]. This FLIP
> was
> > discussed in this thread [2].
> >
> > Flink 1.18 release will feature freeze on July 11. We hope to make this
> > feature available in Flink 1.18.
> >
> > The vote will be open until at least July 4th (at least 72 hours),
> following
> > the consensus voting process.
> >
> > Cheers,
> > Yunfeng and Dong
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> > [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by yuxia <lu...@alumni.sjtu.edu.cn>.
+1 (binding)

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Yuepeng Pan" <fl...@126.com>
收件人: "dev" <de...@flink.apache.org>
发送时间: 星期四, 2023年 6 月 29日 下午 8:21:14
主题: Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

+1  non-binding.


Best.
Yuepeng Pan


---- Replied Message ----
| From | Jingsong Li<ji...@gmail.com> |
| Date | 06/29/2023 13:25 |
| To | dev<de...@flink.apache.org> |
| Cc | flink.zhouyunfeng<fl...@gmail.com> |
| Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog |
+1 binding

On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>
> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> Flink 1.18 release will feature freeze on July 11. We hope to make this
> feature available in Flink 1.18.
>
> The vote will be open until at least July 4th (at least 72 hours), following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Yuepeng Pan <fl...@126.com>.
+1  non-binding.


Best.
Yuepeng Pan


---- Replied Message ----
| From | Jingsong Li<ji...@gmail.com> |
| Date | 06/29/2023 13:25 |
| To | dev<de...@flink.apache.org> |
| Cc | flink.zhouyunfeng<fl...@gmail.com> |
| Subject | Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog |
+1 binding

On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>
> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> Flink 1.18 release will feature freeze on July 11. We hope to make this
> feature available in Flink 1.18.
>
> The vote will be open until at least July 4th (at least 72 hours), following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Jark Wu <im...@gmail.com>.
+1 (binding)

Best,
Jark

> 2023年6月29日 18:12,Jing Ge <ji...@ververica.com.INVALID> 写道:
> 
> +1(binding)
> 
> On Thu, Jun 29, 2023 at 7:47 AM Leonard Xu <xb...@gmail.com> wrote:
> 
>> +1 (binding)
>> 
>> Best,
>> Leonard
>> 
>>> On Jun 29, 2023, at 1:25 PM, Jingsong Li <ji...@gmail.com> wrote:
>>> 
>>> +1 binding
>>> 
>>> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> We would like to start the vote for FLIP-309: Support using larger
>>>> checkpointing interval when source is processing backlog [1]. This FLIP
>> was
>>>> discussed in this thread [2].
>>>> 
>>>> Flink 1.18 release will feature freeze on July 11. We hope to make this
>>>> feature available in Flink 1.18.
>>>> 
>>>> The vote will be open until at least July 4th (at least 72 hours),
>> following
>>>> the consensus voting process.
>>>> 
>>>> Cheers,
>>>> Yunfeng and Dong
>>>> 
>>>> [1]
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
>>>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>> 
>> 


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Jing Ge <ji...@ververica.com.INVALID>.
+1(binding)

On Thu, Jun 29, 2023 at 7:47 AM Leonard Xu <xb...@gmail.com> wrote:

> +1 (binding)
>
> Best,
> Leonard
>
> > On Jun 29, 2023, at 1:25 PM, Jingsong Li <ji...@gmail.com> wrote:
> >
> > +1 binding
> >
> > On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
> >>
> >> Hi all,
> >>
> >> We would like to start the vote for FLIP-309: Support using larger
> >> checkpointing interval when source is processing backlog [1]. This FLIP
> was
> >> discussed in this thread [2].
> >>
> >> Flink 1.18 release will feature freeze on July 11. We hope to make this
> >> feature available in Flink 1.18.
> >>
> >> The vote will be open until at least July 4th (at least 72 hours),
> following
> >> the consensus voting process.
> >>
> >> Cheers,
> >> Yunfeng and Dong
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> >> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37
>
>

Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Leonard Xu <xb...@gmail.com>.
+1 (binding)

Best,
Leonard

> On Jun 29, 2023, at 1:25 PM, Jingsong Li <ji...@gmail.com> wrote:
> 
> +1 binding
> 
> On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>> 
>> Hi all,
>> 
>> We would like to start the vote for FLIP-309: Support using larger
>> checkpointing interval when source is processing backlog [1]. This FLIP was
>> discussed in this thread [2].
>> 
>> Flink 1.18 release will feature freeze on July 11. We hope to make this
>> feature available in Flink 1.18.
>> 
>> The vote will be open until at least July 4th (at least 72 hours), following
>> the consensus voting process.
>> 
>> Cheers,
>> Yunfeng and Dong
>> 
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
>> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37


Re: [VOTE] FLIP-309: Support using larger checkpointing interval when source is processing backlog

Posted by Jingsong Li <ji...@gmail.com>.
+1 binding

On Thu, Jun 29, 2023 at 11:03 AM Dong Lin <li...@gmail.com> wrote:
>
> Hi all,
>
> We would like to start the vote for FLIP-309: Support using larger
> checkpointing interval when source is processing backlog [1]. This FLIP was
> discussed in this thread [2].
>
> Flink 1.18 release will feature freeze on July 11. We hope to make this
> feature available in Flink 1.18.
>
> The vote will be open until at least July 4th (at least 72 hours), following
> the consensus voting process.
>
> Cheers,
> Yunfeng and Dong
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-309%3A+Support+using+larger+checkpointing+interval+when+source+is+processing+backlog
> [2] https://lists.apache.org/thread/l1l7f30h7zldjp6ow97y70dcthx7tl37