You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by nick toker <ni...@gmail.com> on 2020/12/21 15:52:34 UTC

checkpoint delay consume message

Hello,

We noticed the following behavior:
If we enable the flink checkpoints, we saw that there is a delay between
the time we write a message to the KAFKA topic and the time the flink kafka
connector consumes this message.
The delay is closely related to checkpointInterval and/or
minPauseBetweenCheckpoints meening that the MAX delay when consuming a
message from KAFKA will be one of these parameters.

Could you please advise how we can remove/control this delay?

we use flink 1.11.2

BR
nick

Re: Re: checkpoint delay consume message

Posted by Arvid Heise <ar...@ververica.com>.
There seems to be a double-post with the mail "Long latency when consuming
a message from KAFKA and checkpoint is enabled". Let's continue discussion
there.

On Sun, Dec 27, 2020 at 8:33 AM nick toker <ni...@gmail.com> wrote:

> Hi,
> Hi,  We think  we are using the default values unless we are missing
> something.
> So this doesn't explain the problem we are facing.
> Could you please tell us how to choose synchronous or asynchronous
> checkpoints just to be sure we are using the correct configuration ?
> BR,
> Nick
>
> ‫בתאריך יום ה׳, 24 בדצמ׳ 2020 ב-3:36 מאת ‪lec ssmi‬‏ <‪
> shicheng31604@gmail.com‬‏>:‬
>
>> Checkpoint can be done synchronously and  asynchronously,  the latter is
>> the default .
>> If you chooese  the synchronous way , it may cause this problem.
>>
>> nick toker <ni...@gmail.com> 于2020年12月23日周三 下午3:53写道:
>>
>>> Hi Yun,
>>>
>>> Sorry but we didn't understand your questions.
>>> The delay we are experiencing is on the *read* side.
>>> The message is written to kafka topic and consumed by flink with a delay
>>> that depends on the checkpoints interval
>>> When we disabled the checkpoints the messages are immediately consumed
>>> We use the EXACTLY-ONCE semantic.
>>>
>>> Please advise.
>>> BR,
>>> Nick
>>>
>>> ‫בתאריך יום ג׳, 22 בדצמ׳ 2020 ב-9:32 מאת ‪Yun Gao‬‏ <‪
>>> yungao.gy@aliyun.com‬‏>:‬
>>>
>>>> Hi nick,
>>>>
>>>>    Sorry I initially think that the data is also write into Kafka with
>>>> flink . So it could be ensured that there is no delay in the write side,
>>>> right ? Does the delay in the read side keeps existing ?
>>>>
>>>> Best,
>>>>  Yun
>>>>
>>>>
>>>>
>>>> ------------------Original Mail ------------------
>>>> *Sender:*nick toker <ni...@gmail.com>
>>>> *Send Date:*Tue Dec 22 01:43:50 2020
>>>> *Recipients:*Yun Gao <yu...@aliyun.com>
>>>> *CC:*user <us...@flink.apache.org>
>>>> *Subject:*Re: checkpoint delay consume message
>>>>
>>>>> hi
>>>>>
>>>>> i am confused
>>>>>
>>>>> the delay in in the source when reading message not on the sink
>>>>>
>>>>> nick
>>>>>
>>>>> ‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪
>>>>> yungao.gy@aliyun.com‬‏>:‬
>>>>>
>>>>>>  Hi Nick,
>>>>>>
>>>>>>     Are you using EXACTLY_ONCE semantics ? If so the sink would use
>>>>>> transactions, and only commit the transaction on checkpoint complete to
>>>>>> ensure end-to-end exactly-once. A detailed description could be find in [1]
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>  Yun
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>>>>>>
>>>>>> ------------------------------------------------------------------
>>>>>> Sender:nick toker<ni...@gmail.com>
>>>>>> Date:2020/12/21 23:52:34
>>>>>> Recipient:user<us...@flink.apache.org>
>>>>>> Theme:checkpoint delay consume message
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We noticed the following behavior:
>>>>>> If we enable the flink checkpoints, we saw that there is a delay
>>>>>> between the time we write a message to the KAFKA topic and the time the
>>>>>> flink kafka connector consumes this message.
>>>>>> The delay is closely related to checkpointInterval and/or
>>>>>> minPauseBetweenCheckpoints meening that the MAX delay when consuming a
>>>>>> message from KAFKA will be one of these parameters.
>>>>>>
>>>>>> Could you please advise how we can remove/control this delay?
>>>>>>
>>>>>> we use flink 1.11.2
>>>>>>
>>>>>> BR
>>>>>> nick
>>>>>>
>>>>>>

-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Re: Re: checkpoint delay consume message

Posted by nick toker <ni...@gmail.com>.
Hi,
Hi,  We think  we are using the default values unless we are missing
something.
So this doesn't explain the problem we are facing.
Could you please tell us how to choose synchronous or asynchronous
checkpoints just to be sure we are using the correct configuration ?
BR,
Nick

‫בתאריך יום ה׳, 24 בדצמ׳ 2020 ב-3:36 מאת ‪lec ssmi‬‏ <‪
shicheng31604@gmail.com‬‏>:‬

> Checkpoint can be done synchronously and  asynchronously,  the latter is
> the default .
> If you chooese  the synchronous way , it may cause this problem.
>
> nick toker <ni...@gmail.com> 于2020年12月23日周三 下午3:53写道:
>
>> Hi Yun,
>>
>> Sorry but we didn't understand your questions.
>> The delay we are experiencing is on the *read* side.
>> The message is written to kafka topic and consumed by flink with a delay
>> that depends on the checkpoints interval
>> When we disabled the checkpoints the messages are immediately consumed
>> We use the EXACTLY-ONCE semantic.
>>
>> Please advise.
>> BR,
>> Nick
>>
>> ‫בתאריך יום ג׳, 22 בדצמ׳ 2020 ב-9:32 מאת ‪Yun Gao‬‏ <‪
>> yungao.gy@aliyun.com‬‏>:‬
>>
>>> Hi nick,
>>>
>>>    Sorry I initially think that the data is also write into Kafka with
>>> flink . So it could be ensured that there is no delay in the write side,
>>> right ? Does the delay in the read side keeps existing ?
>>>
>>> Best,
>>>  Yun
>>>
>>>
>>>
>>> ------------------Original Mail ------------------
>>> *Sender:*nick toker <ni...@gmail.com>
>>> *Send Date:*Tue Dec 22 01:43:50 2020
>>> *Recipients:*Yun Gao <yu...@aliyun.com>
>>> *CC:*user <us...@flink.apache.org>
>>> *Subject:*Re: checkpoint delay consume message
>>>
>>>> hi
>>>>
>>>> i am confused
>>>>
>>>> the delay in in the source when reading message not on the sink
>>>>
>>>> nick
>>>>
>>>> ‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪
>>>> yungao.gy@aliyun.com‬‏>:‬
>>>>
>>>>>  Hi Nick,
>>>>>
>>>>>     Are you using EXACTLY_ONCE semantics ? If so the sink would use
>>>>> transactions, and only commit the transaction on checkpoint complete to
>>>>> ensure end-to-end exactly-once. A detailed description could be find in [1]
>>>>>
>>>>>
>>>>> Best,
>>>>>  Yun
>>>>>
>>>>>
>>>>> [1]
>>>>> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> Sender:nick toker<ni...@gmail.com>
>>>>> Date:2020/12/21 23:52:34
>>>>> Recipient:user<us...@flink.apache.org>
>>>>> Theme:checkpoint delay consume message
>>>>>
>>>>> Hello,
>>>>>
>>>>> We noticed the following behavior:
>>>>> If we enable the flink checkpoints, we saw that there is a delay
>>>>> between the time we write a message to the KAFKA topic and the time the
>>>>> flink kafka connector consumes this message.
>>>>> The delay is closely related to checkpointInterval and/or
>>>>> minPauseBetweenCheckpoints meening that the MAX delay when consuming a
>>>>> message from KAFKA will be one of these parameters.
>>>>>
>>>>> Could you please advise how we can remove/control this delay?
>>>>>
>>>>> we use flink 1.11.2
>>>>>
>>>>> BR
>>>>> nick
>>>>>
>>>>>

Re: Re: checkpoint delay consume message

Posted by lec ssmi <sh...@gmail.com>.
Checkpoint can be done synchronously and  asynchronously,  the latter is
the default .
If you chooese  the synchronous way , it may cause this problem.

nick toker <ni...@gmail.com> 于2020年12月23日周三 下午3:53写道:

> Hi Yun,
>
> Sorry but we didn't understand your questions.
> The delay we are experiencing is on the *read* side.
> The message is written to kafka topic and consumed by flink with a delay
> that depends on the checkpoints interval
> When we disabled the checkpoints the messages are immediately consumed
> We use the EXACTLY-ONCE semantic.
>
> Please advise.
> BR,
> Nick
>
> ‫בתאריך יום ג׳, 22 בדצמ׳ 2020 ב-9:32 מאת ‪Yun Gao‬‏ <‪yungao.gy@aliyun.com
> ‬‏>:‬
>
>> Hi nick,
>>
>>    Sorry I initially think that the data is also write into Kafka with
>> flink . So it could be ensured that there is no delay in the write side,
>> right ? Does the delay in the read side keeps existing ?
>>
>> Best,
>>  Yun
>>
>>
>>
>> ------------------Original Mail ------------------
>> *Sender:*nick toker <ni...@gmail.com>
>> *Send Date:*Tue Dec 22 01:43:50 2020
>> *Recipients:*Yun Gao <yu...@aliyun.com>
>> *CC:*user <us...@flink.apache.org>
>> *Subject:*Re: checkpoint delay consume message
>>
>>> hi
>>>
>>> i am confused
>>>
>>> the delay in in the source when reading message not on the sink
>>>
>>> nick
>>>
>>> ‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪
>>> yungao.gy@aliyun.com‬‏>:‬
>>>
>>>>  Hi Nick,
>>>>
>>>>     Are you using EXACTLY_ONCE semantics ? If so the sink would use
>>>> transactions, and only commit the transaction on checkpoint complete to
>>>> ensure end-to-end exactly-once. A detailed description could be find in [1]
>>>>
>>>>
>>>> Best,
>>>>  Yun
>>>>
>>>>
>>>> [1]
>>>> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>>>>
>>>> ------------------------------------------------------------------
>>>> Sender:nick toker<ni...@gmail.com>
>>>> Date:2020/12/21 23:52:34
>>>> Recipient:user<us...@flink.apache.org>
>>>> Theme:checkpoint delay consume message
>>>>
>>>> Hello,
>>>>
>>>> We noticed the following behavior:
>>>> If we enable the flink checkpoints, we saw that there is a delay
>>>> between the time we write a message to the KAFKA topic and the time the
>>>> flink kafka connector consumes this message.
>>>> The delay is closely related to checkpointInterval and/or
>>>> minPauseBetweenCheckpoints meening that the MAX delay when consuming a
>>>> message from KAFKA will be one of these parameters.
>>>>
>>>> Could you please advise how we can remove/control this delay?
>>>>
>>>> we use flink 1.11.2
>>>>
>>>> BR
>>>> nick
>>>>
>>>>

Re: Re: checkpoint delay consume message

Posted by nick toker <ni...@gmail.com>.
Hi Yun,

Sorry but we didn't understand your questions.
The delay we are experiencing is on the *read* side.
The message is written to kafka topic and consumed by flink with a delay
that depends on the checkpoints interval
When we disabled the checkpoints the messages are immediately consumed
We use the EXACTLY-ONCE semantic.

Please advise.
BR,
Nick

‫בתאריך יום ג׳, 22 בדצמ׳ 2020 ב-9:32 מאת ‪Yun Gao‬‏ <‪yungao.gy@aliyun.com
‬‏>:‬

> Hi nick,
>
>    Sorry I initially think that the data is also write into Kafka with
> flink . So it could be ensured that there is no delay in the write side,
> right ? Does the delay in the read side keeps existing ?
>
> Best,
>  Yun
>
>
>
> ------------------Original Mail ------------------
> *Sender:*nick toker <ni...@gmail.com>
> *Send Date:*Tue Dec 22 01:43:50 2020
> *Recipients:*Yun Gao <yu...@aliyun.com>
> *CC:*user <us...@flink.apache.org>
> *Subject:*Re: checkpoint delay consume message
>
>> hi
>>
>> i am confused
>>
>> the delay in in the source when reading message not on the sink
>>
>> nick
>>
>> ‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪
>> yungao.gy@aliyun.com‬‏>:‬
>>
>>>  Hi Nick,
>>>
>>>     Are you using EXACTLY_ONCE semantics ? If so the sink would use
>>> transactions, and only commit the transaction on checkpoint complete to
>>> ensure end-to-end exactly-once. A detailed description could be find in [1]
>>>
>>>
>>> Best,
>>>  Yun
>>>
>>>
>>> [1]
>>> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>>>
>>> ------------------------------------------------------------------
>>> Sender:nick toker<ni...@gmail.com>
>>> Date:2020/12/21 23:52:34
>>> Recipient:user<us...@flink.apache.org>
>>> Theme:checkpoint delay consume message
>>>
>>> Hello,
>>>
>>> We noticed the following behavior:
>>> If we enable the flink checkpoints, we saw that there is a delay between
>>> the time we write a message to the KAFKA topic and the time the flink kafka
>>> connector consumes this message.
>>> The delay is closely related to checkpointInterval and/or
>>> minPauseBetweenCheckpoints meening that the MAX delay when consuming a
>>> message from KAFKA will be one of these parameters.
>>>
>>> Could you please advise how we can remove/control this delay?
>>>
>>> we use flink 1.11.2
>>>
>>> BR
>>> nick
>>>
>>>

Re: Re: checkpoint delay consume message

Posted by Yun Gao <yu...@aliyun.com>.
Hi nick,

   Sorry I initially think that the data is also write into Kafka with flink . So it could be ensured that there is no delay in the write side, right ? Does the delay in the read side keeps existing ?

Best,
 Yun




 ------------------Original Mail ------------------
Sender:nick toker <ni...@gmail.com>
Send Date:Tue Dec 22 01:43:50 2020
Recipients:Yun Gao <yu...@aliyun.com>
CC:user <us...@flink.apache.org>
Subject:Re: checkpoint delay consume message

hi

i am confused

the delay in in the source when reading message not on the sink

nick

‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪yungao.gy@aliyun.com‬‏>:‬

 Hi Nick,

    Are you using EXACTLY_ONCE semantics ? If so the sink would use transactions, and only commit the transaction on checkpoint complete to ensure end-to-end exactly-once. A detailed description could be find in [1]


Best,
 Yun


[1] https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html

------------------------------------------------------------------
Sender:nick toker<ni...@gmail.com>
Date:2020/12/21 23:52:34
Recipient:user<us...@flink.apache.org>
Theme:checkpoint delay consume message

Hello,

We noticed the following behavior:
If we enable the flink checkpoints, we saw that there is a delay between the time we write a message to the KAFKA topic and the time the flink kafka connector consumes this message.
The delay is closely related to checkpointInterval and/or minPauseBetweenCheckpoints meening that the MAX delay when consuming a message from KAFKA will be one of these parameters.

Could you please advise how we can remove/control this delay?

we use flink 1.11.2

BR
nick


Re: checkpoint delay consume message

Posted by nick toker <ni...@gmail.com>.
hi

i am confused

the delay in in the source when reading message not on the sink

nick

‫בתאריך יום ב׳, 21 בדצמ׳ 2020 ב-18:12 מאת ‪Yun Gao‬‏ <‪yungao.gy@aliyun.com
‬‏>:‬

>  Hi Nick,
>
>     Are you using EXACTLY_ONCE semantics ? If so the sink would use
> transactions, and only commit the transaction on checkpoint complete to
> ensure end-to-end exactly-once. A detailed description could be find in [1]
>
>
> Best,
>  Yun
>
>
> [1]
> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>
> ------------------------------------------------------------------
> Sender:nick toker<ni...@gmail.com>
> Date:2020/12/21 23:52:34
> Recipient:user<us...@flink.apache.org>
> Theme:checkpoint delay consume message
>
> Hello,
>
> We noticed the following behavior:
> If we enable the flink checkpoints, we saw that there is a delay between
> the time we write a message to the KAFKA topic and the time the flink kafka
> connector consumes this message.
> The delay is closely related to checkpointInterval and/or
> minPauseBetweenCheckpoints meening that the MAX delay when consuming a
> message from KAFKA will be one of these parameters.
>
> Could you please advise how we can remove/control this delay?
>
> we use flink 1.11.2
>
> BR
> nick
>
>

Re: checkpoint delay consume message

Posted by Yun Gao <yu...@aliyun.com>.
 Hi Nick,

    Are you using EXACTLY_ONCE semantics ? If so the sink would use transactions, and only commit the transaction on checkpoint complete to ensure end-to-end exactly-once. A detailed description could be find in [1]


Best,
 Yun


[1] https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html

------------------------------------------------------------------
Sender:nick toker<ni...@gmail.com>
Date:2020/12/21 23:52:34
Recipient:user<us...@flink.apache.org>
Theme:checkpoint delay consume message

Hello,

We noticed the following behavior:
If we enable the flink checkpoints, we saw that there is a delay between the time we write a message to the KAFKA topic and the time the flink kafka connector consumes this message.
The delay is closely related to checkpointInterval and/or minPauseBetweenCheckpoints meening that the MAX delay when consuming a message from KAFKA will be one of these parameters.

Could you please advise how we can remove/control this delay?

we use flink 1.11.2

BR
nick