You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aeden Jameson <ae...@gmail.com> on 2021/03/04 03:46:31 UTC

pipeline.auto-watermark-interval vs setAutoWatermarkInterval

I'm hoping to have my confusion clarified regarding the settings,

1. pipeline.auto-watermark-interval
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-

2. setAutoWatermarkInterval
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-

I noticed the default value of pipeline.auto-watermark-interval is 0
and according to these docs,
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark,
it states, "If watermark interval is 0ms, the generated watermarks
will be emitted per-record if it is not null and greater than the last
emitted one." However in the documentation for
setAutoWatermarkInterval the value 0 disables watermark emission.

* Are they intended to be the same setting? If not how are they
different? Is one for FlinkSql and the other DataStream API?

-- 
Thank you,
Aeden

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

Posted by Matthias Pohl <ma...@ververica.com>.
Thanks for double-checking Dawid and thanks for clarifying, Jark. I will
leave the Jira issue open as Jark suggested improving the documentation in
that sense.

Best,
Matthias

On Fri, Mar 26, 2021 at 7:43 AM Jark Wu <im...@gmail.com> wrote:

> IIUC, pipeline.auto-watermak-interval = 0 just disable **periodic**
> watermark emission,
>  it doesn't mean the watermark will never be emitted.
> In Table API/SQL, it has the same meaning. If watermark interval = 0, we
> disable periodic watermark emission,
> and emit watermark once it advances.
>
> So I think the SQL documentation is correct.
>
> Best,
> Jark
>
> On Tue, 23 Mar 2021 at 22:29, Dawid Wysakowicz <dw...@apache.org>
> wrote:
>
>> Hey,
>>
>> I would like to double check this with Jark and/or Timo. As far as
>> DataStream is concerned the javadoc is correct. Moreover the
>> pipeline.auto-watermak-interval and setAutoWatermarkInterval are
>> effectively the same setting/option. However I am not sure if Table API
>> interprets it in the same way as DataStream APi. The documentation you
>> linked, Aeden, describes the SQL API.
>>
>> @Jark @Timo Could you verify if the SQL documentation is correct?
>>
>> Best,
>>
>> Dawid
>> On 23/03/2021 15:20, Matthias Pohl wrote:
>>
>> Hi Aeden,
>> sorry for the late reply. I looked through the code and verified that the
>> JavaDoc is correct. Setting pipeline.auto-watermark-interval to 0 will
>> disable the automatic watermark generation. I created FLINK-21931 [1] to
>> cover this.
>>
>> Thanks,
>> Matthias
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-21931
>>
>> On Thu, Mar 4, 2021 at 9:53 PM Aeden Jameson <ae...@gmail.com>
>> wrote:
>>
>>> Correction: The first link was supposed to be,
>>>
>>> 1. pipeline.auto-watermark-interval
>>>
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval
>>>
>>> On Wed, Mar 3, 2021 at 7:46 PM Aeden Jameson <ae...@gmail.com>
>>> wrote:
>>> >
>>> > I'm hoping to have my confusion clarified regarding the settings,
>>> >
>>> > 1. pipeline.auto-watermark-interval
>>> >
>>> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>>> >
>>> > 2. setAutoWatermarkInterval
>>> >
>>> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>>> >
>>> > I noticed the default value of pipeline.auto-watermark-interval is 0
>>> > and according to these docs,
>>> >
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark
>>> ,
>>> > it states, "If watermark interval is 0ms, the generated watermarks
>>> > will be emitted per-record if it is not null and greater than the last
>>> > emitted one." However in the documentation for
>>> > setAutoWatermarkInterval the value 0 disables watermark emission.
>>> >
>>> > * Are they intended to be the same setting? If not how are they
>>> > different? Is one for FlinkSql and the other DataStream API?
>>> >
>>> > --
>>> > Thank you,
>>> > Aeden
>>
>>

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

Posted by Jark Wu <im...@gmail.com>.
IIUC, pipeline.auto-watermak-interval = 0 just disable **periodic**
watermark emission,
 it doesn't mean the watermark will never be emitted.
In Table API/SQL, it has the same meaning. If watermark interval = 0, we
disable periodic watermark emission,
and emit watermark once it advances.

So I think the SQL documentation is correct.

Best,
Jark

On Tue, 23 Mar 2021 at 22:29, Dawid Wysakowicz <dw...@apache.org>
wrote:

> Hey,
>
> I would like to double check this with Jark and/or Timo. As far as
> DataStream is concerned the javadoc is correct. Moreover the
> pipeline.auto-watermak-interval and setAutoWatermarkInterval are
> effectively the same setting/option. However I am not sure if Table API
> interprets it in the same way as DataStream APi. The documentation you
> linked, Aeden, describes the SQL API.
>
> @Jark @Timo Could you verify if the SQL documentation is correct?
>
> Best,
>
> Dawid
> On 23/03/2021 15:20, Matthias Pohl wrote:
>
> Hi Aeden,
> sorry for the late reply. I looked through the code and verified that the
> JavaDoc is correct. Setting pipeline.auto-watermark-interval to 0 will
> disable the automatic watermark generation. I created FLINK-21931 [1] to
> cover this.
>
> Thanks,
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-21931
>
> On Thu, Mar 4, 2021 at 9:53 PM Aeden Jameson <ae...@gmail.com>
> wrote:
>
>> Correction: The first link was supposed to be,
>>
>> 1. pipeline.auto-watermark-interval
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval
>>
>> On Wed, Mar 3, 2021 at 7:46 PM Aeden Jameson <ae...@gmail.com>
>> wrote:
>> >
>> > I'm hoping to have my confusion clarified regarding the settings,
>> >
>> > 1. pipeline.auto-watermark-interval
>> >
>> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>> >
>> > 2. setAutoWatermarkInterval
>> >
>> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>> >
>> > I noticed the default value of pipeline.auto-watermark-interval is 0
>> > and according to these docs,
>> >
>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark
>> ,
>> > it states, "If watermark interval is 0ms, the generated watermarks
>> > will be emitted per-record if it is not null and greater than the last
>> > emitted one." However in the documentation for
>> > setAutoWatermarkInterval the value 0 disables watermark emission.
>> >
>> > * Are they intended to be the same setting? If not how are they
>> > different? Is one for FlinkSql and the other DataStream API?
>> >
>> > --
>> > Thank you,
>> > Aeden
>
>

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

Posted by Dawid Wysakowicz <dw...@apache.org>.
Hey,

I would like to double check this with Jark and/or Timo. As far as
DataStream is concerned the javadoc is correct. Moreover the
pipeline.auto-watermak-interval and setAutoWatermarkInterval are
effectively the same setting/option. However I am not sure if Table API
interprets it in the same way as DataStream APi. The documentation you
linked, Aeden, describes the SQL API.

@Jark @Timo Could you verify if the SQL documentation is correct?

Best,

Dawid

On 23/03/2021 15:20, Matthias Pohl wrote:
> Hi Aeden,
> sorry for the late reply. I looked through the code and verified that
> the JavaDoc is correct. Setting pipeline.auto-watermark-interval to 0
> will disable the automatic watermark generation. I created FLINK-21931
> [1] to cover this.
>
> Thanks,
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-21931
> <https://issues.apache.org/jira/browse/FLINK-21931>
>
> On Thu, Mar 4, 2021 at 9:53 PM Aeden Jameson <aeden.jameson@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Correction: The first link was supposed to be,
>
>     1. pipeline.auto-watermark-interval
>     https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval>
>
>     On Wed, Mar 3, 2021 at 7:46 PM Aeden Jameson
>     <aeden.jameson@gmail.com <ma...@gmail.com>> wrote:
>     >
>     > I'm hoping to have my confusion clarified regarding the settings,
>     >
>     > 1. pipeline.auto-watermark-interval
>     >
>     https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>     <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long->
>     >
>     > 2. setAutoWatermarkInterval
>     >
>     https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>     <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long->
>     >
>     > I noticed the default value of pipeline.auto-watermark-interval is 0
>     > and according to these docs,
>     >
>     https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark
>     <https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark>,
>     > it states, "If watermark interval is 0ms, the generated watermarks
>     > will be emitted per-record if it is not null and greater than
>     the last
>     > emitted one." However in the documentation for
>     > setAutoWatermarkInterval the value 0 disables watermark emission.
>     >
>     > * Are they intended to be the same setting? If not how are they
>     > different? Is one for FlinkSql and the other DataStream API?
>     >
>     > --
>     > Thank you,
>     > Aeden
>

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

Posted by Matthias Pohl <ma...@ververica.com>.
Hi Aeden,
sorry for the late reply. I looked through the code and verified that the
JavaDoc is correct. Setting pipeline.auto-watermark-interval to 0 will
disable the automatic watermark generation. I created FLINK-21931 [1] to
cover this.

Thanks,
Matthias

[1] https://issues.apache.org/jira/browse/FLINK-21931

On Thu, Mar 4, 2021 at 9:53 PM Aeden Jameson <ae...@gmail.com>
wrote:

> Correction: The first link was supposed to be,
>
> 1. pipeline.auto-watermark-interval
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval
>
> On Wed, Mar 3, 2021 at 7:46 PM Aeden Jameson <ae...@gmail.com>
> wrote:
> >
> > I'm hoping to have my confusion clarified regarding the settings,
> >
> > 1. pipeline.auto-watermark-interval
> >
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
> >
> > 2. setAutoWatermarkInterval
> >
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
> >
> > I noticed the default value of pipeline.auto-watermark-interval is 0
> > and according to these docs,
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark
> ,
> > it states, "If watermark interval is 0ms, the generated watermarks
> > will be emitted per-record if it is not null and greater than the last
> > emitted one." However in the documentation for
> > setAutoWatermarkInterval the value 0 disables watermark emission.
> >
> > * Are they intended to be the same setting? If not how are they
> > different? Is one for FlinkSql and the other DataStream API?
> >
> > --
> > Thank you,
> > Aeden

Re: pipeline.auto-watermark-interval vs setAutoWatermarkInterval

Posted by Aeden Jameson <ae...@gmail.com>.
Correction: The first link was supposed to be,

1. pipeline.auto-watermark-interval
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#pipeline-auto-watermark-interval

On Wed, Mar 3, 2021 at 7:46 PM Aeden Jameson <ae...@gmail.com> wrote:
>
> I'm hoping to have my confusion clarified regarding the settings,
>
> 1. pipeline.auto-watermark-interval
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>
> 2. setAutoWatermarkInterval
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/ExecutionConfig.html#setAutoWatermarkInterval-long-
>
> I noticed the default value of pipeline.auto-watermark-interval is 0
> and according to these docs,
> https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/create.html#watermark,
> it states, "If watermark interval is 0ms, the generated watermarks
> will be emitted per-record if it is not null and greater than the last
> emitted one." However in the documentation for
> setAutoWatermarkInterval the value 0 disables watermark emission.
>
> * Are they intended to be the same setting? If not how are they
> different? Is one for FlinkSql and the other DataStream API?
>
> --
> Thank you,
> Aeden