You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by cy <ca...@126.com> on 2021/12/23 01:51:34 UTC

Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2 Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 

Re:Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
Hi
Yes, data income from a topic every five minutes with same value of datatime like this.













At 2021-12-24 12:22:11, "Martijn Visser" <ma...@ververica.com> wrote:

I agree that it is suspicious, but I would like to understand better how your incoming data looks like, especially the values for datatime. Do these values change or do they always stay the same value? 


On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:

Hi
picture1: my schema
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;
picture2: result on 1.13.5
picure3: result on 1.14.2




Need your help, thank you.







At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:

Hi,


Based on the screenshot of your source data, all events have the same value for `datatime`. Is that indeed correct? 


Best regards,


Martijn


On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:

Hi,


I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. 
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;


watermark definition is
watermark for `datatime` as `datatime` - interval '5' minutes


so I don't know what's wrong about it.







At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:

Sorry I mean 16:00:05, but it should be similar.


------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,


I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?


Best,
Yun




------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 





 





 





 

Re: Re: Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Martijn Visser <ma...@ververica.com>.
Hi Cy,

1. Yes, that should be the case
2. Have/can you checked the Web UI if the watermark has progressed after
that time?
3. If I understand you correctly, you have elements arriving every 5
minutes. Let's say you have elements arriving with datatime 16:00:05 and 5
minutes later with datatime 16:05:05. Given that you've set the INTERVAL to
5 minutes and no new elements will arrive until another 5 minutes have
passed with datatime 16:10:05, only then Flink knows that the window has
passed and it should start outputting data. From your screenshots, I can't
see how long you've left the query running. Did enough time pass?

Best regards,

Martijn

On Mon, 27 Dec 2021 at 04:03, cy <ca...@126.com> wrote:

> Hi
> I agree that when watermark will not progress if datatime value stays the
> same, but I'm not sure and confused that
> 1. Are elements allocated to the window? For example, when two elements of
> datatime 16:00:05 arrives, I think window 16:00~16:05 shoud have two
> elements?
> 2. Why watermark still not progress when elements of datatime 16:15:05 or
> later arrive?
> 3. If watermark is progress when elements of datatime 16:15:05 or later
> arrive, why there is no output?
>
> Need your help, thank you.
>
>
>
>
>
> At 2021-12-24 15:18:27, "Martijn Visser" <ma...@ververica.com> wrote:
>
> Hi,
>
> If the datatime value stays the same, the watermark doesn't progress. See
> [1] specifically "If the current watermark is still identical to the
> previous one, or is null, or the value of the returned watermark is smaller
> than that of the last emitted one, then no new watermark will be emitted."
>
> I'm wondering if in your previous Flink version you've changed settings to
> either auto-generate watermark in case there's no progression or some of
> the other settings which might explain the difference in behaviour between
> the two Flink versions.
>
> Best regards,
>
> Martijn
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#watermark
>
>
> On Fri, 24 Dec 2021 at 07:10, cy <ca...@126.com> wrote:
>
>> Hi
>> Yes, data income from a topic every five minutes with same value of
>> datatime like this and value changed in next five minute period. In each
>> period datatime stays same value.
>>
>>
>>
>>
>>
>> At 2021-12-24 12:22:11, "Martijn Visser" <ma...@ververica.com> wrote:
>>
>> I agree that it is suspicious, but I would like to understand better how
>> your incoming data looks like, especially the values for datatime. Do these
>> values change or do they always stay the same value?
>>
>> On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:
>>
>>> Hi
>>> picture1: my schema
>>> sql:
>>> SELECT window_start, window_end, COUNT(*)
>>> FROM TABLE(
>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
>>> ) GROUP BY window_start, window_end;
>>> picture2: result on 1.13.5
>>> picure3: result on 1.14.2
>>>
>>>
>>> Need your help, thank you.
>>>
>>>
>>>
>>> At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:
>>>
>>> Hi,
>>>
>>> Based on the screenshot of your source data, all events have the same
>>> value for `datatime`. Is that indeed correct?
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I execute the sql on 1.13.5 and the result is correct but no result
>>>> show on 1.14.0 and 1.14.2.
>>>> sql:
>>>> SELECT window_start, window_end, COUNT(*)
>>>> FROM TABLE(
>>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
>>>> ) GROUP BY window_start, window_end;
>>>>
>>>> watermark definition is
>>>> watermark for `datatime` as `datatime` - interval '5' minutes
>>>>
>>>> so I don't know what's wrong about it.
>>>>
>>>>
>>>>
>>>> At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:
>>>>
>>>> Sorry I mean 16:00:05, but it should be similar.
>>>>
>>>> ------------------Original Mail ------------------
>>>>
>>>> *Sender:*Yun Gao <yu...@aliyun.com>
>>>> *Send Date:*Thu Dec 23 17:05:33 2021
>>>> *Recipients:*cy <ca...@126.com>
>>>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>>>> *Subject:*Re: Re:Re: Window Aggregation and Window Join ability not
>>>> work properly
>>>>
>>>>> Hi Caiyi,
>>>>>
>>>>> I think if the image shows all the records, after the change we should
>>>>> only have
>>>>> the watermark at 16:05, which is still not be able to trigger the
>>>>> window of 5 minutes?
>>>>>
>>>>> Best,
>>>>> Yun
>>>>>
>>>>>
>>>>> ------------------Original Mail ------------------
>>>>> *Sender:*cy <ca...@126.com>
>>>>> *Send Date:*Thu Dec 23 15:44:23 2021
>>>>> *Recipients:*Yun Gao <yu...@aliyun.com>
>>>>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>>>>> *Subject:*Re:Re: Window Aggregation and Window Join ability not work
>>>>> properly
>>>>>
>>>>>> I change to
>>>>>> watermark for `datatime` as `datatime` - interval '1' second
>>>>>> or
>>>>>> watermark for `datatime` as `datatime`
>>>>>> but is still not work.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
>>>>>>
>>>>>> Hi Caiyi,
>>>>>>
>>>>>> The window need to be finalized with watermark[1]. I noticed that the
>>>>>> watermark defined
>>>>>> is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted
>>>>>> would be the
>>>>>> maximum observed timestamp so far minus 5 minutes [1]. Therefore, if
>>>>>> we want to
>>>>>> trigger the window of 16:00 ~ 16:05, there should be one record with
>>>>>> datetime >= 16:10
>>>>>> get processed, then the source would emit the watermark >= 16:05 to
>>>>>> finalize the window,
>>>>>> otherwise there would be no output due to no window is finalized.
>>>>>>
>>>>>> Best,
>>>>>> Yun
>>>>>>
>>>>>>
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
>>>>>>
>>>>>> ------------------Original Mail ------------------
>>>>>> *Sender:*cy <ca...@126.com>
>>>>>> *Send Date:*Thu Dec 23 09:52:40 2021
>>>>>> *Recipients:*'user@flink.apache.org' <us...@flink.apache.org>
>>>>>> *Subject:*Window Aggregation and Window Join ability not work
>>>>>> properly
>>>>>>
>>>>>>> Hi
>>>>>>> Flink 1.14.2Scala 2.12
>>>>>>>
>>>>>>> I'm using flink sql window aggregation and window join ability,  I
>>>>>>> write the sql as documentation said but is not work. Here is my schema and
>>>>>>> sql
>>>>>>> schema:
>>>>>>>
>>>>>>> sql;
>>>>>>> SELECT window_start, window_end, COUNT(*)
>>>>>>> FROM TABLE(
>>>>>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>>>>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
>>>>>>> GROUP BY window_start, window_end;
>>>>>>>
>>>>>>> result:
>>>>>>>
>>>>>>> data:
>>>>>>>
>>>>>>> Similar with window join ability.
>>>>>>>
>>>>>>> So is anything wrong with this two abilities?
>>>>>>>
>>>>>>> Need your help, thank you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
>

Re:Re: Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
Hi
I agree that when watermark will not progress if datatime value stays the same, but I'm not sure and confused that
1. Are elements allocated to the window? For example, when two elements of datatime 16:00:05 arrives, I think window 16:00~16:05 shoud have two elements?
2. Why watermark still not progress when elements of datatime 16:15:05 or later arrive?
3. If watermark is progress when elements of datatime 16:15:05 or later arrive, why there is no output?


Need your help, thank you.













At 2021-12-24 15:18:27, "Martijn Visser" <ma...@ververica.com> wrote:

Hi,


If the datatime value stays the same, the watermark doesn't progress. See [1] specifically "If the current watermark is still identical to the previous one, or is null, or the value of the returned watermark is smaller than that of the last emitted one, then no new watermark will be emitted." 


I'm wondering if in your previous Flink version you've changed settings to either auto-generate watermark in case there's no progression or some of the other settings which might explain the difference in behaviour between the two Flink versions.


Best regards,


Martijn



[1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#watermark




On Fri, 24 Dec 2021 at 07:10, cy <ca...@126.com> wrote:

Hi
Yes, data income from a topic every five minutes with same value of datatime like this and value changed in next five minute period. In each period datatime stays same value.













At 2021-12-24 12:22:11, "Martijn Visser" <ma...@ververica.com> wrote:

I agree that it is suspicious, but I would like to understand better how your incoming data looks like, especially the values for datatime. Do these values change or do they always stay the same value? 


On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:

Hi
picture1: my schema
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;
picture2: result on 1.13.5
picure3: result on 1.14.2




Need your help, thank you.







At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:

Hi,


Based on the screenshot of your source data, all events have the same value for `datatime`. Is that indeed correct? 


Best regards,


Martijn


On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:

Hi,


I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. 
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;


watermark definition is
watermark for `datatime` as `datatime` - interval '5' minutes


so I don't know what's wrong about it.







At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:

Sorry I mean 16:00:05, but it should be similar.


------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,


I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?


Best,
Yun




------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 





 





 





 





 





 

Re: Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Martijn Visser <ma...@ververica.com>.
Hi,

If the datatime value stays the same, the watermark doesn't progress. See
[1] specifically "If the current watermark is still identical to the
previous one, or is null, or the value of the returned watermark is smaller
than that of the last emitted one, then no new watermark will be emitted."

I'm wondering if in your previous Flink version you've changed settings to
either auto-generate watermark in case there's no progression or some of
the other settings which might explain the difference in behaviour between
the two Flink versions.

Best regards,

Martijn

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#watermark


On Fri, 24 Dec 2021 at 07:10, cy <ca...@126.com> wrote:

> Hi
> Yes, data income from a topic every five minutes with same value of
> datatime like this and value changed in next five minute period. In each
> period datatime stays same value.
>
>
>
>
>
> At 2021-12-24 12:22:11, "Martijn Visser" <ma...@ververica.com> wrote:
>
> I agree that it is suspicious, but I would like to understand better how
> your incoming data looks like, especially the values for datatime. Do these
> values change or do they always stay the same value?
>
> On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:
>
>> Hi
>> picture1: my schema
>> sql:
>> SELECT window_start, window_end, COUNT(*)
>> FROM TABLE(
>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
>> ) GROUP BY window_start, window_end;
>> picture2: result on 1.13.5
>> picure3: result on 1.14.2
>>
>>
>> Need your help, thank you.
>>
>>
>>
>> At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:
>>
>> Hi,
>>
>> Based on the screenshot of your source data, all events have the same
>> value for `datatime`. Is that indeed correct?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:
>>
>>> Hi,
>>>
>>> I execute the sql on 1.13.5 and the result is correct but no result show
>>> on 1.14.0 and 1.14.2.
>>> sql:
>>> SELECT window_start, window_end, COUNT(*)
>>> FROM TABLE(
>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
>>> ) GROUP BY window_start, window_end;
>>>
>>> watermark definition is
>>> watermark for `datatime` as `datatime` - interval '5' minutes
>>>
>>> so I don't know what's wrong about it.
>>>
>>>
>>>
>>> At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:
>>>
>>> Sorry I mean 16:00:05, but it should be similar.
>>>
>>> ------------------Original Mail ------------------
>>>
>>> *Sender:*Yun Gao <yu...@aliyun.com>
>>> *Send Date:*Thu Dec 23 17:05:33 2021
>>> *Recipients:*cy <ca...@126.com>
>>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>>> *Subject:*Re: Re:Re: Window Aggregation and Window Join ability not
>>> work properly
>>>
>>>> Hi Caiyi,
>>>>
>>>> I think if the image shows all the records, after the change we should
>>>> only have
>>>> the watermark at 16:05, which is still not be able to trigger the
>>>> window of 5 minutes?
>>>>
>>>> Best,
>>>> Yun
>>>>
>>>>
>>>> ------------------Original Mail ------------------
>>>> *Sender:*cy <ca...@126.com>
>>>> *Send Date:*Thu Dec 23 15:44:23 2021
>>>> *Recipients:*Yun Gao <yu...@aliyun.com>
>>>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>>>> *Subject:*Re:Re: Window Aggregation and Window Join ability not work
>>>> properly
>>>>
>>>>> I change to
>>>>> watermark for `datatime` as `datatime` - interval '1' second
>>>>> or
>>>>> watermark for `datatime` as `datatime`
>>>>> but is still not work.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
>>>>>
>>>>> Hi Caiyi,
>>>>>
>>>>> The window need to be finalized with watermark[1]. I noticed that the
>>>>> watermark defined
>>>>> is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted
>>>>> would be the
>>>>> maximum observed timestamp so far minus 5 minutes [1]. Therefore, if
>>>>> we want to
>>>>> trigger the window of 16:00 ~ 16:05, there should be one record with
>>>>> datetime >= 16:10
>>>>> get processed, then the source would emit the watermark >= 16:05 to
>>>>> finalize the window,
>>>>> otherwise there would be no output due to no window is finalized.
>>>>>
>>>>> Best,
>>>>> Yun
>>>>>
>>>>>
>>>>> [1]
>>>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
>>>>>
>>>>> ------------------Original Mail ------------------
>>>>> *Sender:*cy <ca...@126.com>
>>>>> *Send Date:*Thu Dec 23 09:52:40 2021
>>>>> *Recipients:*'user@flink.apache.org' <us...@flink.apache.org>
>>>>> *Subject:*Window Aggregation and Window Join ability not work properly
>>>>>
>>>>>> Hi
>>>>>> Flink 1.14.2Scala 2.12
>>>>>>
>>>>>> I'm using flink sql window aggregation and window join ability,  I
>>>>>> write the sql as documentation said but is not work. Here is my schema and
>>>>>> sql
>>>>>> schema:
>>>>>>
>>>>>> sql;
>>>>>> SELECT window_start, window_end, COUNT(*)
>>>>>> FROM TABLE(
>>>>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>>>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
>>>>>> GROUP BY window_start, window_end;
>>>>>>
>>>>>> result:
>>>>>>
>>>>>> data:
>>>>>>
>>>>>> Similar with window join ability.
>>>>>>
>>>>>> So is anything wrong with this two abilities?
>>>>>>
>>>>>> Need your help, thank you.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
>
>
>
>
>

Re:Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
Hi
Yes, data income from a topic every five minutes with same value of datatime like this and value changed in next five minute period. In each period datatime stays same value.













At 2021-12-24 12:22:11, "Martijn Visser" <ma...@ververica.com> wrote:

I agree that it is suspicious, but I would like to understand better how your incoming data looks like, especially the values for datatime. Do these values change or do they always stay the same value? 


On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:

Hi
picture1: my schema
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;
picture2: result on 1.13.5
picure3: result on 1.14.2




Need your help, thank you.







At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:

Hi,


Based on the screenshot of your source data, all events have the same value for `datatime`. Is that indeed correct? 


Best regards,


Martijn


On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:

Hi,


I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. 
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;


watermark definition is
watermark for `datatime` as `datatime` - interval '5' minutes


so I don't know what's wrong about it.







At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:

Sorry I mean 16:00:05, but it should be similar.


------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,


I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?


Best,
Yun




------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 





 





 





 





 

Re: Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Martijn Visser <ma...@ververica.com>.
I agree that it is suspicious, but I would like to understand better how
your incoming data looks like, especially the values for datatime. Do these
values change or do they always stay the same value?

On Fri, 24 Dec 2021 at 02:44, cy <ca...@126.com> wrote:

> Hi
> picture1: my schema
> sql:
> SELECT window_start, window_end, COUNT(*)
> FROM TABLE(
>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
> ) GROUP BY window_start, window_end;
> picture2: result on 1.13.5
> picure3: result on 1.14.2
>
>
> Need your help, thank you.
>
>
>
> At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:
>
> Hi,
>
> Based on the screenshot of your source data, all events have the same
> value for `datatime`. Is that indeed correct?
>
> Best regards,
>
> Martijn
>
> On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:
>
>> Hi,
>>
>> I execute the sql on 1.13.5 and the result is correct but no result show
>> on 1.14.0 and 1.14.2.
>> sql:
>> SELECT window_start, window_end, COUNT(*)
>> FROM TABLE(
>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
>> ) GROUP BY window_start, window_end;
>>
>> watermark definition is
>> watermark for `datatime` as `datatime` - interval '5' minutes
>>
>> so I don't know what's wrong about it.
>>
>>
>>
>> At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:
>>
>> Sorry I mean 16:00:05, but it should be similar.
>>
>> ------------------Original Mail ------------------
>>
>> *Sender:*Yun Gao <yu...@aliyun.com>
>> *Send Date:*Thu Dec 23 17:05:33 2021
>> *Recipients:*cy <ca...@126.com>
>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>> *Subject:*Re: Re:Re: Window Aggregation and Window Join ability not work
>> properly
>>
>>> Hi Caiyi,
>>>
>>> I think if the image shows all the records, after the change we should
>>> only have
>>> the watermark at 16:05, which is still not be able to trigger the window
>>> of 5 minutes?
>>>
>>> Best,
>>> Yun
>>>
>>>
>>> ------------------Original Mail ------------------
>>> *Sender:*cy <ca...@126.com>
>>> *Send Date:*Thu Dec 23 15:44:23 2021
>>> *Recipients:*Yun Gao <yu...@aliyun.com>
>>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>>> *Subject:*Re:Re: Window Aggregation and Window Join ability not work
>>> properly
>>>
>>>> I change to
>>>> watermark for `datatime` as `datatime` - interval '1' second
>>>> or
>>>> watermark for `datatime` as `datatime`
>>>> but is still not work.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
>>>>
>>>> Hi Caiyi,
>>>>
>>>> The window need to be finalized with watermark[1]. I noticed that the
>>>> watermark defined
>>>> is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted
>>>> would be the
>>>> maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we
>>>> want to
>>>> trigger the window of 16:00 ~ 16:05, there should be one record with
>>>> datetime >= 16:10
>>>> get processed, then the source would emit the watermark >= 16:05 to
>>>> finalize the window,
>>>> otherwise there would be no output due to no window is finalized.
>>>>
>>>> Best,
>>>> Yun
>>>>
>>>>
>>>> [1]
>>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
>>>>
>>>> ------------------Original Mail ------------------
>>>> *Sender:*cy <ca...@126.com>
>>>> *Send Date:*Thu Dec 23 09:52:40 2021
>>>> *Recipients:*'user@flink.apache.org' <us...@flink.apache.org>
>>>> *Subject:*Window Aggregation and Window Join ability not work properly
>>>>
>>>>> Hi
>>>>> Flink 1.14.2Scala 2.12
>>>>>
>>>>> I'm using flink sql window aggregation and window join ability,  I
>>>>> write the sql as documentation said but is not work. Here is my schema and
>>>>> sql
>>>>> schema:
>>>>>
>>>>> sql;
>>>>> SELECT window_start, window_end, COUNT(*)
>>>>> FROM TABLE(
>>>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
>>>>> GROUP BY window_start, window_end;
>>>>>
>>>>> result:
>>>>>
>>>>> data:
>>>>>
>>>>> Similar with window join ability.
>>>>>
>>>>> So is anything wrong with this two abilities?
>>>>>
>>>>> Need your help, thank you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
>

Re:Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
Hi
picture1: my schema
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;
picture2: result on 1.13.5
picure3: result on 1.14.2




Need your help, thank you.







At 2021-12-23 18:27:32, "Martijn Visser" <ma...@ververica.com> wrote:

Hi,


Based on the screenshot of your source data, all events have the same value for `datatime`. Is that indeed correct? 


Best regards,


Martijn


On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:

Hi,


I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. 
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;


watermark definition is
watermark for `datatime` as `datatime` - interval '5' minutes


so I don't know what's wrong about it.







At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:

Sorry I mean 16:00:05, but it should be similar.


------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,


I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?


Best,
Yun




------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 





 





 

Re: Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Martijn Visser <ma...@ververica.com>.
Hi,

Based on the screenshot of your source data, all events have the same value
for `datatime`. Is that indeed correct?

Best regards,

Martijn

On Thu, 23 Dec 2021 at 10:16, cy <ca...@126.com> wrote:

> Hi,
>
> I execute the sql on 1.13.5 and the result is correct but no result show
> on 1.14.0 and 1.14.2.
> sql:
> SELECT window_start, window_end, COUNT(*)
> FROM TABLE(
>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
> DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
> ) GROUP BY window_start, window_end;
>
> watermark definition is
> watermark for `datatime` as `datatime` - interval '5' minutes
>
> so I don't know what's wrong about it.
>
>
>
> At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:
>
> Sorry I mean 16:00:05, but it should be similar.
>
> ------------------Original Mail ------------------
>
> *Sender:*Yun Gao <yu...@aliyun.com>
> *Send Date:*Thu Dec 23 17:05:33 2021
> *Recipients:*cy <ca...@126.com>
> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
> *Subject:*Re: Re:Re: Window Aggregation and Window Join ability not work
> properly
>
>> Hi Caiyi,
>>
>> I think if the image shows all the records, after the change we should
>> only have
>> the watermark at 16:05, which is still not be able to trigger the window
>> of 5 minutes?
>>
>> Best,
>> Yun
>>
>>
>> ------------------Original Mail ------------------
>> *Sender:*cy <ca...@126.com>
>> *Send Date:*Thu Dec 23 15:44:23 2021
>> *Recipients:*Yun Gao <yu...@aliyun.com>
>> *CC:*'user@flink.apache.org' <us...@flink.apache.org>
>> *Subject:*Re:Re: Window Aggregation and Window Join ability not work
>> properly
>>
>>> I change to
>>> watermark for `datatime` as `datatime` - interval '1' second
>>> or
>>> watermark for `datatime` as `datatime`
>>> but is still not work.
>>>
>>>
>>>
>>>
>>>
>>>
>>> At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
>>>
>>> Hi Caiyi,
>>>
>>> The window need to be finalized with watermark[1]. I noticed that the
>>> watermark defined
>>> is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted
>>> would be the
>>> maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we
>>> want to
>>> trigger the window of 16:00 ~ 16:05, there should be one record with
>>> datetime >= 16:10
>>> get processed, then the source would emit the watermark >= 16:05 to
>>> finalize the window,
>>> otherwise there would be no output due to no window is finalized.
>>>
>>> Best,
>>> Yun
>>>
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
>>>
>>> ------------------Original Mail ------------------
>>> *Sender:*cy <ca...@126.com>
>>> *Send Date:*Thu Dec 23 09:52:40 2021
>>> *Recipients:*'user@flink.apache.org' <us...@flink.apache.org>
>>> *Subject:*Window Aggregation and Window Join ability not work properly
>>>
>>>> Hi
>>>> Flink 1.14.2Scala 2.12
>>>>
>>>> I'm using flink sql window aggregation and window join ability,  I
>>>> write the sql as documentation said but is not work. Here is my schema and
>>>> sql
>>>> schema:
>>>>
>>>> sql;
>>>> SELECT window_start, window_end, COUNT(*)
>>>> FROM TABLE(
>>>>     TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`,
>>>> DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
>>>> GROUP BY window_start, window_end;
>>>>
>>>> result:
>>>>
>>>> data:
>>>>
>>>> Similar with window join ability.
>>>>
>>>> So is anything wrong with this two abilities?
>>>>
>>>> Need your help, thank you.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>
>
>

Re:Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
Hi,


I execute the sql on 1.13.5 and the result is correct but no result show on 1.14.0 and 1.14.2. 
sql:
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES)
) GROUP BY window_start, window_end;


watermark definition is
watermark for `datatime` as `datatime` - interval '5' minutes


so I don't know what's wrong about it.







At 2021-12-23 17:06:35, "Yun Gao" <yu...@aliyun.com> wrote:

Sorry I mean 16:00:05, but it should be similar.


------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly
Hi Caiyi,


I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?


Best,
Yun




------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 





 

Re: Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Yun Gao <yu...@aliyun.com>.
Sorry I mean 16:00:05, but it should be similar.

------------------Original Mail ------------------

Sender:Yun Gao <yu...@aliyun.com>
Send Date:Thu Dec 23 17:05:33 2021
Recipients:cy <ca...@126.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re: Re:Re: Window Aggregation and Window Join ability not work properly

Hi Caiyi,

I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?

Best,
Yun



 ------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly

I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.





At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
Hi Caiyi,

The window need to be finalized with watermark[1]. I noticed that the watermark defined
 is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.

Best,
Yun


[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
 ------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12

I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;

result:


data:



Similar with window join ability.

So is anything wrong with this two abilities?

Need your help, thank you.













Re: Re:Re: Window Aggregation and Window Join ability not work properly

Posted by Yun Gao <yu...@aliyun.com>.
Hi Caiyi,

I think if the image shows all the records, after the change we should only have
the watermark at 16:05, which is still not be able to trigger the window of 5 minutes?

Best,
Yun



 ------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 15:44:23 2021
Recipients:Yun Gao <yu...@aliyun.com>
CC:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Re:Re: Window Aggregation and Window Join ability not work properly

I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.





At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:
Hi Caiyi,

The window need to be finalized with watermark[1]. I noticed that the watermark defined
 is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.

Best,
Yun


[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
 ------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12

I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;

result:


data:



Similar with window join ability.

So is anything wrong with this two abilities?

Need your help, thank you.













Re:Re: Window Aggregation and Window Join ability not work properly

Posted by cy <ca...@126.com>.
I change to 
watermark for `datatime` as `datatime` - interval '1' second
or
watermark for `datatime` as `datatime`
but is still not work.
















At 2021-12-23 15:16:20, "Yun Gao" <yu...@aliyun.com> wrote:

Hi Caiyi,


The window need to be finalized with watermark[1]. I noticed that the watermark defined
is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.


Best,
Yun




[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12


I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;


result:


data:


Similar with window join ability.


So is anything wrong with this two abilities?


Need your help, thank you.








 





 





 





 

Re: Window Aggregation and Window Join ability not work properly

Posted by Yun Gao <yu...@aliyun.com>.
Hi Caiyi,

The window need to be finalized with watermark[1]. I noticed that the watermark defined
 is `datatime` - INTERVAL '5' MINUTE,  it means the watermark emitted would be the 
maximum observed timestamp so far minus 5 minutes [1]. Therefore, if we want to 
trigger the window of 16:00 ~ 16:05, there should be one record with datetime >= 16:10 
get processed, then the source would emit the watermark >= 16:05 to finalize the window,
otherwise there would be no output due to no window is finalized.

Best,
Yun


[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/create/#watermark
 ------------------Original Mail ------------------
Sender:cy <ca...@126.com>
Send Date:Thu Dec 23 09:52:40 2021
Recipients:'user@flink.apache.org' <us...@flink.apache.org>
Subject:Window Aggregation and Window Join ability not work properly

Hi
Flink 1.14.2Scala 2.12

I'm using flink sql window aggregation and window join ability,  I write the sql as documentation said but is not work. Here is my schema and sql
schema:


sql;
SELECT window_start, window_end, COUNT(*)
FROM TABLE(
    TUMBLE(TABLE `origin`.`queue_3_ads_ccops_sdn_vrs_result`, DESCRIPTOR(datatime), INTERVAL '5' MINUTES))
GROUP BY window_start, window_end;

result:


data:



Similar with window join ability.

So is anything wrong with this two abilities?

Need your help, thank you.