You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Shawn Liu (Jira)" <ji...@apache.org> on 2022/04/28 23:00:00 UTC

[jira] [Comment Edited] (FLINK-21301) Decouple window aggregate allow lateness with state ttl configuration

    [ https://issues.apache.org/jira/browse/FLINK-21301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17529678#comment-17529678 ] 

Shawn Liu edited comment on FLINK-21301 at 4/28/22 10:59 PM:
-------------------------------------------------------------

[~lzljs3620320]  -what is the table config for late-fire? I didn't find it in the documentation. [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/]-

 

-I also find it useful to have this allow-lateness config separate from state ttl. For example, when I have a join and windowing in the same SQL Script, I only want to set state ttl for join but not allowed lateness for windowing. I don't find another way to do so without table.exec.emit.allow-lateness-

I misunderstood the current solution. [https://www.mail-archive.com/issues@flink.apache.org/msg498605.html] provides a good summarization. 


was (Author: xiangcaohello):
[~lzljs3620320]  what is the table config for late-fire? I didn't find it in the documentation. [https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/]

 

I also find it useful to have this allow-lateness config separate from state ttl. For example, when I have a join and windowing in the same SQL Script, I only want to set state ttl for join but not allowed lateness for windowing. I don't find another way to do so without table.exec.emit.allow-lateness

 

> Decouple window aggregate allow lateness with state ttl configuration
> ---------------------------------------------------------------------
>
>                 Key: FLINK-21301
>                 URL: https://issues.apache.org/jira/browse/FLINK-21301
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>            Reporter: Jing Zhang
>            Assignee: Jing Zhang
>            Priority: Major
>              Labels: auto-unassigned, pull-request-available
>             Fix For: 1.14.0
>
>
> Currently, state retention time config will also effect state clean behavior of Window Aggregate, which is unexpected for most users.
> E.g for the following example,  User would set `MinIdleStateRetentionTime` to 1 Day to clean state in `deduplicate` . However, it will also effects clean behavior of window aggregate. For example, 2021-01-04 data would clean at 2021-01-06 instead of 2021-01-05. 
> {code:sql}
> SELECT
>  DATE_FORMAT(tumble_end(ROWTIME ,interval '1' DAY),'yyyy-MM-dd') as stat_time,
>  count(1) first_phone_num
> FROM (
>  SELECT 
>  ROWTIME,
>  user_id,
>  row_number() over(partition by user_id, pdate order by ROWTIME ) as rn
>  FROM source_kafka_biz_shuidi_sdb_crm_call_record 
> ) cal 
> where rn =1
> group by tumble(ROWTIME,interval '1' DAY);{code}
> It's better to decouple window aggregate allow lateness with `MinIdleStateRetentionTime` .



--
This message was sent by Atlassian Jira
(v8.20.7#820007)