You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Hannan Kan <ha...@foxmail.com> on 2022/08/02 15:41:09 UTC

回复: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce pre-aggregated merge to Table Store

Very much thank you for reviewing FLIP-255.


## Aggregate functions
I partly refer to Apache Druid[1]. In Druid, each aggregate column is designated by an Json/map string, such as ‘{"type": count, "name": <output_name&gt;}’.&nbsp; The "type" is aggregate function and the "name" is column name. In FLIP-255, I plan to use an map to save all column names, whose aggregate function is given by user, and they aggregate functions. I have thought the way like `'fields.sum_field1.function'='sum'`, but I think it is less neat than using map when the number of aggregate columns gets larger.


## Default function
I do mean `replace`.
There is not the same case.&nbsp;
In Drois, columns are either aggregate keys (like primary key in our case) or the one to be aggregated[2].
In Druid, aggregate results are put in a new table which means all columns have aggregate functions.
I think it is possible that an column has no aggregate functions. So I use `replace` as default function.


## Supported functions
`replace` refers to Doris [2]. To be honest, `replace_if_not_null/concatenate` are proposed according to imaginary scenarios but without other systems can be referred to.&nbsp;&nbsp;


Aforementioned responses are personal idea. Looking forward to your precious advice.


[1]&nbsp;https://druid.apache.org/docs/latest/querying/aggregations.html
[2] https://doris.apache.org/zh-CN/docs/data-table/data-model/


------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "dev"                                                                                    <jingsonglee0@gmail.com&gt;;
发送时间:&nbsp;2022年8月2日(星期二) 中午12:23
收件人:&nbsp;"dev"<dev@flink.apache.org&gt;;

主题:&nbsp;Re: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce pre-aggregated merge to Table Store



Thanks Nathan for starting this discussion.

This [1] is a very good requirement to build the materialized view on Flink
Table Store.

## Aggregate Functions
For 'aggregate-function' = '{sum_field1:sum,max_field2:max}'.
Do you refer to any other systems? For us at Flink, a viable approach is
something like the Datagen connector [2]. Something like
`'fields.sum_field1.function'='sum'`.

## Default function
&gt;&gt; Tips: Columns which do not have designated aggregate functions using
newest value to overwrite old value.
Do you mean `replace`?
Is there anything about default functions that other systems can refer to?

## Supported functions
I'm not quite sure that the names of these functions are standard enough:
`replace_if_not_null/replace/concatenate`. Can you look at other systems?
You can also specify whether they support retraction messages.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-255+Introduce+pre-aggregated+merge+to+Table+Store
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/table/datagen/#connector-options

Best,
Jingsong

On Tue, Aug 2, 2022 at 11:09 AM 李国君 <liguojun@bytedance.com&gt; wrote:

&gt; Hi Nathan,
&gt;
&gt; Seems a great proposal for table store aggregation.
&gt; In the example, I think the 'max_field1' should be 1 instead of 2 after
&gt; the max aggregation in the output result.
&gt; And there may be a minor typo in the WITH clause, 'max_field2' -&gt;
&gt; 'max_field1'.
&gt;
&gt; Best,
&gt; Guojun
&gt;
&gt; From: "Hannan Kan"<hannankan@foxmail.com&gt;
&gt; Date: Mon, Aug 1, 2022, 11:35 PM
&gt; Subject: [Phishing Risk] [External] [DISCUSS] FLIP-255 Introduce
&gt; pre-aggregated merge to Table Store
&gt; To: "dev"<dev@flink.apache.org&gt;
&gt; Cc: "lzljs3620320"<lzljs3620320@apache.org&gt;
&gt; Hi everyone, I would like to open a discussion on&amp;nbsp;FLIP-255 Introduce
&gt; pre-aggregated merge to table store&amp;nbsp;[1]. Pre-aggregation mechanism has
&gt; been adopted by ma​ny bi​g d​ata system​​s (such as Apache Doris,​&amp;nbsp;
&gt; Apache Kylin​ , Druid etc.)&amp;nbsp;to save storage and accelerate the
&gt; aggregate query.​ FLIP-255 proposes to introduce pre-aggregated merge into
&gt; Flink Table Store to acquire the same benefit.&amp;nbsp; ​Supported aggregate
&gt; functions include&amp;nbsp;sum, max/min, count, replace_if_not_null,
&gt; replace,&amp;nbsp; concatenate, or/and​. ​ Looking forward to your feedback.
&gt; [1]&amp;nbsp;https://cwiki.apache.org/confluence/display​/FLINK/FLIP-2​55+Introduce+pre-aggregated+merge+to+table+store
&gt; Best, Nathan Kan (​Hongnan Gan)​​​
&gt;