You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/05/17 10:32:00 UTC
[jira] [Updated] (FLINK-27626) Introduce pre-aggregated merge to table store
[ https://issues.apache.org/jira/browse/FLINK-27626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated FLINK-27626:
-----------------------------------
Labels: pull-request-available (was: )
> Introduce pre-aggregated merge to table store
> ---------------------------------------------
>
> Key: FLINK-27626
> URL: https://issues.apache.org/jira/browse/FLINK-27626
> Project: Flink
> Issue Type: New Feature
> Components: Table Store
> Reporter: Jingsong Lee
> Priority: Major
> Labels: pull-request-available
>
> We can introduce richer merge strategies, one of which is already introduced is PartialUpdateMergeFunction, which completes non-NULL fields when merging. We can introduce more powerful merge strategies, such as support for pre-aggregated merges.
> Usage 1:
> CREATE TABLE T (
> pk STRING PRIMARY KEY NOT ENFOCED,
> sum_field1 BIGINT,
> sum_field1 BIGINT
> ) WITH (
> 'merge-engine' = 'aggregation',
> 'sum_field1.aggregate-function' = 'sum',
> 'sum_field2.aggregate-function' = 'sum'
> );
> INSERT INTO T VALUES ('pk1', 1, 1);
> INSERT INTO T VALUES ('pk1', 1, 1);
> SELECT * FROM T;
> => output 'pk1', 2, 2
> Usage 2:
> CREATE MATERIALIZED VIEW T
> with (
> 'merge-engine' = 'aggregation'
> ) AS SELECT
> pk,
> SUM(field1) AS sum_field1,
> SUM(field2) AS sum_field1
> FROM source_t
> GROUP BY pk ;
> This will start a stream job to synchronize data, consume source data, and write incrementally to T. This data synchronization job has no state.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)