You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Jingsong Lee (Jira)" <ji...@apache.org> on 2022/05/16 03:41:00 UTC

[jira] [Created] (FLINK-27626) Introduce pre-aggregated merge to table store

Jingsong Lee created FLINK-27626:
------------------------------------

             Summary: Introduce pre-aggregated merge to table store
                 Key: FLINK-27626
                 URL: https://issues.apache.org/jira/browse/FLINK-27626
             Project: Flink
          Issue Type: New Feature
          Components: Table Store
            Reporter: Jingsong Lee


We can introduce richer merge strategies, one of which is already introduced is PartialUpdateMergeFunction, which completes non-NULL fields when merging. We can introduce more powerful merge strategies, such as support for pre-aggregated merges.

Usage 1:
CREATE TABLE T (
    pk STRING PRIMARY KEY NOT ENFOCED,
    sum_field1 BIGINT,
    sum_field1 BIGINT
) WITH (
     'merge-engine' = 'aggregation',
     'sum_field1.aggregate-function' = 'sum',
     'sum_field2.aggregate-function' = 'sum'
);

INSERT INTO T VALUES ('pk1', 1, 1);
INSERT INTO T VALUES ('pk1', 1, 1);
SELECT * FROM T;
=> output 'pk1', 2, 2

Usage 2:
CREATE MATERIALIZED VIEW T
with (
    'merge-engine' = 'aggregation'
) AS SELECT
    pk,
    SUM(field1) AS sum_field1,
    SUM(field2) AS sum_field1
FROM source_t
GROUP BY pk ;

This will start a stream job to synchronize data, consume source data, and write incrementally to T. This data synchronization job has no state.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)