You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by lz...@apache.org on 2023/01/13 15:33:15 UTC

[flink-web] branch asf-site updated: Table Store Release 0.3.0

This is an automated email from the ASF dual-hosted git repository.

lzljs3620320 pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/flink-web.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 310e8d920 Table Store Release 0.3.0
310e8d920 is described below

commit 310e8d920b384fbfd6e4ae956f0a03d44eb83d2b
Author: Jingsong Lee <ji...@gmail.com>
AuthorDate: Fri Jan 13 23:33:09 2023 +0800

    Table Store Release 0.3.0
    
    This closes #601
---
 _config.yml                                        |  23 +-
 _posts/2023-01-13-release-table-store-0.3.0.md     | 240 +++++++++++++++++++++
 .../changelog-producer-full-compaction.png         | Bin 0 -> 103043 bytes
 3 files changed, 261 insertions(+), 2 deletions(-)

diff --git a/_config.yml b/_config.yml
index be9c82f6f..da3fba2b9 100644
--- a/_config.yml
+++ b/_config.yml
@@ -35,8 +35,8 @@ FLINK_KUBERNETES_OPERATOR_STABLE_SHORT: "1.3"
 FLINK_KUBERNETES_OPERATOR_URL: https://github.com/apache/flink-kubernetes-operator
 FLINK_KUBERNETES_OPERATOR_GITHUB_REPO_NAME: flink-kubernetes-operator
 
-FLINK_TABLE_STORE_VERSION_STABLE: 0.2.0
-FLINK_TABLE_STORE_STABLE_SHORT: "0.2"
+FLINK_TABLE_STORE_VERSION_STABLE: 0.3.0
+FLINK_TABLE_STORE_STABLE_SHORT: "0.3"
 
 FLINK_TABLE_STORE_GITHUB_URL: https://github.com/apache/flink-table-store
 FLINK_TABLE_STORE_GITHUB_REPO_NAME: flink-table-store
@@ -251,6 +251,22 @@ flink_kubernetes_operator_releases:
         sha512_url: "https://downloads.apache.org/flink/flink-kubernetes-operator-1.2.0/flink-kubernetes-operator-1.2.0-helm.tgz.sha512"
 
 flink_table_store_releases:
+  -
+    version_short: "0.3"
+    source_release:
+      name: "Apache Flink Table Store 0.3.0"
+      id: "030-table-store-download-source"
+      flink_version: "1.16.0"
+      url: "https://www.apache.org/dyn/closer.lua/flink/flink-table-store-0.3.0/flink-table-store-0.3.0-src.tgz"
+      asc_url: "https://downloads.apache.org/flink/flink-table-store-0.3.0/flink-table-store-0.3.0-src.tgz.asc"
+      sha512_url: "https://downloads.apache.org/flink/flink-table-store-0.3.0/flink-table-store-0.3.0-src.tgz.sha512"
+    binaries_release:
+      name: "Apache Flink Table Store Binaries 0.3.0"
+      id: "030-table-store-download-binaries"
+      flink_version: "1.16.0"
+      url: "https://www.apache.org/dyn/closer.lua/flink/flink-table-store-0.3.0/flink-table-store-dist-0.3.0.jar"
+      asc_url: "https://downloads.apache.org/flink/flink-table-store-0.3.0/flink-table-store-dist-0.3.0.jar.asc"
+      sha512_url: "https://downloads.apache.org/flink/flink-table-store-0.3.0/flink-table-store-dist-0.3.0.jar.sha512"
   -
     version_short: "0.2"
     source_release:
@@ -814,6 +830,9 @@ release_archive:
         release_date: 2022-04-02
 
     flink_table_store:
+      - version_short: 0.3
+        version_long: 0.3.0
+        release_date: 2023-01-13
       - version_short: 0.2
         version_long: 0.2.0
         release_date: 2022-08-29
diff --git a/_posts/2023-01-13-release-table-store-0.3.0.md b/_posts/2023-01-13-release-table-store-0.3.0.md
new file mode 100644
index 000000000..c118618de
--- /dev/null
+++ b/_posts/2023-01-13-release-table-store-0.3.0.md
@@ -0,0 +1,240 @@
+---
+layout: post
+title:  "Apache Flink Table Store 0.3.0 Release Announcement"
+date: 2023-01-13T08:00:00.000Z
+categories: news
+authors:
+- JingsongLi:
+  name: "Jingsong Lee"
+
+excerpt:  The Apache Flink Community is pleased to announce the release for Flink Table Store 0.3.0!
+
+---
+
+The Apache Flink community is pleased to announce the release of the
+[Apache Flink Table Store](https://github.com/apache/flink-table-store) (0.3.0).
+
+We highly recommend all users upgrade to Flink Table Store 0.3.0. 0.3.0 completed 150+ issues, which were completed by nearly 30 contributors.
+
+Please check out the full [documentation]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/) for detailed information and user guides.
+
+<br/>
+
+Flink Table Store 0.3 completes many exciting features, enhances its ability as a data lake storage and greatly
+improves the availability of its stream pipeline. Some important features are described below.
+
+## Changelog Producer: Full-Compaction
+
+If:
+- You are using `partial-update` or `aggregation` table, at the time of writing, table store can't know what the
+  result is after merging, so table store can't generate the corresponding changelog.
+- Your input can’t produce a complete changelog but you still want to get rid of the costly normalized operator,
+
+You may consider using the [Full compaction changelog producer]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/features/table-types/#full-compaction).
+
+By specifying `'changelog-producer' = 'full-compaction'`, Table Store will compare the results between full compactions
+and produce the differences as changelog. The latency of changelog is affected by the frequency of full compactions.
+By specifying `changelog-producer.compaction-interval` table property (default value 30min), users can define the
+maximum interval between two full compactions to ensure latency.
+
+<center>
+<img src="{{site.baseurl}}/img/blog/table-store/changelog-producer-full-compaction.png" width="100%"/>
+</center>
+
+<br/>
+
+## Dedicated Compaction Job && Multiple Writers
+
+By default, Table Store writers will perform compaction as needed when writing records. This is sufficient for most use cases, but there are two downsides:
+
+* This may result in unstable write throughput because throughput might temporarily drop when performing a compaction.
+* Compaction will mark some data files as "deleted". If multiple writers mark the same file a conflict will occur when
+  committing the changes. Table Store will automatically resolve the conflict, but this may result in job restarts.
+
+To avoid these downsides, users can also choose to skip compactions in writers, and run a
+[dedicated job only for compaction]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/maintenance/write-performance/#dedicated-compaction-job).
+As compactions are performed only by the dedicated job, writers can continuously write records without pausing and no conflicts will ever occur.
+
+To skip compactions in writers, set `write-only` to `true`.
+
+To run a dedicated job for compaction, follow these instructions.
+
+Flink SQL currently does not support statements related to compactions, so we have to submit the compaction job through `flink run`.
+
+Run the following command to submit a compaction job for the table.
+
+```bash
+<FLINK_HOME>/bin/flink run \
+    -c org.apache.flink.table.store.connector.action.FlinkActions \
+    /path/to/flink-table-store-dist-{{< version >}}.jar \
+    compact \
+    --warehouse <warehouse-path> \
+    --database <database-name> \
+    --table <table-name>
+```
+
+## Aggregation Table
+
+Sometimes users only care about aggregated results. The
+[aggregation merge engine]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/features/table-types/#aggregation)
+aggregates each value field with the latest data one by one under the same primary key according to the aggregate function.
+
+Each field that is not part of the primary keys must be given an aggregate function, specified by the
+`fields.<field-name>.aggregate-function` table property.
+
+For example:
+
+```
+CREATE TABLE MyTable (
+    product_id BIGINT,
+    price DOUBLE,
+    sales BIGINT,
+    PRIMARY KEY (product_id) NOT ENFORCED
+) WITH (
+    'merge-engine' = 'aggregation',
+    'fields.price.aggregate-function' = 'max',
+    'fields.sales.aggregate-function' = 'sum'
+);
+```
+
+## Schema Evolution
+
+In version 0.2, the research and development of Schema Evolution has begun. In version 0.3, some of the capabilities
+of Schema Evolution have been completed. You can use below operations in Spark-SQL (Flink SQL completes the following syntax in 1.17):
+
+- Adding New Columns
+- Renaming Column Name
+- Dropping Columns
+
+Flink Table Store ensures that the above operations are safe, and the old data will automatically adapt to
+the new schema when it is read.
+
+For example:
+
+```
+CREATE TABLE T (i INT, j INT);
+
+INSERT INTO T (1, 1);
+
+ALTER TABLE T ADD COLUMN k INT;
+
+ALTER TABLE T RENAME COLUMN i to a;
+
+INSERT INTO T (2, 2, 2);
+
+SELECT * FROM T;
+-- outputs (1, 1, NULL) and (2, 2, 2)
+```
+
+## Flink Lookup Join
+
+[Lookup Joins]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/features/lookup-joins/) are a type of join
+in streaming queries. It is used to enrich a table with data that is queried from Table Store. The join requires one
+table to have a processing time attribute and the other table to be backed by a lookup source connector.
+
+Table Store supports lookup joins on unpartitioned tables with primary keys in Flink.
+
+The lookup join operator will maintain a RocksDB cache locally and pull the latest updates of the table in real time.
+Lookup join operator will only pull the necessary data, so your filter conditions are very important for performance.
+
+This feature is only suitable for tables containing at most tens of millions of records to avoid excessive use of local disks.
+
+## Time Traveling
+
+You can use [Snapshots Table]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/how-to/querying-tables/#snapshots-table)
+and [Scan Mode]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/how-to/querying-tables/#scan-mode) to time traveling.
+
+- You can view all current snapshots by reading snapshots table.
+- You can travel time through `from-timestamp` and `from-snapshot`.
+
+For example:
+
+```
+SELECT * FROM T$options;
+
+/*
++--------------+------------+--------------+-------------------------+
+|  snapshot_id |  schema_id |  commit_kind |             commit_time |
++--------------+------------+--------------+-------------------------+
+|            2 |          0 |       APPEND | 2022-10-26 11:44:15.600 |
+|            1 |          0 |       APPEND | 2022-10-26 11:44:15.148 |
++--------------+------------+--------------+-------------------------+
+2 rows in set
+*/
+
+SELECT * FROM T /*+ OPTIONS('from-snapshot'='1') */;
+```
+
+## Audit Log Table
+
+If you need to audit the changelog of the table, you can use the
+[audit_log]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/how-to/querying-tables/#audit-log-table)
+system table. Through audit_log table, you can get the `rowkind` column when you get the incremental data of the table.
+You can use this column for filtering and other operations to complete the audit.
+
+There are four values for `rowkind`:
+
+- +I: Insertion operation.
+- -U: Update operation with the previous content of the updated row.
+- +U: Update operation with new content of the updated row.
+- -D: Deletion operation.
+
+For example:
+
+```
+SELECT * FROM MyTable$audit_log;
+
+/*
++------------------+-----------------+-----------------+
+|     rowkind      |     column_0    |     column_1    |
++------------------+-----------------+-----------------+
+|        +I        |      ...        |      ...        |
++------------------+-----------------+-----------------+
+|        -U        |      ...        |      ...        |
++------------------+-----------------+-----------------+
+|        +U        |      ...        |      ...        |
++------------------+-----------------+-----------------+
+3 rows in set
+*/
+```
+
+## Ecosystem
+
+Flink Table Store continues to strengthen its ecosystem and gradually gets through the reading and writing
+of all engines. Each engine below 0.3 has been enhanced.
+
+- Spark write has been supported. But `INSERT OVERWRITE` and stream write are still unsupported.
+- [S3]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/filesystems/s3/) and
+  [OSS]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/filesystems/oss/)
+  are supported by all computing engines.
+- Hive 3.1 is supported.
+- Trino the latest version (JDK 1s.17) is supported.
+
+## Getting started
+
+Please refer to the [getting started guide]({{site.DOCS_BASE_URL}}flink-table-store-docs-release-0.3/docs/try-table-store/quick-start/) for more details.
+
+<br/>
+
+## What's Next?
+
+In the upcoming 0.4.0 release you can expect the following additional features:
+
+* Provides Flink decoupled independent Java APIs
+* Spark: enhance batch write, provide streaming write and streaming read
+* Flink: complete DDL & DML, providing more management operations
+* Changelog producer: Lookup, the delay of stream reading each scenario is less than one minute
+* Provide multi table consistent materialized views in real-time
+* Data Integration: Schema Evolution integration, whole database integration.
+
+Please give the release a try, share your feedback on the Flink mailing list and contribute to the project!
+
+<br/>
+
+## List of Contributors
+
+The Apache Flink community would like to thank every one of the contributors that have made this release possible:
+
+Feng Wang, Hannankan, Jane Chan, Jia Liu, Jingsong Lee, Jonathan Leitschuh, JunZhang, Kirill Listopad, Liwei Li,
+MOBIN-F, Nicholas Jiang, Wang Luning, WencongLiu, Yubin Li, gongzhongqiang, houhang1005, liuzhuang2017, openinx,
+tsreaper, wuyouwuyoulian, zhuangchong, zjureel (shammon), 吴祥平
diff --git a/img/blog/table-store/changelog-producer-full-compaction.png b/img/blog/table-store/changelog-producer-full-compaction.png
new file mode 100644
index 000000000..72639de76
Binary files /dev/null and b/img/blog/table-store/changelog-producer-full-compaction.png differ