You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by da...@apache.org on 2022/10/07 08:18:13 UTC
[doris] branch master updated: [docs](unique-key-merge-on-write) add document for unique key merge o… (#13068)

This is an automated email from the ASF dual-hosted git repository.

dataroaring pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris.git


The following commit(s) were added to refs/heads/master by this push:
     new d902e80d6d [docs](unique-key-merge-on-write) add document for unique key merge o… (#13068)
d902e80d6d is described below

commit d902e80d6dd640798d7833b28921d985cac097e9
Author: zhannngchen <48...@users.noreply.github.com>
AuthorDate: Fri Oct 7 16:18:04 2022 +0800

    [docs](unique-key-merge-on-write) add document for unique key merge o… (#13068)
---
 docs/en/docs/data-table/best-practice.md    |  4 +-
 docs/en/docs/data-table/data-model.md       | 98 +++++++++++++++++++++++++++--
 docs/zh-CN/docs/data-table/best-practice.md |  2 +-
 docs/zh-CN/docs/data-table/data-model.md    | 95 ++++++++++++++++++++++++++--
 4 files changed, 188 insertions(+), 11 deletions(-)

diff --git a/docs/en/docs/data-table/best-practice.md b/docs/en/docs/data-table/best-practice.md
index ca6d87c44b..7b1a2de51f 100644
--- a/docs/en/docs/data-table/best-practice.md
+++ b/docs/en/docs/data-table/best-practice.md
@@ -51,9 +51,9 @@ AGGREGATE KEY(siteid, city, username)
 DISTRIBUTED BY HASH(siteid) BUCKETS 10;
 ```
 
-1.1.2. KEY UNIQUE
+1.1.2. UNIQUE KEY
 
-When UNIQUE KEY is the same, the new record covers the old record. At present, UNIQUE KEY implements the same REPLACE aggregation method as AGGREGATE KEY, and they are essentially the same. Suitable for analytical business with updated requirements.
+When UNIQUE KEY is the same, the new record covers the old record. Before version 1.2, UNIQUE KEY implements the same REPLACE aggregation method as AGGREGATE KEY, and they are essentially the same. We introduced a new merge-on-write implementation for UNIQUE KEY since version 1.2, which have better performance on many scenarios. Suitable for analytical business with updated requirements.
 
 ```
 CREATE TABLE sales_order
diff --git a/docs/en/docs/data-table/data-model.md b/docs/en/docs/data-table/data-model.md
index d27e6f6bce..287a3c48b9 100644
--- a/docs/en/docs/data-table/data-model.md
+++ b/docs/en/docs/data-table/data-model.md
@@ -236,7 +236,9 @@ Data may be aggregated to varying degrees at different times. For example, when
 
 ## Uniq Model
 
-In some multi-dimensional analysis scenarios, users are more concerned with how to ensure the uniqueness of Key, that is, how to obtain the Primary Key uniqueness constraint. Therefore, we introduce Uniq's data model. This model is essentially a special case of aggregation model and a simplified representation of table structure. Let's give an example.
+In some multi-dimensional analysis scenarios, users are more concerned about how to ensure the uniqueness of the Key, that is, how to obtain the uniqueness constraint of the Primary Key. Therefore, we introduced the Unique data model. Prior to version 1.2, the model was essentially a special case of the Aggregate Model and a simplified representation of the table structure. The implementation of the aggregation model is merge on read, it has poor performance on some aggregation queries ( [...]
+
+### Merge on read (same implementation as aggregate model)
 
 |ColumnName|Type|IsKey|Comment|
 |---|---|---|---|
@@ -303,7 +305,60 @@ PROPERTIES (
 );
 ```
 
-That is to say, Uniq model can be completely replaced by REPLACE in aggregation model. Its internal implementation and data storage are exactly the same. No further examples will be given here.
+That is to say, the merge-on-read implementation of the Unique model can be completely replaced by REPLACE in aggregation model. Its internal implementation and data storage are exactly the same. No further examples will be given here.
+
+### Merge on write (introduced from version 1.2)
+
+The merge-on-write implementation of the Unique model is completely different from the aggregation model. The query performance is closer to the duplicate model. Compared with the aggregation model, it has a better query performance.
+
+In version 1.2, as a new feature, merge-on-write is disabled by default, and users can enable it by adding the following property
+
+```
+"enable_unique_key_merge_on_write" = "true"
+```
+
+Let's continue to use the previous table as an example, the create table statement:
+
+```
+CREATE TABLE IF NOT EXISTS example_db.expamle_tbl
+(
+`user_id` LARGEINT NOT NULL COMMENT "user id",
+`username` VARCHAR (50) NOT NULL COMMENT "username",
+`city` VARCHAR (20) COMMENT "user city",
+`age` SMALLINT COMMENT "age",
+`sex` TINYINT COMMENT "sex",
+`phone` LARGEINT COMMENT "phone",
+`address` VARCHAR (500) COMMENT "address",
+`register_time` DATETIME COMMENT "register time"
+)
+Unique Key (`user_id`, `username`)
+DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
+PROPERTIES (
+"replication_allocation" = "tag.location.default: 1"
+"enable_unique_key_merge_on_write" = "true"
+);
+```
+
+In this implementation, the table structure is completely different with aggregate model:
+
+
+|ColumnName|Type|AggregationType|Comment|
+|---|---|---|---|
+| user_id | BIGINT | | user id|
+| username | VARCHAR (50) | | User nickname|
+| city | VARCHAR (20) | NONE | User City|
+| age | SMALLINT | NONE | User Age|
+| sex | TINYINT | NONE | User Gender|
+| phone | LARGEINT | NONE | User Phone|
+| address | VARCHAR (500) | NONE | User Address|
+| register_time | DATETIME | NONE | User registration time|
+
+On a Unique table with the merge-on-write option enabled, the data that is overwritten and updated will be marked for deletion during the load job, and new data will be written to a new file at the same time. When querying, all data marked for deletion will be filtered out at the file level, only the latest data would be readed, which eliminates the data aggregation cost while reading, and can support many predicates pushdown now. Therefore, it can bring a relatively large performance im [...]
+
+[NOTE]
+1. The new Merge-on-write implementation is disabled by default, and can only be enabled by specifying a property when creating a new table.
+2. The old Merge-on-read implementation cannot be seamlessly upgraded to the new version (the data organization is completely different). If you need to use the merge-on-write implementation version, you need to manually execute `insert into unique-mow- table select * from source table` to load data to new table.
+3. The feature delete sign and sequence col on the Unique model can still be used normally in the new implementation, and the usage has not changed.
 
 ## Duplicate Model
 
@@ -343,7 +398,7 @@ This data model is suitable for storing raw data without aggregation requirement
 
 ## Limitations of aggregation model
 
-Here we introduce the limitations of Aggregate model (including Uniq model).
+Here we introduce the limitations of Aggregate model.
 
 In the aggregation model, what the model presents is the aggregated data. That is to say, any data that has not yet been aggregated (for example, two different imported batches) must be presented in some way to ensure consistency. Let's give an example.
 
@@ -440,6 +495,40 @@ Add a count column and import the data with the column value **equal to 1**. The
 
 Another way is to **change the aggregation type of the count column above to REPLACE, and still weigh 1**. Then`select sum (count) from table;` and `select count (*) from table;` the results will be consistent. And in this way, there is no restriction on importing duplicate rows.
 
+### Merge-on-write implementation of Unique model
+
+In Merge-on-write implementation, a delete-bitmap is added to each rowset during loading, to mark some data as overwritten or deleted. With the previous example, after the first batch of data is loaded, the status is as follows:
+
+**batch 1**
+
+| user_id | date       | cost | delete bit |
+| ------- | ---------- | ---- | ---------- |
+| 10001   | 2017-11-20 | 50   | false      |
+| 10002   | 2017-11-21 | 39   | false      |
+
+After the batch2 is loaded, the duplicate rows in the first batch will be marked as deleted, and the status of the two batches of data is as follows
+
+**batch 1**
+
+| user_id | date       | cost | delete bit |
+| ------- | ---------- | ---- | ---------- |
+| 10001   | 2017-11-20 | 50   | **true**   |
+| 10002   | 2017-11-21 | 39   | false      |
+
+**batch 2**
+
+| user\_id | date       | cost | delete bit |
+| -------- | ---------- | ---- | ---------- |
+| 10001    | 2017-11-20 | 1    | false      |
+| 10001    | 2017-11-21 | 5    | false      |
+| 10003    | 2017-11-22 | 22   | false      |
+
+When querying, all data marked for deletion in the delete-bitmap will not be read out, so there is no need to do any data aggregation. The number of valid rows in the above data is 4 rows, and the query result should also be 4 rows. 
+
+It is also possible to obtain the result in the least expensive way, that is, the way of "scanning only a certain column of data to obtain the count value" mentioned above.
+
+In the test environment, the performance of the count(*) query in the merge-on-write implementation of the Unique model is more than 10 times faster than that of the aggregation model.
+
 ### Duplicate Model
 
 Duplicate model has no limitation of aggregation model. Because the model does not involve aggregate semantics, when doing count (*) query, we can get the correct semantics by choosing a column of queries arbitrarily.
@@ -453,5 +542,6 @@ Because the data model was established when the table was built, and **could not
 
 1. Aggregate model can greatly reduce the amount of data scanned and the amount of query computation by pre-aggregation. It is very suitable for report query scenarios with fixed patterns. But this model is not very friendly for count (*) queries. At the same time, because the aggregation method on the Value column is fixed, semantic correctness should be considered in other types of aggregation queries.
 2. Uniq model guarantees the uniqueness of primary key for scenarios requiring unique primary key constraints. However, the query advantage brought by pre-aggregation such as ROLLUP cannot be exploited (because the essence is REPLACE, there is no such aggregation as SUM).
-   - \[Note\] The Unique model only supports the entire row update. If the user needs unique key with partial update (such as loading multiple source tables into one doris table), you can consider using the Aggregate model, setting the aggregate type of the non-primary key columns to REPLACE_IF_NOT_NULL. For detail, please refer to [CREATE TABLE Manual](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md)
+   1. For users who have high performance requirements for aggregate queries, it is recommended to use the merge-on-write implementation added since version 1.2.
+   2. \[Note\] The Unique model only supports the entire row update. If the user needs unique key with partial update (such as loading multiple source tables into one doris table), you can consider using the Aggregate model, setting the aggregate type of the non-primary key columns to REPLACE_IF_NOT_NULL. For detail, please refer to [CREATE TABLE Manual](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md)
 3. Duplicate is suitable for ad-hoc queries of any dimension. Although it is also impossible to take advantage of the pre-aggregation feature, it is not constrained by the aggregation model and can take advantage of the queue-store model (only reading related columns, but not all Key columns).
diff --git a/docs/zh-CN/docs/data-table/best-practice.md b/docs/zh-CN/docs/data-table/best-practice.md
index 55c9a39617..f0e16a5d8c 100644
--- a/docs/zh-CN/docs/data-table/best-practice.md
+++ b/docs/zh-CN/docs/data-table/best-practice.md
@@ -52,7 +52,7 @@ DISTRIBUTED BY HASH(siteid) BUCKETS 10;
 
 #### UNIQUE KEY
 
-UNIQUE KEY 相同时，新记录覆盖旧记录。目前 UNIQUE KEY 实现上和 AGGREGATE KEY 的 REPLACE 聚合方法一样，二者本质上相同。适用于有更新需求的分析业务。
+UNIQUE KEY 相同时，新记录覆盖旧记录。在1.2版本之前，UNIQUE KEY 实现上和 AGGREGATE KEY 的 REPLACE 聚合方法一样，二者本质上相同，自1.2版本我们给UNIQUE KEY引入了merge on write实现，该实现有更好的聚合查询性能。适用于有更新需求的分析业务。
 
 ```sql
 CREATE TABLE sales_order
diff --git a/docs/zh-CN/docs/data-table/data-model.md b/docs/zh-CN/docs/data-table/data-model.md
index 3242fe9b78..b5a992e2f5 100644
--- a/docs/zh-CN/docs/data-table/data-model.md
+++ b/docs/zh-CN/docs/data-table/data-model.md
@@ -240,7 +240,9 @@ PROPERTIES (
 
 ## Unique 模型
 
-在某些多维分析场景下，用户更关注的是如何保证 Key 的唯一性，即如何获得 Primary Key 唯一性约束。因此，我们引入了 Unique 的数据模型。该模型本质上是聚合模型的一个特例，也是一种简化的表结构表示方式。我们举例说明。
+在某些多维分析场景下，用户更关注的是如何保证 Key 的唯一性，即如何获得 Primary Key 唯一性约束。因此，我们引入了 Unique 数据模型。在1.2版本之前，该模型本质上是聚合模型的一个特例，也是一种简化的表结构表示方式。由于聚合模型的实现方式是读时合并（merge on read)，因此在一些聚合查询上性能不佳（参考后续章节[聚合模型的局限性](#聚合模型的局限性)的描述），在1.2版本我们引入了Unique模型新的实现方式，写时合并（merge on write），通过在写入时做一些额外的工作，实现了最优的查询性能。写时合并将在未来替换读时合并成为Unique模型的默认实现方式，两者将会短暂的共存一段时间。下面将对两种实现方式分别举例进行说明。
+
+### 读时合并（与聚合模型相同的实现方式）
 
 | ColumnName    | Type         | IsKey | Comment      |
 | ------------- | ------------ | ----- | ------------ |
@@ -308,7 +310,59 @@ PROPERTIES (
 );
 ```
 
-即 Unique 模型完全可以用聚合模型中的 REPLACE 方式替代。其内部的实现方式和数据存储方式也完全一样。这里不再继续举例说明。
+即Unique 模型的读时合并实现完全可以用聚合模型中的 REPLACE 方式替代。其内部的实现方式和数据存储方式也完全一样。这里不再继续举例说明。
+
+### 写时合并（1.2版本新增）
+
+Unqiue模型的写时合并实现，与聚合模型就是完全不同的两种模型了，查询性能更接近于duplicate模型，在有主键约束需求的场景上相比聚合模型有较大的查询性能优势，尤其是在聚合模型上。
+
+在1.2版本中，作为一个新的feature，写时合并默认关闭，用户可以通过添加下面的property来开启
+
+```
+“enable_unique_key_merge_on_write” = “true”
+```
+
+仍然以上面的表为例，建表语句为
+
+```sql
+CREATE TABLE IF NOT EXISTS example_db.expamle_tbl
+(
+    `user_id` LARGEINT NOT NULL COMMENT "用户id",
+    `username` VARCHAR(50) NOT NULL COMMENT "用户昵称",
+    `city` VARCHAR(20) COMMENT "用户所在城市",
+    `age` SMALLINT COMMENT "用户年龄",
+    `sex` TINYINT COMMENT "用户性别",
+    `phone` LARGEINT COMMENT "用户电话",
+    `address` VARCHAR(500) COMMENT "用户地址",
+    `register_time` DATETIME COMMENT "用户注册时间"
+)
+UNIQUE KEY(`user_id`, `username`)
+DISTRIBUTED BY HASH(`user_id`) BUCKETS 1
+PROPERTIES (
+"replication_allocation" = "tag.location.default: 1"
+"enable_unique_key_merge_on_write" = "true"
+);
+```
+
+使用这种建表语句建出来的表结构，与聚合模型就完全不同了：
+
+| ColumnName    | Type         | AggregationType | Comment      |
+| ------------- | ------------ | --------------- | ------------ |
+| user_id       | BIGINT       |                 | 用户id       |
+| username      | VARCHAR(50)  |                 | 用户昵称     |
+| city          | VARCHAR(20)  | NONE            | 用户所在城市 |
+| age           | SMALLINT     | NONE            | 用户年龄     |
+| sex           | TINYINT      | NONE            | 用户性别     |
+| phone         | LARGEINT     | NONE            | 用户电话     |
+| address       | VARCHAR(500) | NONE            | 用户住址     |
+| register_time | DATETIME     | NONE            | 用户注册时间 |
+
+在开启了写时合并选项的Unique表上，数据在导入阶段就会去将被覆盖和被更新的数据进行标记删除，同时将新的数据写入新的文件。在查询的时候，所有被标记删除的数据都会在文件级别被过滤掉，读取出来的数据就都是最新的数据，消除掉了读时合并中的数据聚合过程，并且能够在很多情况下支持多种谓词的下推。因此在许多场景都能带来比较大的性能提升，尤其是在有聚合查询的情况下。
+
+【注意】
+1. 新的Merge-on-write实现默认关闭，且只能在建表时通过指定property的方式打开。
+2. 旧的Merge-on-read的实现无法无缝升级到新版本的实现（数据组织方式完全不同），如果需要改为使用写时合并的实现版本，需要手动执行`insert into unique-mow-table select * from source table`.
+3. 在Unique模型上独有的delete sign 和 sequence col，在写时合并的新版实现中仍可以正常使用，用法没有变化。
 
 ## Duplicate 模型
 
@@ -348,7 +402,7 @@ PROPERTIES (
 
 ## 聚合模型的局限性
 
-这里我们针对 Aggregate 模型（包括 Unique 模型），来介绍下聚合模型的局限性。
+这里我们针对 Aggregate 模型，来介绍下聚合模型的局限性。
 
 在聚合模型中，模型对外展现的，是**最终聚合后的**数据。也就是说，任何还未聚合的数据（比如说两个不同导入批次的数据），必须通过某种方式，以保证对外展示的一致性。我们举例说明。
 
@@ -449,6 +503,38 @@ SELECT COUNT(*) FROM table;
 
 另一种方式，就是 **将如上的 `count` 列的聚合类型改为 REPLACE，且依然值恒为 1**。那么 `select sum(count) from table;` 和 `select count(*) from table;` 的结果将是一致的。并且这种方式，没有导入重复行的限制。
 
+### Unique模型的写时合并实现
+
+还是以刚才的数据为例，写时合并为每次导入的rowset增加了对应的delete bitmap，来标记哪些数据被覆盖。第一批数据导入后状态如下
+
+**batch 1**
+
+| user_id | date       | cost | delete bit |
+| ------- | ---------- | ---- | ---------- |
+| 10001   | 2017-11-20 | 50   | false      |
+| 10002   | 2017-11-21 | 39   | false      |
+
+当第二批数据导入完成后，第一批数据中重复的行就会被标记为已删除，此时两批数据状态如下
+
+**batch 1**
+
+| user_id | date       | cost | delete bit |
+| ------- | ---------- | ---- | ---------- |
+| 10001   | 2017-11-20 | 50   | **true**   |
+| 10002   | 2017-11-21 | 39   | false      |
+
+**batch 2**
+
+| user_id | date       | cost | delete bit |
+| ------- | ---------- | ---- | ---------- |
+| 10001   | 2017-11-20 | 1    | false      |
+| 10001   | 2017-11-21 | 5    | false      |
+| 10003   | 2017-11-22 | 22   | false      |
+
+在查询时，所有在delete bitmap中被标记删除的数据都不会读出来，因此也无需进行做任何数据聚合，上述数据中有效的行数为4行，查询出的结果也应该是4行，也就可以采取开销最小的方式来获取结果，即前面提到的“仅扫描某一列数据，获得 count 值”的方式。
+
+在测试环境中，count(*) 查询在Unique模型的写时合并实现上的性能，相比聚合模型有10倍以上的提升。
+
 ### Duplicate 模型
 
 Duplicate 模型没有聚合模型的这个局限性。因为该模型不涉及聚合语意，在做 count(*) 查询时，任意选择一列查询，即可得到语意正确的结果。
@@ -462,5 +548,6 @@ Duplicate、Aggregate、Unique 模型，都会在建表指定 key 列，然而
 
 1. Aggregate 模型可以通过预聚合，极大地降低聚合查询时所需扫描的数据量和查询的计算量，非常适合有固定模式的报表类查询场景。但是该模型对 count(*) 查询很不友好。同时因为固定了 Value 列上的聚合方式，在进行其他类型的聚合查询时，需要考虑语意正确性。
 2. Unique 模型针对需要唯一主键约束的场景，可以保证主键唯一性约束。但是无法利用 ROLLUP 等预聚合带来的查询优势（因为本质是 REPLACE，没有 SUM 这种聚合方式）。
-   - 【注意】Unique 模型仅支持整行更新，如果用户既需要唯一主键约束，又需要更新部分列（例如将多张源表导入到一张 doris 表的情形），则可以考虑使用 Aggregate 模型，同时将非主键列的聚合类型设置为 REPLACE_IF_NOT_NULL。具体的用法可以参考[语法手册](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md)
+   1. 对于聚合查询有较高性能需求的用户，推荐使用自1.2版本加入的写时合并实现。
+   2. 【注意】Unique 模型仅支持整行更新，如果用户既需要唯一主键约束，又需要更新部分列（例如将多张源表导入到一张 doris 表的情形），则可以考虑使用 Aggregate 模型，同时将非主键列的聚合类型设置为 REPLACE_IF_NOT_NULL。具体的用法可以参考[语法手册](../sql-manual/sql-reference/Data-Definition-Statements/Create/CREATE-TABLE.md)
 3. Duplicate 适合任意维度的 Ad-hoc 查询。虽然同样无法利用预聚合的特性，但是不受聚合模型的约束，可以发挥列存模型的优势（只读取相关列，而不需要读取所有 Key 列）。


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org