You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by ka...@apache.org on 2020/04/18 14:48:26 UTC

[incubator-doris] branch master updated: [ISSUE #3190]Add documents for delete simplifly (#3335)

This is an automated email from the ASF dual-hosted git repository.

kangkaisen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 31ebb24  [ISSUE #3190]Add documents for delete simplifly (#3335)
31ebb24 is described below

commit 31ebb2496d479e5ed20b8c90c47fafdd417dd1eb
Author: xy720 <22...@users.noreply.github.com>
AuthorDate: Sat Apr 18 22:48:18 2020 +0800

    [ISSUE #3190]Add documents for delete simplifly (#3335)
---
 .../administrator-guide/load-data/delete-manual.md | 178 ++++++++++++++++++++
 .../load-data/delete-manual_EN.md                  | 184 +++++++++++++++++++++
 2 files changed, 362 insertions(+)

diff --git a/docs/documentation/cn/administrator-guide/load-data/delete-manual.md b/docs/documentation/cn/administrator-guide/load-data/delete-manual.md
new file mode 100644
index 0000000..c3da18d
--- /dev/null
+++ b/docs/documentation/cn/administrator-guide/load-data/delete-manual.md
@@ -0,0 +1,178 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+#Delete
+
+Delete不同于其他导入方式,它是一个同步过程。和Insert into相似,所有的Delete操作在Doris中是一个独立的导入作业,一般Delete语句需要指定表和分区以及删除的条件来筛选要删除的数据,并将会同时删除base表和rollup表的数据。
+
+##语法
+
+主要的Delete语法如下:
+
+```
+DELETE FROM table_name [PARTITION partition_name]
+        WHERE
+        column_name1 op value[ AND column_name2 op value ...];
+```
+
+示例1:
+
+```
+DELETE FROM my_table PARTITION p1
+        WHERE k1 = 3;
+```
+
+示例2:
+
+```
+DELETE FROM my_table PARTITION p1
+        WHERE k1 < 3 AND k2 = "abc";
+```
+
+下面介绍删除语句中使用到的参数:
+
+* PARTITION
+	
+	Delete语句的目标分区,若未指定,则此表必须为单分区表,否则无法delete
+
+* WHERE
+	
+	Delete语句的条件语句,所有删除语句都必须指定WHERE语句
+
+说明:
+
+1. `Where`语句中的op的类型可包括`=,>,<,>=,<=,!=`,目前暂时不支持 where key in (value1, value2, value3) 的方式选定范围,后续将加上此功能。
+2. `Where`语句中的列只能是`key`列
+3.  当选定的`key`列不存在某个rollup表内时,无法进行delete
+4.  条件语句中各个条件只能是`and`关系,如希望达成`or`可将条件分别写入两个delete语句中
+5.  如果指定表为RANGE分区表,则必须指定 `PARTITION`。如果是单分区表,可以不指定。
+6.  不同于Insert into命令,delete不能手动指定`label`,有关label的概念可以查看[Insert Into文档] (./insert-into-manual.md)
+
+##返回结果
+
+Delete命令是一个SQL命令,返回结果是同步的,分为以下几种:
+
+1. 执行成功
+
+	如果Delete顺利执行完成并可见,将返回下列结果,`Query OK`表示成功
+	
+	```
+	mysql> delete from test_tbl PARTITION p1 where k1 = 1;
+Query OK, 0 rows affected (0.04 sec)
+{'label':'delete_e7830c72-eb14-4cb9-bbb6-eebd4511d251', 'status':'VISIBLE', 'txnId':'4005'}
+	```
+	
+2. 提交成功,但未可见
+
+    Doris的事务提交分为两步:提交和发布版本,只有完成了发布版本步骤,结果才对用户是可见的。若已经提交成功了,那么就可以认为最终一定会发布成功,Doris会尝试在提交完后等待发布一段时间,如果超时后即使发布版本还未完成也会优先返回给用户,提示用户提交已经完成。若如果Delete已经提交并执行,但是仍未发布版本和可见,将返回下列结果
+    
+    ```
+	mysql> delete from test_tbl PARTITION p1 where k1 = 1;
+Query OK, 0 rows affected (0.04 sec)
+{'label':'delete_e7830c72-eb14-4cb9-bbb6-eebd4511d251', 'status':'VISIBLE', 'txnId':'4005', 'err':'delete job is committed but may be taking effect later' }
+	```
+	
+      结果会同时返回一个json字符串:
+	
+    `affected rows`表示此次删除影响的行,由于Doris的删除目前是逻辑删除,因此对于这个值是恒为0。
+    
+    `label`为自动生成的 label,是该导入作业的标识。每个导入作业,都有一个在单 database 内部唯一的 Label。
+    
+    `status`表示数据删除是否可见,如果可见,显示`VISIBLE`,如果不可见,显示`COMMITTED`。
+    
+    `txnId`为这个Delete job对应的事务id
+    
+    `err`字段会显示一些本次删除的详细信息
+	
+3. 提交失败,事务取消
+
+    如果Delete语句没有提交成功,将会被Doris自动中止,返回下列结果
+    
+    ```
+	mysql> delete from test_tbl partition p1 where k1 > 80;
+ERROR 1064 (HY000): errCode = 2, detailMessage = {错误原因}
+	```
+	
+    示例:
+    
+    比如说一个超时的删除,将会返回timeout时间和未完成的`(tablet=replica)`
+
+    ```
+	mysql> delete from test_tbl partition p1 where k1 > 80;
+ERROR 1064 (HY000): errCode = 2, detailMessage = failed to delete replicas from job: 4005, Unfinished replicas:10000=60000, 10001=60000, 10002=60000
+	```
+	
+    **综上,对于Delete操作返回结果的正确处理逻辑为:**
+    
+    1. 如果返回结果为`ERROR 1064 (HY000)`,则表示删除失败
+    
+    2. 如果返回结果为`Query OK`,则表示删除执行成功
+
+    	1. 如果`status`为`COMMITTED`,表示数据仍不可见,用户可以稍等一段时间再用`show delete`命令查看结果
+    	2. 如果`status`为`VISIBLE`,表示数据删除成功。
+
+##可配置项
+
+###FE配置
+
+**TIMEOUT配置**
+
+总体来说,Doris的删除作业的超时时间限制在30秒到5分钟时间内,具体时间可通过下面配置项调整
+
+* tablet\_delete\_timeout\_second
+
+   delete自身的超时时间是可受指定分区下tablet的数量弹性改变的,此项配置为平均一个tablet所贡献的timeout时间,默认值为2。
+   
+   假设此次删除所指定分区下有5个tablet,那么可提供给delete的timeout时间为10秒,由于低于最低超时时间30秒,因此最终超时时间为30秒。
+   
+* load\_straggler\_wait\_second
+
+  如果用户预估的数据量确实比较大,使得5分钟的上限不足时,用户可以通过此项调整timeout上限,默认值为300。
+  
+   **TIMEOUT的具体计算规则为(秒)**
+  
+  `TIMEOUT = MIN(load_straggler_wait_second, MAX(30, tablet_delete_timeout_second * tablet_num))`
+  
+* query_timeout
+  
+  因为delete本身是一个SQL命令,因此删除语句也会受session限制,timeout还受Session中的`query_timeout`值影响,可以通过`SET query_timeout = xxx`来增加超时时间,单位是秒。
+  
+##查看历史记录
+	
+1. 用户可以通过show delete语句查看历史上已执行完成的删除记录
+
+	###语法
+
+	```
+	SHOW DELETE [FROM db_name]
+	```
+	
+	示例
+	
+	```
+	mysql> show delete from test_db;
+	+-----------+---------------+---------------------+-----------------+----------+
+	| TableName | PartitionName | CreateTime          | DeleteCondition | State    |
+	+-----------+---------------+---------------------+-----------------+----------+
+	| empty_tbl | p3            | 2020-04-15 23:09:35 | k1 EQ "1"       | FINISHED |
+	| test_tbl  | p4            | 2020-04-15 23:09:53 | k1 GT "80"      | FINISHED |
+	+-----------+---------------+---------------------+-----------------+----------+
+	2 rows in set (0.00 sec)
+	```
+	
diff --git a/docs/documentation/en/administrator-guide/load-data/delete-manual_EN.md b/docs/documentation/en/administrator-guide/load-data/delete-manual_EN.md
new file mode 100644
index 0000000..bfd782d
--- /dev/null
+++ b/docs/documentation/en/administrator-guide/load-data/delete-manual_EN.md
@@ -0,0 +1,184 @@
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+#Delete
+
+Unlike other import methods, delete is a synchronization process. Similar to insert into, all delete operations are an independent import job in Doris. Generally, delete statements need to specify tables, partitions and delete conditions to tell which data to be deleted, and the data on base index and rollup index will be deleted at the same time.
+
+
+##Syntax
+
+The delete statement's syntax is as follows:
+
+```
+DELETE FROM table_name [PARTITION partition_name]
+        WHERE
+        column_name1 op value[ AND column_name2 op value ...];
+```
+
+example 1:
+
+```
+DELETE FROM my_table PARTITION p1
+        WHERE k1 = 3;
+```
+
+example 2:
+
+```
+DELETE FROM my_table PARTITION p1
+        WHERE k1 < 3 AND k2 = "abc";
+```
+
+The following describes the parameters used in the delete statement:
+
+* PARTITION
+	
+	The target partition of the delete statement. If not specified, the table must be a single partition table, otherwise it cannot be deleted
+
+
+* WHERE
+	
+	The conditiona of the delete statement. All delete statements must specify a where condition.
+
+说明:
+
+1. The type of `OP` in the WHERE condition can only include `=, >, <, > =, < =,!=`. Currently, where key in (value1, Value2, value3) mode is not supported yet, may be added this support later.
+2. The column in the WHERE condition can only be the `key` column.
+3. Cannot delete when the `key` column does not exist in any rollup table.
+4. Each condition in WHERE condition can only be realated by `and`. If you want `or`, you are suggested to write these conditions into two delete statements.
+5. If the specified table is a range partitioned table, `PARTITION` must be specified unless the table is a single partition table,.
+6. Unlike the insert into command, delete statement cannot specify `label` manually. You can view the concept of `label` in [Insert Into] (./insert-into-manual.md)
+
+##Delete Result
+
+The delete command is an SQL command, and the returned results are synchronous. It can be divided into the following types:
+
+1. Successful visible
+
+	If delete completes successfully and is visible, the following results will be returned, `query OK` indicates success.
+	
+	```
+	mysql> delete from test_tbl PARTITION p1 where k1 = 1;
+Query OK, 0 rows affected (0.04 sec)
+{'label':'delete_e7830c72-eb14-4cb9-bbb6-eebd4511d251', 'status':'VISIBLE', 'txnId':'4005'}
+	```
+	
+2. Submitted successfully, but not visible
+
+
+    The transaction submission of Doris is divided into two steps: submission and publish version. Only after the publish version step is completed, the result will be visible to the user. If it has been submitted successfully, then it can be considered that the publish version step will eventually success. Doris will try to wait for publishing for a period of time after submitting. If it has timed out, even if the publishing version has not been completed, it will return to the user in  [...]
+    
+    ```
+	mysql> delete from test_tbl PARTITION p1 where k1 = 1;
+Query OK, 0 rows affected (0.04 sec)
+{'label':'delete_e7830c72-eb14-4cb9-bbb6-eebd4511d251', 'status':'VISIBLE', 'txnId':'4005', 'err':'delete job is committed but may be taking effect later' }
+	```
+	
+     The result will return a JSON string at the same time:
+	
+    `affected rows`: Indicates the row affected by this deletion. Since the deletion of Doris is currently a logical deletion, the value is always 0.
+    
+    `label`: The label generated automatically to be the signature of the delete jobs. Each job has a unique label within a single database.
+    
+    `status`: Indicates whether the data deletion is visible. If it is visible, `visible` will be displayed. If it is not visible, `committed` will be displayed.
+
+    
+    `txnId`: The transaction ID corresponding to the delete job
+    
+    `err`: Field will display some details of this deletion
+	
+3. Commit failed, transaction cancelled
+
+    If the delete statement is not submitted successfully, it will be automatically aborted by Doris and the following results will be returned
+
+    
+    ```
+	mysql> delete from test_tbl partition p1 where k1 > 80;
+ERROR 1064 (HY000): errCode = 2, detailMessage = {错误原因}
+	```
+	
+    example:
+    
+    A timeout deletion will return the timeout and unfinished replicas displayed as ` (tablet = replica)`
+    
+
+    ```
+	mysql> delete from test_tbl partition p1 where k1 > 80;
+ERROR 1064 (HY000): errCode = 2, detailMessage = failed to delete replicas from job: 4005, Unfinished replicas:10000=60000, 10001=60000, 10002=60000
+	```
+	
+    **The correct processing logic for the returned results of the delete operation is as follows:**
+    
+    1. If `Error 1064 (HY000)` is returned, deletion fails
+    
+    2. If the returned result is `Query OK`, the deletion is successful
+
+    	1. If `status` is `committed`, the data deletion is committed and will be eventually invisible. Users can wait for a while and then use the `show delete` command to view the results.
+    	2. If `status` is `visible`, the data have been deleted successfully.
+
+##Relevant Configuration
+
+###FE configuration
+
+**TIMEOUT configuration**
+
+In general, Doris's deletion timeout is limited from 30 seconds to 5 minutes. The specific time can be adjusted through the following configuration items
+
+* tablet\_delete\_timeout\_second
+
+   The timeout of delete itself can be elastically changed by the number of tablets in the specified partition. This configuration represents the average timeout contributed by a tablet. The default value is 2.
+   
+   Assuming that there are 5 tablets under the specified partition for this deletion, the timeout time available for the deletion is 10 seconds. Because the minimum timeout is 30 seconds which is higher than former timeout time, the final timeout is 30 seconds.
+   
+* load\_straggler\_wait\_second
+
+  If the user estimates a large amount of data, so that the upper limit of 5 minutes is insufficient, the user can adjust the upper limit of timeout through this item, and the default value is 300.
+  
+   **The specific calculation rule of timeout(seconds)**
+  
+  `TIMEOUT = MIN(load_straggler_wait_second, MAX(30, tablet_delete_timeout_second * tablet_num))`
+  
+* query_timeout
+  
+  Because delete itself is an SQL command, the deletion statement is also limited by the session variables, and the timeout is also affected by the session value `query'timeout`. You can increase the value by `set query'timeout = xxx`.
+  
+##Show delete history
+	
+1. The user can view the deletion completed in history through the show delete statement.
+
+	###Syntax
+
+	```
+	SHOW DELETE [FROM db_name]
+	```
+	
+	###example
+	
+	```
+	mysql> show delete from test_db;
+	+-----------+---------------+---------------------+-----------------+----------+
+	| TableName | PartitionName | CreateTime          | DeleteCondition | State    |
+	+-----------+---------------+---------------------+-----------------+----------+
+	| empty_tbl | p3            | 2020-04-15 23:09:35 | k1 EQ "1"       | FINISHED |
+	| test_tbl  | p4            | 2020-04-15 23:09:53 | k1 GT "80"      | FINISHED |
+	+-----------+---------------+---------------------+-----------------+----------+
+	2 rows in set (0.00 sec)
+	```
+	


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org