You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/01/20 07:15:49 UTC

[GitHub] [incubator-doris] yangzhg opened a new issue #5270: [Proposal]Doris UNIQUE table supports UPDATE syntax

yangzhg opened a new issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270


   ## Background
   
   The current Doris table does not support the update operation, but there are many scenarios where the data needs to be updated. Due to the batch delete function we implemented before, it paved the way for the update.
   
   ## Syntax
   
   We only support single table update
   
   ```
   UPDATE table_name
       SET assignment_list
       [WHERE where_condition]
       [ORDER BY...]
       [LIMIT row_count]
   
   value:
       {expr | DEFAULT}
   
   assignment:
       col_name = value
   
   assignment_list:
       assignment [, assignment].
   ```
   
   ## Design
   
   There are two kinds of updates involved here, one that only contains the value column, and the other that contains the key column.
   
   * Include and update of value column
   
     This method is relatively simple. You only need to plan the update statement into a query plan like `select * from table insert into table` during query planning, and you need to add an update node to modify the data to the updated value, or plan Into a statement similar to `select a, b,'xx','xx1' from table insert into table` and hand it directly to the tableSink node
   
   * Contains the update of the key column
   
     This kind of relatively complicated, we need to divide the query into two parts, and need to use the MERGE semantics in the batch delete, first plan it into a query plan similar to `select * from table insert into table`, and then generate it first for the modification of the key column One piece of data that needs to be deleted, and then replace the value of the key column to generate new data that needs to be added, mainly generating two pieces of imported data from one piece of original data. Need update node to have the ability to repeat data
   
   ## Subtasks
   
   1. Add index to unique table value column to speed up data scanning
   2. FE supports update statements and generates corresponding query plans. Be implements update node, which can be divided into two parts here
      1. Support updates that only contain the value column
      2. Support the update of the key column
   
   ## 背景
   
   当前 Doris 表不支持 update 操作,但是有很多场景需要对数据进行更新,由于我们之前实现的批量删除功能,为实现update 铺平了道路。
   
   ## 语法
   
   我们只支持单表更新
   
   ```
   UPDATE table_name
       SET assignment_list
       [WHERE where_condition]
       [ORDER BY ...]
       [LIMIT row_count]
   
   value:
       {expr | DEFAULT}
   
   assignment:
       col_name = value
   
   assignment_list:
       assignment [, assignment] .
   ```
   
   ## 设计方案
   
   这里涉及两种更新,一种使只包含value 列的更新, 一种是包含key列的更新
   
   * 包含及value 列的更新
   
     这种方式比较简单,只需要查询规划时将update 语句规划成类似 `select * from table insert into table`  的查询规划,并且需新增一个update node,用于修改数据为更新后的值, 或者 规划成 类似 `select a, b, 'xx', 'xx1' from table insert into table` 的语句,直接交给 tableSink 节点
   
   * 包含key列的更新
   
     这种相对复杂一点,我们需要将查询分成两部,并且需要用到 批量删除里的MERGE 语义,先规划成类似 `select * from table insert into table`  查询规划, 之后针对key列的修改,先生成一条需要删除的数据,再替换key列值后生成需要新增的数据,主要将一条原始数据生成两条导入数据。需要update node 有数据重复的能力
   
   ## 子任务
   
   1. unique 表 value 列加索引,加快数据扫描速度
   2. FE 支持 update 语句,并生成相应的查询规划,be实现update node, 这里可以分为两部
      1. 支持只包含value 列的更新
      2. 支持包含key列的更新


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] blackfox1983 commented on issue #5270: [Proposal]Doris UNIQUE table supports UPDATE syntax

Posted by GitBox <gi...@apache.org>.
blackfox1983 commented on issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270#issuecomment-764199935


   Great features。
   
   There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] blackfox1983 commented on issue #5270: [Proposal]Doris UNIQUE table supports UPDATE syntax

Posted by GitBox <gi...@apache.org>.
blackfox1983 commented on issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270#issuecomment-764199935


   Great features。
   
   There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on issue #5270: [Proposal]Doris UNIQUE table supports UPDATE syntax

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270#issuecomment-764448529


   > Great features。
   > 
   > There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.
   
   The need for bitmap update does exist, but what you said may not be a simple syntax like update set bitmap=xxx where xxx.
   Is that true?
   
   bitmap 的更新需求确实是一个很重要的场景。不过你说的这类场景并不是 update 语句可以支持的。因为update支持的是列整体的更新,而不是列的数据的局部更新。
   或者你知道有哪些其他的系统是支持这个功能的吗?我可以调研一下


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on issue #5270: [Proposal]Doris UNIQUE table supports UPDATE syntax

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270#issuecomment-764448529


   > Great features。
   > 
   > There is another scenario. For example, when using bitmap to calculate UV, it is only cumulative at present. For example, if 100 people have paid the order, UV should be equal to 100. But if 10 people cancel the order, UV should be equal to 90. In this case, bitmap needs to support the reset operation of a bit. This kind of scenario can be considered later to design.
   
   The need for bitmap update does exist, but what you said may not be a simple syntax like update set bitmap=xxx where xxx.
   Is that true?
   
   bitmap 的更新需求确实是一个很重要的场景。不过你说的这类场景并不是 update 语句可以支持的。因为update支持的是列整体的更新,而不是列的数据的局部更新。
   或者你知道有哪些其他的系统是支持这个功能的吗?我可以调研一下


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org