You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by 王博 <50...@qq.com> on 2021/01/20 09:57:38 UTC

回复:Doris UNIQUE table supports UPDATE syntax

I think the desgin ideas is weird;
Why update statement should be converted into a query and insert statement?
As far as I know, Kudu support update api.
Why not research Kudu's design first?
I prefer to design from the perspective of the storage engine




------------------&nbsp;原始邮件&nbsp;------------------
发件人:                                                                                                                        "dev"                                                                                    <yangzhgg@gmail.com&gt;;
发送时间:&nbsp;2021年1月20日(星期三) 下午3:14
收件人:&nbsp;"dev"<dev@doris.apache.org&gt;;

主题:&nbsp;Doris UNIQUE table supports UPDATE syntax



## Background

The current Doris table does not support the update operation, but there
are many scenarios where the data needs to be updated. Due to the batch
delete function we implemented before, it paved the way for the update.

## Syntax

We only support single table update

```
UPDATE table_name
&nbsp;&nbsp;&nbsp; SET assignment_list
&nbsp;&nbsp;&nbsp; [WHERE where_condition]
&nbsp;&nbsp;&nbsp; [ORDER BY...]
&nbsp;&nbsp;&nbsp; [LIMIT row_count]

value:
&nbsp;&nbsp;&nbsp; {expr | DEFAULT}

assignment:
&nbsp;&nbsp;&nbsp; col_name = value

assignment_list:
&nbsp;&nbsp;&nbsp; assignment [, assignment].
```

## Design

There are two kinds of updates involved here, one that only contains the
value column, and the other that contains the key column.

* Include and update of value column

&nbsp; This method is relatively simple. You only need to plan the update
statement into a query plan like `select * from table insert into table`
during query planning, and you need to add an update node to modify the
data to the updated value, or plan Into a statement similar to `select a,
b,'xx','xx1' from table insert into table` and hand it directly to the
tableSink node

* Contains the update of the key column

&nbsp; This kind of relatively complicated, we need to divide the query into two
parts, and need to use the MERGE semantics in the batch delete, first plan
it into a query plan similar to `select * from table insert into table`,
and then generate it first for the modification of the key column One piece
of data that needs to be deleted, and then replace the value of the key
column to generate new data that needs to be added, mainly generating two
pieces of imported data from one piece of original data. Need update node
to have the ability to repeat data

## Subtasks

1. Add index to unique table value column to speed up data scanning
2. FE supports update statements and generates corresponding query plans.
Be implements update node, which can be divided into two parts here
&nbsp;&nbsp; 1. Support updates that only contain the value column
&nbsp;&nbsp; 2. Support the update of the key column

Re: Doris UNIQUE table supports UPDATE syntax

Posted by Murugan Muthusamy <mm...@gmail.com>.
+1 This is a valuable and most required feature. It makes Doris a fully
qualified SQL supported data warehouse DB.

On Wed, Jan 20, 2021 at 1:57 AM 王博 <50...@qq.com> wrote:

> I think the desgin ideas is weird;
> Why update statement should be converted into a query and insert statement?
> As far as I know, Kudu support update api.
> Why not research Kudu's design first?
> I prefer to design from the perspective of the storage engine
>
>
>
>
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:
>                                                   "dev"
>                                                                 <
> yangzhgg@gmail.com&gt;;
> 发送时间:&nbsp;2021年1月20日(星期三) 下午3:14
> 收件人:&nbsp;"dev"<dev@doris.apache.org&gt;;
>
> 主题:&nbsp;Doris UNIQUE table supports UPDATE syntax
>
>
>
> ## Background
>
> The current Doris table does not support the update operation, but there
> are many scenarios where the data needs to be updated. Due to the batch
> delete function we implemented before, it paved the way for the update.
>
> ## Syntax
>
> We only support single table update
>
> ```
> UPDATE table_name
> &nbsp;&nbsp;&nbsp; SET assignment_list
> &nbsp;&nbsp;&nbsp; [WHERE where_condition]
> &nbsp;&nbsp;&nbsp; [ORDER BY...]
> &nbsp;&nbsp;&nbsp; [LIMIT row_count]
>
> value:
> &nbsp;&nbsp;&nbsp; {expr | DEFAULT}
>
> assignment:
> &nbsp;&nbsp;&nbsp; col_name = value
>
> assignment_list:
> &nbsp;&nbsp;&nbsp; assignment [, assignment].
> ```
>
> ## Design
>
> There are two kinds of updates involved here, one that only contains the
> value column, and the other that contains the key column.
>
> * Include and update of value column
>
> &nbsp; This method is relatively simple. You only need to plan the update
> statement into a query plan like `select * from table insert into table`
> during query planning, and you need to add an update node to modify the
> data to the updated value, or plan Into a statement similar to `select a,
> b,'xx','xx1' from table insert into table` and hand it directly to the
> tableSink node
>
> * Contains the update of the key column
>
> &nbsp; This kind of relatively complicated, we need to divide the query
> into two
> parts, and need to use the MERGE semantics in the batch delete, first plan
> it into a query plan similar to `select * from table insert into table`,
> and then generate it first for the modification of the key column One piece
> of data that needs to be deleted, and then replace the value of the key
> column to generate new data that needs to be added, mainly generating two
> pieces of imported data from one piece of original data. Need update node
> to have the ability to repeat data
>
> ## Subtasks
>
> 1. Add index to unique table value column to speed up data scanning
> 2. FE supports update statements and generates corresponding query plans.
> Be implements update node, which can be divided into two parts here
> &nbsp;&nbsp; 1. Support updates that only contain the value column
> &nbsp;&nbsp; 2. Support the update of the key column

Re: Doris UNIQUE table supports UPDATE syntax

Posted by ling miao <li...@apache.org>.
Hi Wangbo,

In terms of design, I think it is not syntactically rewritten as select +
insert, but should use the query engine to plan read and write operators
(the current operators can be reused) to meet the update requirements.

In addition, Doris itself supports updates from the storage engine. It just
lacks the stage of querying the data to be updated.

Ling Miao

王博 <50...@qq.com> 于2021年1月20日周三 下午5:57写道:

> I think the desgin ideas is weird;
> Why update statement should be converted into a query and insert statement?
> As far as I know, Kudu support update api.
> Why not research Kudu's design first?
> I prefer to design from the perspective of the storage engine
>
>
>
>
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:
>                                                   "dev"
>                                                                 <
> yangzhgg@gmail.com&gt;;
> 发送时间:&nbsp;2021年1月20日(星期三) 下午3:14
> 收件人:&nbsp;"dev"<dev@doris.apache.org&gt;;
>
> 主题:&nbsp;Doris UNIQUE table supports UPDATE syntax
>
>
>
> ## Background
>
> The current Doris table does not support the update operation, but there
> are many scenarios where the data needs to be updated. Due to the batch
> delete function we implemented before, it paved the way for the update.
>
> ## Syntax
>
> We only support single table update
>
> ```
> UPDATE table_name
> &nbsp;&nbsp;&nbsp; SET assignment_list
> &nbsp;&nbsp;&nbsp; [WHERE where_condition]
> &nbsp;&nbsp;&nbsp; [ORDER BY...]
> &nbsp;&nbsp;&nbsp; [LIMIT row_count]
>
> value:
> &nbsp;&nbsp;&nbsp; {expr | DEFAULT}
>
> assignment:
> &nbsp;&nbsp;&nbsp; col_name = value
>
> assignment_list:
> &nbsp;&nbsp;&nbsp; assignment [, assignment].
> ```
>
> ## Design
>
> There are two kinds of updates involved here, one that only contains the
> value column, and the other that contains the key column.
>
> * Include and update of value column
>
> &nbsp; This method is relatively simple. You only need to plan the update
> statement into a query plan like `select * from table insert into table`
> during query planning, and you need to add an update node to modify the
> data to the updated value, or plan Into a statement similar to `select a,
> b,'xx','xx1' from table insert into table` and hand it directly to the
> tableSink node
>
> * Contains the update of the key column
>
> &nbsp; This kind of relatively complicated, we need to divide the query
> into two
> parts, and need to use the MERGE semantics in the batch delete, first plan
> it into a query plan similar to `select * from table insert into table`,
> and then generate it first for the modification of the key column One piece
> of data that needs to be deleted, and then replace the value of the key
> column to generate new data that needs to be added, mainly generating two
> pieces of imported data from one piece of original data. Need update node
> to have the ability to repeat data
>
> ## Subtasks
>
> 1. Add index to unique table value column to speed up data scanning
> 2. FE supports update statements and generates corresponding query plans.
> Be implements update node, which can be divided into two parts here
> &nbsp;&nbsp; 1. Support updates that only contain the value column
> &nbsp;&nbsp; 2. Support the update of the key column