You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by "Zhang,Linfeng" <zh...@baidu.com> on 2021/02/22 10:40:30 UTC

Question about buckets of Doris

 Hi,
>  Question:
Excuse me, I have another question to ask. Does Doris have the function of sub table ? I didn't find an introduction to Doris's sub table function. IF not, What I want to know is can Doris bucket redefine the table structure ? Suppose that a table of ES has two sub tables corresponding to two buckets of Doris, but the table structures of the two sub tables of ES are different, can the two buckets of Doris also define different table structures? Or is there a corresponding solution in this case ?


Re: Question about buckets of Doris

Posted by "Zhang,Linfeng" <zh...@baidu.com>.
Hi Ling Miao
Thank you for your reply.
Let me answer this question from the perspective of business usage scenarios. The reason why two tables are not used to replace the sub table is that after a table is created on ES, if the table is divided, it can still be directly queried from the main table. For example, create a table called DorisTable. ES can be divided automatically according to the month: DorisTable202101 and DorisTable202102。 If I want to retrieve this table, I can use the SQL statement: select * from DorisTable202102 can also use select * from DorisTable。 The data of February can be found in the above two ways. For external users, they only need to use a fixed SQL statement: select * from DorisTable, instead of paying attention to the sub table.

One problem is that the structure of the sub table in ES has changed (assuming a new field x is added), but the structure of the primary table has not changed. Based on the characteristics of ES, the primary table can still query the new field data. This scenario does not work in Doris. The index on Doris is still only associated with the data structure of the previous master table, and there is no new field X. If one create two new tables corresponding to es on Doris, can one use a fixed SQL statement to query the data in the two tables ? It's like using the SQL statement in the above example: select * from dorista ble

Linfeng Zhang

发件人: ling miao <li...@apache.org>
日期: 2021年2月24日 星期三 11:40
收件人: "Zhang,Linfeng" <zh...@baidu.com>
抄送: "dev@doris.apache.org" <de...@doris.apache.org>, "Yang,Dan(R&D QED)" <ya...@baidu.com>
主题: Re: Question about buckets of Doris

Hi LinFeng,

Or can you explain the difference between two sub-tables of es and creating two separate tables?

Ling Miao

Zhang,Linfeng <zh...@baidu.com>> 于2021年2月24日周三 上午11:25写道:
Hi Ling Miao

Thank you for your reply.

In the process of using Doris on ES, there will be a table of ES, but the corresponding two sub tables have different table structures. This problem can be solved by building different Doris external table through different es sub tables, but the external table of Doris will increase geometrically. This is difficult to manage because of too much external tables. So I want to ask if there is any better way. For example, can partitioning and buckets solve this problem ? Or is there a better way ?

Zhang,Linfeng

发件人: ling miao <em...@gmail.com>>
日期: 2021年2月23日 星期二 10:35
收件人: "dev@doris.apache.org<ma...@doris.apache.org>" <de...@doris.apache.org>>
抄送: "Yang,Dan(R&D QED)" <ya...@baidu.com>>, "Zhang,Linfeng" <zh...@baidu.com>>
主题: Re: Question about buckets of Doris

Hi LinFeng,

Doris does not support the concept of sub-tables. A table can only have one table structure at a time.

But what I am more puzzled is, under what circumstances will the concept of sub-tables be used in business requirements.
Does it mean that ES supports one table but has two different table structures?
Is it possible to treat these two sub-tables directly as the two external tables of Doris?

Ling Miao

Zhang,Linfeng <zh...@baidu.com>> 于2021年2月23日周二 上午2:26写道:
 Hi,
>  Question:
Excuse me, I have another question to ask. Does Doris have the function of sub table ? I didn't find an introduction to Doris's sub table function. IF not, What I want to know is can Doris bucket redefine the table structure ? Suppose that a table of ES has two sub tables corresponding to two buckets of Doris, but the table structures of the two sub tables of ES are different, can the two buckets of Doris also define different table structures? Or is there a corresponding solution in this case ?

Fwd: Question about buckets of Doris

Posted by ling miao <li...@apache.org>.
---------- Forwarded message ---------
发件人: ling miao <li...@apache.org>
Date: 2021年2月24日周三 下午5:27
Subject: Re: Question about buckets of Doris
To: Zhang,Linfeng <zh...@baidu.com>
Cc: dev@doris.apache.org <de...@doris.apache.org>, Yang,Dan(R&D QED) <
yangdan@baidu.com>, Zhang,Linfeng <zh...@baidu.com>, ci-dev <
ci-dev@baidu.com>


Hi LinFeng,

I understand.

The most fundamental problem is that ES is supported semi-structured data,
  so the columns in the sub-table that are inconsistent with the base table
will not be reviewed.
But Doris is a strictly structured product, and at this point it fails to
match ES.

My suggestion is to create only one table in Doris, but the schema of this
table should be a collection of all es subtables.
In this case, if you query any sub-table of es, you can directly use the
main table registered in doris, and there will be no issue of column review
failure.

For example,
There is one table of ES named ESTable
There are two sub tables of ES named ESTable202101 and ESTable202102 with
different schema: (k1, k2) and (k1,k3)
Then, you could create a Doris external table named ESTable with schema
(k1,k2,k3)
Query on Doris:
select k1, k2 from ESTable == select k1, k2 from ESTable of ES
select k1, k3 from ESTable == select k1, k3 from ESTable of ES.
And this query will not cause any problems.

Ling Miao



Zhang,Linfeng <zh...@baidu.com> 于2021年2月24日周三 下午12:25写道:

> Hi Ling Miao
>
>
>
> Thank you for your reply.
>
>
>
> Let me answer this question from the perspective of business usage
> scenarios. The reason why two tables are not used to replace the sub table
> is that after a table is created on ES, if the table is divided, it can
> still be directly queried from the main table. For example, create a table
> called DorisTable. ES can be divided automatically according to the month:
> DorisTable202101 and DorisTable202102。 If I want to retrieve this table,
> I can use the SQL statement: select * from DorisTable202102 can also use
> select * from DorisTable。 The data of February can be found in the above
> two ways. For external users, they only need to use a fixed SQL statement:
> select * from DorisTable, instead of paying attention to the sub table.
>
>
>
>
> One problem is that the structure of the sub table in ES has changed
> (assuming a new field x is added), but the structure of the primary table
> has not changed. Based on the characteristics of ES, the primary table can
> still query the new field data. This scenario does not work in Doris. The
> index on Doris is still only associated with the data structure of the
> previous master table, and there is no new field X. If one create two new
> tables corresponding to es on Doris, can one use a fixed SQL statement to
> query the data in the two tables ? It's like using the SQL statement in the
> above example: select * from dorista ble
>
>
>
> Linfeng Zhang
>
>
>
> *发件人**: *ling miao <li...@apache.org>
> *日期**: *2021年2月24日 星期三 11:40
> *收件人**: *"Zhang,Linfeng" <zh...@baidu.com>
> *抄送**: *"dev@doris.apache.org" <de...@doris.apache.org>, "Yang,Dan(R&D
> QED)" <ya...@baidu.com>
> *主题**: *Re: Question about buckets of Doris
>
>
>
> Hi LinFeng,
>
>
>
> Or can you explain the difference between two sub-tables of es and
> creating two separate tables?
>
>
>
> Ling Miao
>
>
>
> Zhang,Linfeng <zh...@baidu.com> 于2021年2月24日周三 上午11:25写道:
>
> Hi Ling Miao
>
>
>
> Thank you for your reply.
>
>
>
> In the process of using Doris on ES, there will be a table of ES, but the
> corresponding two sub tables have different table structures. This problem
> can be solved by building different Doris external table through different
> es sub tables, but the external table of Doris will increase geometrically.
> This is difficult to manage because of too much external tables. So I want
> to ask if there is any better way. For example, can partitioning and
> buckets solve this problem ? Or is there a better way ?
>
>
>
> Zhang,Linfeng
>
>
>
> *发件人**: *ling miao <em...@gmail.com>
> *日期**: *2021年2月23日 星期二 10:35
> *收件人**: *"dev@doris.apache.org" <de...@doris.apache.org>
> *抄送**: *"Yang,Dan(R&D QED)" <ya...@baidu.com>, "Zhang,Linfeng" <
> zhanglinfeng@baidu.com>
> *主题**: *Re: Question about buckets of Doris
>
>
>
> Hi LinFeng,
>
>
>
> Doris does not support the concept of sub-tables. A table can only have
> one table structure at a time.
>
>
>
> But what I am more puzzled is, under what circumstances will the concept
> of sub-tables be used in business requirements.
> Does it mean that ES supports one table but has two different table
> structures?
> Is it possible to treat these two sub-tables directly as the two external
> tables of Doris?
>
>
>
> Ling Miao
>
>
>
> Zhang,Linfeng <zh...@baidu.com> 于2021年2月23日周二 上午2:26写道:
>
>  Hi,
> >  Question:
> Excuse me, I have another question to ask. Does Doris have the function of
> sub table ? I didn't find an introduction to Doris's sub table function. IF
> not, What I want to know is can Doris bucket redefine the table structure ?
> Suppose that a table of ES has two sub tables corresponding to two buckets
> of Doris, but the table structures of the two sub tables of ES are
> different, can the two buckets of Doris also define different table
> structures? Or is there a corresponding solution in this case ?
>
>

Re: Question about buckets of Doris

Posted by ling miao <li...@apache.org>.
Hi LinFeng,

I understand.

The most fundamental problem is that ES is supported semi-structured data,
  so the columns in the sub-table that are inconsistent with the base table
will not be reviewed.
But Doris is a strictly structured product, and at this point it fails to
match ES.

My suggestion is to create only one table in Doris, but the schema of this
table should be a collection of all es subtables.
In this case, if you query any sub-table of es, you can directly use the
main table registered in doris, and there will be no issue of column review
failure.

For example,
There is one table of ES named ESTable
There are two sub tables of ES named ESTable202101 and ESTable202102 with
different schema: (k1, k2) and (k1,k3)
Then, you could create a Doris external table named ESTable with schema
(k1,k2,k3)
Query on Doris:
select k1, k2 from ESTable == select k1, k2 from ESTable of ES
select k1, k3 from ESTable == select k1, k3 from ESTable of ES.
And this query will not cause any problems.

Ling Miao



Zhang,Linfeng <zh...@baidu.com> 于2021年2月24日周三 下午12:25写道:

> Hi Ling Miao
>
>
>
> Thank you for your reply.
>
>
>
> Let me answer this question from the perspective of business usage
> scenarios. The reason why two tables are not used to replace the sub table
> is that after a table is created on ES, if the table is divided, it can
> still be directly queried from the main table. For example, create a table
> called DorisTable. ES can be divided automatically according to the month:
> DorisTable202101 and DorisTable202102。 If I want to retrieve this table,
> I can use the SQL statement: select * from DorisTable202102 can also use
> select * from DorisTable。 The data of February can be found in the above
> two ways. For external users, they only need to use a fixed SQL statement:
> select * from DorisTable, instead of paying attention to the sub table.
>
>
>
>
> One problem is that the structure of the sub table in ES has changed
> (assuming a new field x is added), but the structure of the primary table
> has not changed. Based on the characteristics of ES, the primary table can
> still query the new field data. This scenario does not work in Doris. The
> index on Doris is still only associated with the data structure of the
> previous master table, and there is no new field X. If one create two new
> tables corresponding to es on Doris, can one use a fixed SQL statement to
> query the data in the two tables ? It's like using the SQL statement in the
> above example: select * from dorista ble
>
>
>
> Linfeng Zhang
>
>
>
> *发件人**: *ling miao <li...@apache.org>
> *日期**: *2021年2月24日 星期三 11:40
> *收件人**: *"Zhang,Linfeng" <zh...@baidu.com>
> *抄送**: *"dev@doris.apache.org" <de...@doris.apache.org>, "Yang,Dan(R&D
> QED)" <ya...@baidu.com>
> *主题**: *Re: Question about buckets of Doris
>
>
>
> Hi LinFeng,
>
>
>
> Or can you explain the difference between two sub-tables of es and
> creating two separate tables?
>
>
>
> Ling Miao
>
>
>
> Zhang,Linfeng <zh...@baidu.com> 于2021年2月24日周三 上午11:25写道:
>
> Hi Ling Miao
>
>
>
> Thank you for your reply.
>
>
>
> In the process of using Doris on ES, there will be a table of ES, but the
> corresponding two sub tables have different table structures. This problem
> can be solved by building different Doris external table through different
> es sub tables, but the external table of Doris will increase geometrically.
> This is difficult to manage because of too much external tables. So I want
> to ask if there is any better way. For example, can partitioning and
> buckets solve this problem ? Or is there a better way ?
>
>
>
> Zhang,Linfeng
>
>
>
> *发件人**: *ling miao <em...@gmail.com>
> *日期**: *2021年2月23日 星期二 10:35
> *收件人**: *"dev@doris.apache.org" <de...@doris.apache.org>
> *抄送**: *"Yang,Dan(R&D QED)" <ya...@baidu.com>, "Zhang,Linfeng" <
> zhanglinfeng@baidu.com>
> *主题**: *Re: Question about buckets of Doris
>
>
>
> Hi LinFeng,
>
>
>
> Doris does not support the concept of sub-tables. A table can only have
> one table structure at a time.
>
>
>
> But what I am more puzzled is, under what circumstances will the concept
> of sub-tables be used in business requirements.
> Does it mean that ES supports one table but has two different table
> structures?
> Is it possible to treat these two sub-tables directly as the two external
> tables of Doris?
>
>
>
> Ling Miao
>
>
>
> Zhang,Linfeng <zh...@baidu.com> 于2021年2月23日周二 上午2:26写道:
>
>  Hi,
> >  Question:
> Excuse me, I have another question to ask. Does Doris have the function of
> sub table ? I didn't find an introduction to Doris's sub table function. IF
> not, What I want to know is can Doris bucket redefine the table structure ?
> Suppose that a table of ES has two sub tables corresponding to two buckets
> of Doris, but the table structures of the two sub tables of ES are
> different, can the two buckets of Doris also define different table
> structures? Or is there a corresponding solution in this case ?
>
>

Re: Question about buckets of Doris

Posted by ling miao <li...@apache.org>.
Hi LinFeng,

Or can you explain the difference between two sub-tables of es and creating
two separate tables?

Ling Miao

Zhang,Linfeng <zh...@baidu.com> 于2021年2月24日周三 上午11:25写道:

> Hi Ling Miao
>
>
>
> Thank you for your reply.
>
>
>
> In the process of using Doris on ES, there will be a table of ES, but the
> corresponding two sub tables have different table structures. This problem
> can be solved by building different Doris external table through different
> es sub tables, but the external table of Doris will increase geometrically.
> This is difficult to manage because of too much external tables. So I want
> to ask if there is any better way. For example, can partitioning and
> buckets solve this problem ? Or is there a better way ?
>
>
>
> Zhang,Linfeng
>
>
>
> *发件人**: *ling miao <em...@gmail.com>
> *日期**: *2021年2月23日 星期二 10:35
> *收件人**: *"dev@doris.apache.org" <de...@doris.apache.org>
> *抄送**: *"Yang,Dan(R&D QED)" <ya...@baidu.com>, "Zhang,Linfeng" <
> zhanglinfeng@baidu.com>
> *主题**: *Re: Question about buckets of Doris
>
>
>
> Hi LinFeng,
>
>
>
> Doris does not support the concept of sub-tables. A table can only have
> one table structure at a time.
>
>
>
> But what I am more puzzled is, under what circumstances will the concept
> of sub-tables be used in business requirements.
> Does it mean that ES supports one table but has two different table
> structures?
> Is it possible to treat these two sub-tables directly as the two external
> tables of Doris?
>
>
>
> Ling Miao
>
>
>
> Zhang,Linfeng <zh...@baidu.com> 于2021年2月23日周二 上午2:26写道:
>
>  Hi,
> >  Question:
> Excuse me, I have another question to ask. Does Doris have the function of
> sub table ? I didn't find an introduction to Doris's sub table function. IF
> not, What I want to know is can Doris bucket redefine the table structure ?
> Suppose that a table of ES has two sub tables corresponding to two buckets
> of Doris, but the table structures of the two sub tables of ES are
> different, can the two buckets of Doris also define different table
> structures? Or is there a corresponding solution in this case ?
>
>

Re: Question about buckets of Doris

Posted by "Zhang,Linfeng" <zh...@baidu.com>.
Hi Ling Miao

Thank you for your reply.

In the process of using Doris on ES, there will be a table of ES, but the corresponding two sub tables have different table structures. This problem can be solved by building different Doris external table through different es sub tables, but the external table of Doris will increase geometrically. This is difficult to manage because of too much external tables. So I want to ask if there is any better way. For example, can partitioning and buckets solve this problem ? Or is there a better way ?

Zhang,Linfeng

发件人: ling miao <em...@gmail.com>
日期: 2021年2月23日 星期二 10:35
收件人: "dev@doris.apache.org" <de...@doris.apache.org>
抄送: "Yang,Dan(R&D QED)" <ya...@baidu.com>, "Zhang,Linfeng" <zh...@baidu.com>
主题: Re: Question about buckets of Doris

Hi LinFeng,

Doris does not support the concept of sub-tables. A table can only have one table structure at a time.

But what I am more puzzled is, under what circumstances will the concept of sub-tables be used in business requirements.
Does it mean that ES supports one table but has two different table structures?
Is it possible to treat these two sub-tables directly as the two external tables of Doris?

Ling Miao

Zhang,Linfeng <zh...@baidu.com>> 于2021年2月23日周二 上午2:26写道:
 Hi,
>  Question:
Excuse me, I have another question to ask. Does Doris have the function of sub table ? I didn't find an introduction to Doris's sub table function. IF not, What I want to know is can Doris bucket redefine the table structure ? Suppose that a table of ES has two sub tables corresponding to two buckets of Doris, but the table structures of the two sub tables of ES are different, can the two buckets of Doris also define different table structures? Or is there a corresponding solution in this case ?

Re: Question about buckets of Doris

Posted by ling miao <em...@gmail.com>.
Hi LinFeng,

Doris does not support the concept of sub-tables. A table can only have one
table structure at a time.

But what I am more puzzled is, under what circumstances will the concept of
sub-tables be used in business requirements.
Does it mean that ES supports one table but has two different table
structures?
Is it possible to treat these two sub-tables directly as the two external
tables of Doris?

Ling Miao

Zhang,Linfeng <zh...@baidu.com> 于2021年2月23日周二 上午2:26写道:

>  Hi,
> >  Question:
> Excuse me, I have another question to ask. Does Doris have the function of
> sub table ? I didn't find an introduction to Doris's sub table function. IF
> not, What I want to know is can Doris bucket redefine the table structure ?
> Suppose that a table of ES has two sub tables corresponding to two buckets
> of Doris, but the table structures of the two sub tables of ES are
> different, can the two buckets of Doris also define different table
> structures? Or is there a corresponding solution in this case ?
>
>