You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by 陈明雨 <mo...@163.com> on 2022/05/27 10:09:53 UTC

[Discuss][DSIP] Support Multi Catalog

Hi all,
I plan to support multi catalog in Doris to manager all external datasource such as hive, iceberg, hudi, es, odbc, etc.
I have created a DSIP[1] for this.
And here is the first PR to add some new interface and class[2]


Please feel free to discuss.




[1] https://cwiki.apache.org/confluence/display/DORIS/DSIP-014%3A+Multi+Catalog+Support

[2] https://github.com/apache/incubator-doris/pull/9812




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
chenmingyu@apache.org

Re:Re: Re: [Discuss][DSIP] Support Multi Catalog

Posted by 陈明雨 <mo...@163.com>.
Actually, in my design, after user creating an external datasource, he can access the external databases and tables directly, and no need to create the metadata mapping manuelly.




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
chenmingyu@apache.org





在 2022-05-30 12:52:49,"张家峰" <zh...@gmail.com> 写道:
>how to create an exterior:
>
>      When using external tables in this management method, it is best not
>to manually create the mapping of each table, so that users can use it
>without perception, especially the field mapping of the external table,
>which can be simplified to the point where the user only needs to create an
>external data source. resource , and then create a foreign table. You don't
>need to specify the order of fields. You only need to specify the resource
>and attributes corresponding to this table (which table corresponds to the
>data source). If you can create an external data source in Doris, you can
>create database by specifying the external data source. attribute, and then
>the user can view the best appearance through the show tables method under
>this database;
>
>
>
>jiafeng.Zhang
>
>
>陈明雨 <mo...@163.com> 于2022年5月30日周一 12:39写道:
>
>> 1. Permission
>>
>>
>> In my design, the permission system of the external data source is
>> decoupled from Doris' own permission system.
>> First, when creating an external data source, the user will specify an
>> account of the external data source to connect to (named ”proxy account”),
>> and the permissions of this account to the external data source are managed
>> by the external data source itself.
>> Inside Doris, we still use the current permission management mechanism to
>> manage the read and write permissions of the database or table in the
>> external data source.
>>
>>
>> For example, if I grant read permission to table A on the Doris side, but
>> the proxy account does not have read permission to table A, an error will
>> be reported when the actual access is made. These are two logics that do
>> not affect each other.
>>
>>
>> 2. Direction
>>
>>
>> Many users' data is stored in external data sources, but they hopes to
>> have a unified data access portal for offline and online analysis,
>> federated query and other operations. The purpose of this function is to
>> give Doris the ability to "unify SQL entry".
>> At this stage, we need to solve the problem of metadata mapping and data
>> access unification for external data sources such as hive, iceberg, and
>> hudi.
>>
>>
>>
>> --
>>
>> 此致!Best Regards
>> 陈明雨 Mingyu Chen
>>
>> Email:
>> chenmingyu@apache.org
>>
>>
>>
>>
>>
>> At 2022-05-30 10:54:08, "ling miao" <li...@apache.org> wrote:
>> >Regarding permissions, there is currently a difference between Doris's
>> >permission system and the external permission system. For external data
>> >sources, still use Doris's permissions, or use the permissions of external
>> >data sources?
>> >
>> >As far as the current architecture is concerned, Doris is still a system
>> >based on querying its own table and has made many optimizations on this
>> >basis.
>> >Supporting such functions at this stage, what *direction* do you hope
>> Doris
>> >will develop in the future? What *other features* are planned besides
>> this?
>> >
>> >
>> >
>> >陈明雨 <mo...@163.com> 于2022年5月27日周五 18:10写道:
>> >
>> >> Hi all,
>> >> I plan to support multi catalog in Doris to manager all external
>> >> datasource such as hive, iceberg, hudi, es, odbc, etc.
>> >> I have created a DSIP[1] for this.
>> >> And here is the first PR to add some new interface and class[2]
>> >>
>> >>
>> >> Please feel free to discuss.
>> >>
>> >>
>> >>
>> >>
>> >> [1]
>> >>
>> https://cwiki.apache.org/confluence/display/DORIS/DSIP-014%3A+Multi+Catalog+Support
>> >>
>> >> [2] https://github.com/apache/incubator-doris/pull/9812
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >> 此致!Best Regards
>> >> 陈明雨 Mingyu Chen
>> >>
>> >> Email:
>> >> chenmingyu@apache.org
>> >
>> >
>> >
>> >--
>> >Ling Miao | Apache Doris
>>
>
>
>-- 
>张家峰

Re: Re: [Discuss][DSIP] Support Multi Catalog

Posted by 张家峰 <zh...@gmail.com>.
how to create an exterior:

      When using external tables in this management method, it is best not
to manually create the mapping of each table, so that users can use it
without perception, especially the field mapping of the external table,
which can be simplified to the point where the user only needs to create an
external data source. resource , and then create a foreign table. You don't
need to specify the order of fields. You only need to specify the resource
and attributes corresponding to this table (which table corresponds to the
data source). If you can create an external data source in Doris, you can
create database by specifying the external data source. attribute, and then
the user can view the best appearance through the show tables method under
this database;



jiafeng.Zhang


陈明雨 <mo...@163.com> 于2022年5月30日周一 12:39写道:

> 1. Permission
>
>
> In my design, the permission system of the external data source is
> decoupled from Doris' own permission system.
> First, when creating an external data source, the user will specify an
> account of the external data source to connect to (named ”proxy account”),
> and the permissions of this account to the external data source are managed
> by the external data source itself.
> Inside Doris, we still use the current permission management mechanism to
> manage the read and write permissions of the database or table in the
> external data source.
>
>
> For example, if I grant read permission to table A on the Doris side, but
> the proxy account does not have read permission to table A, an error will
> be reported when the actual access is made. These are two logics that do
> not affect each other.
>
>
> 2. Direction
>
>
> Many users' data is stored in external data sources, but they hopes to
> have a unified data access portal for offline and online analysis,
> federated query and other operations. The purpose of this function is to
> give Doris the ability to "unify SQL entry".
> At this stage, we need to solve the problem of metadata mapping and data
> access unification for external data sources such as hive, iceberg, and
> hudi.
>
>
>
> --
>
> 此致!Best Regards
> 陈明雨 Mingyu Chen
>
> Email:
> chenmingyu@apache.org
>
>
>
>
>
> At 2022-05-30 10:54:08, "ling miao" <li...@apache.org> wrote:
> >Regarding permissions, there is currently a difference between Doris's
> >permission system and the external permission system. For external data
> >sources, still use Doris's permissions, or use the permissions of external
> >data sources?
> >
> >As far as the current architecture is concerned, Doris is still a system
> >based on querying its own table and has made many optimizations on this
> >basis.
> >Supporting such functions at this stage, what *direction* do you hope
> Doris
> >will develop in the future? What *other features* are planned besides
> this?
> >
> >
> >
> >陈明雨 <mo...@163.com> 于2022年5月27日周五 18:10写道:
> >
> >> Hi all,
> >> I plan to support multi catalog in Doris to manager all external
> >> datasource such as hive, iceberg, hudi, es, odbc, etc.
> >> I have created a DSIP[1] for this.
> >> And here is the first PR to add some new interface and class[2]
> >>
> >>
> >> Please feel free to discuss.
> >>
> >>
> >>
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/display/DORIS/DSIP-014%3A+Multi+Catalog+Support
> >>
> >> [2] https://github.com/apache/incubator-doris/pull/9812
> >>
> >>
> >>
> >>
> >> --
> >>
> >> 此致!Best Regards
> >> 陈明雨 Mingyu Chen
> >>
> >> Email:
> >> chenmingyu@apache.org
> >
> >
> >
> >--
> >Ling Miao | Apache Doris
>


-- 
张家峰

Re:Re: [Discuss][DSIP] Support Multi Catalog

Posted by 陈明雨 <mo...@163.com>.
1. Permission


In my design, the permission system of the external data source is decoupled from Doris' own permission system.
First, when creating an external data source, the user will specify an account of the external data source to connect to (named ”proxy account”), and the permissions of this account to the external data source are managed by the external data source itself.
Inside Doris, we still use the current permission management mechanism to manage the read and write permissions of the database or table in the external data source.


For example, if I grant read permission to table A on the Doris side, but the proxy account does not have read permission to table A, an error will be reported when the actual access is made. These are two logics that do not affect each other.


2. Direction


Many users' data is stored in external data sources, but they hopes to have a unified data access portal for offline and online analysis, federated query and other operations. The purpose of this function is to give Doris the ability to "unify SQL entry".
At this stage, we need to solve the problem of metadata mapping and data access unification for external data sources such as hive, iceberg, and hudi.



--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
chenmingyu@apache.org





At 2022-05-30 10:54:08, "ling miao" <li...@apache.org> wrote:
>Regarding permissions, there is currently a difference between Doris's
>permission system and the external permission system. For external data
>sources, still use Doris's permissions, or use the permissions of external
>data sources?
>
>As far as the current architecture is concerned, Doris is still a system
>based on querying its own table and has made many optimizations on this
>basis.
>Supporting such functions at this stage, what *direction* do you hope Doris
>will develop in the future? What *other features* are planned besides this?
>
>
>
>陈明雨 <mo...@163.com> 于2022年5月27日周五 18:10写道:
>
>> Hi all,
>> I plan to support multi catalog in Doris to manager all external
>> datasource such as hive, iceberg, hudi, es, odbc, etc.
>> I have created a DSIP[1] for this.
>> And here is the first PR to add some new interface and class[2]
>>
>>
>> Please feel free to discuss.
>>
>>
>>
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/DORIS/DSIP-014%3A+Multi+Catalog+Support
>>
>> [2] https://github.com/apache/incubator-doris/pull/9812
>>
>>
>>
>>
>> --
>>
>> 此致!Best Regards
>> 陈明雨 Mingyu Chen
>>
>> Email:
>> chenmingyu@apache.org
>
>
>
>-- 
>Ling Miao | Apache Doris

Re: [Discuss][DSIP] Support Multi Catalog

Posted by ling miao <li...@apache.org>.
Regarding permissions, there is currently a difference between Doris's
permission system and the external permission system. For external data
sources, still use Doris's permissions, or use the permissions of external
data sources?

As far as the current architecture is concerned, Doris is still a system
based on querying its own table and has made many optimizations on this
basis.
Supporting such functions at this stage, what *direction* do you hope Doris
will develop in the future? What *other features* are planned besides this?



陈明雨 <mo...@163.com> 于2022年5月27日周五 18:10写道:

> Hi all,
> I plan to support multi catalog in Doris to manager all external
> datasource such as hive, iceberg, hudi, es, odbc, etc.
> I have created a DSIP[1] for this.
> And here is the first PR to add some new interface and class[2]
>
>
> Please feel free to discuss.
>
>
>
>
> [1]
> https://cwiki.apache.org/confluence/display/DORIS/DSIP-014%3A+Multi+Catalog+Support
>
> [2] https://github.com/apache/incubator-doris/pull/9812
>
>
>
>
> --
>
> 此致!Best Regards
> 陈明雨 Mingyu Chen
>
> Email:
> chenmingyu@apache.org



-- 
Ling Miao | Apache Doris