You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ankit Gupta <in...@gmail.com> on 2023/04/17 05:35:30 UTC

Re: Spark Multiple Hive Metastore Catalog Support

++
User Mailing List

Just a reminder, anyone who can help on this.

Thanks a lot !

Ankit Prakash Gupta

On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com> wrote:

> Hi All
>
> The question is regarding the support of multiple Remote Hive Metastore
> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
> spark, but have we implemented any CatalogPlugin that can help us configure
> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
> the Fully Qualified Class Name that I can try using for configuring a Hive
> Metastore Catalog. If not, I would like to work on the implementation of
> the CatalogPlugin that we can use to configure multiple Hive Metastore
> Servers' .
>
> Thanks and Regards.
>
> Ankit Prakash Gupta
> +91 8750101321
> info.ankitp@gmail.com
>
>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Cheng Pan <ch...@apache.org>.
There is a DSv2-based Hive connector in Apache Kyuubi[1] that supports
connecting multiple HMS in a single Spark application.

Some limitations

- currently only supports Spark 3.3
- has a known issue when using w/ `spark-sql`, but OK w/ spark-shell and
normal jar-based Spark application.

[1]
https://github.com/apache/kyuubi/tree/master/extensions/spark/kyuubi-spark-connector-hive

Thanks,
Cheng Pan


On Apr 18, 2023 at 00:38:23, Elliot West <te...@gmail.com> wrote:

> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive metastores so that they are accessible via a
> single URI: https://github.com/ExpediaGroup/waggle-dance
>
> This may be useful or perhaps serve as inspiration.
>
> Thanks,
>
> Elliot.
>
> On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:
>
>> ++
>> User Mailing List
>>
>> Just a reminder, anyone who can help on this.
>>
>> Thanks a lot !
>>
>> Ankit Prakash Gupta
>>
>> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com>
>> wrote:
>>
>>> Hi All
>>>
>>> The question is regarding the support of multiple Remote Hive Metastore
>>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>>> spark, but have we implemented any CatalogPlugin that can help us configure
>>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>>> the Fully Qualified Class Name that I can try using for configuring a Hive
>>> Metastore Catalog. If not, I would like to work on the implementation of
>>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>>> Servers' .
>>>
>>> Thanks and Regards.
>>>
>>> Ankit Prakash Gupta
>>> +91 8750101321
>>> info.ankitp@gmail.com
>>>
>>>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Ankit Gupta <in...@gmail.com>.
Thanks Elliot ! Let me check it out !

On Mon, 17 Apr, 2023, 10:08 pm Elliot West, <te...@gmail.com> wrote:

> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive metastores so that they are accessible via a
> single URI: https://github.com/ExpediaGroup/waggle-dance
>
> This may be useful or perhaps serve as inspiration.
>
> Thanks,
>
> Elliot.
>
> On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:
>
>> ++
>> User Mailing List
>>
>> Just a reminder, anyone who can help on this.
>>
>> Thanks a lot !
>>
>> Ankit Prakash Gupta
>>
>> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com>
>> wrote:
>>
>>> Hi All
>>>
>>> The question is regarding the support of multiple Remote Hive Metastore
>>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>>> spark, but have we implemented any CatalogPlugin that can help us configure
>>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>>> the Fully Qualified Class Name that I can try using for configuring a Hive
>>> Metastore Catalog. If not, I would like to work on the implementation of
>>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>>> Servers' .
>>>
>>> Thanks and Regards.
>>>
>>> Ankit Prakash Gupta
>>> +91 8750101321
>>> info.ankitp@gmail.com
>>>
>>>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Ankit Gupta <in...@gmail.com>.
Thanks Elliot ! Let me check it out !

On Mon, 17 Apr, 2023, 10:08 pm Elliot West, <te...@gmail.com> wrote:

> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive metastores so that they are accessible via a
> single URI: https://github.com/ExpediaGroup/waggle-dance
>
> This may be useful or perhaps serve as inspiration.
>
> Thanks,
>
> Elliot.
>
> On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:
>
>> ++
>> User Mailing List
>>
>> Just a reminder, anyone who can help on this.
>>
>> Thanks a lot !
>>
>> Ankit Prakash Gupta
>>
>> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com>
>> wrote:
>>
>>> Hi All
>>>
>>> The question is regarding the support of multiple Remote Hive Metastore
>>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>>> spark, but have we implemented any CatalogPlugin that can help us configure
>>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>>> the Fully Qualified Class Name that I can try using for configuring a Hive
>>> Metastore Catalog. If not, I would like to work on the implementation of
>>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>>> Servers' .
>>>
>>> Thanks and Regards.
>>>
>>> Ankit Prakash Gupta
>>> +91 8750101321
>>> info.ankitp@gmail.com
>>>
>>>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Cheng Pan <ch...@apache.org>.
There is a DSv2-based Hive connector in Apache Kyuubi[1] that supports
connecting multiple HMS in a single Spark application.

Some limitations

- currently only supports Spark 3.3
- has a known issue when using w/ `spark-sql`, but OK w/ spark-shell and
normal jar-based Spark application.

[1]
https://github.com/apache/kyuubi/tree/master/extensions/spark/kyuubi-spark-connector-hive

Thanks,
Cheng Pan


On Apr 18, 2023 at 00:38:23, Elliot West <te...@gmail.com> wrote:

> Hi Ankit,
>
> While not a part of Spark, there is a project called 'WaggleDance' that
> can federate multiple Hive metastores so that they are accessible via a
> single URI: https://github.com/ExpediaGroup/waggle-dance
>
> This may be useful or perhaps serve as inspiration.
>
> Thanks,
>
> Elliot.
>
> On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:
>
>> ++
>> User Mailing List
>>
>> Just a reminder, anyone who can help on this.
>>
>> Thanks a lot !
>>
>> Ankit Prakash Gupta
>>
>> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com>
>> wrote:
>>
>>> Hi All
>>>
>>> The question is regarding the support of multiple Remote Hive Metastore
>>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>>> spark, but have we implemented any CatalogPlugin that can help us configure
>>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>>> the Fully Qualified Class Name that I can try using for configuring a Hive
>>> Metastore Catalog. If not, I would like to work on the implementation of
>>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>>> Servers' .
>>>
>>> Thanks and Regards.
>>>
>>> Ankit Prakash Gupta
>>> +91 8750101321
>>> info.ankitp@gmail.com
>>>
>>>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Elliot West <te...@gmail.com>.
Hi Ankit,

While not a part of Spark, there is a project called 'WaggleDance' that can
federate multiple Hive metastores so that they are accessible via a single
URI: https://github.com/ExpediaGroup/waggle-dance

This may be useful or perhaps serve as inspiration.

Thanks,

Elliot.

On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:

> ++
> User Mailing List
>
> Just a reminder, anyone who can help on this.
>
> Thanks a lot !
>
> Ankit Prakash Gupta
>
> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com> wrote:
>
>> Hi All
>>
>> The question is regarding the support of multiple Remote Hive Metastore
>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>> spark, but have we implemented any CatalogPlugin that can help us configure
>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>> the Fully Qualified Class Name that I can try using for configuring a Hive
>> Metastore Catalog. If not, I would like to work on the implementation of
>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>> Servers' .
>>
>> Thanks and Regards.
>>
>> Ankit Prakash Gupta
>> +91 8750101321
>> info.ankitp@gmail.com
>>
>>

Re: Spark Multiple Hive Metastore Catalog Support

Posted by Elliot West <te...@gmail.com>.
Hi Ankit,

While not a part of Spark, there is a project called 'WaggleDance' that can
federate multiple Hive metastores so that they are accessible via a single
URI: https://github.com/ExpediaGroup/waggle-dance

This may be useful or perhaps serve as inspiration.

Thanks,

Elliot.

On Mon, 17 Apr 2023 at 16:38, Ankit Gupta <in...@gmail.com> wrote:

> ++
> User Mailing List
>
> Just a reminder, anyone who can help on this.
>
> Thanks a lot !
>
> Ankit Prakash Gupta
>
> On Wed, Apr 12, 2023 at 8:22 AM Ankit Gupta <in...@gmail.com> wrote:
>
>> Hi All
>>
>> The question is regarding the support of multiple Remote Hive Metastore
>> catalogs with Spark. Starting Spark 3, multiple catalog support is added in
>> spark, but have we implemented any CatalogPlugin that can help us configure
>> multiple Remote Hive Metastore Catalogs ? If yes, can anyone help me with
>> the Fully Qualified Class Name that I can try using for configuring a Hive
>> Metastore Catalog. If not, I would like to work on the implementation of
>> the CatalogPlugin that we can use to configure multiple Hive Metastore
>> Servers' .
>>
>> Thanks and Regards.
>>
>> Ankit Prakash Gupta
>> +91 8750101321
>> info.ankitp@gmail.com
>>
>>