You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Feng Lu <fe...@google.com> on 2020/07/31 20:57:17 UTC

Enabling gRPC support in Hive Metastore

Hi all,

Several of us from Google and Cloudera explored the possibility of adding
gRPC support in Apache Hive. The detailed design proposal can be found
here:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886.

*TL;DR:*
*Why?*
- modernize Hive Metastore's Thrift interface to gRPC based for
enhanced/more flexible authentication/authorization support.
- leverage the gRPC streaming capability to reduce response latency and
Hive Metastore resource footprint.

*How? *
- we provide a thin translation layer that converts Thrift to gRPC (and
vice versa) messages in memory initially.
- over time, Hive Metastore methods (e.g., getTable) will be implemented
directly on gRPC so no gRPC->Thrift conversation is required.

Thank you and any feedback is welcome. Have a great weekend!

p.s., If we don't see much concern from the community, the next step is to
get the dev work started in the next couple of months.

Feng

Re: Enabling gRPC support in Hive Metastore

Posted by Feng Lu <fe...@google.com>.
Hi Jan,

Thanks for your feedback, very helpful!
You made a great point about potential dependency conflict, we'll make sure
our libraries coexist well with shared Hive dependencies.
I am not sure SASL is necessarily superior to gRPC. It depends a lot on the
platform we run these services. For example, gRPC (and its auth) is well
supported in k8s/Istio with baked-in mutual-tls support.
That said, there's an existing proposal
<https://github.com/grpc/proposal/pull/101> to add GSSAPI support in gRPC
when Kerberos is used. We are happy to revisit this issue later if there's
a strong demand to run Kerberos + gRPC.

Feng

On Thu, Aug 20, 2020 at 11:27 PM Jan Fili <ja...@gmail.com> wrote:

> Just on a sitenote it appears as Sasl is the superior auth mechanism
> compared to grpc auth.
> Will the GRPC protocol  contain the usual sasl messages or are you
> opting for grpc auth?
>

Re: Enabling gRPC support in Hive Metastore

Posted by Jan Fili <ja...@gmail.com>.
Just on a sitenote it appears as Sasl is the superior auth mechanism
compared to grpc auth.
Will the GRPC protocol  contain the usual sasl messages or are you
opting for grpc auth?

Re: Enabling gRPC support in Hive Metastore

Posted by Jan Fili <ja...@gmail.com>.
If you take a look at

<dependency>
    <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId>
    <version>3.1.2</version>
</dependency>

you will find com.google.protobuf as a package contained in it. That
creates conflicts occasionally and I wanted to know if you could
relocate/shade the protobuf dependency into something
org.apache.hive. The same process is done for kryo and a few others
already. In my line of thought it would be easy to also shade the
protobuf libraries while one is working
on a big chunk of grpc.

Hope it makes sense

Jan

Re: Enabling gRPC support in Hive Metastore

Posted by Feng Lu <fe...@google.com>.
Yes, we'll share and check in the thrift equivalent protobuf sepc in Apache
Hive repo.
As for the proto libraries, we'll open source and distribute them under the
Apache license.
Based on our preliminary evaluation, it might be better to load these proto
libraries at run time and not bloat up the Hive codebase.
Would that work for you Jan?

Feng

On Tue, Aug 18, 2020 at 10:12 AM Jan Fili <ja...@gmail.com> wrote:

> Would you mind shading the proto-library in hive-exec-core along the way?
>
> Am Mo., 17. Aug. 2020 um 08:31 Uhr schrieb Feng Lu <fe...@google.com>:
> >
> > It has been a while since we shared this design proposal, if there's no
> more concern, we'll go ahead with the implementation work soon.
> > Thank you!
> >
> > On Fri, Jul 31, 2020 at 1:57 PM Feng Lu <fe...@google.com> wrote:
> >>
> >> Hi all,
> >>
> >> Several of us from Google and Cloudera explored the possibility of
> adding gRPC support in Apache Hive. The detailed design proposal can be
> found here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886
> .
> >>
> >> TL;DR:
> >> Why?
> >> - modernize Hive Metastore's Thrift interface to gRPC based for
> enhanced/more flexible authentication/authorization support.
> >> - leverage the gRPC streaming capability to reduce response latency and
> Hive Metastore resource footprint.
> >>
> >> How?
> >> - we provide a thin translation layer that converts Thrift to gRPC (and
> vice versa) messages in memory initially.
> >> - over time, Hive Metastore methods (e.g., getTable) will be
> implemented directly on gRPC so no gRPC->Thrift conversation is required.
> >>
> >> Thank you and any feedback is welcome. Have a great weekend!
> >>
> >> p.s., If we don't see much concern from the community, the next step is
> to get the dev work started in the next couple of months.
> >>
> >> Feng
>

Re: Enabling gRPC support in Hive Metastore

Posted by Jan Fili <ja...@gmail.com>.
Would you mind shading the proto-library in hive-exec-core along the way?

Am Mo., 17. Aug. 2020 um 08:31 Uhr schrieb Feng Lu <fe...@google.com>:
>
> It has been a while since we shared this design proposal, if there's no more concern, we'll go ahead with the implementation work soon.
> Thank you!
>
> On Fri, Jul 31, 2020 at 1:57 PM Feng Lu <fe...@google.com> wrote:
>>
>> Hi all,
>>
>> Several of us from Google and Cloudera explored the possibility of adding gRPC support in Apache Hive. The detailed design proposal can be found here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886.
>>
>> TL;DR:
>> Why?
>> - modernize Hive Metastore's Thrift interface to gRPC based for enhanced/more flexible authentication/authorization support.
>> - leverage the gRPC streaming capability to reduce response latency and Hive Metastore resource footprint.
>>
>> How?
>> - we provide a thin translation layer that converts Thrift to gRPC (and vice versa) messages in memory initially.
>> - over time, Hive Metastore methods (e.g., getTable) will be implemented directly on gRPC so no gRPC->Thrift conversation is required.
>>
>> Thank you and any feedback is welcome. Have a great weekend!
>>
>> p.s., If we don't see much concern from the community, the next step is to get the dev work started in the next couple of months.
>>
>> Feng

Re: Enabling gRPC support in Hive Metastore

Posted by Feng Lu <fe...@google.com>.
It has been a while since we shared this design proposal, if there's no
more concern, we'll go ahead with the implementation work soon.
Thank you!

On Fri, Jul 31, 2020 at 1:57 PM Feng Lu <fe...@google.com> wrote:

> Hi all,
>
> Several of us from Google and Cloudera explored the possibility of adding
> gRPC support in Apache Hive. The detailed design proposal can be found
> here:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158869886
> .
>
> *TL;DR:*
> *Why?*
> - modernize Hive Metastore's Thrift interface to gRPC based for
> enhanced/more flexible authentication/authorization support.
> - leverage the gRPC streaming capability to reduce response latency and
> Hive Metastore resource footprint.
>
> *How? *
> - we provide a thin translation layer that converts Thrift to gRPC (and
> vice versa) messages in memory initially.
> - over time, Hive Metastore methods (e.g., getTable) will be implemented
> directly on gRPC so no gRPC->Thrift conversation is required.
>
> Thank you and any feedback is welcome. Have a great weekend!
>
> p.s., If we don't see much concern from the community, the next step is to
> get the dev work started in the next couple of months.
>
> Feng
>