You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Zoltan Haindrich <zh...@hortonworks.com> on 2017/03/27 14:33:33 UTC

[DISCUSS] split metastore and service

Hello,

Currently the jdbc driver contains lots of hive code; which are not needed for the driver to function properly - jdbc-standalone is currently a 60M binary! :)

I've opened a ticket, to explore the possibilites what can be done in this aspect to reduce jdbc's dependencies.

I was able to remove most of the service and the metastore dependencies - by introducing 2 new modules: I called them metastore-api and service-client.
As a change like this would mean that the released jars name and purpose would change - I didn't wanted to just file a jira about it :)

So...I would like to ask for opinions or any concerns against doing the following:

1) Splitting the metastore module; the new module would be named as metastore-X (my proposals for X are: client,rpc,if or api).
  * the dependency would contain the thrift interface
  * and possibly a few other source files which are needed to use it.

2) Splitting the service module; the new module would be named service-X (my propsal for X would be client)
  * the module would contain auth related classes
  * some other basic stuffs like RowSet
  * connected change: jdbc driver would change the support of embedded mode to only make it usable if 'service' is loaded onto the classpath

With these two modules available, the size of the jdbc driver have dropped to about 21M.

more info:
https://issues.apache.org/jira/browse/HIVE-16214

regards,
Zoltan

Re: [DISCUSS] split metastore and service

Posted by Carl Steinbach <cw...@gmail.com>.
+1!

On Tue, Mar 28, 2017 at 6:36 PM, Thejas Nair <th...@gmail.com> wrote:

> Also, thanks for the email thread to bring peoples attention to this
> change.
>
> On Tue, Mar 28, 2017 at 6:35 PM, Thejas Nair <th...@gmail.com>
> wrote:
>
> > +1
> > Thanks for looking into this!
> >
> >
> > On Tue, Mar 28, 2017 at 11:26 AM, Eugene Koifman <
> ekoifman@hortonworks.com
> > > wrote:
> >
> >> +1 reduce the number of uber jars
> >>
> >>
> >> On 3/27/17, 1:05 PM, "Sergey Shelukhin" <se...@hortonworks.com> wrote:
> >>
> >>     Splitting the metastore would also allow us to get rid of compile
> time
> >>     dependencies that are resolved via reflection right now.
> >>     +1 on the feature
> >>
> >>     On 17/3/27, 07:33, "Zoltan Haindrich" <zh...@hortonworks.com>
> >> wrote:
> >>
> >>     >Hello,
> >>     >
> >>     >Currently the jdbc driver contains lots of hive code; which are not
> >>     >needed for the driver to function properly - jdbc-standalone is
> >> currently
> >>     >a 60M binary! :)
> >>     >
> >>     >I've opened a ticket, to explore the possibilites what can be done
> in
> >>     >this aspect to reduce jdbc's dependencies.
> >>     >
> >>     >I was able to remove most of the service and the metastore
> >> dependencies -
> >>     >by introducing 2 new modules: I called them metastore-api and
> >>     >service-client.
> >>     >As a change like this would mean that the released jars name and
> >> purpose
> >>     >would change - I didn't wanted to just file a jira about it :)
> >>     >
> >>     >So...I would like to ask for opinions or any concerns against doing
> >> the
> >>     >following:
> >>     >
> >>     >1) Splitting the metastore module; the new module would be named as
> >>     >metastore-X (my proposals for X are: client,rpc,if or api).
> >>     >  * the dependency would contain the thrift interface
> >>     >  * and possibly a few other source files which are needed to use
> it.
> >>     >
> >>     >2) Splitting the service module; the new module would be named
> >> service-X
> >>     >(my propsal for X would be client)
> >>     >  * the module would contain auth related classes
> >>     >  * some other basic stuffs like RowSet
> >>     >  * connected change: jdbc driver would change the support of
> >> embedded
> >>     >mode to only make it usable if 'service' is loaded onto the
> classpath
> >>     >
> >>     >With these two modules available, the size of the jdbc driver have
> >>     >dropped to about 21M.
> >>     >
> >>     >more info:
> >>     >https://issues.apache.org/jira/browse/HIVE-16214
> >>     >
> >>     >regards,
> >>     >Zoltan
> >>
> >>
> >>
> >>
> >
>

Re: [DISCUSS] split metastore and service

Posted by Thejas Nair <th...@gmail.com>.
Also, thanks for the email thread to bring peoples attention to this change.

On Tue, Mar 28, 2017 at 6:35 PM, Thejas Nair <th...@gmail.com> wrote:

> +1
> Thanks for looking into this!
>
>
> On Tue, Mar 28, 2017 at 11:26 AM, Eugene Koifman <ekoifman@hortonworks.com
> > wrote:
>
>> +1 reduce the number of uber jars
>>
>>
>> On 3/27/17, 1:05 PM, "Sergey Shelukhin" <se...@hortonworks.com> wrote:
>>
>>     Splitting the metastore would also allow us to get rid of compile time
>>     dependencies that are resolved via reflection right now.
>>     +1 on the feature
>>
>>     On 17/3/27, 07:33, "Zoltan Haindrich" <zh...@hortonworks.com>
>> wrote:
>>
>>     >Hello,
>>     >
>>     >Currently the jdbc driver contains lots of hive code; which are not
>>     >needed for the driver to function properly - jdbc-standalone is
>> currently
>>     >a 60M binary! :)
>>     >
>>     >I've opened a ticket, to explore the possibilites what can be done in
>>     >this aspect to reduce jdbc's dependencies.
>>     >
>>     >I was able to remove most of the service and the metastore
>> dependencies -
>>     >by introducing 2 new modules: I called them metastore-api and
>>     >service-client.
>>     >As a change like this would mean that the released jars name and
>> purpose
>>     >would change - I didn't wanted to just file a jira about it :)
>>     >
>>     >So...I would like to ask for opinions or any concerns against doing
>> the
>>     >following:
>>     >
>>     >1) Splitting the metastore module; the new module would be named as
>>     >metastore-X (my proposals for X are: client,rpc,if or api).
>>     >  * the dependency would contain the thrift interface
>>     >  * and possibly a few other source files which are needed to use it.
>>     >
>>     >2) Splitting the service module; the new module would be named
>> service-X
>>     >(my propsal for X would be client)
>>     >  * the module would contain auth related classes
>>     >  * some other basic stuffs like RowSet
>>     >  * connected change: jdbc driver would change the support of
>> embedded
>>     >mode to only make it usable if 'service' is loaded onto the classpath
>>     >
>>     >With these two modules available, the size of the jdbc driver have
>>     >dropped to about 21M.
>>     >
>>     >more info:
>>     >https://issues.apache.org/jira/browse/HIVE-16214
>>     >
>>     >regards,
>>     >Zoltan
>>
>>
>>
>>
>

Re: [DISCUSS] split metastore and service

Posted by Thejas Nair <th...@gmail.com>.
+1
Thanks for looking into this!


On Tue, Mar 28, 2017 at 11:26 AM, Eugene Koifman <ek...@hortonworks.com>
wrote:

> +1 reduce the number of uber jars
>
>
> On 3/27/17, 1:05 PM, "Sergey Shelukhin" <se...@hortonworks.com> wrote:
>
>     Splitting the metastore would also allow us to get rid of compile time
>     dependencies that are resolved via reflection right now.
>     +1 on the feature
>
>     On 17/3/27, 07:33, "Zoltan Haindrich" <zh...@hortonworks.com>
> wrote:
>
>     >Hello,
>     >
>     >Currently the jdbc driver contains lots of hive code; which are not
>     >needed for the driver to function properly - jdbc-standalone is
> currently
>     >a 60M binary! :)
>     >
>     >I've opened a ticket, to explore the possibilites what can be done in
>     >this aspect to reduce jdbc's dependencies.
>     >
>     >I was able to remove most of the service and the metastore
> dependencies -
>     >by introducing 2 new modules: I called them metastore-api and
>     >service-client.
>     >As a change like this would mean that the released jars name and
> purpose
>     >would change - I didn't wanted to just file a jira about it :)
>     >
>     >So...I would like to ask for opinions or any concerns against doing
> the
>     >following:
>     >
>     >1) Splitting the metastore module; the new module would be named as
>     >metastore-X (my proposals for X are: client,rpc,if or api).
>     >  * the dependency would contain the thrift interface
>     >  * and possibly a few other source files which are needed to use it.
>     >
>     >2) Splitting the service module; the new module would be named
> service-X
>     >(my propsal for X would be client)
>     >  * the module would contain auth related classes
>     >  * some other basic stuffs like RowSet
>     >  * connected change: jdbc driver would change the support of embedded
>     >mode to only make it usable if 'service' is loaded onto the classpath
>     >
>     >With these two modules available, the size of the jdbc driver have
>     >dropped to about 21M.
>     >
>     >more info:
>     >https://issues.apache.org/jira/browse/HIVE-16214
>     >
>     >regards,
>     >Zoltan
>
>
>
>

Re: [DISCUSS] split metastore and service

Posted by Eugene Koifman <ek...@hortonworks.com>.
+1 reduce the number of uber jars


On 3/27/17, 1:05 PM, "Sergey Shelukhin" <se...@hortonworks.com> wrote:

    Splitting the metastore would also allow us to get rid of compile time
    dependencies that are resolved via reflection right now.
    +1 on the feature
    
    On 17/3/27, 07:33, "Zoltan Haindrich" <zh...@hortonworks.com> wrote:
    
    >Hello,
    >
    >Currently the jdbc driver contains lots of hive code; which are not
    >needed for the driver to function properly - jdbc-standalone is currently
    >a 60M binary! :)
    >
    >I've opened a ticket, to explore the possibilites what can be done in
    >this aspect to reduce jdbc's dependencies.
    >
    >I was able to remove most of the service and the metastore dependencies -
    >by introducing 2 new modules: I called them metastore-api and
    >service-client.
    >As a change like this would mean that the released jars name and purpose
    >would change - I didn't wanted to just file a jira about it :)
    >
    >So...I would like to ask for opinions or any concerns against doing the
    >following:
    >
    >1) Splitting the metastore module; the new module would be named as
    >metastore-X (my proposals for X are: client,rpc,if or api).
    >  * the dependency would contain the thrift interface
    >  * and possibly a few other source files which are needed to use it.
    >
    >2) Splitting the service module; the new module would be named service-X
    >(my propsal for X would be client)
    >  * the module would contain auth related classes
    >  * some other basic stuffs like RowSet
    >  * connected change: jdbc driver would change the support of embedded
    >mode to only make it usable if 'service' is loaded onto the classpath
    >
    >With these two modules available, the size of the jdbc driver have
    >dropped to about 21M.
    >
    >more info:
    >https://issues.apache.org/jira/browse/HIVE-16214
    >
    >regards,
    >Zoltan
    
    


Re: [DISCUSS] split metastore and service

Posted by Sergey Shelukhin <se...@hortonworks.com>.
Splitting the metastore would also allow us to get rid of compile time
dependencies that are resolved via reflection right now.
+1 on the feature

On 17/3/27, 07:33, "Zoltan Haindrich" <zh...@hortonworks.com> wrote:

>Hello,
>
>Currently the jdbc driver contains lots of hive code; which are not
>needed for the driver to function properly - jdbc-standalone is currently
>a 60M binary! :)
>
>I've opened a ticket, to explore the possibilites what can be done in
>this aspect to reduce jdbc's dependencies.
>
>I was able to remove most of the service and the metastore dependencies -
>by introducing 2 new modules: I called them metastore-api and
>service-client.
>As a change like this would mean that the released jars name and purpose
>would change - I didn't wanted to just file a jira about it :)
>
>So...I would like to ask for opinions or any concerns against doing the
>following:
>
>1) Splitting the metastore module; the new module would be named as
>metastore-X (my proposals for X are: client,rpc,if or api).
>  * the dependency would contain the thrift interface
>  * and possibly a few other source files which are needed to use it.
>
>2) Splitting the service module; the new module would be named service-X
>(my propsal for X would be client)
>  * the module would contain auth related classes
>  * some other basic stuffs like RowSet
>  * connected change: jdbc driver would change the support of embedded
>mode to only make it usable if 'service' is loaded onto the classpath
>
>With these two modules available, the size of the jdbc driver have
>dropped to about 21M.
>
>more info:
>https://issues.apache.org/jira/browse/HIVE-16214
>
>regards,
>Zoltan