You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by Roman Shaposhnik <rv...@apache.org> on 2012/07/01 05:31:46 UTC

Re: Optional dependencies

Hi!

Firs of all, let me articulate the particular concern that I have for Hive
(and it is Hive-specific, as in Pig, for example, doesn't suffer from
the same issue). I apologize for not clearly articulating it earlier.

Basically with Hive the fundamental problem is that it relies on the
presence of the symlinks to hbase.jar under /usr/lib/hive/lib in
order for the Hive-HBase integration to work. Unlike Pig, even when
HBase is installed on the system under a well known location
the symlinks have to there in order for Hive to be able to interface
with HBase.

Currently these symlinks get installed by the Hive package. With
your proposed change they will be installed by hive-hbase package.
This means change in behaviour on the systems where HBase is
present with or without hbase dependency coming from the Hive
package.

Now, historically Bigtop hasn't really paid much attention to being
backwards compatible between releases. Hence this is not a -1,
but rather food for thought.

Now to comment on your points:

On Thu, Jun 28, 2012 at 10:01 PM, Bruno Mahé <bm...@apache.org> wrote:
> On 06/28/2012 09:51 PM, Roman Shaposhnik wrote:
>> Personally, I think I'm reasonably fine with #3 (after all it kind of combines
>> #1 with an extra package) with the only concern that I remember
>> being potential combinatoric explosion of  these helper packages
>> (e.g. hive-hbase, hive-cassandra, hive-hbase-cassandra, etc).
>>
>> Thanks,
>> Roman.
>
> I am not sure to follow why you would have a combinatorial explosion?
> Following your example you would have the following packages:
> * hive which provides the main non-optional features
> * hive-hbase which provides and pulls everything necessary for an
> integration with hbase. hive-hbase depending on hive
> * hive-cassandra which provides and pulls everything necessary for an
> integration with cassandra. hive-cassandra depending on hive
>
> So then depending on your needs you could install (hive), (hive and
> hive-hbase), (hive and hive-cassandra) or (hive and hive-hbase and
> hive-cassandra) if you need both.
> You want subpackages as orthogonal as possible rather than per use cases.

I suppose this could work for as long as the interaction between these types
of subpackages is indeed orthogonal (as in -- presence or absence of
hive-hbase integration doesn't affect hive-cassandra integration).

At this point I can't think of a case where it should be a problem so I'm +0 on
the approach.

That said -- what really would make me an enthusiastic +1 is if we could
make Hive behave more like Pig when it comes to HBase integration. Let
me take a look into the launcher script and see whether there's a low hanging
fruit in there.

Thanks,
Roman.

Re: Optional dependencies

Posted by Konstantin Boudnik <co...@apache.org>.
Wrong op, pal ;)

On Wed, Jul 04, 2012 at 10:16AM, Roman Shaposhnik wrote:
> On Wed, Jul 4, 2012 at 10:12 AM, Konstantin Boudnik <co...@apache.org> wrote:
> > I won't Bruno ;) I was replying to the conclusion from Roman's email ;)
> 
> Aha! -1 x -1 == +1  ;-)
> 
> Thanks,
> Roman.

Re: Optional dependencies

Posted by Roman Shaposhnik <rv...@apache.org>.
On Wed, Jul 4, 2012 at 10:12 AM, Konstantin Boudnik <co...@apache.org> wrote:
> I won't Bruno ;) I was replying to the conclusion from Roman's email ;)

Aha! -1 x -1 == +1  ;-)

Thanks,
Roman.

Re: Optional dependencies

Posted by Konstantin Boudnik <co...@apache.org>.
I won't Bruno ;) I was replying to the conclusion from Roman's email ;)

Cos

On Wed, Jul 04, 2012 at 01:54AM, Bruno Mahé wrote:
> On 07/03/2012 08:49 PM, Konstantin Boudnik wrote:
> > From the component purist perspective and my limited understanding of the
> > matter it seems that hbase doesn't really care about what hive is and if it is
> > presented on system on not.
> > 
> > On the other hand, hive has this ability to use hbase as a "storage handler",
> > thus it has to treat hbase as a real dependency if the need to interface
> > between two component arises.
> > 
> > What I am saying, is that it seems pretty awkward to load an hbase package
> > with such an alien functionality as providing links for hive.
> > 
> > I think this is -1 from me, then.
> > 
> > Cos
> > 
> 
> Why would do you want hbase to contain that feature for hive?
> I agree that would be pretty awkward and would warrant a -1 from myself
> as well.
> 
> Thanks,
> Bruno

Re: Optional dependencies

Posted by Bruno Mahé <bm...@apache.org>.
On 07/03/2012 08:49 PM, Konstantin Boudnik wrote:
> From the component purist perspective and my limited understanding of the
> matter it seems that hbase doesn't really care about what hive is and if it is
> presented on system on not.
> 
> On the other hand, hive has this ability to use hbase as a "storage handler",
> thus it has to treat hbase as a real dependency if the need to interface
> between two component arises.
> 
> What I am saying, is that it seems pretty awkward to load an hbase package
> with such an alien functionality as providing links for hive.
> 
> I think this is -1 from me, then.
> 
> Cos
> 

Why would do you want hbase to contain that feature for hive?
I agree that would be pretty awkward and would warrant a -1 from myself
as well.

Thanks,
Bruno

Re: Optional dependencies

Posted by Konstantin Boudnik <co...@apache.org>.
From the component purist perspective and my limited understanding of the
matter it seems that hbase doesn't really care about what hive is and if it is
presented on system on not.

On the other hand, hive has this ability to use hbase as a "storage handler",
thus it has to treat hbase as a real dependency if the need to interface
between two component arises.

What I am saying, is that it seems pretty awkward to load an hbase package
with such an alien functionality as providing links for hive.

I think this is -1 from me, then.

Cos

On Sat, Jun 30, 2012 at 08:31PM, Roman Shaposhnik wrote:
> Hi!
> 
> Firs of all, let me articulate the particular concern that I have for Hive
> (and it is Hive-specific, as in Pig, for example, doesn't suffer from
> the same issue). I apologize for not clearly articulating it earlier.
> 
> Basically with Hive the fundamental problem is that it relies on the
> presence of the symlinks to hbase.jar under /usr/lib/hive/lib in
> order for the Hive-HBase integration to work. Unlike Pig, even when
> HBase is installed on the system under a well known location
> the symlinks have to there in order for Hive to be able to interface
> with HBase.
> 
> Currently these symlinks get installed by the Hive package. With
> your proposed change they will be installed by hive-hbase package.
> This means change in behaviour on the systems where HBase is
> present with or without hbase dependency coming from the Hive
> package.
> 
> Now, historically Bigtop hasn't really paid much attention to being
> backwards compatible between releases. Hence this is not a -1,
> but rather food for thought.
> 
> Now to comment on your points:
> 
> On Thu, Jun 28, 2012 at 10:01 PM, Bruno MahИ <bm...@apache.org> wrote:
> > On 06/28/2012 09:51 PM, Roman Shaposhnik wrote:
> >> Personally, I think I'm reasonably fine with #3 (after all it kind of combines
> >> #1 with an extra package) with the only concern that I remember
> >> being potential combinatoric explosion of ═these helper packages
> >> (e.g. hive-hbase, hive-cassandra, hive-hbase-cassandra, etc).
> >>
> >> Thanks,
> >> Roman.
> >
> > I am not sure to follow why you would have a combinatorial explosion?
> > Following your example you would have the following packages:
> > * hive which provides the main non-optional features
> > * hive-hbase which provides and pulls everything necessary for an
> > integration with hbase. hive-hbase depending on hive
> > * hive-cassandra which provides and pulls everything necessary for an
> > integration with cassandra. hive-cassandra depending on hive
> >
> > So then depending on your needs you could install (hive), (hive and
> > hive-hbase), (hive and hive-cassandra) or (hive and hive-hbase and
> > hive-cassandra) if you need both.
> > You want subpackages as orthogonal as possible rather than per use cases.
> 
> I suppose this could work for as long as the interaction between these types
> of subpackages is indeed orthogonal (as in -- presence or absence of
> hive-hbase integration doesn't affect hive-cassandra integration).
> 
> At this point I can't think of a case where it should be a problem so I'm +0 on
> the approach.
> 
> That said -- what really would make me an enthusiastic +1 is if we could
> make Hive behave more like Pig when it comes to HBase integration. Let
> me take a look into the launcher script and see whether there's a low hanging
> fruit in there.
> 
> Thanks,
> Roman.