You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Stack <st...@duboce.net> on 2010/02/04 22:50:09 UTC

Re: Discussion: Move contribs out of hbase?

Trying to summarize the above back and forth, how about going forward
we do something like the following:

+ We keep contrib
+ Look into having build NOT go down to contrib dirs by default so
broken contrib doesn't hold up core
+ Core devs no longer are required chase changes all the ways down
into each of the contrib tributaries; onus on rectifying contrib with
core changes falls on contrib owner.
+ Allow that if a contrib is not fixed promptly, it will dropped from a release
+ Move those contribs we consider core -- stargate for example -- up
into core and out of contrib (thrift is a little more involved;
depends on how Lars Francke wants to run it)
+ Raise the barrier when it comes to taking on new contribs; encourage
satellite projects to use other repos

To be clear, current contrib owners are: andrew for ec2 and stargate,
clint morgan for THBase ('transactional') and I'm covering IHBase
('indexed').

If above is not objectionable, I'll stick it up on the wiki as a sort
of contrib 'policy'.

Here's some other comments on items raised over course of this discussion:

On Thu, Jan 28, 2010 at 4:09 PM, Andrew Purtell <ap...@apache.org> wrote:
> What do you want to do about the EC2 scripts? They make no sense as a
> standalone project in my opinion. Could move into bin/ec2/ ? What happens
> with those when they generalize to other cloud providers like the Hadoop
> cloud scripts are doing for 0.21? bin/cloud/ ? That would be fine with me.
>

IMO bin/ec2 seems grand but maybe just wait till after 0.21 because
when the generalized scripts become available, it might dictate they
belong somewhere else -- a bin/cloud or somewhere else altogether.

On Sun, Jan 31, 2010 at 5:18 AM, Bruce Williams
<wi...@gmail.com> wrote:
>
> Providing an interface for contrib to code to is the ultimate solution.
>
Interestingly, contribs have been instrumental here.  Much of hbase
core is subclassable and you can configure it to use alternate
implementations of Master, RegionServer, Region, etc.

Thanks,
St.Ack

Re: Discussion: Move contribs out of hbase?

Posted by Kay Kay <ka...@gmail.com>.
On 02/04/2010 10:47 PM, Stack wrote:
> Go for it Kay-Kay.
>
> Its an old contribution by Ning Li.  If you do move it, we can remove
> the lucene jar at least from lib directory and perhaps a few more?
>    

Definitely we can remove lucene from the ivy list of dependencies and 
hence from the classpath , making it leaner.

> There is also a pretty heavy-duty "unit test" that fires up a mini
> hdfs, mapreduce, and hbase cluster running mapreduce tasks and then
> queries all to verify stuff works.  Taking that out will cut build
> test times in half (smile).
>
>    

Will look into this, as well.
I was thinking of github for hosting and eventually publishing artifacts 
to central maven repository, through nexus sonatype, so that integrating 
should be easy.

That brings back to the original question on HBASE-1933 , to motivate to 
get started independently.

As far as thrift is concerned, I believe we can take the option of 
publishing artifacts ourselves under - org.apache.hadoop.hbase / 
libthrift artifact , for 0.2.0.  zk seems wrapping up the stuff for a 
3.3.0 release so if we wait for a month from here - that might become 
possible too.




> St.Ack
>
>
> On Thu, Feb 4, 2010 at 10:28 PM, Kay Kay<ka...@gmail.com>  wrote:
>    
>> On 02/04/2010 06:32 PM, Stack wrote:
>>      
>>> On Thu, Feb 4, 2010 at 6:24 PM, Lars Francke<la...@gmail.com>
>>>   wrote:
>>>
>>>        
>>>> I've just seen your comment on HBASE-1402. It'll be no problem to keep
>>>> the two different API versions in the core. I'll just use two
>>>> different package names and rewrite the ThriftServer to accept a
>>>> parameter to chose between the old and new API. We've already brought
>>>> up the old API to the new Thrift version so there shouldn't be
>>>> conflicts.
>>>>
>>>>
>>>>          
>>> Excellent.
>>> St.Ack
>>>
>>>        
>> While we are looking at refactoring this - there is a rudimentary lucene
>> indexer , available at - o.a.h.h.mapreduce.IndexTableMapper/Reducer that
>> goes through a given table and column names and creates an index. While I
>> can definitely how valuable this is- I do not see this as core to hbase
>> functionality but rather distracting. I can volunteer to move this to github
>> with appropriate documentation and artifacts published to mvn central , but
>> am curious to know the thoughts of the people and possible stakeholders in
>> this list to know the opinions.  Thanks.
>>
>>
>>
>>
>>
>>      


Re: Discussion: Move contribs out of hbase?

Posted by Stack <st...@duboce.net>.
Go for it Kay-Kay.

Its an old contribution by Ning Li.  If you do move it, we can remove
the lucene jar at least from lib directory and perhaps a few more?
There is also a pretty heavy-duty "unit test" that fires up a mini
hdfs, mapreduce, and hbase cluster running mapreduce tasks and then
queries all to verify stuff works.  Taking that out will cut build
test times in half (smile).

St.Ack


On Thu, Feb 4, 2010 at 10:28 PM, Kay Kay <ka...@gmail.com> wrote:
> On 02/04/2010 06:32 PM, Stack wrote:
>>
>> On Thu, Feb 4, 2010 at 6:24 PM, Lars Francke<la...@gmail.com>
>>  wrote:
>>
>>>
>>> I've just seen your comment on HBASE-1402. It'll be no problem to keep
>>> the two different API versions in the core. I'll just use two
>>> different package names and rewrite the ThriftServer to accept a
>>> parameter to chose between the old and new API. We've already brought
>>> up the old API to the new Thrift version so there shouldn't be
>>> conflicts.
>>>
>>>
>>
>> Excellent.
>> St.Ack
>>
>
> While we are looking at refactoring this - there is a rudimentary lucene
> indexer , available at - o.a.h.h.mapreduce.IndexTableMapper/Reducer that
> goes through a given table and column names and creates an index. While I
> can definitely how valuable this is- I do not see this as core to hbase
> functionality but rather distracting. I can volunteer to move this to github
> with appropriate documentation and artifacts published to mvn central , but
> am curious to know the thoughts of the people and possible stakeholders in
> this list to know the opinions.  Thanks.
>
>
>
>
>

Re: Discussion: Move contribs out of hbase?

Posted by Kay Kay <ka...@gmail.com>.
On 02/04/2010 06:32 PM, Stack wrote:
> On Thu, Feb 4, 2010 at 6:24 PM, Lars Francke<la...@gmail.com>  wrote:
>    
>> I've just seen your comment on HBASE-1402. It'll be no problem to keep
>> the two different API versions in the core. I'll just use two
>> different package names and rewrite the ThriftServer to accept a
>> parameter to chose between the old and new API. We've already brought
>> up the old API to the new Thrift version so there shouldn't be
>> conflicts.
>>
>>      
> Excellent.
> St.Ack
>    
While we are looking at refactoring this - there is a rudimentary lucene 
indexer , available at - o.a.h.h.mapreduce.IndexTableMapper/Reducer that 
goes through a given table and column names and creates an index. While 
I can definitely how valuable this is- I do not see this as core to 
hbase functionality but rather distracting. I can volunteer to move this 
to github with appropriate documentation and artifacts published to mvn 
central , but am curious to know the thoughts of the people and possible 
stakeholders in this list to know the opinions.  Thanks.





Re: Discussion: Move contribs out of hbase?

Posted by Stack <st...@duboce.net>.
On Thu, Feb 4, 2010 at 6:24 PM, Lars Francke <la...@gmail.com> wrote:
> I've just seen your comment on HBASE-1402. It'll be no problem to keep
> the two different API versions in the core. I'll just use two
> different package names and rewrite the ThriftServer to accept a
> parameter to chose between the old and new API. We've already brought
> up the old API to the new Thrift version so there shouldn't be
> conflicts.
>
Excellent.
St.Ack

Re: Discussion: Move contribs out of hbase?

Posted by Lars Francke <la...@gmail.com>.
> In which way is Thrift more involved than Stargate?

I've just seen your comment on HBASE-1402. It'll be no problem to keep
the two different API versions in the core. I'll just use two
different package names and rewrite the ThriftServer to accept a
parameter to chose between the old and new API. We've already brought
up the old API to the new Thrift version so there shouldn't be
conflicts.

Lars

Re: Discussion: Move contribs out of hbase?

Posted by Lars Francke <la...@gmail.com>.
> + Move those contribs we consider core -- stargate for example -- up
> into core and out of contrib (thrift is a little more involved;
> depends on how Lars Francke wants to run it)

I've kept quiet so far as I am neither a committer nor a big time
contributor nor a long standing member of the HBase community. I also
have no strong opinion about contrib vs. core in general and thrift
contrib in particular.

In which way is Thrift more involved than Stargate? The code itself is
easily decoupled from the core as it uses only the public API via
interfaces. The only slightly complex thing at the moment is that
we'll keep the old API around for now so that we'll probably need a
"thrift2" package or something like that. It was decided earlier (in
one of the tickets) that the old API should stay in core for 0.21
anyway.

Either way: If Stargate gets pulled in the core I think thrift should
stay too but as I said before I'm fine either way.

Cheers,
Lars

Re: Discussion: Move contribs out of hbase?

Posted by Stack <st...@duboce.net>.
IMO, replication is a core feature.

I agree that contribs don't need the usual review and that owners if
committers should go ahead and commit.

St.Ack

On Thu, Feb 4, 2010 at 2:42 PM, Jean-Daniel Cryans <jd...@apache.org> wrote:
> I'd like to add a rule for contrib, that it doesn't need +1 from others for
> the owner to commit changes. I think it was already a non-verbal rule as
> most of the ec2 and stargate stuff was committed directly after the issue
> was opened and it's good because it gives more agility and it doesn't break
> core.
>
> J-D
>
> On Thu, Feb 4, 2010 at 1:57 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:
>
>> What about replication? I think it is a core feature but the reason we
>> wanted to put it first in contrib was because of stability issues it could
>> generate. 2 other options:
>>
>> - github, means no visibility from hbase and more steps to get it
>> - directly into core, with a configuration variable that works the same way
>> as dfs.append.enable
>>
>> J-D
>>
>>
>> On Thu, Feb 4, 2010 at 1:50 PM, Stack <st...@duboce.net> wrote:
>>
>>> Trying to summarize the above back and forth, how about going forward
>>> we do something like the following:
>>>
>>> + We keep contrib
>>> + Look into having build NOT go down to contrib dirs by default so
>>> broken contrib doesn't hold up core
>>> + Core devs no longer are required chase changes all the ways down
>>> into each of the contrib tributaries; onus on rectifying contrib with
>>> core changes falls on contrib owner.
>>> + Allow that if a contrib is not fixed promptly, it will dropped from a
>>> release
>>> + Move those contribs we consider core -- stargate for example -- up
>>> into core and out of contrib (thrift is a little more involved;
>>> depends on how Lars Francke wants to run it)
>>> + Raise the barrier when it comes to taking on new contribs; encourage
>>> satellite projects to use other repos
>>>
>>> To be clear, current contrib owners are: andrew for ec2 and stargate,
>>> clint morgan for THBase ('transactional') and I'm covering IHBase
>>> ('indexed').
>>>
>>> If above is not objectionable, I'll stick it up on the wiki as a sort
>>> of contrib 'policy'.
>>>
>>> Here's some other comments on items raised over course of this discussion:
>>>
>>> On Thu, Jan 28, 2010 at 4:09 PM, Andrew Purtell <ap...@apache.org>
>>> wrote:
>>> > What do you want to do about the EC2 scripts? They make no sense as a
>>> > standalone project in my opinion. Could move into bin/ec2/ ? What
>>> happens
>>> > with those when they generalize to other cloud providers like the Hadoop
>>> > cloud scripts are doing for 0.21? bin/cloud/ ? That would be fine with
>>> me.
>>> >
>>>
>>> IMO bin/ec2 seems grand but maybe just wait till after 0.21 because
>>> when the generalized scripts become available, it might dictate they
>>> belong somewhere else -- a bin/cloud or somewhere else altogether.
>>>
>>> On Sun, Jan 31, 2010 at 5:18 AM, Bruce Williams
>>> <wi...@gmail.com> wrote:
>>> >
>>> > Providing an interface for contrib to code to is the ultimate solution.
>>> >
>>> Interestingly, contribs have been instrumental here.  Much of hbase
>>> core is subclassable and you can configure it to use alternate
>>> implementations of Master, RegionServer, Region, etc.
>>>
>>> Thanks,
>>> St.Ack
>>>
>>
>>
>

Re: Discussion: Move contribs out of hbase?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I'd like to add a rule for contrib, that it doesn't need +1 from others for
the owner to commit changes. I think it was already a non-verbal rule as
most of the ec2 and stargate stuff was committed directly after the issue
was opened and it's good because it gives more agility and it doesn't break
core.

J-D

On Thu, Feb 4, 2010 at 1:57 PM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> What about replication? I think it is a core feature but the reason we
> wanted to put it first in contrib was because of stability issues it could
> generate. 2 other options:
>
> - github, means no visibility from hbase and more steps to get it
> - directly into core, with a configuration variable that works the same way
> as dfs.append.enable
>
> J-D
>
>
> On Thu, Feb 4, 2010 at 1:50 PM, Stack <st...@duboce.net> wrote:
>
>> Trying to summarize the above back and forth, how about going forward
>> we do something like the following:
>>
>> + We keep contrib
>> + Look into having build NOT go down to contrib dirs by default so
>> broken contrib doesn't hold up core
>> + Core devs no longer are required chase changes all the ways down
>> into each of the contrib tributaries; onus on rectifying contrib with
>> core changes falls on contrib owner.
>> + Allow that if a contrib is not fixed promptly, it will dropped from a
>> release
>> + Move those contribs we consider core -- stargate for example -- up
>> into core and out of contrib (thrift is a little more involved;
>> depends on how Lars Francke wants to run it)
>> + Raise the barrier when it comes to taking on new contribs; encourage
>> satellite projects to use other repos
>>
>> To be clear, current contrib owners are: andrew for ec2 and stargate,
>> clint morgan for THBase ('transactional') and I'm covering IHBase
>> ('indexed').
>>
>> If above is not objectionable, I'll stick it up on the wiki as a sort
>> of contrib 'policy'.
>>
>> Here's some other comments on items raised over course of this discussion:
>>
>> On Thu, Jan 28, 2010 at 4:09 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>> > What do you want to do about the EC2 scripts? They make no sense as a
>> > standalone project in my opinion. Could move into bin/ec2/ ? What
>> happens
>> > with those when they generalize to other cloud providers like the Hadoop
>> > cloud scripts are doing for 0.21? bin/cloud/ ? That would be fine with
>> me.
>> >
>>
>> IMO bin/ec2 seems grand but maybe just wait till after 0.21 because
>> when the generalized scripts become available, it might dictate they
>> belong somewhere else -- a bin/cloud or somewhere else altogether.
>>
>> On Sun, Jan 31, 2010 at 5:18 AM, Bruce Williams
>> <wi...@gmail.com> wrote:
>> >
>> > Providing an interface for contrib to code to is the ultimate solution.
>> >
>> Interestingly, contribs have been instrumental here.  Much of hbase
>> core is subclassable and you can configure it to use alternate
>> implementations of Master, RegionServer, Region, etc.
>>
>> Thanks,
>> St.Ack
>>
>
>

Re: Discussion: Move contribs out of hbase?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
What about replication? I think it is a core feature but the reason we
wanted to put it first in contrib was because of stability issues it could
generate. 2 other options:

- github, means no visibility from hbase and more steps to get it
- directly into core, with a configuration variable that works the same way
as dfs.append.enable

J-D

On Thu, Feb 4, 2010 at 1:50 PM, Stack <st...@duboce.net> wrote:

> Trying to summarize the above back and forth, how about going forward
> we do something like the following:
>
> + We keep contrib
> + Look into having build NOT go down to contrib dirs by default so
> broken contrib doesn't hold up core
> + Core devs no longer are required chase changes all the ways down
> into each of the contrib tributaries; onus on rectifying contrib with
> core changes falls on contrib owner.
> + Allow that if a contrib is not fixed promptly, it will dropped from a
> release
> + Move those contribs we consider core -- stargate for example -- up
> into core and out of contrib (thrift is a little more involved;
> depends on how Lars Francke wants to run it)
> + Raise the barrier when it comes to taking on new contribs; encourage
> satellite projects to use other repos
>
> To be clear, current contrib owners are: andrew for ec2 and stargate,
> clint morgan for THBase ('transactional') and I'm covering IHBase
> ('indexed').
>
> If above is not objectionable, I'll stick it up on the wiki as a sort
> of contrib 'policy'.
>
> Here's some other comments on items raised over course of this discussion:
>
> On Thu, Jan 28, 2010 at 4:09 PM, Andrew Purtell <ap...@apache.org>
> wrote:
> > What do you want to do about the EC2 scripts? They make no sense as a
> > standalone project in my opinion. Could move into bin/ec2/ ? What happens
> > with those when they generalize to other cloud providers like the Hadoop
> > cloud scripts are doing for 0.21? bin/cloud/ ? That would be fine with
> me.
> >
>
> IMO bin/ec2 seems grand but maybe just wait till after 0.21 because
> when the generalized scripts become available, it might dictate they
> belong somewhere else -- a bin/cloud or somewhere else altogether.
>
> On Sun, Jan 31, 2010 at 5:18 AM, Bruce Williams
> <wi...@gmail.com> wrote:
> >
> > Providing an interface for contrib to code to is the ultimate solution.
> >
> Interestingly, contribs have been instrumental here.  Much of hbase
> core is subclassable and you can configure it to use alternate
> implementations of Master, RegionServer, Region, etc.
>
> Thanks,
> St.Ack
>