You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2011/12/13 22:01:16 UTC

Feature branch Was: Code review request for hbase-4120 table priority

I was thinking about using a feature branch as well.
May I get clarification for the following issues first ?

1. Would there ever be competing feature branches ? e.g. there're two
implementations for online schema update. Would there be two branches, one
for each implementation ?
If the answer is yes, I want to know how we decide which one to pursue and
which one to abandon.
If the answer is no, we need to decide what criteria should be used in
deciding whether to graduate or abandon the development on that branch

2. I suppose there would be some committers acting as sponsors for the
branch. How often should they refresh the branch to keep in sync with TRUNK

3. If the development on that branch takes longer than the cycle between
major releases, e.g. a branch is made off of 0.94 but by the time feature
completes, major release has moved beyond 0.96
What should we do ?

Thanks

On Tue, Dec 13, 2011 at 11:57 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:

> Note: I've only done a quick look at the jira and the code.  The high level
> design document/approach seems reasonable and I think most agree that this
> is a useful feature and that a lot of effort has gone into it.
>
> The feature is off by default -- I can see one main difference in this
> situation compared to other major newish generally-considered experimental
> or incomplete features (replication, off-heap slab cache, online schema
> changes).  This feature doesn't have one of the current HBase committers
> using/testing it in their production  environments or in their test
> environment.
>
> This seems perfect for *a feature branch* as we talked briefly about at the
> Pow-wow.  There seem to be some problems identified that will result in
> follow on issues (races mentioned).  Using a branch would:
> * make it available at apache allows devs to test it
> * allows a committer who is championing this to test it by using it more
> and to iron out glaring problems in environment,
> * encourages and shepards the contributor allowing them to justify
> continued effort,
> * allows all of us to defer the decision to fold the feature into 0.94 (or
> 0.96, or later) when more folks are familiar or comfortable with it.
>
> Who knows, maybe some of the TaoBao folks will eventually become
> committers.
>
> Jon.
>
> On Tue, Dec 13, 2011 at 12:42 AM, <yu...@gmail.com> wrote:
>
> > Thanks for the suggestion, Lars.
> > The original scope for 4120 is bigger than the latest patch which only
> > covers table priorities.
> >
> > Let's perform more reviews for the current patch. We can create more
> > subtasks for the umbrella feature.
> >
> > Cheers
> >
> >
> >
> > On Dec 12, 2011, at 11:23 PM, lars hofhansl <lh...@yahoo.com> wrote:
> >
> > > While I haven't looked (in depth) at the patch, yet, this is definitely
> > a feature that will be extremely helpful
> > > for Salesforce's multitenant architecture to isolate tenants and
> > services from each other.
> > >
> > > While we don't have HBase in our production data centers, yet (working
> > on it), I am certain that we will use this feature
> > > eventually.
> > >
> > > Would it help to break the patch into multiple smaller patches?
> > >
> > > Off the bat I think of:
> > > 1. the grouping logic
> > > 2. regionserver configuration (caching, etc) per group
> > > 3. table priorities
> > > 4. etc... (folks who have actually looked at the patch can probably
> > identify better demarcations between the aspects of this change.)
> > >
> > > That would certainly make it more manageable for me - personally - to
> > review the code.
> > >
> > > -- Lars
> > >
> > >
> > > ----- Original Message -----
> > > From: Todd Lipcon <to...@cloudera.com>
> > > To: dev@hbase.apache.org; Andrew Purtell <ap...@apache.org>
> > > Cc:
> > > Sent: Monday, December 12, 2011 4:55 PM
> > > Subject: Re: Code review request for hbase-4120 table priority
> > >
> > > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <ap...@apache.org>
> > wrote:
> > >>
> > >> HBase as a project should not have as a criteria for inclusion of some
> > feature that Cloudera and SU and Facebook run it. Core managed to escape
> > Yahoo. Let's not run history in reverse here in HBase land. And,
> actually,
> > this makes it worse, because the the occurrence that a number of core
> HBase
> > users (multiple) will all need something is substantially less likely
> than
> > if one might find it useful; or, maybe, only users outside of those with
> > such self-appointed attitude, yet perhaps a community multiples in size
> of
> > "core users".
> > >
> > > It's not about Cloudera/SU/FB - it's about code that will be supported
> > > by people who are committed to the project. TrendMicro certainly fits
> > > the bill. I of course mean no offense to Lu Jia, but neither he nor
> > > Taobao has made continued contributions in the past - just one other
> > > bug fix beyond the HBASE-4120 project.
> > >
> > > If we have a few of the core people committed to running this in
> > > production and supporting it in the future, I'm all for it (just like
> > > I am +1 on security). I just want to avoid repeating mistakes like the
> > > Avro server which isn't really supported despite being in our
> > > codebase. (You'll note this was a Cloudera contribution but from a
> > > contributor who was doing this in his spare time rather than part of
> > > job responsibilities, and we have never run it in production
> > > scenarios)
> > >
> > > I am consistently conservative on what goes into the project because
> > > we have to stand behind what we release. I certainly don't think _all_
> > > core people should find every feature useful (eg REST and Thrift are
> > > examples of some things which are useless to many but I think make
> > > sense). But if _no_ core people see a feature as a requirement then
> > > I'd rather let it bake until we have many people requesting it.
> > > Otherwise people download HBase, try out these "fringe" features, and
> > > get a bad taste in their mouth when they've bit-rot across several
> > > versions of little usage.
> > >
> > > -Todd
> > > --
> > > Todd Lipcon
> > > Software Engineer, Cloudera
> > >
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Re: Feature branch Was: Code review request for hbase-4120 table priority

Posted by Todd Lipcon <to...@cloudera.com>.
On Tue, Dec 13, 2011 at 1:01 PM, Ted Yu <yu...@gmail.com> wrote:
> I was thinking about using a feature branch as well.
> May I get clarification for the following issues first ?
>
> 1. Would there ever be competing feature branches ? e.g. there're two
> implementations for online schema update. Would there be two branches, one
> for each implementation ?

IMO yes, it's possible. It's better if the two people or teams
implementing them can work together on the project, but if they have
substantially different approaches, makes sense to pursue them on
separate branches.

> If the answer is yes, I want to know how we decide which one to pursue and
> which one to abandon.

I think this document is a good reference here:
http://incubator.apache.org/learn/rules-for-revolutionaries.html

The short answer is that whichever one gets to a commitable state
first wins. It's up to the committers to vote to merge a branch, based
on its merits for inclusion.

> If the answer is no, we need to decide what criteria should be used in
> deciding whether to graduate or abandon the development on that branch

Same criteria for any patch IMO:
- well tested, including unit tests
- coding style matches the rest of the project
- the feature fits with what we think HBase should be as a
project/architecture/etc

Of course the above are subjective, which is why we get to vote
instead of having some dictator decide :)

>
> 2. I suppose there would be some committers acting as sponsors for the
> branch. How often should they refresh the branch to keep in sync with TRUNK
>

It's up to them - they'll need it to be up to date with TRUNK before
they can merge. If it were me I'd probably merge every day or two
rather than put off the conflict resolution until the end. At least
that's how we're currently operating on the HDFS HA feature branch,
and how I operated on the HDFS-1073 branch as well.

> 3. If the development on that branch takes longer than the cycle between
> major releases, e.g. a branch is made off of 0.94 but by the time feature
> completes, major release has moved beyond 0.96
> What should we do ?

Feature branches should always merge into trunk, whatever version
trunk happens to be at the time... so if it misses 94, then it goes to
96, if it misses that, it goes to whatever's out next.

-Todd

>
> On Tue, Dec 13, 2011 at 11:57 AM, Jonathan Hsieh <jo...@cloudera.com> wrote:
>
>> Note: I've only done a quick look at the jira and the code.  The high level
>> design document/approach seems reasonable and I think most agree that this
>> is a useful feature and that a lot of effort has gone into it.
>>
>> The feature is off by default -- I can see one main difference in this
>> situation compared to other major newish generally-considered experimental
>> or incomplete features (replication, off-heap slab cache, online schema
>> changes).  This feature doesn't have one of the current HBase committers
>> using/testing it in their production  environments or in their test
>> environment.
>>
>> This seems perfect for *a feature branch* as we talked briefly about at the
>> Pow-wow.  There seem to be some problems identified that will result in
>> follow on issues (races mentioned).  Using a branch would:
>> * make it available at apache allows devs to test it
>> * allows a committer who is championing this to test it by using it more
>> and to iron out glaring problems in environment,
>> * encourages and shepards the contributor allowing them to justify
>> continued effort,
>> * allows all of us to defer the decision to fold the feature into 0.94 (or
>> 0.96, or later) when more folks are familiar or comfortable with it.
>>
>> Who knows, maybe some of the TaoBao folks will eventually become
>> committers.
>>
>> Jon.
>>
>> On Tue, Dec 13, 2011 at 12:42 AM, <yu...@gmail.com> wrote:
>>
>> > Thanks for the suggestion, Lars.
>> > The original scope for 4120 is bigger than the latest patch which only
>> > covers table priorities.
>> >
>> > Let's perform more reviews for the current patch. We can create more
>> > subtasks for the umbrella feature.
>> >
>> > Cheers
>> >
>> >
>> >
>> > On Dec 12, 2011, at 11:23 PM, lars hofhansl <lh...@yahoo.com> wrote:
>> >
>> > > While I haven't looked (in depth) at the patch, yet, this is definitely
>> > a feature that will be extremely helpful
>> > > for Salesforce's multitenant architecture to isolate tenants and
>> > services from each other.
>> > >
>> > > While we don't have HBase in our production data centers, yet (working
>> > on it), I am certain that we will use this feature
>> > > eventually.
>> > >
>> > > Would it help to break the patch into multiple smaller patches?
>> > >
>> > > Off the bat I think of:
>> > > 1. the grouping logic
>> > > 2. regionserver configuration (caching, etc) per group
>> > > 3. table priorities
>> > > 4. etc... (folks who have actually looked at the patch can probably
>> > identify better demarcations between the aspects of this change.)
>> > >
>> > > That would certainly make it more manageable for me - personally - to
>> > review the code.
>> > >
>> > > -- Lars
>> > >
>> > >
>> > > ----- Original Message -----
>> > > From: Todd Lipcon <to...@cloudera.com>
>> > > To: dev@hbase.apache.org; Andrew Purtell <ap...@apache.org>
>> > > Cc:
>> > > Sent: Monday, December 12, 2011 4:55 PM
>> > > Subject: Re: Code review request for hbase-4120 table priority
>> > >
>> > > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <ap...@apache.org>
>> > wrote:
>> > >>
>> > >> HBase as a project should not have as a criteria for inclusion of some
>> > feature that Cloudera and SU and Facebook run it. Core managed to escape
>> > Yahoo. Let's not run history in reverse here in HBase land. And,
>> actually,
>> > this makes it worse, because the the occurrence that a number of core
>> HBase
>> > users (multiple) will all need something is substantially less likely
>> than
>> > if one might find it useful; or, maybe, only users outside of those with
>> > such self-appointed attitude, yet perhaps a community multiples in size
>> of
>> > "core users".
>> > >
>> > > It's not about Cloudera/SU/FB - it's about code that will be supported
>> > > by people who are committed to the project. TrendMicro certainly fits
>> > > the bill. I of course mean no offense to Lu Jia, but neither he nor
>> > > Taobao has made continued contributions in the past - just one other
>> > > bug fix beyond the HBASE-4120 project.
>> > >
>> > > If we have a few of the core people committed to running this in
>> > > production and supporting it in the future, I'm all for it (just like
>> > > I am +1 on security). I just want to avoid repeating mistakes like the
>> > > Avro server which isn't really supported despite being in our
>> > > codebase. (You'll note this was a Cloudera contribution but from a
>> > > contributor who was doing this in his spare time rather than part of
>> > > job responsibilities, and we have never run it in production
>> > > scenarios)
>> > >
>> > > I am consistently conservative on what goes into the project because
>> > > we have to stand behind what we release. I certainly don't think _all_
>> > > core people should find every feature useful (eg REST and Thrift are
>> > > examples of some things which are useless to many but I think make
>> > > sense). But if _no_ core people see a feature as a requirement then
>> > > I'd rather let it bake until we have many people requesting it.
>> > > Otherwise people download HBase, try out these "fringe" features, and
>> > > get a bad taste in their mouth when they've bit-rot across several
>> > > versions of little usage.
>> > >
>> > > -Todd
>> > > --
>> > > Todd Lipcon
>> > > Software Engineer, Cloudera
>> > >
>> >
>>
>>
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>>



-- 
Todd Lipcon
Software Engineer, Cloudera