You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2011/08/19 01:23:53 UTC

HBASE-1730 and HBASE-4213

Hi,
Due to lack of coordination, HBASE-1730 and HBASE-4213 try to implement the
same feature at roughly the same pace.

I want to hear your opinion on how we should plan to move forward with these
two JIRAs.
One possibility is to provide two policies, one accommodating each JIRA. But
that requires even more work.

It would be nice if we can have some performance numbers for both
implementations on comparable cluster(s).

Cheers

Re: HBASE-1730 and HBASE-4213

Posted by Andrew Purtell <ap...@apache.org>.

> Using zookeeper to record transient state is Andy's favorite choice.


Ted, thank you for the consideration!
 
Best regards,


- Andy


Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: Ted Yu <yu...@gmail.com>
> To: dev@hbase.apache.org
> Cc: Subbu M Iyer <ms...@gmail.com>; nileema.shingte@gmail.com
> Sent: Thursday, August 18, 2011 5:53 PM
> Subject: Re: HBASE-1730 and HBASE-4213
> 
> I prefer choice A below.
> 
> Let's vote on which implementation is the better approach.
> 
> My vote is for 4213. Subbu implemented hbase-451 and has deep understanding
> of related code.
> Using zookeeper to record transient state is Andy's favorite choice.
> 
> Cheers
> 
> On Thu, Aug 18, 2011 at 5:29 PM, Todd Lipcon <to...@cloudera.com> wrote:
> 
>>  In my opinion we have three options:
>> 
>>  (a) have the two contributors work together on a single JIRA
>>  (b) factor out what's common between their approaches into a new JIRA,
>>  then let them proceed independently
>>  or (c) let them proceed independently, and whichever one reaches a
>>  suitable commitable state first, we go with
>> 
>>  If they both become committable around the same time, then we should
>>  go to benchmarks as well as comparisons of which codebase seems more
>>  maintainable.
>> 
>>  -Todd
>> 
>>  On Thu, Aug 18, 2011 at 4:23 PM, Ted Yu <yu...@gmail.com> wrote:
>>  > Hi,
>>  > Due to lack of coordination, HBASE-1730 and HBASE-4213 try to 
> implement
>>  the
>>  > same feature at roughly the same pace.
>>  >
>>  > I want to hear your opinion on how we should plan to move forward with
>>  these
>>  > two JIRAs.
>>  > One possibility is to provide two policies, one accommodating each 
> JIRA.
>>  But
>>  > that requires even more work.
>>  >
>>  > It would be nice if we can have some performance numbers for both
>>  > implementations on comparable cluster(s).
>>  >
>>  > Cheers
>>  >
>> 
>> 
>> 
>>  --
>>  Todd Lipcon
>>  Software Engineer, Cloudera
>> 
>

Re: HBASE-1730 and HBASE-4213

Posted by Andrew Purtell <ap...@apache.org>.

Nileema,


> PS: I am new to HBase and was working on this during my internship. I wasn't

> aware of other efforts towards this. Apologies if I created any
> inconvenience.


Your contributed efforts can only bring a benefit of some kind!
 
Best regards,


   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)


----- Original Message -----
> From: nileema shingte <ni...@gmail.com>
> To: Stack <st...@duboce.net>
> Cc: dev@hbase.apache.org; Subbu M Iyer <ms...@gmail.com>
> Sent: Thursday, August 18, 2011 9:03 PM
> Subject: Re: HBASE-1730 and HBASE-4213
> 
> Hi Guys,
> 
> Thank you for starting a thread to bring this to closure.
> 
> I have only read Subbu's description of the approach. I believe the main
> difference is that 4213 uses the zookeeper to maintain if all regions have
> the updated schema and reopening the regions is the RegionServer's
> responsibility, where as 1730 it happens in memory in the Master, and
> reopens are issued by the master.
> 
> I have not read through the patch submitted on 4213, so am not sure if it
> allows you to configure the number of regions that reopen at a time. Apart
> from this I think that 1730 is a subset of 4213 and if 4213 agrees more with
> the HBase ideology, we should definitely use that as a patch.
> 
> In either case, I will respond to the comments on the Jira and upload
> detailed description for the same.
> 
> Thanks,
> Nileema
> 
> PS: I am new to HBase and was working on this during my internship. I wasn't
> aware of other efforts towards this. Apologies if I created any
> inconvenience.
> 
> 
> 
> On Thu, Aug 18, 2011 at 8:26 PM, Stack <st...@duboce.net> wrote:
> 
>>  On Fri, Aug 19, 2011 at 3:04 AM, Ted Yu <yu...@gmail.com> wrote:
>>  > Surely they do.
>>  > If they summarize their plan, it would be easier for others to make a
>>  decision.
>>  >
>>  >
>> 
>>  I'd tend to think its their decision to make.  We are just here to
>>  cheer them on (smile).
>> 
>>  Subbu you going to come to the hackathon
>>  (http://www.meetup.com/hackathon/) or meetup on Monday
>>  (http://www.meetup.com/hbaseusergroup/events/28518471/)?  You Nileema?
>> 
>>  Good stuff,
>>  St.Ack
>> 
>

Re: HBASE-1730 and HBASE-4213

Posted by Subramanian Iyer <ms...@gmail.com>.

Hi Nileema,

Absolutely no need for apologies. Actually I never realized that you were also working on the same and saw your patch when I was about to commit my patch.

I created a new JIRA (4213) just so that I can bring up an additional option to the table with out polluting or hijacking your efforts from 1730.

thanks a lot for your help on this.

Subbu

On Aug 18, 2011, at 9:03 PM, nileema shingte wrote:

> Hi Guys, 
> 
> Thank you for starting a thread to bring this to closure. 
> 
> I have only read Subbu's description of the approach. I believe the main difference is that 4213 uses the zookeeper to maintain if all regions have the updated schema and reopening the regions is the RegionServer's responsibility, where as 1730 it happens in memory in the Master, and reopens are issued by the master. 
> 
> I have not read through the patch submitted on 4213, so am not sure if it allows you to configure the number of regions that reopen at a time. Apart from this I think that 1730 is a subset of 4213 and if 4213 agrees more with the HBase ideology, we should definitely use that as a patch. 
> 
> In either case, I will respond to the comments on the Jira and upload detailed description for the same. 
> 
> Thanks,
> Nileema 
> 
> PS: I am new to HBase and was working on this during my internship. I wasn't aware of other efforts towards this. Apologies if I created any inconvenience. 
> 
> 
> 
> On Thu, Aug 18, 2011 at 8:26 PM, Stack <st...@duboce.net> wrote:
> On Fri, Aug 19, 2011 at 3:04 AM, Ted Yu <yu...@gmail.com> wrote:
> > Surely they do.
> > If they summarize their plan, it would be easier for others to make a decision.
> >
> >
> 
> I'd tend to think its their decision to make.  We are just here to
> cheer them on (smile).
> 
> Subbu you going to come to the hackathon
> (http://www.meetup.com/hackathon/) or meetup on Monday
> (http://www.meetup.com/hbaseusergroup/events/28518471/)?  You Nileema?
> 
> Good stuff,
> St.Ack
>

Re: HBASE-1730 and HBASE-4213

Posted by Ted Yu <yu...@gmail.com>.

Nileema:
I should apologize for the late effort of trying to converge two very different designs at such a late stage. 

I feel you have expressed the approach of 4213. Thanks for your understanding. 

There is definitely pearl in your patch. E.g. Ruby script enhancements. If you and Subbu can work together, that would be great. 

Feel free to provide constructive ideas so that this feature is implemented nicely. 

Cheers 



On Aug 18, 2011, at 9:03 PM, nileema shingte <ni...@gmail.com> wrote:

> Hi Guys,
> 
> Thank you for starting a thread to bring this to closure.
> 
> I have only read Subbu's description of the approach. I believe the main
> difference is that 4213 uses the zookeeper to maintain if all regions have
> the updated schema and reopening the regions is the RegionServer's
> responsibility, where as 1730 it happens in memory in the Master, and
> reopens are issued by the master.
> 
> I have not read through the patch submitted on 4213, so am not sure if it
> allows you to configure the number of regions that reopen at a time. Apart
> from this I think that 1730 is a subset of 4213 and if 4213 agrees more with
> the HBase ideology, we should definitely use that as a patch.
> 
> In either case, I will respond to the comments on the Jira and upload
> detailed description for the same.
> 
> Thanks,
> Nileema
> 
> PS: I am new to HBase and was working on this during my internship. I wasn't
> aware of other efforts towards this. Apologies if I created any
> inconvenience.
> 
> 
> 
> On Thu, Aug 18, 2011 at 8:26 PM, Stack <st...@duboce.net> wrote:
> 
>> On Fri, Aug 19, 2011 at 3:04 AM, Ted Yu <yu...@gmail.com> wrote:
>>> Surely they do.
>>> If they summarize their plan, it would be easier for others to make a
>> decision.
>>> 
>>> 
>> 
>> I'd tend to think its their decision to make.  We are just here to
>> cheer them on (smile).
>> 
>> Subbu you going to come to the hackathon
>> (http://www.meetup.com/hackathon/) or meetup on Monday
>> (http://www.meetup.com/hbaseusergroup/events/28518471/)?  You Nileema?
>> 
>> Good stuff,
>> St.Ack
>>

Re: HBASE-1730 and HBASE-4213

Posted by nileema shingte <ni...@gmail.com>.

Hi Guys,

Thank you for starting a thread to bring this to closure.

I have only read Subbu's description of the approach. I believe the main
difference is that 4213 uses the zookeeper to maintain if all regions have
the updated schema and reopening the regions is the RegionServer's
responsibility, where as 1730 it happens in memory in the Master, and
reopens are issued by the master.

I have not read through the patch submitted on 4213, so am not sure if it
allows you to configure the number of regions that reopen at a time. Apart
from this I think that 1730 is a subset of 4213 and if 4213 agrees more with
the HBase ideology, we should definitely use that as a patch.

In either case, I will respond to the comments on the Jira and upload
detailed description for the same.

Thanks,
Nileema

PS: I am new to HBase and was working on this during my internship. I wasn't
aware of other efforts towards this. Apologies if I created any
inconvenience.

On Thu, Aug 18, 2011 at 8:26 PM, Stack <st...@duboce.net> wrote:

> On Fri, Aug 19, 2011 at 3:04 AM, Ted Yu <yu...@gmail.com> wrote:
> > Surely they do.
> > If they summarize their plan, it would be easier for others to make a
> decision.
> >
> >
>
> I'd tend to think its their decision to make.  We are just here to
> cheer them on (smile).
>
> Subbu you going to come to the hackathon
> (http://www.meetup.com/hackathon/) or meetup on Monday
> (http://www.meetup.com/hbaseusergroup/events/28518471/)?  You Nileema?
>
> Good stuff,
> St.Ack
>

Re: HBASE-1730 and HBASE-4213

Posted by Stack <st...@duboce.net>.

On Fri, Aug 19, 2011 at 3:04 AM, Ted Yu <yu...@gmail.com> wrote:
> Surely they do.
> If they summarize their plan, it would be easier for others to make a decision.
>
>

I'd tend to think its their decision to make.  We are just here to
cheer them on (smile).

Subbu you going to come to the hackathon
(http://www.meetup.com/hackathon/) or meetup on Monday
(http://www.meetup.com/hbaseusergroup/events/28518471/)?  You Nileema?

Good stuff,
St.Ack

Re: HBASE-1730 and HBASE-4213

Posted by Ted Yu <yu...@gmail.com>.

Surely they do. 
If they summarize their plan, it would be easier for others to make a decision. 

On Aug 18, 2011, at 7:59 PM, Stack <st...@duboce.net> wrote:

> On Fri, Aug 19, 2011 at 12:53 AM, Ted Yu <yu...@gmail.com> wrote:
>> Let's vote on which implementation is the better approach.
>> 
> 
> Don't the authors of the patches get a say?
> St.Ack

Re: HBASE-1730 and HBASE-4213

Posted by Stack <st...@duboce.net>.

On Fri, Aug 19, 2011 at 12:53 AM, Ted Yu <yu...@gmail.com> wrote:
> Let's vote on which implementation is the better approach.
>

Don't the authors of the patches get a say?
St.Ack

Re: HBASE-1730 and HBASE-4213

Posted by Ted Yu <yu...@gmail.com>.

I prefer choice A below.

Let's vote on which implementation is the better approach.

My vote is for 4213. Subbu implemented hbase-451 and has deep understanding
of related code.
Using zookeeper to record transient state is Andy's favorite choice.

Cheers

On Thu, Aug 18, 2011 at 5:29 PM, Todd Lipcon <to...@cloudera.com> wrote:

> In my opinion we have three options:
>
> (a) have the two contributors work together on a single JIRA
> (b) factor out what's common between their approaches into a new JIRA,
> then let them proceed independently
> or (c) let them proceed independently, and whichever one reaches a
> suitable commitable state first, we go with
>
> If they both become committable around the same time, then we should
> go to benchmarks as well as comparisons of which codebase seems more
> maintainable.
>
> -Todd
>
> On Thu, Aug 18, 2011 at 4:23 PM, Ted Yu <yu...@gmail.com> wrote:
> > Hi,
> > Due to lack of coordination, HBASE-1730 and HBASE-4213 try to implement
> the
> > same feature at roughly the same pace.
> >
> > I want to hear your opinion on how we should plan to move forward with
> these
> > two JIRAs.
> > One possibility is to provide two policies, one accommodating each JIRA.
> But
> > that requires even more work.
> >
> > It would be nice if we can have some performance numbers for both
> > implementations on comparable cluster(s).
> >
> > Cheers
> >
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: HBASE-1730 and HBASE-4213

Posted by Todd Lipcon <to...@cloudera.com>.

In my opinion we have three options:

(a) have the two contributors work together on a single JIRA
(b) factor out what's common between their approaches into a new JIRA,
then let them proceed independently
or (c) let them proceed independently, and whichever one reaches a
suitable commitable state first, we go with

If they both become committable around the same time, then we should
go to benchmarks as well as comparisons of which codebase seems more
maintainable.

-Todd

On Thu, Aug 18, 2011 at 4:23 PM, Ted Yu <yu...@gmail.com> wrote:
> Hi,
> Due to lack of coordination, HBASE-1730 and HBASE-4213 try to implement the
> same feature at roughly the same pace.
>
> I want to hear your opinion on how we should plan to move forward with these
> two JIRAs.
> One possibility is to provide two policies, one accommodating each JIRA. But
> that requires even more work.
>
> It would be nice if we can have some performance numbers for both
> implementations on comparable cluster(s).
>
> Cheers
>



-- 
Todd Lipcon
Software Engineer, Cloudera