You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Stack <st...@duboce.net> on 2010/08/31 09:43:44 UTC

Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

I just posted the patch to https://review.cloudera.org/r/750/.  Its a
little on the large size (1.5MB. Sorry about that).

The bulk of the patch is by Karthik Ranganathan and Jon Gray.  They've
been working on it in the 0.90_master_rewrite branch with a good few
months now.  Its been reviewed pretty extensively, multiple times, but
its too big for any one individual to review in anything but a cursory
manner in its current form (Again, sorry about that).  Piece-mealing
the changes into the code base was tried but getting all of the
stepped changes in was going to take eons to complete and when we
tried it, it wasn't working well anyways -- reviewers had a hard time
getting their heads around partial feature implementations and
groundwork baffled when the superstructure wasn't coming till a later
stage.

This patch addresses issues head on that have plagued us for what
seems like ages now --- troublesome assignment of regions in
particular -- and IMO in spite of its size and lack of review, unless
objection, I'm going to go ahead and commit this patch tomorrow or the
day after, after all tests pass.  We could let this monster stew out
on the branch for another couple or weeks or a month but IMO, lts
mature enough to be added to TRUNK so we can all work on the
stabilization that will get us to 0.90.0 Release Candidate.

See the umbrella issue for all thats addressed -- about 11 or 12
issues in all, a few of them blockers -- but here is a synopsis of
what the patch includes:

+ Region in transition data structure is now kept out in zookeeper to
facilitate master failover and to do away with race conditions that
used result in double assignment of regions
+ Open and close of regions as well as server shutdown handling and
table transitions are now done in Executors; config. says how much
parallellism to run with.  Default is 3 openers, 3 closers, with
designated handlers for meta and root opening, etc. (We used to be
single-threaded in master and regionserver doing opens/closes, etc.)
+ New load balancer; features include figuring out the plan on startup
and then assigning out all regions in the one assignment.  New method
in admin tool allows you unload region from one server and assign it
to another explicit server.
+ Most of what passed over the heartbeating mechanism has now moved to
go via zk or the master directly invokes rpc to close/open regions
rather than wait on heartbeat to come around

There is more including a bunch of cleanup and refactorings that in
particular facilitate testing, and this patch lays the ground work for
new features coming down the pipeline (the same executor/handler
mechanism will get us parallel flushing, splitting and compacting).

Things will look different after this patch goes in, there are lots of
zk transitions in logs now, and this patch is going to drum up new
kinds of bugs but after a week of gung-ho bug bashing we should have
ourselves a more robust hbase.

St.Ack

Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Jean-Daniel Cryans <jd...@apache.org>.
+1 Better now than when it's going to be an ever greater pain to merge.

J-D

On Tue, Aug 31, 2010 at 12:43 AM, Stack <st...@duboce.net> wrote:
> I just posted the patch to https://review.cloudera.org/r/750/.  Its a
> little on the large size (1.5MB. Sorry about that).
>

Re: coprocessors (was Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90")

Posted by Andrew Purtell <ap...@apache.org>.
> From: Bradford Stephens
> P.S. Very interested in coprocessors ;)

We (Trend Micro) are working on coprocessors now, a piece of it anyway. 

This big master rewrite patch that went in tonight will impact that some... digging in tomorrow...

We still anticipate putting up an initial cut of RBAC implemented on top of a minimal coprocessor framework for review by end of this month. 

>From there we have big plans. I hear others as well. As I understand, we will be getting together week of 10/18 or 10/25 over at FB about coprocessors. 

  - Andy



      


Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Bradford Stephens <br...@gmail.com>.
Very cool work, folks. I'll try to spin up a cluster on EC2 with some
customer data and see if it stays alive :)

P.S. Very interested in coprocessors ;)

-B

On Tue, Aug 31, 2010 at 8:22 PM, Stack <st...@duboce.net> wrote:
> More performance:
>
> + Splits should run faster now the daughters are put up immediately on
> the parents' hosting server rather than later after a message to the
> master and after master assigns the daughter regions out for opening
> (The load balancer will rebalance off the parent's host later if
> needed)
> + Smarter load balancer that is parsimonious and smarter about when to
> move regions
>
> The above should make it so regions are offlined for shorter periods of time
>
> St.Ack
>
>
> On Tue, Aug 31, 2010 at 4:47 PM, Jonathan Gray <jg...@facebook.com> wrote:
>> @Ted, what Ryan said.  Please don't keep asking for performance numbers after each change.  We are spending effort writing code and testing for correctness.  If numbers are available, we will not hide them.  Otherwise, it would be awesome if you wanted to lead an effort to do ongoing performance tests.
>>
>> As far as what could have a performance impact...
>>
>> - Cluster startup can be drastically faster and the ability to not lose data locality across restarts will be fairly trivial after this goes in
>> - Enable/disable should be significantly faster
>> - Region assignment should be faster than current trunk though addition of ZK does add latency compared with an RPC-only design
>> - Multi-threading and priority abstractions added.  Already done for things like open/close, next up is flush/split/compact
>> - Removal of BaseScanner means we do not ever need to wait for a meta scan to trigger any master operations
>> - Removal of heartbeat piggybacking means we do not ever need to wait for a heartbeat to send an RS a message
>>
>> Other stuff like admin functions going straight to RS will open up the ability for us to make things that are only async today work in either a sync or async fashion.
>>
>> Lastly, we are moving away from things like RetryableMetaOperations which use the combination of maxRetries and delay when META is not available.  Now this is strictly set as a maxTimeout.
>>
>> JG
>>
>>
>>> -----Original Message-----
>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>> Sent: Tuesday, August 31, 2010 3:36 PM
>>> To: dev@hbase.apache.org
>>> Subject: Re: Heads-up: big commit in next day or so; "HBASE-2692 Master
>>> rewrite and cleanup for 0.90"
>>>
>>> There might not be straight line performance, but there are features
>>> to be enabled, and also some things that are sped up, like region
>>> assignment, don't show in many standard performance tests (eg: ycsb).
>>>
>>> If you are serious, maybe you could help by running performance tests?
>>>  Running performance tests is not an easy thing, and can occupy a
>>> senior engineer an entire day running a series of tests just to
>>> produce 1 spreadsheet.  In reality it's performance #s vs working
>>> code.  I think you know the one we pick.
>>>
>>> -ryan
>>>
>>> On Tue, Aug 31, 2010 at 2:29 PM, Ted Yu <yu...@gmail.com> wrote:
>>> > Jonathan:
>>> > Can you publish performance metric (compared with current trunk) from
>>> > cluster running the new master ?
>>> >
>>> > Thanks
>>> >
>>> > On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com>
>>> wrote:
>>> >
>>> >> Though I'm sure my vote is clear, I'm +1 on this.
>>> >>
>>> >> The plan at fb is to update our internal branch to (almost) the
>>> current
>>> >> head of trunk, before the commit of the master branch.  Ongoing
>>> testing will
>>> >> continue on this branch.
>>> >>
>>> >> In parallel, testing will also begin here on the new master
>>> following the
>>> >> mega commit.
>>> >>
>>> >> Hopefully we can transition everything to the new master sooner than
>>> later
>>> >> instead of splitting time.  I'd say shortly after initial testing is
>>> >> complete we should push for a new master 0.89 or 0.90RC and ask
>>> users to
>>> >> test as much as possible.
>>> >>
>>> >>
>>> >> I did as much as possible to try and get reviews along the way,
>>> including
>>> >> several very early design discussions and group code review
>>> sessions, but
>>> >> this is pretty radical change so has not been easy.  If you're
>>> familiar with
>>> >> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this
>>> stuff has
>>> >> been cut and the replacements are much shorter/simpler.
>>> >>
>>> >> Just need to find all the bugs and fill in the oversights :)
>>> >>
>>> >> Stack, thanks for carrying this thing over the finish line.
>>> >>
>>> >> JG
>>> >>
>>> >> > -----Original Message-----
>>> >> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>>> Of
>>> >> > Stack
>>> >> > Sent: Tuesday, August 31, 2010 12:44 AM
>>> >> > To: HBase Dev List
>>> >> > Subject: Heads-up: big commit in next day or so; "HBASE-2692
>>> Master
>>> >> > rewrite and cleanup for 0.90"
>>> >> >
>>> >> > I just posted the patch to https://review.cloudera.org/r/750/.
>>>  Its a
>>> >> > little on the large size (1.5MB. Sorry about that).
>>> >> >
>>> >> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.
>>>  They've
>>> >> > been working on it in the 0.90_master_rewrite branch with a good
>>> few
>>> >> > months now.  Its been reviewed pretty extensively, multiple times,
>>> but
>>> >> > its too big for any one individual to review in anything but a
>>> cursory
>>> >> > manner in its current form (Again, sorry about that).  Piece-
>>> mealing
>>> >> > the changes into the code base was tried but getting all of the
>>> >> > stepped changes in was going to take eons to complete and when we
>>> >> > tried it, it wasn't working well anyways -- reviewers had a hard
>>> time
>>> >> > getting their heads around partial feature implementations and
>>> >> > groundwork baffled when the superstructure wasn't coming till a
>>> later
>>> >> > stage.
>>> >> >
>>> >> > This patch addresses issues head on that have plagued us for what
>>> >> > seems like ages now --- troublesome assignment of regions in
>>> >> > particular -- and IMO in spite of its size and lack of review,
>>> unless
>>> >> > objection, I'm going to go ahead and commit this patch tomorrow or
>>> the
>>> >> > day after, after all tests pass.  We could let this monster stew
>>> out
>>> >> > on the branch for another couple or weeks or a month but IMO, lts
>>> >> > mature enough to be added to TRUNK so we can all work on the
>>> >> > stabilization that will get us to 0.90.0 Release Candidate.
>>> >> >
>>> >> > See the umbrella issue for all thats addressed -- about 11 or 12
>>> >> > issues in all, a few of them blockers -- but here is a synopsis of
>>> >> > what the patch includes:
>>> >> >
>>> >> > + Region in transition data structure is now kept out in zookeeper
>>> to
>>> >> > facilitate master failover and to do away with race conditions
>>> that
>>> >> > used result in double assignment of regions
>>> >> > + Open and close of regions as well as server shutdown handling
>>> and
>>> >> > table transitions are now done in Executors; config. says how much
>>> >> > parallellism to run with.  Default is 3 openers, 3 closers, with
>>> >> > designated handlers for meta and root opening, etc. (We used to be
>>> >> > single-threaded in master and regionserver doing opens/closes,
>>> etc.)
>>> >> > + New load balancer; features include figuring out the plan on
>>> startup
>>> >> > and then assigning out all regions in the one assignment.  New
>>> method
>>> >> > in admin tool allows you unload region from one server and assign
>>> it
>>> >> > to another explicit server.
>>> >> > + Most of what passed over the heartbeating mechanism has now
>>> moved to
>>> >> > go via zk or the master directly invokes rpc to close/open regions
>>> >> > rather than wait on heartbeat to come around
>>> >> >
>>> >> > There is more including a bunch of cleanup and refactorings that
>>> in
>>> >> > particular facilitate testing, and this patch lays the ground work
>>> for
>>> >> > new features coming down the pipeline (the same executor/handler
>>> >> > mechanism will get us parallel flushing, splitting and
>>> compacting).
>>> >> >
>>> >> > Things will look different after this patch goes in, there are
>>> lots of
>>> >> > zk transitions in logs now, and this patch is going to drum up new
>>> >> > kinds of bugs but after a week of gung-ho bug bashing we should
>>> have
>>> >> > ourselves a more robust hbase.
>>> >> >
>>> >> > St.Ack
>>> >>
>>> >
>>
>



-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science

Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Stack <st...@duboce.net>.
More performance:

+ Splits should run faster now the daughters are put up immediately on
the parents' hosting server rather than later after a message to the
master and after master assigns the daughter regions out for opening
(The load balancer will rebalance off the parent's host later if
needed)
+ Smarter load balancer that is parsimonious and smarter about when to
move regions

The above should make it so regions are offlined for shorter periods of time

St.Ack


On Tue, Aug 31, 2010 at 4:47 PM, Jonathan Gray <jg...@facebook.com> wrote:
> @Ted, what Ryan said.  Please don't keep asking for performance numbers after each change.  We are spending effort writing code and testing for correctness.  If numbers are available, we will not hide them.  Otherwise, it would be awesome if you wanted to lead an effort to do ongoing performance tests.
>
> As far as what could have a performance impact...
>
> - Cluster startup can be drastically faster and the ability to not lose data locality across restarts will be fairly trivial after this goes in
> - Enable/disable should be significantly faster
> - Region assignment should be faster than current trunk though addition of ZK does add latency compared with an RPC-only design
> - Multi-threading and priority abstractions added.  Already done for things like open/close, next up is flush/split/compact
> - Removal of BaseScanner means we do not ever need to wait for a meta scan to trigger any master operations
> - Removal of heartbeat piggybacking means we do not ever need to wait for a heartbeat to send an RS a message
>
> Other stuff like admin functions going straight to RS will open up the ability for us to make things that are only async today work in either a sync or async fashion.
>
> Lastly, we are moving away from things like RetryableMetaOperations which use the combination of maxRetries and delay when META is not available.  Now this is strictly set as a maxTimeout.
>
> JG
>
>
>> -----Original Message-----
>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>> Sent: Tuesday, August 31, 2010 3:36 PM
>> To: dev@hbase.apache.org
>> Subject: Re: Heads-up: big commit in next day or so; "HBASE-2692 Master
>> rewrite and cleanup for 0.90"
>>
>> There might not be straight line performance, but there are features
>> to be enabled, and also some things that are sped up, like region
>> assignment, don't show in many standard performance tests (eg: ycsb).
>>
>> If you are serious, maybe you could help by running performance tests?
>>  Running performance tests is not an easy thing, and can occupy a
>> senior engineer an entire day running a series of tests just to
>> produce 1 spreadsheet.  In reality it's performance #s vs working
>> code.  I think you know the one we pick.
>>
>> -ryan
>>
>> On Tue, Aug 31, 2010 at 2:29 PM, Ted Yu <yu...@gmail.com> wrote:
>> > Jonathan:
>> > Can you publish performance metric (compared with current trunk) from
>> > cluster running the new master ?
>> >
>> > Thanks
>> >
>> > On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com>
>> wrote:
>> >
>> >> Though I'm sure my vote is clear, I'm +1 on this.
>> >>
>> >> The plan at fb is to update our internal branch to (almost) the
>> current
>> >> head of trunk, before the commit of the master branch.  Ongoing
>> testing will
>> >> continue on this branch.
>> >>
>> >> In parallel, testing will also begin here on the new master
>> following the
>> >> mega commit.
>> >>
>> >> Hopefully we can transition everything to the new master sooner than
>> later
>> >> instead of splitting time.  I'd say shortly after initial testing is
>> >> complete we should push for a new master 0.89 or 0.90RC and ask
>> users to
>> >> test as much as possible.
>> >>
>> >>
>> >> I did as much as possible to try and get reviews along the way,
>> including
>> >> several very early design discussions and group code review
>> sessions, but
>> >> this is pretty radical change so has not been easy.  If you're
>> familiar with
>> >> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this
>> stuff has
>> >> been cut and the replacements are much shorter/simpler.
>> >>
>> >> Just need to find all the bugs and fill in the oversights :)
>> >>
>> >> Stack, thanks for carrying this thing over the finish line.
>> >>
>> >> JG
>> >>
>> >> > -----Original Message-----
>> >> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
>> Of
>> >> > Stack
>> >> > Sent: Tuesday, August 31, 2010 12:44 AM
>> >> > To: HBase Dev List
>> >> > Subject: Heads-up: big commit in next day or so; "HBASE-2692
>> Master
>> >> > rewrite and cleanup for 0.90"
>> >> >
>> >> > I just posted the patch to https://review.cloudera.org/r/750/.
>>  Its a
>> >> > little on the large size (1.5MB. Sorry about that).
>> >> >
>> >> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.
>>  They've
>> >> > been working on it in the 0.90_master_rewrite branch with a good
>> few
>> >> > months now.  Its been reviewed pretty extensively, multiple times,
>> but
>> >> > its too big for any one individual to review in anything but a
>> cursory
>> >> > manner in its current form (Again, sorry about that).  Piece-
>> mealing
>> >> > the changes into the code base was tried but getting all of the
>> >> > stepped changes in was going to take eons to complete and when we
>> >> > tried it, it wasn't working well anyways -- reviewers had a hard
>> time
>> >> > getting their heads around partial feature implementations and
>> >> > groundwork baffled when the superstructure wasn't coming till a
>> later
>> >> > stage.
>> >> >
>> >> > This patch addresses issues head on that have plagued us for what
>> >> > seems like ages now --- troublesome assignment of regions in
>> >> > particular -- and IMO in spite of its size and lack of review,
>> unless
>> >> > objection, I'm going to go ahead and commit this patch tomorrow or
>> the
>> >> > day after, after all tests pass.  We could let this monster stew
>> out
>> >> > on the branch for another couple or weeks or a month but IMO, lts
>> >> > mature enough to be added to TRUNK so we can all work on the
>> >> > stabilization that will get us to 0.90.0 Release Candidate.
>> >> >
>> >> > See the umbrella issue for all thats addressed -- about 11 or 12
>> >> > issues in all, a few of them blockers -- but here is a synopsis of
>> >> > what the patch includes:
>> >> >
>> >> > + Region in transition data structure is now kept out in zookeeper
>> to
>> >> > facilitate master failover and to do away with race conditions
>> that
>> >> > used result in double assignment of regions
>> >> > + Open and close of regions as well as server shutdown handling
>> and
>> >> > table transitions are now done in Executors; config. says how much
>> >> > parallellism to run with.  Default is 3 openers, 3 closers, with
>> >> > designated handlers for meta and root opening, etc. (We used to be
>> >> > single-threaded in master and regionserver doing opens/closes,
>> etc.)
>> >> > + New load balancer; features include figuring out the plan on
>> startup
>> >> > and then assigning out all regions in the one assignment.  New
>> method
>> >> > in admin tool allows you unload region from one server and assign
>> it
>> >> > to another explicit server.
>> >> > + Most of what passed over the heartbeating mechanism has now
>> moved to
>> >> > go via zk or the master directly invokes rpc to close/open regions
>> >> > rather than wait on heartbeat to come around
>> >> >
>> >> > There is more including a bunch of cleanup and refactorings that
>> in
>> >> > particular facilitate testing, and this patch lays the ground work
>> for
>> >> > new features coming down the pipeline (the same executor/handler
>> >> > mechanism will get us parallel flushing, splitting and
>> compacting).
>> >> >
>> >> > Things will look different after this patch goes in, there are
>> lots of
>> >> > zk transitions in logs now, and this patch is going to drum up new
>> >> > kinds of bugs but after a week of gung-ho bug bashing we should
>> have
>> >> > ourselves a more robust hbase.
>> >> >
>> >> > St.Ack
>> >>
>> >
>

RE: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Jonathan Gray <jg...@facebook.com>.
@Ted, what Ryan said.  Please don't keep asking for performance numbers after each change.  We are spending effort writing code and testing for correctness.  If numbers are available, we will not hide them.  Otherwise, it would be awesome if you wanted to lead an effort to do ongoing performance tests.

As far as what could have a performance impact...

- Cluster startup can be drastically faster and the ability to not lose data locality across restarts will be fairly trivial after this goes in
- Enable/disable should be significantly faster
- Region assignment should be faster than current trunk though addition of ZK does add latency compared with an RPC-only design
- Multi-threading and priority abstractions added.  Already done for things like open/close, next up is flush/split/compact
- Removal of BaseScanner means we do not ever need to wait for a meta scan to trigger any master operations
- Removal of heartbeat piggybacking means we do not ever need to wait for a heartbeat to send an RS a message

Other stuff like admin functions going straight to RS will open up the ability for us to make things that are only async today work in either a sync or async fashion.

Lastly, we are moving away from things like RetryableMetaOperations which use the combination of maxRetries and delay when META is not available.  Now this is strictly set as a maxTimeout.

JG


> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Tuesday, August 31, 2010 3:36 PM
> To: dev@hbase.apache.org
> Subject: Re: Heads-up: big commit in next day or so; "HBASE-2692 Master
> rewrite and cleanup for 0.90"
> 
> There might not be straight line performance, but there are features
> to be enabled, and also some things that are sped up, like region
> assignment, don't show in many standard performance tests (eg: ycsb).
> 
> If you are serious, maybe you could help by running performance tests?
>  Running performance tests is not an easy thing, and can occupy a
> senior engineer an entire day running a series of tests just to
> produce 1 spreadsheet.  In reality it's performance #s vs working
> code.  I think you know the one we pick.
> 
> -ryan
> 
> On Tue, Aug 31, 2010 at 2:29 PM, Ted Yu <yu...@gmail.com> wrote:
> > Jonathan:
> > Can you publish performance metric (compared with current trunk) from
> > cluster running the new master ?
> >
> > Thanks
> >
> > On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com>
> wrote:
> >
> >> Though I'm sure my vote is clear, I'm +1 on this.
> >>
> >> The plan at fb is to update our internal branch to (almost) the
> current
> >> head of trunk, before the commit of the master branch.  Ongoing
> testing will
> >> continue on this branch.
> >>
> >> In parallel, testing will also begin here on the new master
> following the
> >> mega commit.
> >>
> >> Hopefully we can transition everything to the new master sooner than
> later
> >> instead of splitting time.  I'd say shortly after initial testing is
> >> complete we should push for a new master 0.89 or 0.90RC and ask
> users to
> >> test as much as possible.
> >>
> >>
> >> I did as much as possible to try and get reviews along the way,
> including
> >> several very early design discussions and group code review
> sessions, but
> >> this is pretty radical change so has not been easy.  If you're
> familiar with
> >> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this
> stuff has
> >> been cut and the replacements are much shorter/simpler.
> >>
> >> Just need to find all the bugs and fill in the oversights :)
> >>
> >> Stack, thanks for carrying this thing over the finish line.
> >>
> >> JG
> >>
> >> > -----Original Message-----
> >> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf
> Of
> >> > Stack
> >> > Sent: Tuesday, August 31, 2010 12:44 AM
> >> > To: HBase Dev List
> >> > Subject: Heads-up: big commit in next day or so; "HBASE-2692
> Master
> >> > rewrite and cleanup for 0.90"
> >> >
> >> > I just posted the patch to https://review.cloudera.org/r/750/.
>  Its a
> >> > little on the large size (1.5MB. Sorry about that).
> >> >
> >> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.
>  They've
> >> > been working on it in the 0.90_master_rewrite branch with a good
> few
> >> > months now.  Its been reviewed pretty extensively, multiple times,
> but
> >> > its too big for any one individual to review in anything but a
> cursory
> >> > manner in its current form (Again, sorry about that).  Piece-
> mealing
> >> > the changes into the code base was tried but getting all of the
> >> > stepped changes in was going to take eons to complete and when we
> >> > tried it, it wasn't working well anyways -- reviewers had a hard
> time
> >> > getting their heads around partial feature implementations and
> >> > groundwork baffled when the superstructure wasn't coming till a
> later
> >> > stage.
> >> >
> >> > This patch addresses issues head on that have plagued us for what
> >> > seems like ages now --- troublesome assignment of regions in
> >> > particular -- and IMO in spite of its size and lack of review,
> unless
> >> > objection, I'm going to go ahead and commit this patch tomorrow or
> the
> >> > day after, after all tests pass.  We could let this monster stew
> out
> >> > on the branch for another couple or weeks or a month but IMO, lts
> >> > mature enough to be added to TRUNK so we can all work on the
> >> > stabilization that will get us to 0.90.0 Release Candidate.
> >> >
> >> > See the umbrella issue for all thats addressed -- about 11 or 12
> >> > issues in all, a few of them blockers -- but here is a synopsis of
> >> > what the patch includes:
> >> >
> >> > + Region in transition data structure is now kept out in zookeeper
> to
> >> > facilitate master failover and to do away with race conditions
> that
> >> > used result in double assignment of regions
> >> > + Open and close of regions as well as server shutdown handling
> and
> >> > table transitions are now done in Executors; config. says how much
> >> > parallellism to run with.  Default is 3 openers, 3 closers, with
> >> > designated handlers for meta and root opening, etc. (We used to be
> >> > single-threaded in master and regionserver doing opens/closes,
> etc.)
> >> > + New load balancer; features include figuring out the plan on
> startup
> >> > and then assigning out all regions in the one assignment.  New
> method
> >> > in admin tool allows you unload region from one server and assign
> it
> >> > to another explicit server.
> >> > + Most of what passed over the heartbeating mechanism has now
> moved to
> >> > go via zk or the master directly invokes rpc to close/open regions
> >> > rather than wait on heartbeat to come around
> >> >
> >> > There is more including a bunch of cleanup and refactorings that
> in
> >> > particular facilitate testing, and this patch lays the ground work
> for
> >> > new features coming down the pipeline (the same executor/handler
> >> > mechanism will get us parallel flushing, splitting and
> compacting).
> >> >
> >> > Things will look different after this patch goes in, there are
> lots of
> >> > zk transitions in logs now, and this patch is going to drum up new
> >> > kinds of bugs but after a week of gung-ho bug bashing we should
> have
> >> > ourselves a more robust hbase.
> >> >
> >> > St.Ack
> >>
> >

Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Ryan Rawson <ry...@gmail.com>.
There might not be straight line performance, but there are features
to be enabled, and also some things that are sped up, like region
assignment, don't show in many standard performance tests (eg: ycsb).

If you are serious, maybe you could help by running performance tests?
 Running performance tests is not an easy thing, and can occupy a
senior engineer an entire day running a series of tests just to
produce 1 spreadsheet.  In reality it's performance #s vs working
code.  I think you know the one we pick.

-ryan

On Tue, Aug 31, 2010 at 2:29 PM, Ted Yu <yu...@gmail.com> wrote:
> Jonathan:
> Can you publish performance metric (compared with current trunk) from
> cluster running the new master ?
>
> Thanks
>
> On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com> wrote:
>
>> Though I'm sure my vote is clear, I'm +1 on this.
>>
>> The plan at fb is to update our internal branch to (almost) the current
>> head of trunk, before the commit of the master branch.  Ongoing testing will
>> continue on this branch.
>>
>> In parallel, testing will also begin here on the new master following the
>> mega commit.
>>
>> Hopefully we can transition everything to the new master sooner than later
>> instead of splitting time.  I'd say shortly after initial testing is
>> complete we should push for a new master 0.89 or 0.90RC and ask users to
>> test as much as possible.
>>
>>
>> I did as much as possible to try and get reviews along the way, including
>> several very early design discussions and group code review sessions, but
>> this is pretty radical change so has not been easy.  If you're familiar with
>> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this stuff has
>> been cut and the replacements are much shorter/simpler.
>>
>> Just need to find all the bugs and fill in the oversights :)
>>
>> Stack, thanks for carrying this thing over the finish line.
>>
>> JG
>>
>> > -----Original Message-----
>> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
>> > Stack
>> > Sent: Tuesday, August 31, 2010 12:44 AM
>> > To: HBase Dev List
>> > Subject: Heads-up: big commit in next day or so; "HBASE-2692 Master
>> > rewrite and cleanup for 0.90"
>> >
>> > I just posted the patch to https://review.cloudera.org/r/750/.  Its a
>> > little on the large size (1.5MB. Sorry about that).
>> >
>> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.  They've
>> > been working on it in the 0.90_master_rewrite branch with a good few
>> > months now.  Its been reviewed pretty extensively, multiple times, but
>> > its too big for any one individual to review in anything but a cursory
>> > manner in its current form (Again, sorry about that).  Piece-mealing
>> > the changes into the code base was tried but getting all of the
>> > stepped changes in was going to take eons to complete and when we
>> > tried it, it wasn't working well anyways -- reviewers had a hard time
>> > getting their heads around partial feature implementations and
>> > groundwork baffled when the superstructure wasn't coming till a later
>> > stage.
>> >
>> > This patch addresses issues head on that have plagued us for what
>> > seems like ages now --- troublesome assignment of regions in
>> > particular -- and IMO in spite of its size and lack of review, unless
>> > objection, I'm going to go ahead and commit this patch tomorrow or the
>> > day after, after all tests pass.  We could let this monster stew out
>> > on the branch for another couple or weeks or a month but IMO, lts
>> > mature enough to be added to TRUNK so we can all work on the
>> > stabilization that will get us to 0.90.0 Release Candidate.
>> >
>> > See the umbrella issue for all thats addressed -- about 11 or 12
>> > issues in all, a few of them blockers -- but here is a synopsis of
>> > what the patch includes:
>> >
>> > + Region in transition data structure is now kept out in zookeeper to
>> > facilitate master failover and to do away with race conditions that
>> > used result in double assignment of regions
>> > + Open and close of regions as well as server shutdown handling and
>> > table transitions are now done in Executors; config. says how much
>> > parallellism to run with.  Default is 3 openers, 3 closers, with
>> > designated handlers for meta and root opening, etc. (We used to be
>> > single-threaded in master and regionserver doing opens/closes, etc.)
>> > + New load balancer; features include figuring out the plan on startup
>> > and then assigning out all regions in the one assignment.  New method
>> > in admin tool allows you unload region from one server and assign it
>> > to another explicit server.
>> > + Most of what passed over the heartbeating mechanism has now moved to
>> > go via zk or the master directly invokes rpc to close/open regions
>> > rather than wait on heartbeat to come around
>> >
>> > There is more including a bunch of cleanup and refactorings that in
>> > particular facilitate testing, and this patch lays the ground work for
>> > new features coming down the pipeline (the same executor/handler
>> > mechanism will get us parallel flushing, splitting and compacting).
>> >
>> > Things will look different after this patch goes in, there are lots of
>> > zk transitions in logs now, and this patch is going to drum up new
>> > kinds of bugs but after a week of gung-ho bug bashing we should have
>> > ourselves a more robust hbase.
>> >
>> > St.Ack
>>
>

Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Ted Yu <yu...@gmail.com>.
Jonathan:
Can you publish performance metric (compared with current trunk) from
cluster running the new master ?

Thanks

On Tue, Aug 31, 2010 at 10:20 AM, Jonathan Gray <jg...@facebook.com> wrote:

> Though I'm sure my vote is clear, I'm +1 on this.
>
> The plan at fb is to update our internal branch to (almost) the current
> head of trunk, before the commit of the master branch.  Ongoing testing will
> continue on this branch.
>
> In parallel, testing will also begin here on the new master following the
> mega commit.
>
> Hopefully we can transition everything to the new master sooner than later
> instead of splitting time.  I'd say shortly after initial testing is
> complete we should push for a new master 0.89 or 0.90RC and ask users to
> test as much as possible.
>
>
> I did as much as possible to try and get reviews along the way, including
> several very early design discussions and group code review sessions, but
> this is pretty radical change so has not been easy.  If you're familiar with
> the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this stuff has
> been cut and the replacements are much shorter/simpler.
>
> Just need to find all the bugs and fill in the oversights :)
>
> Stack, thanks for carrying this thing over the finish line.
>
> JG
>
> > -----Original Message-----
> > From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> > Stack
> > Sent: Tuesday, August 31, 2010 12:44 AM
> > To: HBase Dev List
> > Subject: Heads-up: big commit in next day or so; "HBASE-2692 Master
> > rewrite and cleanup for 0.90"
> >
> > I just posted the patch to https://review.cloudera.org/r/750/.  Its a
> > little on the large size (1.5MB. Sorry about that).
> >
> > The bulk of the patch is by Karthik Ranganathan and Jon Gray.  They've
> > been working on it in the 0.90_master_rewrite branch with a good few
> > months now.  Its been reviewed pretty extensively, multiple times, but
> > its too big for any one individual to review in anything but a cursory
> > manner in its current form (Again, sorry about that).  Piece-mealing
> > the changes into the code base was tried but getting all of the
> > stepped changes in was going to take eons to complete and when we
> > tried it, it wasn't working well anyways -- reviewers had a hard time
> > getting their heads around partial feature implementations and
> > groundwork baffled when the superstructure wasn't coming till a later
> > stage.
> >
> > This patch addresses issues head on that have plagued us for what
> > seems like ages now --- troublesome assignment of regions in
> > particular -- and IMO in spite of its size and lack of review, unless
> > objection, I'm going to go ahead and commit this patch tomorrow or the
> > day after, after all tests pass.  We could let this monster stew out
> > on the branch for another couple or weeks or a month but IMO, lts
> > mature enough to be added to TRUNK so we can all work on the
> > stabilization that will get us to 0.90.0 Release Candidate.
> >
> > See the umbrella issue for all thats addressed -- about 11 or 12
> > issues in all, a few of them blockers -- but here is a synopsis of
> > what the patch includes:
> >
> > + Region in transition data structure is now kept out in zookeeper to
> > facilitate master failover and to do away with race conditions that
> > used result in double assignment of regions
> > + Open and close of regions as well as server shutdown handling and
> > table transitions are now done in Executors; config. says how much
> > parallellism to run with.  Default is 3 openers, 3 closers, with
> > designated handlers for meta and root opening, etc. (We used to be
> > single-threaded in master and regionserver doing opens/closes, etc.)
> > + New load balancer; features include figuring out the plan on startup
> > and then assigning out all regions in the one assignment.  New method
> > in admin tool allows you unload region from one server and assign it
> > to another explicit server.
> > + Most of what passed over the heartbeating mechanism has now moved to
> > go via zk or the master directly invokes rpc to close/open regions
> > rather than wait on heartbeat to come around
> >
> > There is more including a bunch of cleanup and refactorings that in
> > particular facilitate testing, and this patch lays the ground work for
> > new features coming down the pipeline (the same executor/handler
> > mechanism will get us parallel flushing, splitting and compacting).
> >
> > Things will look different after this patch goes in, there are lots of
> > zk transitions in logs now, and this patch is going to drum up new
> > kinds of bugs but after a week of gung-ho bug bashing we should have
> > ourselves a more robust hbase.
> >
> > St.Ack
>

RE: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Jonathan Gray <jg...@facebook.com>.
Though I'm sure my vote is clear, I'm +1 on this.

The plan at fb is to update our internal branch to (almost) the current head of trunk, before the commit of the master branch.  Ongoing testing will continue on this branch.

In parallel, testing will also begin here on the new master following the mega commit.

Hopefully we can transition everything to the new master sooner than later instead of splitting time.  I'd say shortly after initial testing is complete we should push for a new master 0.89 or 0.90RC and ask users to test as much as possible.


I did as much as possible to try and get reviews along the way, including several very early design discussions and group code review sessions, but this is pretty radical change so has not been easy.  If you're familiar with the old BaseScanner, RegionManager, ZooKeeperWrapper, etc. this stuff has been cut and the replacements are much shorter/simpler.

Just need to find all the bugs and fill in the oversights :)

Stack, thanks for carrying this thing over the finish line.

JG

> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
> Stack
> Sent: Tuesday, August 31, 2010 12:44 AM
> To: HBase Dev List
> Subject: Heads-up: big commit in next day or so; "HBASE-2692 Master
> rewrite and cleanup for 0.90"
> 
> I just posted the patch to https://review.cloudera.org/r/750/.  Its a
> little on the large size (1.5MB. Sorry about that).
> 
> The bulk of the patch is by Karthik Ranganathan and Jon Gray.  They've
> been working on it in the 0.90_master_rewrite branch with a good few
> months now.  Its been reviewed pretty extensively, multiple times, but
> its too big for any one individual to review in anything but a cursory
> manner in its current form (Again, sorry about that).  Piece-mealing
> the changes into the code base was tried but getting all of the
> stepped changes in was going to take eons to complete and when we
> tried it, it wasn't working well anyways -- reviewers had a hard time
> getting their heads around partial feature implementations and
> groundwork baffled when the superstructure wasn't coming till a later
> stage.
> 
> This patch addresses issues head on that have plagued us for what
> seems like ages now --- troublesome assignment of regions in
> particular -- and IMO in spite of its size and lack of review, unless
> objection, I'm going to go ahead and commit this patch tomorrow or the
> day after, after all tests pass.  We could let this monster stew out
> on the branch for another couple or weeks or a month but IMO, lts
> mature enough to be added to TRUNK so we can all work on the
> stabilization that will get us to 0.90.0 Release Candidate.
> 
> See the umbrella issue for all thats addressed -- about 11 or 12
> issues in all, a few of them blockers -- but here is a synopsis of
> what the patch includes:
> 
> + Region in transition data structure is now kept out in zookeeper to
> facilitate master failover and to do away with race conditions that
> used result in double assignment of regions
> + Open and close of regions as well as server shutdown handling and
> table transitions are now done in Executors; config. says how much
> parallellism to run with.  Default is 3 openers, 3 closers, with
> designated handlers for meta and root opening, etc. (We used to be
> single-threaded in master and regionserver doing opens/closes, etc.)
> + New load balancer; features include figuring out the plan on startup
> and then assigning out all regions in the one assignment.  New method
> in admin tool allows you unload region from one server and assign it
> to another explicit server.
> + Most of what passed over the heartbeating mechanism has now moved to
> go via zk or the master directly invokes rpc to close/open regions
> rather than wait on heartbeat to come around
> 
> There is more including a bunch of cleanup and refactorings that in
> particular facilitate testing, and this patch lays the ground work for
> new features coming down the pipeline (the same executor/handler
> mechanism will get us parallel flushing, splitting and compacting).
> 
> Things will look different after this patch goes in, there are lots of
> zk transitions in logs now, and this patch is going to drum up new
> kinds of bugs but after a week of gung-ho bug bashing we should have
> ourselves a more robust hbase.
> 
> St.Ack

Re: Heads-up: big commit in next day or so; "HBASE-2692 Master rewrite and cleanup for 0.90"

Posted by Andrew Purtell <ap...@apache.org>.
> [...] unless objection, I'm going to go ahead and commit this patch
> tomorrow or the day after, after all tests pass.  We could let this
> monster stew out on the branch for another couple or weeks or a
> month but IMO, lts mature enough to be added to TRUNK so we can all
> work on the stabilization that will get us to 0.90.0 Release
> Candidate.

+1

This will help us carry coprocessors over the line also, we won't have to deal with a bunch of small changes, only one big set.

   - Andy