You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2009/08/01 00:02:02 UTC

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

My understanding is the region server is supposed to restart and check in
with the master as if newly launched. I could be wrong. I was away for a 
while. At least following the log messages this appears to be the intent. 

   - Andy




________________________________
From: Ryan Rawson <ry...@gmail.com>
To: hbase-dev@hadoop.apache.org
Sent: Friday, July 31, 2009 2:11:36 PM
Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

It is not supposed to restart, you will need to use supervisor to
achieve that.

If the ZK session is timed out, then the RS has no idea if the master
has reassigned regions or not.  The RS then FATALS, the master
recovers the log, and all will be well(ish).

The zookeeper daemons also need supervising, since they might FATAL
but can be restarted to continue on later.

On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org> wrote:
> -1
>
> Region server did not restart after ZK timeout. Entered a high stress period
> while compacting under heavy write load, high RAM commitment, and
> concurrency.
>
> This is a stress test and I need to tune down vm.swappiness some more, but
> the region server shut down and did not restart.
>
> See attached.
>
>    - Andy
>
>
> ________________________________
> From: stack <st...@duboce.net>
> To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
> Sent: Wednesday, July 29, 2009 5:31:31 PM
> Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> The first hbase 0.20.0 release candidate is available for download:
>
> http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>
> More than 400 issues have been addressed.  The release notes are available
> here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
>
> HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x including
> configuration fundamentals.  Be sure to read the 'Getting Started'
> documentation available here:
> http://su.pr/211OYP.<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>
> If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will need
> to run a migration.  See http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
> First read the overview and then go to the section, 'From 0.19.x to 0.20.x'.
>
> Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
> Monday August 3rd.
>
> Yours,
> The HBasistas
>
> P.S. 0.20.0 Highlights include:
>
> + Much improved performance
> + Master is no longer SPOF
> + Rolling restarts -- no need to take down whole cluster updating config. or
> making minor upgrades
> + A new, more comprehensive API (The old API is still present but
> deprecated)
> + Improved mapreduce connectors
> + New contrib package with updated Transactional HBase (THBase) and Indexed
> HBase (ITHBase) as well as a new REST gateway called stargate
> + And, as they say on the radio, "much, much more".
>
>



      

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Andrew is right that a RS is supposed to restart but it's not always
working. The real problem, as it is most of the time, is GC pauses
else there would be no ZK timeouts.

J-D

On Fri, Jul 31, 2009 at 6:05 PM, Andrew Purtell<ap...@apache.org> wrote:
> Ok, then I vote +1 on the RC, but with the caveat that "restarting" should
> be stricken from HRS logging.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Friday, July 31, 2009 3:03:32 PM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> The JVM must terminate because it is difficult to reset the state of a
> HRS.  Thus supervision is necessary.
>
> On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell<ap...@apache.org> wrote:
>> My understanding is the region server is supposed to restart and check in
>> with the master as if newly launched. I could be wrong. I was away for a
>> while. At least following the log messages this appears to be the intent.
>>
>>   - Andy
>>
>>
>>
>>
>> ________________________________
>> From: Ryan Rawson <ry...@gmail.com>
>> To: hbase-dev@hadoop.apache.org
>> Sent: Friday, July 31, 2009 2:11:36 PM
>> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>>
>> It is not supposed to restart, you will need to use supervisor to
>> achieve that.
>>
>> If the ZK session is timed out, then the RS has no idea if the master
>> has reassigned regions or not.  The RS then FATALS, the master
>> recovers the log, and all will be well(ish).
>>
>> The zookeeper daemons also need supervising, since they might FATAL
>> but can be restarted to continue on later.
>>
>> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org> wrote:
>>> -1
>>>
>>> Region server did not restart after ZK timeout. Entered a high stress period
>>> while compacting under heavy write load, high RAM commitment, and
>>> concurrency.
>>>
>>> This is a stress test and I need to tune down vm.swappiness some more, but
>>> the region server shut down and did not restart.
>>>
>>> See attached.
>>>
>>>    - Andy
>>>
>>>
>>> ________________________________
>>> From: stack <st...@duboce.net>
>>> To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
>>> Sent: Wednesday, July 29, 2009 5:31:31 PM
>>> Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
>>>
>>> The first hbase 0.20.0 release candidate is available for download:
>>>
>>> http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>>
>>> More than 400 issues have been addressed.  The release notes are available
>>> here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
>>>
>>> HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x including
>>> configuration fundamentals.  Be sure to read the 'Getting Started'
>>> documentation available here:
>>> http://su.pr/211OYP.<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>>
>>> If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will need
>>> to run a migration.  See http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
>>> First read the overview and then go to the section, 'From 0.19.x to 0.20.x'.
>>>
>>> Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
>>> Monday August 3rd.
>>>
>>> Yours,
>>> The HBasistas
>>>
>>> P.S. 0.20.0 Highlights include:
>>>
>>> + Much improved performance
>>> + Master is no longer SPOF
>>> + Rolling restarts -- no need to take down whole cluster updating config. or
>>> making minor upgrades
>>> + A new, more comprehensive API (The old API is still present but
>>> deprecated)
>>> + Improved mapreduce connectors
>>> + New contrib package with updated Transactional HBase (THBase) and Indexed
>>> HBase (ITHBase) as well as a new REST gateway called stargate
>>> + And, as they say on the radio, "much, much more".
>>>
>>>
>>
>>
>>
>>
>
>
>
>

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by Andrew Purtell <ap...@apache.org>.
Ok, then I vote +1 on the RC, but with the caveat that "restarting" should
be stricken from HRS logging. 

   - Andy




________________________________
From: Ryan Rawson <ry...@gmail.com>
To: hbase-dev@hadoop.apache.org
Sent: Friday, July 31, 2009 3:03:32 PM
Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

The JVM must terminate because it is difficult to reset the state of a
HRS.  Thus supervision is necessary.

On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell<ap...@apache.org> wrote:
> My understanding is the region server is supposed to restart and check in
> with the master as if newly launched. I could be wrong. I was away for a
> while. At least following the log messages this appears to be the intent.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Friday, July 31, 2009 2:11:36 PM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> It is not supposed to restart, you will need to use supervisor to
> achieve that.
>
> If the ZK session is timed out, then the RS has no idea if the master
> has reassigned regions or not.  The RS then FATALS, the master
> recovers the log, and all will be well(ish).
>
> The zookeeper daemons also need supervising, since they might FATAL
> but can be restarted to continue on later.
>
> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org> wrote:
>> -1
>>
>> Region server did not restart after ZK timeout. Entered a high stress period
>> while compacting under heavy write load, high RAM commitment, and
>> concurrency.
>>
>> This is a stress test and I need to tune down vm.swappiness some more, but
>> the region server shut down and did not restart.
>>
>> See attached.
>>
>>    - Andy
>>
>>
>> ________________________________
>> From: stack <st...@duboce.net>
>> To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
>> Sent: Wednesday, July 29, 2009 5:31:31 PM
>> Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
>>
>> The first hbase 0.20.0 release candidate is available for download:
>>
>> http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>
>> More than 400 issues have been addressed.  The release notes are available
>> here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
>>
>> HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x including
>> configuration fundamentals.  Be sure to read the 'Getting Started'
>> documentation available here:
>> http://su.pr/211OYP.<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>
>> If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will need
>> to run a migration.  See http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
>> First read the overview and then go to the section, 'From 0.19.x to 0.20.x'.
>>
>> Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
>> Monday August 3rd.
>>
>> Yours,
>> The HBasistas
>>
>> P.S. 0.20.0 Highlights include:
>>
>> + Much improved performance
>> + Master is no longer SPOF
>> + Rolling restarts -- no need to take down whole cluster updating config. or
>> making minor upgrades
>> + A new, more comprehensive API (The old API is still present but
>> deprecated)
>> + Improved mapreduce connectors
>> + New contrib package with updated Transactional HBase (THBase) and Indexed
>> HBase (ITHBase) as well as a new REST gateway called stargate
>> + And, as they say on the radio, "much, much more".
>>
>>
>
>
>
>



      

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by Ryan Rawson <ry...@gmail.com>.
The JVM must terminate because it is difficult to reset the state of a
HRS.  Thus supervision is necessary.

On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell<ap...@apache.org> wrote:
> My understanding is the region server is supposed to restart and check in
> with the master as if newly launched. I could be wrong. I was away for a
> while. At least following the log messages this appears to be the intent.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Friday, July 31, 2009 2:11:36 PM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> It is not supposed to restart, you will need to use supervisor to
> achieve that.
>
> If the ZK session is timed out, then the RS has no idea if the master
> has reassigned regions or not.  The RS then FATALS, the master
> recovers the log, and all will be well(ish).
>
> The zookeeper daemons also need supervising, since they might FATAL
> but can be restarted to continue on later.
>
> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org> wrote:
>> -1
>>
>> Region server did not restart after ZK timeout. Entered a high stress period
>> while compacting under heavy write load, high RAM commitment, and
>> concurrency.
>>
>> This is a stress test and I need to tune down vm.swappiness some more, but
>> the region server shut down and did not restart.
>>
>> See attached.
>>
>>    - Andy
>>
>>
>> ________________________________
>> From: stack <st...@duboce.net>
>> To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
>> Sent: Wednesday, July 29, 2009 5:31:31 PM
>> Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
>>
>> The first hbase 0.20.0 release candidate is available for download:
>>
>> http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>
>> More than 400 issues have been addressed.  The release notes are available
>> here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
>>
>> HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x including
>> configuration fundamentals.  Be sure to read the 'Getting Started'
>> documentation available here:
>> http://su.pr/211OYP.<http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>>
>> If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will need
>> to run a migration.  See http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
>> First read the overview and then go to the section, 'From 0.19.x to 0.20.x'.
>>
>> Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
>> Monday August 3rd.
>>
>> Yours,
>> The HBasistas
>>
>> P.S. 0.20.0 Highlights include:
>>
>> + Much improved performance
>> + Master is no longer SPOF
>> + Rolling restarts -- no need to take down whole cluster updating config. or
>> making minor upgrades
>> + A new, more comprehensive API (The old API is still present but
>> deprecated)
>> + Improved mapreduce connectors
>> + New contrib package with updated Transactional HBase (THBase) and Indexed
>> HBase (ITHBase) as well as a new REST gateway called stargate
>> + And, as they say on the radio, "much, much more".
>>
>>
>
>
>
>

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by stack <st...@duboce.net>.
To be clear, release candidate 1 has been sunk.  We'll post a new candidate
after we chat at this weeks' meetup (
http://www.meetup.com/hbaseusergroup/calendar/10950511/).  Its looking like
there are a few more bug fixes that should go in, in particular HBASE-1738
and at least some kind of salve for HBASE-1750 and HBASE-1736.  Meantime,
please keep turning up the issues.

Thanks for your patience,
The HBase Team

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by stack <st...@duboce.net>.
Argument below for a new RC seems good to me.  Let me put up a new RC, one
that disables RS restart and that fixes the documentation issue found by Lei
Wang.
St.Ack


On Sat, Aug 1, 2009 at 12:02 PM, Andrew Purtell <ap...@apache.org> wrote:

> I don't have the log any more. I would have kept it if it was revealing.
> There were messages about ZK session expiration, followed by a message
> indicating the RS will restart, followed by warnings out of IPC as
> clients were querying but the server was shutting down, followed by
> thread stopping/terminated messages, and then nothing. I didn't see any
> ERRORs related to potential problems with restarting... it just didn't
> happen.
>
> I did switch my vote +1 for the RC because this is not a situation that
> leads to data loss -- the master splits the log and reassigns regions
> as expected -- but the messages about restarting in the RS log are
> misleading if restart doesn't happen. Sounds like I'm not the only one
> having this trouble. Could come up over and over on hbase-user@. On
> 1732 I suggest putting in the associated patch and making abort the
> default behavior instead of restart until this can be sorted out. That
> would require rolling a new RC. I haven't changed my vote but I do
> recommend that to avoid confusion. Should put something up on the
> troubleshooting page of the wiki in addition to or at least.
>
>   - Andy
>
>
>
>
> ________________________________
> From: stack <st...@duboce.net>
> To: hbase-dev@hadoop.apache.org
> Sent: Saturday, August 1, 2009 9:26:56 AM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> Yes, its supposed to restart itself and check in with the master as though
> it a new server (as J-D notes).
>
> Do you have log from the incident?  Open an issue if its broke?
>
> I see you flipped your vote from -1 to +1.  I think we should fix this
> failed restart but in the scheme of things, IMO, I don't think it a
> showstopper sufficient to sink the RC.  We can make a 0.20.1 to follow
> close
> on 0.20.0 with fixes for the likes of this and for the documentation issue
> noted by Lei Wang up on the list.
>
> St.Ack
>
>
>
> On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell <ap...@apache.org>
> wrote:
>
> > My understanding is the region server is supposed to restart and check in
> > with the master as if newly launched. I could be wrong. I was away for a
> > while. At least following the log messages this appears to be the intent.
> >
> >   - Andy
> >
> >
> >
> >
> > ________________________________
> > From: Ryan Rawson <ry...@gmail.com>
> > To: hbase-dev@hadoop.apache.org
> > Sent: Friday, July 31, 2009 2:11:36 PM
> > Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
> >
> > It is not supposed to restart, you will need to use supervisor to
> > achieve that.
> >
> > If the ZK session is timed out, then the RS has no idea if the master
> > has reassigned regions or not.  The RS then FATALS, the master
> > recovers the log, and all will be well(ish).
> >
> > The zookeeper daemons also need supervising, since they might FATAL
> > but can be restarted to continue on later.
> >
> > On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org>
> > wrote:
> > > -1
> > >
> > > Region server did not restart after ZK timeout. Entered a high stress
> > period
> > > while compacting under heavy write load, high RAM commitment, and
> > > concurrency.
> > >
> > > This is a stress test and I need to tune down vm.swappiness some more,
> > but
> > > the region server shut down and did not restart.
> > >
> > > See attached.
> > >
> > >    - Andy
> > >
> > >
> > > ________________________________
> > > From: stack <st...@duboce.net>
> > > To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
> > > Sent: Wednesday, July 29, 2009 5:31:31 PM
> > > Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
> > >
> > > The first hbase 0.20.0 release candidate is available for download:
> > >
> > > http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.20.0-candidate-1/>
> <http://people.apache.org/%7Estack/hbase-0.20.0-candidate-1/>
> > <http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> > >
> > > More than 400 issues have been addressed.  The release notes are
> > available
> > > here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
> > >
> > > HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x
> > including
> > > configuration fundamentals.  Be sure to read the 'Getting Started'
> > > documentation available here:
> > > http://su.pr/211OYP.<
> > http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> > >
> > > If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will
> > need
> > > to run a migration.  See
> > http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
> > > First read the overview and then go to the section, 'From 0.19.x to
> > 0.20.x'.
> > >
> > > Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
> > > Monday August 3rd.
> > >
> > > Yours,
> > > The HBasistas
> > >
> > > P.S. 0.20.0 Highlights include:
> > >
> > > + Much improved performance
> > > + Master is no longer SPOF
> > > + Rolling restarts -- no need to take down whole cluster updating
> config.
> > or
> > > making minor upgrades
> > > + A new, more comprehensive API (The old API is still present but
> > > deprecated)
> > > + Improved mapreduce connectors
> > > + New contrib package with updated Transactional HBase (THBase) and
> > Indexed
> > > HBase (ITHBase) as well as a new REST gateway called stargate
> > > + And, as they say on the radio, "much, much more".
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by Andrew Purtell <ap...@apache.org>.
I don't have the log any more. I would have kept it if it was revealing.
There were messages about ZK session expiration, followed by a message
indicating the RS will restart, followed by warnings out of IPC as 
clients were querying but the server was shutting down, followed by 
thread stopping/terminated messages, and then nothing. I didn't see any
ERRORs related to potential problems with restarting... it just didn't
happen.

I did switch my vote +1 for the RC because this is not a situation that
leads to data loss -- the master splits the log and reassigns regions
as expected -- but the messages about restarting in the RS log are
misleading if restart doesn't happen. Sounds like I'm not the only one
having this trouble. Could come up over and over on hbase-user@. On
1732 I suggest putting in the associated patch and making abort the 
default behavior instead of restart until this can be sorted out. That 
would require rolling a new RC. I haven't changed my vote but I do
recommend that to avoid confusion. Should put something up on the
troubleshooting page of the wiki in addition to or at least. 

   - Andy




________________________________
From: stack <st...@duboce.net>
To: hbase-dev@hadoop.apache.org
Sent: Saturday, August 1, 2009 9:26:56 AM
Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Yes, its supposed to restart itself and check in with the master as though
it a new server (as J-D notes).

Do you have log from the incident?  Open an issue if its broke?

I see you flipped your vote from -1 to +1.  I think we should fix this
failed restart but in the scheme of things, IMO, I don't think it a
showstopper sufficient to sink the RC.  We can make a 0.20.1 to follow close
on 0.20.0 with fixes for the likes of this and for the documentation issue
noted by Lei Wang up on the list.

St.Ack



On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell <ap...@apache.org> wrote:

> My understanding is the region server is supposed to restart and check in
> with the master as if newly launched. I could be wrong. I was away for a
> while. At least following the log messages this appears to be the intent.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Friday, July 31, 2009 2:11:36 PM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> It is not supposed to restart, you will need to use supervisor to
> achieve that.
>
> If the ZK session is timed out, then the RS has no idea if the master
> has reassigned regions or not.  The RS then FATALS, the master
> recovers the log, and all will be well(ish).
>
> The zookeeper daemons also need supervising, since they might FATAL
> but can be restarted to continue on later.
>
> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org>
> wrote:
> > -1
> >
> > Region server did not restart after ZK timeout. Entered a high stress
> period
> > while compacting under heavy write load, high RAM commitment, and
> > concurrency.
> >
> > This is a stress test and I need to tune down vm.swappiness some more,
> but
> > the region server shut down and did not restart.
> >
> > See attached.
> >
> >    - Andy
> >
> >
> > ________________________________
> > From: stack <st...@duboce.net>
> > To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
> > Sent: Wednesday, July 29, 2009 5:31:31 PM
> > Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
> >
> > The first hbase 0.20.0 release candidate is available for download:
> >
> > http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.20.0-candidate-1/>
> <http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> >
> > More than 400 issues have been addressed.  The release notes are
> available
> > here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
> >
> > HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x
> including
> > configuration fundamentals.  Be sure to read the 'Getting Started'
> > documentation available here:
> > http://su.pr/211OYP.<
> http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> >
> > If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will
> need
> > to run a migration.  See
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
> > First read the overview and then go to the section, 'From 0.19.x to
> 0.20.x'.
> >
> > Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
> > Monday August 3rd.
> >
> > Yours,
> > The HBasistas
> >
> > P.S. 0.20.0 Highlights include:
> >
> > + Much improved performance
> > + Master is no longer SPOF
> > + Rolling restarts -- no need to take down whole cluster updating config.
> or
> > making minor upgrades
> > + A new, more comprehensive API (The old API is still present but
> > deprecated)
> > + Improved mapreduce connectors
> > + New contrib package with updated Transactional HBase (THBase) and
> Indexed
> > HBase (ITHBase) as well as a new REST gateway called stargate
> > + And, as they say on the radio, "much, much more".
> >
> >
>
>
>
>
>



      

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by Jean-Daniel Cryans <jd...@apache.org>.
There are many small issues with the current RC, but I agree with
Stack that we should get it out ASAP even if it's followed by a minor
rev. It is stable and very very usable. Also peeps are usually aware
that a .0 is always a bit rough on the edges.

So +1

J-D

On Sat, Aug 1, 2009 at 12:26 PM, stack<st...@duboce.net> wrote:
> Yes, its supposed to restart itself and check in with the master as though
> it a new server (as J-D notes).
>
> Do you have log from the incident?  Open an issue if its broke?
>
> I see you flipped your vote from -1 to +1.  I think we should fix this
> failed restart but in the scheme of things, IMO, I don't think it a
> showstopper sufficient to sink the RC.  We can make a 0.20.1 to follow close
> on 0.20.0 with fixes for the likes of this and for the documentation issue
> noted by Lei Wang up on the list.
>
> St.Ack
>
>
>
> On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell <ap...@apache.org> wrote:
>
>> My understanding is the region server is supposed to restart and check in
>> with the master as if newly launched. I could be wrong. I was away for a
>> while. At least following the log messages this appears to be the intent.
>>
>>   - Andy
>>
>>
>>
>>
>> ________________________________
>> From: Ryan Rawson <ry...@gmail.com>
>> To: hbase-dev@hadoop.apache.org
>> Sent: Friday, July 31, 2009 2:11:36 PM
>> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>>
>> It is not supposed to restart, you will need to use supervisor to
>> achieve that.
>>
>> If the ZK session is timed out, then the RS has no idea if the master
>> has reassigned regions or not.  The RS then FATALS, the master
>> recovers the log, and all will be well(ish).
>>
>> The zookeeper daemons also need supervising, since they might FATAL
>> but can be restarted to continue on later.
>>
>> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org>
>> wrote:
>> > -1
>> >
>> > Region server did not restart after ZK timeout. Entered a high stress
>> period
>> > while compacting under heavy write load, high RAM commitment, and
>> > concurrency.
>> >
>> > This is a stress test and I need to tune down vm.swappiness some more,
>> but
>> > the region server shut down and did not restart.
>> >
>> > See attached.
>> >
>> >    - Andy
>> >
>> >
>> > ________________________________
>> > From: stack <st...@duboce.net>
>> > To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
>> > Sent: Wednesday, July 29, 2009 5:31:31 PM
>> > Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
>> >
>> > The first hbase 0.20.0 release candidate is available for download:
>> >
>> > http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.20.0-candidate-1/>
>> <http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>> >
>> > More than 400 issues have been addressed.  The release notes are
>> available
>> > here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
>> >
>> > HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x
>> including
>> > configuration fundamentals.  Be sure to read the 'Getting Started'
>> > documentation available here:
>> > http://su.pr/211OYP.<
>> http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
>> >
>> > If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will
>> need
>> > to run a migration.  See
>> http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
>> > First read the overview and then go to the section, 'From 0.19.x to
>> 0.20.x'.
>> >
>> > Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
>> > Monday August 3rd.
>> >
>> > Yours,
>> > The HBasistas
>> >
>> > P.S. 0.20.0 Highlights include:
>> >
>> > + Much improved performance
>> > + Master is no longer SPOF
>> > + Rolling restarts -- no need to take down whole cluster updating config.
>> or
>> > making minor upgrades
>> > + A new, more comprehensive API (The old API is still present but
>> > deprecated)
>> > + Improved mapreduce connectors
>> > + New contrib package with updated Transactional HBase (THBase) and
>> Indexed
>> > HBase (ITHBase) as well as a new REST gateway called stargate
>> > + And, as they say on the radio, "much, much more".
>> >
>> >
>>
>>
>>
>>
>>
>

Re: ANN: hbase 0.20.0 Release Candidate 1 available for download

Posted by stack <st...@duboce.net>.
Yes, its supposed to restart itself and check in with the master as though
it a new server (as J-D notes).

Do you have log from the incident?  Open an issue if its broke?

I see you flipped your vote from -1 to +1.  I think we should fix this
failed restart but in the scheme of things, IMO, I don't think it a
showstopper sufficient to sink the RC.  We can make a 0.20.1 to follow close
on 0.20.0 with fixes for the likes of this and for the documentation issue
noted by Lei Wang up on the list.

St.Ack



On Fri, Jul 31, 2009 at 3:02 PM, Andrew Purtell <ap...@apache.org> wrote:

> My understanding is the region server is supposed to restart and check in
> with the master as if newly launched. I could be wrong. I was away for a
> while. At least following the log messages this appears to be the intent.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-dev@hadoop.apache.org
> Sent: Friday, July 31, 2009 2:11:36 PM
> Subject: Re: ANN: hbase 0.20.0 Release Candidate 1 available for download
>
> It is not supposed to restart, you will need to use supervisor to
> achieve that.
>
> If the ZK session is timed out, then the RS has no idea if the master
> has reassigned regions or not.  The RS then FATALS, the master
> recovers the log, and all will be well(ish).
>
> The zookeeper daemons also need supervising, since they might FATAL
> but can be restarted to continue on later.
>
> On Fri, Jul 31, 2009 at 2:07 PM, Andrew Purtell<ap...@apache.org>
> wrote:
> > -1
> >
> > Region server did not restart after ZK timeout. Entered a high stress
> period
> > while compacting under heavy write load, high RAM commitment, and
> > concurrency.
> >
> > This is a stress test and I need to tune down vm.swappiness some more,
> but
> > the region server shut down and did not restart.
> >
> > See attached.
> >
> >    - Andy
> >
> >
> > ________________________________
> > From: stack <st...@duboce.net>
> > To: hbase-dev@hadoop.apache.org; hbase-user@hadoop.apache.org
> > Sent: Wednesday, July 29, 2009 5:31:31 PM
> > Subject: ANN: hbase 0.20.0 Release Candidate 1 available for download
> >
> > The first hbase 0.20.0 release candidate is available for download:
> >
> > http://people.apache.org/~stack/hbase-0.20.0-candidate-1/<http://people.apache.org/%7Estack/hbase-0.20.0-candidate-1/>
> <http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> >
> > More than 400 issues have been addressed.  The release notes are
> available
> > here: http://su.pr/18zcEO <http://tinyurl.com/8xmyx9>.
> >
> > HBase 0.20.0 runs on Hadoop 0.20.0.  Alot has changed since 0.19.x
> including
> > configuration fundamentals.  Be sure to read the 'Getting Started'
> > documentation available here:
> > http://su.pr/211OYP.<
> http://people.apache.org/%7Estack/hbase-0.19.0-candidate-1/>
> >
> > If you wish to bring your 0.19.x hbase data forward to 0.20.0, you will
> need
> > to run a migration.  See
> http://wiki.apache.org/hadoop/Hbase/HowToMigrate.
> > First read the overview and then go to the section, 'From 0.19.x to
> 0.20.x'.
> >
> > Should we release this candidate as hbase 0.20.0?  Please vote +1/-1 by
> > Monday August 3rd.
> >
> > Yours,
> > The HBasistas
> >
> > P.S. 0.20.0 Highlights include:
> >
> > + Much improved performance
> > + Master is no longer SPOF
> > + Rolling restarts -- no need to take down whole cluster updating config.
> or
> > making minor upgrades
> > + A new, more comprehensive API (The old API is still present but
> > deprecated)
> > + Improved mapreduce connectors
> > + New contrib package with updated Transactional HBase (THBase) and
> Indexed
> > HBase (ITHBase) as well as a new REST gateway called stargate
> > + And, as they say on the radio, "much, much more".
> >
> >
>
>
>
>
>