You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by an...@nokia.com on 2009/06/01 16:59:06 UTC

State of HA

Hello,

I have been looking at Jira and trying to get a current snapshot of the state of HA for HBase/Hadoop? I know that the zookeeper integration is the core of the HA story, but when is that slated for a "stable" debut? Is there anything that is currently in svn that we can pull and test?

TIA,

Andrew

Re: State of HA

Posted by Jean-Daniel Cryans <jd...@apache.org>.

This is not exactly the situation. Currently the code to have master
failover is committed in trunk but it is rough e.g. it requires some
manual modifications that aren't "fun" to do to start masters on other
nodes. HBASE-1357 and HBASE-1445 are about fixing that. HBASE-1448 is
about making the shutdown of multiple masters easier. This is due for
0.20.

J-D

On Mon, Jun 1, 2009 at 12:53 PM, Billy Pearson
<sa...@pearsonwholesale.com> wrote:
> 0.20.0 / trunk is the first release of hbase that will work with zookeeper
> so likely not going to have the HA stuff in there just getting it to work
> with zookeeper will likely be targeted in this release
> I do know there is a whole lot of rework on hbase in this release that
> should make big improvements over 0.19.0 in a few different areas.
> I do not thank we have any open issues targeting the HA yet just some people
> with ideas in our heads on how it would work.
>
> Billy
>
>
> <an...@nokia.com> wrote in message
> news:0E94BEEABCAE4C4EAC18B13A7A5C24563A691BDE55@NOK-EUMSG-01.mgdnok.nokia.com...
> Hello,
>
> I have been looking at Jira and trying to get a current snapshot of the
> state of HA for HBase/Hadoop? I know that the zookeeper integration is the
> core of the HA story, but when is that slated for a "stable" debut? Is there
> anything that is currently in svn that we can pull and test?
>
> TIA,
>
> Andrew
>
>
>
>

Re: State of HA

Posted by Billy Pearson <sa...@pearsonwholesale.com>.

0.20.0 / trunk is the first release of hbase that will work with zookeeper 
so likely not going to have the HA stuff in there just getting it to work 
with zookeeper will likely be targeted in this release
I do know there is a whole lot of rework on hbase in this release that 
should make big improvements over 0.19.0 in a few different areas.
I do not thank we have any open issues targeting the HA yet just some people 
with ideas in our heads on how it would work.

Billy


<an...@nokia.com> wrote in message 
news:0E94BEEABCAE4C4EAC18B13A7A5C24563A691BDE55@NOK-EUMSG-01.mgdnok.nokia.com...
Hello,

I have been looking at Jira and trying to get a current snapshot of the 
state of HA for HBase/Hadoop? I know that the zookeeper integration is the 
core of the HA story, but when is that slated for a "stable" debut? Is there 
anything that is currently in svn that we can pull and test?

TIA,

Andrew

Re: State of HA

Posted by Andrew Purtell <ap...@apache.org>.

At least there are known infrastructure solutions -- read: more work than just out of the box deployment -- to the Hadoop Namenode SPOF. If the failure is something a heartbeat will catch then HBase can survive a namenode crash if such a solution is in place. Apply the patch for HADOOP-4681 to insure recovery. (See https://issues.apache.org/jira/browse/HADOOP-4681) On Linux you can consider using Watchdog (http://linux.die.net/man/8/watchdog) or the Redhat cluster suite to force that kind of failure if the namenode process goes away and fence at the same time. 

   - Andy

________________________________
From: Ryan Rawson <ry...@gmail.com>
To: hbase-user@hadoop.apache.org
Sent: Monday, June 1, 2009 1:56:34 PM
Subject: Re: State of HA

Hey,

Stack is saying that for HADOOP-4379, it fails 1/5th of the time - recovery
takes more than 15 minutes, aka potentially unlimited amount of time.  That
patch relies on lease recovery it seems, so it may not be the final answer
for us.

Now, on the subject of the rest of things, under Zookeeper we are doing a
much better job at HA.  Regionserver crashes are detect significantly faster
than the 2 minute lease timeout, with my fixes you can take down any
regionserver without getting 'stuck' with an unassigned ROOT/META
(previously a problem).

I have noticed on trunk I can kill and restart the master w/o taking down
the cluster.  During master start-up it does a fairly good job at detecting
node status and otherwise recovering.  I can't say about master elections
exactly yet.

The HA story is shaping up nicely.

To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with
HBase 0.20 it should be the only SPOF.

-ryan

On Mon, Jun 1, 2009 at 1:50 PM, <an...@nokia.com> wrote:

> I am trying to parse this: are you implying that I can expect a 20% ("1 out
> of 5 or so") success getting HA to work with this code?
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext
> stack
> Sent: 01 June, 2009 13:27
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> You can pull TRUNK and try it with HADOOP-4379.
>
> The master failover works as J-D suggests.  It needs some polish but thats
> on its way.  The HADOOP-4379 will get you a sync that works most of the
> time
> (1 out of 5 or so in my testing) but hopefully that'll be addressed soon
> too.  You'll also need HBASE-1470.   Its the bit of code that exploits
> HADOOP-4379 when configuration is set right).
>
> If you need help setting up stuff, you know where to find us.  Issues we
> want to hear about because we're hoping to tell the above as part of our
> 0.20.0 release story.
>
> Yours,
> St.Ack
>
> On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:
>
> > Hello,
> >
> > I have been looking at Jira and trying to get a current snapshot of the
> > state of HA for HBase/Hadoop? I know that the zookeeper integration is
> the
> > core of the HA story, but when is that slated for a "stable" debut? Is
> there
> > anything that is currently in svn that we can pull and test?
> >
> > TIA,
> >
> > Andrew
> >
> >
>

RE: State of HA

Posted by "Jim Kellerman (POWERSET)" <Ji...@microsoft.com>.

Andrew,

For questions about the name node, you should ask on hadoop-user@hadoop.apache.org


> -----Original Message-----
> From: Andrew Wharton [mailto:andrew.wharton@nokia.com]
> Sent: Friday, June 19, 2009 11:40 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> Does anybody know the state of the Backupnamenode scheme that is
> referenced here? I am looking for a Jira ticket that might give me some
> more insight into the timeline for release. The hadoop-general email
> list doesn't seem to have any information about this issue, which is
> kind of worrying...
>
> TIA.
>
> -- Andrew
>
> On Tue, 2009-06-02 at 16:57 +0200, ext Jean-Daniel Cryans wrote:
> > Andrew,
> >
> > I think you are confusing some components of the whole stack here. The
> > Namenode is the master for HDFS just like the HMaster is the master
> > for HBase. Hadoop is 2 things : HDFS and an implementation of
> > MapReduce which also has a master, the JobTracker. HBase sits on all
> > that.
> >
> > So with regards with what's fixed, the HMaster SPOF is fixed for 0.20.
> > The Namenode in 0.20 is still a SPOF. That means, if you want HA, you
> > should get a really reliable machine for the Namenode but you can put
> > the HMaster on any nodes you want.
> > AFAIK, there is a BackupNamenode in Hadoop 0.21 that serves as a
> > Namenode failover.
>
>

Re: State of HA

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Here it is https://issues.apache.org/jira/browse/HADOOP-4539. Konstantin
talked about it in a May 27 mail in the thread "Setting up another machine
as secondary node".

J-D

On Fri, Jun 19, 2009 at 2:39 PM, Andrew Wharton <an...@nokia.com>wrote:

> Does anybody know the state of the Backupnamenode scheme that is
> referenced here? I am looking for a Jira ticket that might give me some
> more insight into the timeline for release. The hadoop-general email
> list doesn't seem to have any information about this issue, which is
> kind of worrying...
>
> TIA.
>
> -- Andrew
>
> On Tue, 2009-06-02 at 16:57 +0200, ext Jean-Daniel Cryans wrote:
> > Andrew,
> >
> > I think you are confusing some components of the whole stack here. The
> > Namenode is the master for HDFS just like the HMaster is the master
> > for HBase. Hadoop is 2 things : HDFS and an implementation of
> > MapReduce which also has a master, the JobTracker. HBase sits on all
> > that.
> >
> > So with regards with what's fixed, the HMaster SPOF is fixed for 0.20.
> > The Namenode in 0.20 is still a SPOF. That means, if you want HA, you
> > should get a really reliable machine for the Namenode but you can put
> > the HMaster on any nodes you want.
> > AFAIK, there is a BackupNamenode in Hadoop 0.21 that serves as a
> > Namenode failover.
>
>
>

Re: State of HA

Posted by Andrew Wharton <an...@nokia.com>.

Does anybody know the state of the Backupnamenode scheme that is
referenced here? I am looking for a Jira ticket that might give me some
more insight into the timeline for release. The hadoop-general email
list doesn't seem to have any information about this issue, which is
kind of worrying...

TIA.

-- Andrew

On Tue, 2009-06-02 at 16:57 +0200, ext Jean-Daniel Cryans wrote:
> Andrew,
> 
> I think you are confusing some components of the whole stack here. The
> Namenode is the master for HDFS just like the HMaster is the master
> for HBase. Hadoop is 2 things : HDFS and an implementation of
> MapReduce which also has a master, the JobTracker. HBase sits on all
> that.
> 
> So with regards with what's fixed, the HMaster SPOF is fixed for 0.20.
> The Namenode in 0.20 is still a SPOF. That means, if you want HA, you
> should get a really reliable machine for the Namenode but you can put
> the HMaster on any nodes you want.
> AFAIK, there is a BackupNamenode in Hadoop 0.21 that serves as a
> Namenode failover.

Re: State of HA

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Andrew,

I think you are confusing some components of the whole stack here. The
Namenode is the master for HDFS just like the HMaster is the master
for HBase. Hadoop is 2 things : HDFS and an implementation of
MapReduce which also has a master, the JobTracker. HBase sits on all
that.

So with regards with what's fixed, the HMaster SPOF is fixed for 0.20.
The Namenode in 0.20 is still a SPOF. That means, if you want HA, you
should get a really reliable machine for the Namenode but you can put
the HMaster on any nodes you want.
AFAIK, there is a BackupNamenode in Hadoop 0.21 that serves as a
Namenode failover.

J-D

On Tue, Jun 2, 2009 at 10:49 AM,  <an...@nokia.com> wrote:
> Occasionally, I think that I am getting all of this, but then a statement like this appears:
>
> "To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with HBase 0.20 it should be the only SPOF."
>
> So now I am confused all over again. I thought that any namenode SPOF that was fixed in Hadoop would also imply that it was fixed in HDFS. Doesn't HDFS use Hadoop in some form to M/R the reads/writes? If that is not the case and HDFS is going to suffer from a namenode SPOF in the near-term, are there plans in the works to remedy that too?
>
> -----Original Message-----
> From: ext Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: 01 June, 2009 16:57
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> Hey,
>
> Stack is saying that for HADOOP-4379, it fails 1/5th of the time - recovery
> takes more than 15 minutes, aka potentially unlimited amount of time.  That
> patch relies on lease recovery it seems, so it may not be the final answer
> for us.
>
> Now, on the subject of the rest of things, under Zookeeper we are doing a
> much better job at HA.  Regionserver crashes are detect significantly faster
> than the 2 minute lease timeout, with my fixes you can take down any
> regionserver without getting 'stuck' with an unassigned ROOT/META
> (previously a problem).
>
> I have noticed on trunk I can kill and restart the master w/o taking down
> the cluster.  During master start-up it does a fairly good job at detecting
> node status and otherwise recovering.  I can't say about master elections
> exactly yet.
>
> The HA story is shaping up nicely.
>
> To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with
> HBase 0.20 it should be the only SPOF.
>
> -ryan
>
> On Mon, Jun 1, 2009 at 1:50 PM, <an...@nokia.com> wrote:
>
>> I am trying to parse this: are you implying that I can expect a 20% ("1 out
>> of 5 or so") success getting HA to work with this code?
>>
>> -----Original Message-----
>> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext
>> stack
>> Sent: 01 June, 2009 13:27
>> To: hbase-user@hadoop.apache.org
>> Subject: Re: State of HA
>>
>> You can pull TRUNK and try it with HADOOP-4379.
>>
>> The master failover works as J-D suggests.  It needs some polish but thats
>> on its way.  The HADOOP-4379 will get you a sync that works most of the
>> time
>> (1 out of 5 or so in my testing) but hopefully that'll be addressed soon
>> too.  You'll also need HBASE-1470.   Its the bit of code that exploits
>> HADOOP-4379 when configuration is set right).
>>
>> If you need help setting up stuff, you know where to find us.  Issues we
>> want to hear about because we're hoping to tell the above as part of our
>> 0.20.0 release story.
>>
>> Yours,
>> St.Ack
>>
>> On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:
>>
>> > Hello,
>> >
>> > I have been looking at Jira and trying to get a current snapshot of the
>> > state of HA for HBase/Hadoop? I know that the zookeeper integration is
>> the
>> > core of the HA story, but when is that slated for a "stable" debut? Is
>> there
>> > anything that is currently in svn that we can pull and test?
>> >
>> > TIA,
>> >
>> > Andrew
>> >
>> >
>>
>

RE: State of HA

Posted by an...@nokia.com.

Occasionally, I think that I am getting all of this, but then a statement like this appears:

"To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with HBase 0.20 it should be the only SPOF."

So now I am confused all over again. I thought that any namenode SPOF that was fixed in Hadoop would also imply that it was fixed in HDFS. Doesn't HDFS use Hadoop in some form to M/R the reads/writes? If that is not the case and HDFS is going to suffer from a namenode SPOF in the near-term, are there plans in the works to remedy that too?

-----Original Message-----
From: ext Ryan Rawson [mailto:ryanobjc@gmail.com] 
Sent: 01 June, 2009 16:57
To: hbase-user@hadoop.apache.org
Subject: Re: State of HA

Hey,

Stack is saying that for HADOOP-4379, it fails 1/5th of the time - recovery
takes more than 15 minutes, aka potentially unlimited amount of time.  That
patch relies on lease recovery it seems, so it may not be the final answer
for us.

Now, on the subject of the rest of things, under Zookeeper we are doing a
much better job at HA.  Regionserver crashes are detect significantly faster
than the 2 minute lease timeout, with my fixes you can take down any
regionserver without getting 'stuck' with an unassigned ROOT/META
(previously a problem).

I have noticed on trunk I can kill and restart the master w/o taking down
the cluster.  During master start-up it does a fairly good job at detecting
node status and otherwise recovering.  I can't say about master elections
exactly yet.

The HA story is shaping up nicely.

To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with
HBase 0.20 it should be the only SPOF.

-ryan

On Mon, Jun 1, 2009 at 1:50 PM, <an...@nokia.com> wrote:

> I am trying to parse this: are you implying that I can expect a 20% ("1 out
> of 5 or so") success getting HA to work with this code?
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext
> stack
> Sent: 01 June, 2009 13:27
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> You can pull TRUNK and try it with HADOOP-4379.
>
> The master failover works as J-D suggests.  It needs some polish but thats
> on its way.  The HADOOP-4379 will get you a sync that works most of the
> time
> (1 out of 5 or so in my testing) but hopefully that'll be addressed soon
> too.  You'll also need HBASE-1470.   Its the bit of code that exploits
> HADOOP-4379 when configuration is set right).
>
> If you need help setting up stuff, you know where to find us.  Issues we
> want to hear about because we're hoping to tell the above as part of our
> 0.20.0 release story.
>
> Yours,
> St.Ack
>
> On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:
>
> > Hello,
> >
> > I have been looking at Jira and trying to get a current snapshot of the
> > state of HA for HBase/Hadoop? I know that the zookeeper integration is
> the
> > core of the HA story, but when is that slated for a "stable" debut? Is
> there
> > anything that is currently in svn that we can pull and test?
> >
> > TIA,
> >
> > Andrew
> >
> >
>

Re: State of HA

Posted by Ryan Rawson <ry...@gmail.com>.

Hey,

Stack is saying that for HADOOP-4379, it fails 1/5th of the time - recovery
takes more than 15 minutes, aka potentially unlimited amount of time.  That
patch relies on lease recovery it seems, so it may not be the final answer
for us.

Now, on the subject of the rest of things, under Zookeeper we are doing a
much better job at HA.  Regionserver crashes are detect significantly faster
than the 2 minute lease timeout, with my fixes you can take down any
regionserver without getting 'stuck' with an unassigned ROOT/META
(previously a problem).

I have noticed on trunk I can kill and restart the master w/o taking down
the cluster.  During master start-up it does a fairly good job at detecting
node status and otherwise recovering.  I can't say about master elections
exactly yet.

The HA story is shaping up nicely.

To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with
HBase 0.20 it should be the only SPOF.

-ryan

On Mon, Jun 1, 2009 at 1:50 PM, <an...@nokia.com> wrote:

> I am trying to parse this: are you implying that I can expect a 20% ("1 out
> of 5 or so") success getting HA to work with this code?
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext
> stack
> Sent: 01 June, 2009 13:27
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> You can pull TRUNK and try it with HADOOP-4379.
>
> The master failover works as J-D suggests.  It needs some polish but thats
> on its way.  The HADOOP-4379 will get you a sync that works most of the
> time
> (1 out of 5 or so in my testing) but hopefully that'll be addressed soon
> too.  You'll also need HBASE-1470.   Its the bit of code that exploits
> HADOOP-4379 when configuration is set right).
>
> If you need help setting up stuff, you know where to find us.  Issues we
> want to hear about because we're hoping to tell the above as part of our
> 0.20.0 release story.
>
> Yours,
> St.Ack
>
> On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:
>
> > Hello,
> >
> > I have been looking at Jira and trying to get a current snapshot of the
> > state of HA for HBase/Hadoop? I know that the zookeeper integration is
> the
> > core of the HA story, but when is that slated for a "stable" debut? Is
> there
> > anything that is currently in svn that we can pull and test?
> >
> > TIA,
> >
> > Andrew
> >
> >
>

Re: State of HA

Posted by stack <st...@duboce.net>.

On Mon, Jun 1, 2009 at 1:50 PM, <an...@nokia.com> wrote:

> I am trying to parse this: are you implying that I can expect a 20% ("1 out
> of 5 or so") success getting HA to work with this code?

Please don't get me wrong.  I wasn't trying to imply that HA means we work
80% of the time (Thanks Ryan for clarifying what I was trying to convey).
You were asking for something to try before release.  I was just trying to
point you at current state giving a rough evaluation of current state of
things.  Pardon my rushed explanation.

St.Ack

RE: State of HA

Posted by an...@nokia.com.

I am trying to parse this: are you implying that I can expect a 20% ("1 out of 5 or so") success getting HA to work with this code?

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext stack
Sent: 01 June, 2009 13:27
To: hbase-user@hadoop.apache.org
Subject: Re: State of HA

You can pull TRUNK and try it with HADOOP-4379.

The master failover works as J-D suggests.  It needs some polish but thats
on its way.  The HADOOP-4379 will get you a sync that works most of the time
(1 out of 5 or so in my testing) but hopefully that'll be addressed soon
too.  You'll also need HBASE-1470.   Its the bit of code that exploits
HADOOP-4379 when configuration is set right).

If you need help setting up stuff, you know where to find us.  Issues we
want to hear about because we're hoping to tell the above as part of our
0.20.0 release story.

Yours,
St.Ack

On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:

> Hello,
>
> I have been looking at Jira and trying to get a current snapshot of the
> state of HA for HBase/Hadoop? I know that the zookeeper integration is the
> core of the HA story, but when is that slated for a "stable" debut? Is there
> anything that is currently in svn that we can pull and test?
>
> TIA,
>
> Andrew
>
>

Re: State of HA

Posted by stack <st...@duboce.net>.

You can pull TRUNK and try it with HADOOP-4379.

The master failover works as J-D suggests.  It needs some polish but thats
on its way.  The HADOOP-4379 will get you a sync that works most of the time
(1 out of 5 or so in my testing) but hopefully that'll be addressed soon
too.  You'll also need HBASE-1470.   Its the bit of code that exploits
HADOOP-4379 when configuration is set right).

If you need help setting up stuff, you know where to find us.  Issues we
want to hear about because we're hoping to tell the above as part of our
0.20.0 release story.

Yours,
St.Ack

On Mon, Jun 1, 2009 at 7:59 AM, <an...@nokia.com> wrote:

> Hello,
>
> I have been looking at Jira and trying to get a current snapshot of the
> state of HA for HBase/Hadoop? I know that the zookeeper integration is the
> core of the HA story, but when is that slated for a "stable" debut? Is there
> anything that is currently in svn that we can pull and test?
>
> TIA,
>
> Andrew
>
>