You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Weishung Chung <we...@gmail.com> on 2011/01/28 17:10:51 UTC

multiple masters

Is zookeeper responsible for the backup/replication of -ROOT- and .META.
files? It looks like I need multiple HBase masters setup to achieve high
availability. In the multiple masters setup, would there be any data loss in
the switch over after the first master became unavailable.

Re: multiple masters

Posted by Bill Graham <bi...@gmail.com>.
Thanks Stack, this is really helpful.

On Fri, Jan 28, 2011 at 2:06 PM, Stack <st...@duboce.net> wrote:
> On Fri, Jan 28, 2011 at 1:15 PM, Bill Graham <bi...@gmail.com> wrote:
>> I also don't have a solid understanding of the responsibilities of
>> master, but it seems like it's job is really about managing regions
>> (i.e., coordinating splits and compactions, etc.) and updating ROOT
>> and META. Is that correct?
>>
>>
>
> Yes.  It hosts the balancer and does bootsrapping on cluster startup
> doing bulk initial assign.  On server crash, it runs the recovery
> splitting WAL logs and getting regions back on line again.
>
> It does not run splits.  That is done by the regionservers themselves.
>  Regionservers inform master of the split when done so it can take
> account of new state when running balancer.
>
> We should do a write up on this.  Let me put this on the doc queue.
>
> St.Ack
>

Re: multiple masters

Posted by Stack <st...@duboce.net>.
On Fri, Jan 28, 2011 at 1:15 PM, Bill Graham <bi...@gmail.com> wrote:
> I also don't have a solid understanding of the responsibilities of
> master, but it seems like it's job is really about managing regions
> (i.e., coordinating splits and compactions, etc.) and updating ROOT
> and META. Is that correct?
>
>

Yes.  It hosts the balancer and does bootsrapping on cluster startup
doing bulk initial assign.  On server crash, it runs the recovery
splitting WAL logs and getting regions back on line again.

It does not run splits.  That is done by the regionservers themselves.
 Regionservers inform master of the split when done so it can take
account of new state when running balancer.

We should do a write up on this.  Let me put this on the doc queue.

St.Ack

Re: multiple masters

Posted by Bill Graham <bi...@gmail.com>.
I also don't have a solid understanding of the responsibilities of
master, but it seems like it's job is really about managing regions
(i.e., coordinating splits and compactions, etc.) and updating ROOT
and META. Is that correct?


On Fri, Jan 28, 2011 at 9:31 AM, Weishung Chung <we...@gmail.com> wrote:
> Great, thank you :D
> I guess I need to read up more on zookeeper.
>
> On Fri, Jan 28, 2011 at 10:56 AM, Stack <st...@duboce.net> wrote:
>
>> On Fri, Jan 28, 2011 at 8:52 AM, Weishung Chung <we...@gmail.com>
>> wrote:
>> > Correct me if I am wrong :)
>> > In HConnectionManager, it seems to me that a zookeeper instance is used
>> to
>> > get to the HBase master for META and ROOT info. What would happen if
>> HBase
>> > master became unavailable? Would zookeeper be able to get the ROOT and
>> META
>> > info from another backup/replicated master? Sorry because I haven't got a
>> > chance to browse deeper in the zookeeper codes yet :(
>> >
>>
>> The locations of root is kept in zk and that of meta in the root
>> region, not in master.  If master goes away, cluster continues to run.
>> St.Ack
>>
>

Re: multiple masters

Posted by Weishung Chung <we...@gmail.com>.
Great, thank you :D
I guess I need to read up more on zookeeper.

On Fri, Jan 28, 2011 at 10:56 AM, Stack <st...@duboce.net> wrote:

> On Fri, Jan 28, 2011 at 8:52 AM, Weishung Chung <we...@gmail.com>
> wrote:
> > Correct me if I am wrong :)
> > In HConnectionManager, it seems to me that a zookeeper instance is used
> to
> > get to the HBase master for META and ROOT info. What would happen if
> HBase
> > master became unavailable? Would zookeeper be able to get the ROOT and
> META
> > info from another backup/replicated master? Sorry because I haven't got a
> > chance to browse deeper in the zookeeper codes yet :(
> >
>
> The locations of root is kept in zk and that of meta in the root
> region, not in master.  If master goes away, cluster continues to run.
> St.Ack
>

Re: multiple masters

Posted by Stack <st...@duboce.net>.
On Fri, Jan 28, 2011 at 8:52 AM, Weishung Chung <we...@gmail.com> wrote:
> Correct me if I am wrong :)
> In HConnectionManager, it seems to me that a zookeeper instance is used to
> get to the HBase master for META and ROOT info. What would happen if HBase
> master became unavailable? Would zookeeper be able to get the ROOT and META
> info from another backup/replicated master? Sorry because I haven't got a
> chance to browse deeper in the zookeeper codes yet :(
>

The locations of root is kept in zk and that of meta in the root
region, not in master.  If master goes away, cluster continues to run.
St.Ack

Re: multiple masters

Posted by Weishung Chung <we...@gmail.com>.
Correct me if I am wrong :)
In HConnectionManager, it seems to me that a zookeeper instance is used to
get to the HBase master for META and ROOT info. What would happen if HBase
master became unavailable? Would zookeeper be able to get the ROOT and META
info from another backup/replicated master? Sorry because I haven't got a
chance to browse deeper in the zookeeper codes yet :(

On Fri, Jan 28, 2011 at 10:35 AM, Stack <st...@duboce.net> wrote:

> On Fri, Jan 28, 2011 at 8:10 AM, Weishung Chung <we...@gmail.com>
> wrote:
> > Is zookeeper responsible for the backup/replication of -ROOT- and .META.
> > files?
>
> No.  These are kept in HDFS and rely on its replication.
>
> > It looks like I need multiple HBase masters setup to achieve high
> > availability. In the multiple masters setup, would there be any data loss
> in
> > the switch over after the first master became unavailable.
> >
>
> No.  Master is not in the read/write path.  Cluster can continue
> responding to read/writes even when Master(s) is (are) down.
> St.Ack
>

Re: multiple masters

Posted by Stack <st...@duboce.net>.
On Fri, Jan 28, 2011 at 8:10 AM, Weishung Chung <we...@gmail.com> wrote:
> Is zookeeper responsible for the backup/replication of -ROOT- and .META.
> files?

No.  These are kept in HDFS and rely on its replication.

> It looks like I need multiple HBase masters setup to achieve high
> availability. In the multiple masters setup, would there be any data loss in
> the switch over after the first master became unavailable.
>

No.  Master is not in the read/write path.  Cluster can continue
responding to read/writes even when Master(s) is (are) down.
St.Ack