You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Jeff Storey <st...@gmail.com> on 2014/03/30 23:31:07 UTC

hbase replication for higher availability

In evaluating strategies for minimizing downtime when a region server
fails, in addition to the common approaches such as lowering the zookeeper
timeout, is it possible to use replication to improve availability (at the
cost of consistency) for reads?

I'm still getting more familiar with the HBASE api, but my thought would be
to do something like:

- attempt read from the primary cluster
- if read fails because of downed region server, read from slave cluster
(understanding that the read may be a little bit stale)

I wouldn't expect this to happen too frequently, but in a case where I
would rather return slightly stale data rather than no data, is this a
viable approach?

I'm not sure how the java api deals with reading from a region server that
is in the process of failing over? Is there a way to detect that?

Thanks for the help.

Re: hbase replication for higher availability

Posted by Enis Söztutar <en...@apache.org>.

Hey,

Indeed it is a viable approach. Some of the HBase deployments use the
master-master replication model across DCs. The consistency semantics will
obvious depend on the application use case.

However, there is no out-of-the-box client to do across-DC requests. But
wrapping the HBase client in a higher level should give you the
possibility.

On the other hand, we are adding high available reads within the DC in the
issue HBASE-10070. You can track the development there.

Cheers,
Enis


On Mon, Mar 31, 2014 at 3:37 PM, Jeff Storey <st...@gmail.com> wrote:

> Thank you for the input.
>
>
> On Sun, Mar 30, 2014 at 8:10 PM, Vladimir Rodionov
> <vr...@carrieriq.com>wrote:
>
> > It can be viable approach if you can keep replication lag under control.
> >
> > > I'm not sure how the java api deals with reading from a region server
> > that
> > > is in the process of failing over? Is there a way to detect that?
> >
> > Do two reads in sequence:
> >
> > 1. Read primary cluster
> > 2. Read secondary if 1. exceeds your time - out .
> >
> > Best regards,
> > Vladimir Rodionov
> > Principal Platform Engineer
> > Carrier IQ, www.carrieriq.com
> > e-mail: vrodionov@carrieriq.com
> >
> > ________________________________________
> > From: Jeff Storey [storey.jeff@gmail.com]
> > Sent: Sunday, March 30, 2014 2:31 PM
> > To: user@hbase.apache.org
> > Subject: hbase replication for higher availability
> >
> > In evaluating strategies for minimizing downtime when a region server
> > fails, in addition to the common approaches such as lowering the
> zookeeper
> > timeout, is it possible to use replication to improve availability (at
> the
> > cost of consistency) for reads?
> >
> > I'm still getting more familiar with the HBASE api, but my thought would
> be
> > to do something like:
> >
> > - attempt read from the primary cluster
> > - if read fails because of downed region server, read from slave cluster
> > (understanding that the read may be a little bit stale)
> >
> > I wouldn't expect this to happen too frequently, but in a case where I
> > would rather return slightly stale data rather than no data, is this a
> > viable approach?
> >
> > I'm not sure how the java api deals with reading from a region server
> that
> > is in the process of failing over? Is there a way to detect that?
> >
> > Thanks for the help.
> >
> > Confidentiality Notice:  The information contained in this message,
> > including any attachments hereto, may be confidential and is intended to
> be
> > read only by the individual or entity to whom this message is addressed.
> If
> > the reader of this message is not the intended recipient or an agent or
> > designee of the intended recipient, please note that any review, use,
> > disclosure or distribution of this message or its attachments, in any
> form,
> > is strictly prohibited.  If you have received this message in error,
> please
> > immediately notify the sender and/or Notifications@carrieriq.com and
> > delete or destroy any copy of this message and its attachments.
> >
>

Re: hbase replication for higher availability

Posted by Jeff Storey <st...@gmail.com>.

Thank you for the input.


On Sun, Mar 30, 2014 at 8:10 PM, Vladimir Rodionov
<vr...@carrieriq.com>wrote:

> It can be viable approach if you can keep replication lag under control.
>
> > I'm not sure how the java api deals with reading from a region server
> that
> > is in the process of failing over? Is there a way to detect that?
>
> Do two reads in sequence:
>
> 1. Read primary cluster
> 2. Read secondary if 1. exceeds your time - out .
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Jeff Storey [storey.jeff@gmail.com]
> Sent: Sunday, March 30, 2014 2:31 PM
> To: user@hbase.apache.org
> Subject: hbase replication for higher availability
>
> In evaluating strategies for minimizing downtime when a region server
> fails, in addition to the common approaches such as lowering the zookeeper
> timeout, is it possible to use replication to improve availability (at the
> cost of consistency) for reads?
>
> I'm still getting more familiar with the HBASE api, but my thought would be
> to do something like:
>
> - attempt read from the primary cluster
> - if read fails because of downed region server, read from slave cluster
> (understanding that the read may be a little bit stale)
>
> I wouldn't expect this to happen too frequently, but in a case where I
> would rather return slightly stale data rather than no data, is this a
> viable approach?
>
> I'm not sure how the java api deals with reading from a region server that
> is in the process of failing over? Is there a way to detect that?
>
> Thanks for the help.
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>

RE: hbase replication for higher availability

Posted by Vladimir Rodionov <vr...@carrieriq.com>.

It can be viable approach if you can keep replication lag under control.

> I'm not sure how the java api deals with reading from a region server that
> is in the process of failing over? Is there a way to detect that?

Do two reads in sequence:

1. Read primary cluster
2. Read secondary if 1. exceeds your time - out .

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Jeff Storey [storey.jeff@gmail.com]
Sent: Sunday, March 30, 2014 2:31 PM
To: user@hbase.apache.org
Subject: hbase replication for higher availability

In evaluating strategies for minimizing downtime when a region server
fails, in addition to the common approaches such as lowering the zookeeper
timeout, is it possible to use replication to improve availability (at the
cost of consistency) for reads?

I'm still getting more familiar with the HBASE api, but my thought would be
to do something like:

- attempt read from the primary cluster
- if read fails because of downed region server, read from slave cluster
(understanding that the read may be a little bit stale)

I wouldn't expect this to happen too frequently, but in a case where I
would rather return slightly stale data rather than no data, is this a
viable approach?

I'm not sure how the java api deals with reading from a region server that
is in the process of failing over? Is there a way to detect that?

Thanks for the help.

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

Re: hbase replication for higher availability

Posted by Bharath Vissapragada <bh...@cloudera.com>.

Hey,

You might be interested in the "shadow replica" discussion in the dev list.
The aim is to lower the mttr incase of a failure. Here is the link to the
discussion[1] and the jira[2]. These are very much relevant to what you are
looking for.

[1]
http://apache-hbase.679495.n3.nabble.com/Shadow-Regions-Read-Replicas-td4053313.html
[2] https://issues.apache.org/jira/browse/HBASE-10070

- Bharath


On Sun, Mar 30, 2014 at 10:31 PM, Jeff Storey <st...@gmail.com> wrote:

> In evaluating strategies for minimizing downtime when a region server
> fails, in addition to the common approaches such as lowering the zookeeper
> timeout, is it possible to use replication to improve availability (at the
> cost of consistency) for reads?
>
> I'm still getting more familiar with the HBASE api, but my thought would be
> to do something like:
>
> - attempt read from the primary cluster
> - if read fails because of downed region server, read from slave cluster
> (understanding that the read may be a little bit stale)
>
> I wouldn't expect this to happen too frequently, but in a case where I
> would rather return slightly stale data rather than no data, is this a
> viable approach?
>
> I'm not sure how the java api deals with reading from a region server that
> is in the process of failing over? Is there a way to detect that?
>
> Thanks for the help.
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>