You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by jeff saremi <je...@hotmail.com> on 2017/02/11 22:06:19 UTC

On HBase Read Replicas

The first time I heard replicas in HBase the following thought immediately came to my mind:
To alleviate the load in read-heavy clusters, one could assign Region servers to be replicas of others so that the load is distributed and there is less pressure on the main RS.

Just 2 days ago a colleague quoted a paragraph from HBase manual that contradicted this completely. Apparently, the replicas do not help with the load but they actually contribute to more traffic on the network and on the underlying file system

Would someone be able to give us some insight on why anyone would want replicas?

And also could one easily change this behavior in the HBase native Java client to support what I had been imagining as the concept for replicas?


thanks

Re: On HBase Read Replicas

Posted by jeff saremi <je...@hotmail.com>.

Thank you Biju

________________________________
From: Biju N <bi...@gmail.com>
Sent: Wednesday, March 1, 2017 2:11:15 PM
To: user@hbase.apache.org
Subject: Re: On HBase Read Replicas

From the table definition. For e.g.
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html#getRegionReplication--

On Tue, Feb 28, 2017 at 3:30 PM, jeff saremi <je...@hotmail.com> wrote:

> Enis
>
> just one more question. How would i go about getting the count of the
> replica's for a table or columngroup? thanks
>
> ________________________________
> From: Enis Söztutar <en...@gmail.com>
> Sent: Wednesday, February 22, 2017 1:38:41 PM
> To: hbase-user
> Subject: Re: On HBase Read Replicas
>
> If you are doing a get to a specific replica, it will execute as a read
> with retries to a single "copy". There will not be any backup / fallback
> RPCs to any other replica.
>
> Only in timeline consistency mode there will be fallback RPCs.
>
> Enis
>
> On Sun, Feb 19, 2017 at 9:43 PM, Anoop John <an...@gmail.com> wrote:
>
> > Thanks Enis.. I was not knowing the way of setting replica id
> > specifically..  So what will happen if that said replica is down at
> > the read time?  Will that go to another replica?
> >
> > -Anoop-
> >
> > On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <en...@gmail.com>
> wrote:
> > > You can do gets using two different "modes":
> > >  - Do a read with backup RPCs. In case, the algorithm that I have above
> > > will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
> > >  - Do a read to a single replica. In this case, there is only 1 RPC
> that
> > > will happen to that given replica.
> > >
> > > Enis
> > >
> > > On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
> > > wrote:
> > >
> > >> Enis
> > >>
> > >> Thanks for taking the time to reply
> > >>
> > >> So i thought that a read request is sent to all Replicas regardless.
> If
> > we
> > >> have the option of Sending to one, analyzing response, and then
> sending
> > to
> > >> another, this bodes well with our scenarios.
> > >>
> > >> Please confirm
> > >>
> > >> thanks
> > >>
> > >> ________________________________
> > >> From: Enis Söztutar <en...@gmail.com>
> > >> Sent: Friday, February 17, 2017 11:38:42 AM
> > >> To: hbase-user
> > >> Subject: Re: On HBase Read Replicas
> > >>
> > >> You can use read-replicas to distribute the read-load if you are fine
> > with
> > >> stale reads. The read replicas normally have a "backup rpc" path,
> which
> > >> implements a logic like this:
> > >>  - Send the RPC to the primary replica
> > >>  - if no response for 100ms (or configured timeout), send RPCs to the
> > other
> > >> replicas
> > >>  - return the first non-exception response.
> > >>
> > >> However, there is also another feature for read replicas, where you
> can
> > >> indicate which exact replica_id you want to read from when you are
> > doing a
> > >> get. If you do this:
> > >> Get get = new Get(row);
> > >> get.setReplicaId(2);
> > >>
> > >> the Get RPC will only go to the replica_id=2. Note that if you have
> > region
> > >> replication = 3, then you will have regions with replica ids: {0, 1,
> 2}
> > >> where replica_id=0 is the primary.
> > >>
> > >> So you can do load-balancing with a get.setReplicaId(random() %
> > >> num_replicas) kind of pattern.
> > >>
> > >> Enis
> > >>
> > >>
> > >>
> > >> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com>
> > wrote:
> > >>
> > >> > Never saw this kind of discussion.
> > >> >
> > >> > -Anoop-
> > >> >
> > >> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <
> jeffsaremi@hotmail.com
> > >
> > >> > wrote:
> > >> > > Thanks Anoop.
> > >> > >
> > >> > > Understood.
> > >> > >
> > >> > > Have there been enhancement requests or discussions on load
> > balancing
> > >> by
> > >> > providing additional replicas in the past? Has anyone else come up
> > with
> > >> > anything on this?
> > >> > > thanks
> > >> > >
> > >> > > ________________________________
> > >> > > From: Anoop John <an...@gmail.com>
> > >> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> > >> > > To: user@hbase.apache.org
> > >> > > Subject: Re: On HBase Read Replicas
> > >> > >
> > >> > > The region replica feature came in so as to reduce the MTTR and so
> > >> > > increase the data availability.  When the master region containing
> > RS
> > >> > > dies, the clients can read from the secondary regions.  But to
> keep
> > >> > > one thing in mind that this data from secondary regions will be
> bit
> > >> > > out of sync as the replica is eventual consistent.   Because of
> this
> > >> > > said reason,  change client so as to share the load across diff
> RSs
> > >> > > might be tough.
> > >> > >
> > >> > > -Anoop-
> > >> > >
> > >> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <
> > jeffsaremi@hotmail.com>
> > >> > wrote:
> > >> > >> Yes indeed. thank you very much Ted
> > >> > >>
> > >> > >> ________________________________
> > >> > >> From: Ted Yu <yu...@gmail.com>
> > >> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> > >> > >> To: user@hbase.apache.org
> > >> > >> Subject: Re: On HBase Read Replicas
> > >> > >>
> > >> > >> Please take a look at the design doc attached to
> > >> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> > >> > >>
> > >> > >> Your first question would be answered by that document.
> > >> > >>
> > >> > >> Cheers
> > >> > >>
> > >> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <
> > jeffsaremi@hotmail.com>
> > >> > wrote:
> > >> > >>
> > >> > >>> The first time I heard replicas in HBase the following thought
> > >> > immediately
> > >> > >>> came to my mind:
> > >> > >>> To alleviate the load in read-heavy clusters, one could assign
> > Region
> > >> > >>> servers to be replicas of others so that the load is distributed
> > and
> > >> > there
> > >> > >>> is less pressure on the main RS.
> > >> > >>>
> > >> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual
> > that
> > >> > >>> contradicted this completely. Apparently, the replicas do not
> help
> > >> > with the
> > >> > >>> load but they actually contribute to more traffic on the network
> > and
> > >> > on the
> > >> > >>> underlying file system
> > >> > >>>
> > >> > >>> Would someone be able to give us some insight on why anyone
> would
> > >> want
> > >> > >>> replicas?
> > >> > >>>
> > >> > >>> And also could one easily change this behavior in the HBase
> native
> > >> Java
> > >> > >>> client to support what I had been imagining as the concept for
> > >> > replicas?
> > >> > >>>
> > >> > >>>
> > >> > >>> thanks
> > >> > >>>
> > >> >
> > >>
> >
>

Re: On HBase Read Replicas

Posted by Biju N <bi...@gmail.com>.

From the table definition. For e.g.
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html#getRegionReplication--

On Tue, Feb 28, 2017 at 3:30 PM, jeff saremi <je...@hotmail.com> wrote:

> Enis
>
> just one more question. How would i go about getting the count of the
> replica's for a table or columngroup? thanks
>
> ________________________________
> From: Enis Söztutar <en...@gmail.com>
> Sent: Wednesday, February 22, 2017 1:38:41 PM
> To: hbase-user
> Subject: Re: On HBase Read Replicas
>
> If you are doing a get to a specific replica, it will execute as a read
> with retries to a single "copy". There will not be any backup / fallback
> RPCs to any other replica.
>
> Only in timeline consistency mode there will be fallback RPCs.
>
> Enis
>
> On Sun, Feb 19, 2017 at 9:43 PM, Anoop John <an...@gmail.com> wrote:
>
> > Thanks Enis.. I was not knowing the way of setting replica id
> > specifically..  So what will happen if that said replica is down at
> > the read time?  Will that go to another replica?
> >
> > -Anoop-
> >
> > On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <en...@gmail.com>
> wrote:
> > > You can do gets using two different "modes":
> > >  - Do a read with backup RPCs. In case, the algorithm that I have above
> > > will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
> > >  - Do a read to a single replica. In this case, there is only 1 RPC
> that
> > > will happen to that given replica.
> > >
> > > Enis
> > >
> > > On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
> > > wrote:
> > >
> > >> Enis
> > >>
> > >> Thanks for taking the time to reply
> > >>
> > >> So i thought that a read request is sent to all Replicas regardless.
> If
> > we
> > >> have the option of Sending to one, analyzing response, and then
> sending
> > to
> > >> another, this bodes well with our scenarios.
> > >>
> > >> Please confirm
> > >>
> > >> thanks
> > >>
> > >> ________________________________
> > >> From: Enis Söztutar <en...@gmail.com>
> > >> Sent: Friday, February 17, 2017 11:38:42 AM
> > >> To: hbase-user
> > >> Subject: Re: On HBase Read Replicas
> > >>
> > >> You can use read-replicas to distribute the read-load if you are fine
> > with
> > >> stale reads. The read replicas normally have a "backup rpc" path,
> which
> > >> implements a logic like this:
> > >>  - Send the RPC to the primary replica
> > >>  - if no response for 100ms (or configured timeout), send RPCs to the
> > other
> > >> replicas
> > >>  - return the first non-exception response.
> > >>
> > >> However, there is also another feature for read replicas, where you
> can
> > >> indicate which exact replica_id you want to read from when you are
> > doing a
> > >> get. If you do this:
> > >> Get get = new Get(row);
> > >> get.setReplicaId(2);
> > >>
> > >> the Get RPC will only go to the replica_id=2. Note that if you have
> > region
> > >> replication = 3, then you will have regions with replica ids: {0, 1,
> 2}
> > >> where replica_id=0 is the primary.
> > >>
> > >> So you can do load-balancing with a get.setReplicaId(random() %
> > >> num_replicas) kind of pattern.
> > >>
> > >> Enis
> > >>
> > >>
> > >>
> > >> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com>
> > wrote:
> > >>
> > >> > Never saw this kind of discussion.
> > >> >
> > >> > -Anoop-
> > >> >
> > >> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <
> jeffsaremi@hotmail.com
> > >
> > >> > wrote:
> > >> > > Thanks Anoop.
> > >> > >
> > >> > > Understood.
> > >> > >
> > >> > > Have there been enhancement requests or discussions on load
> > balancing
> > >> by
> > >> > providing additional replicas in the past? Has anyone else come up
> > with
> > >> > anything on this?
> > >> > > thanks
> > >> > >
> > >> > > ________________________________
> > >> > > From: Anoop John <an...@gmail.com>
> > >> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> > >> > > To: user@hbase.apache.org
> > >> > > Subject: Re: On HBase Read Replicas
> > >> > >
> > >> > > The region replica feature came in so as to reduce the MTTR and so
> > >> > > increase the data availability.  When the master region containing
> > RS
> > >> > > dies, the clients can read from the secondary regions.  But to
> keep
> > >> > > one thing in mind that this data from secondary regions will be
> bit
> > >> > > out of sync as the replica is eventual consistent.   Because of
> this
> > >> > > said reason,  change client so as to share the load across diff
> RSs
> > >> > > might be tough.
> > >> > >
> > >> > > -Anoop-
> > >> > >
> > >> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <
> > jeffsaremi@hotmail.com>
> > >> > wrote:
> > >> > >> Yes indeed. thank you very much Ted
> > >> > >>
> > >> > >> ________________________________
> > >> > >> From: Ted Yu <yu...@gmail.com>
> > >> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> > >> > >> To: user@hbase.apache.org
> > >> > >> Subject: Re: On HBase Read Replicas
> > >> > >>
> > >> > >> Please take a look at the design doc attached to
> > >> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> > >> > >>
> > >> > >> Your first question would be answered by that document.
> > >> > >>
> > >> > >> Cheers
> > >> > >>
> > >> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <
> > jeffsaremi@hotmail.com>
> > >> > wrote:
> > >> > >>
> > >> > >>> The first time I heard replicas in HBase the following thought
> > >> > immediately
> > >> > >>> came to my mind:
> > >> > >>> To alleviate the load in read-heavy clusters, one could assign
> > Region
> > >> > >>> servers to be replicas of others so that the load is distributed
> > and
> > >> > there
> > >> > >>> is less pressure on the main RS.
> > >> > >>>
> > >> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual
> > that
> > >> > >>> contradicted this completely. Apparently, the replicas do not
> help
> > >> > with the
> > >> > >>> load but they actually contribute to more traffic on the network
> > and
> > >> > on the
> > >> > >>> underlying file system
> > >> > >>>
> > >> > >>> Would someone be able to give us some insight on why anyone
> would
> > >> want
> > >> > >>> replicas?
> > >> > >>>
> > >> > >>> And also could one easily change this behavior in the HBase
> native
> > >> Java
> > >> > >>> client to support what I had been imagining as the concept for
> > >> > replicas?
> > >> > >>>
> > >> > >>>
> > >> > >>> thanks
> > >> > >>>
> > >> >
> > >>
> >
>

Re: On HBase Read Replicas

Posted by jeff saremi <je...@hotmail.com>.

Enis

just one more question. How would i go about getting the count of the replica's for a table or columngroup? thanks

________________________________
From: Enis Söztutar <en...@gmail.com>
Sent: Wednesday, February 22, 2017 1:38:41 PM
To: hbase-user
Subject: Re: On HBase Read Replicas

If you are doing a get to a specific replica, it will execute as a read
with retries to a single "copy". There will not be any backup / fallback
RPCs to any other replica.

Only in timeline consistency mode there will be fallback RPCs.

Enis

On Sun, Feb 19, 2017 at 9:43 PM, Anoop John <an...@gmail.com> wrote:

> Thanks Enis.. I was not knowing the way of setting replica id
> specifically..  So what will happen if that said replica is down at
> the read time?  Will that go to another replica?
>
> -Anoop-
>
> On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <en...@gmail.com> wrote:
> > You can do gets using two different "modes":
> >  - Do a read with backup RPCs. In case, the algorithm that I have above
> > will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
> >  - Do a read to a single replica. In this case, there is only 1 RPC that
> > will happen to that given replica.
> >
> > Enis
> >
> > On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
> > wrote:
> >
> >> Enis
> >>
> >> Thanks for taking the time to reply
> >>
> >> So i thought that a read request is sent to all Replicas regardless. If
> we
> >> have the option of Sending to one, analyzing response, and then sending
> to
> >> another, this bodes well with our scenarios.
> >>
> >> Please confirm
> >>
> >> thanks
> >>
> >> ________________________________
> >> From: Enis Söztutar <en...@gmail.com>
> >> Sent: Friday, February 17, 2017 11:38:42 AM
> >> To: hbase-user
> >> Subject: Re: On HBase Read Replicas
> >>
> >> You can use read-replicas to distribute the read-load if you are fine
> with
> >> stale reads. The read replicas normally have a "backup rpc" path, which
> >> implements a logic like this:
> >>  - Send the RPC to the primary replica
> >>  - if no response for 100ms (or configured timeout), send RPCs to the
> other
> >> replicas
> >>  - return the first non-exception response.
> >>
> >> However, there is also another feature for read replicas, where you can
> >> indicate which exact replica_id you want to read from when you are
> doing a
> >> get. If you do this:
> >> Get get = new Get(row);
> >> get.setReplicaId(2);
> >>
> >> the Get RPC will only go to the replica_id=2. Note that if you have
> region
> >> replication = 3, then you will have regions with replica ids: {0, 1, 2}
> >> where replica_id=0 is the primary.
> >>
> >> So you can do load-balancing with a get.setReplicaId(random() %
> >> num_replicas) kind of pattern.
> >>
> >> Enis
> >>
> >>
> >>
> >> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com>
> wrote:
> >>
> >> > Never saw this kind of discussion.
> >> >
> >> > -Anoop-
> >> >
> >> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <jeffsaremi@hotmail.com
> >
> >> > wrote:
> >> > > Thanks Anoop.
> >> > >
> >> > > Understood.
> >> > >
> >> > > Have there been enhancement requests or discussions on load
> balancing
> >> by
> >> > providing additional replicas in the past? Has anyone else come up
> with
> >> > anything on this?
> >> > > thanks
> >> > >
> >> > > ________________________________
> >> > > From: Anoop John <an...@gmail.com>
> >> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> >> > > To: user@hbase.apache.org
> >> > > Subject: Re: On HBase Read Replicas
> >> > >
> >> > > The region replica feature came in so as to reduce the MTTR and so
> >> > > increase the data availability.  When the master region containing
> RS
> >> > > dies, the clients can read from the secondary regions.  But to keep
> >> > > one thing in mind that this data from secondary regions will be bit
> >> > > out of sync as the replica is eventual consistent.   Because of this
> >> > > said reason,  change client so as to share the load across diff RSs
> >> > > might be tough.
> >> > >
> >> > > -Anoop-
> >> > >
> >> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <
> jeffsaremi@hotmail.com>
> >> > wrote:
> >> > >> Yes indeed. thank you very much Ted
> >> > >>
> >> > >> ________________________________
> >> > >> From: Ted Yu <yu...@gmail.com>
> >> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> >> > >> To: user@hbase.apache.org
> >> > >> Subject: Re: On HBase Read Replicas
> >> > >>
> >> > >> Please take a look at the design doc attached to
> >> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> >> > >>
> >> > >> Your first question would be answered by that document.
> >> > >>
> >> > >> Cheers
> >> > >>
> >> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <
> jeffsaremi@hotmail.com>
> >> > wrote:
> >> > >>
> >> > >>> The first time I heard replicas in HBase the following thought
> >> > immediately
> >> > >>> came to my mind:
> >> > >>> To alleviate the load in read-heavy clusters, one could assign
> Region
> >> > >>> servers to be replicas of others so that the load is distributed
> and
> >> > there
> >> > >>> is less pressure on the main RS.
> >> > >>>
> >> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual
> that
> >> > >>> contradicted this completely. Apparently, the replicas do not help
> >> > with the
> >> > >>> load but they actually contribute to more traffic on the network
> and
> >> > on the
> >> > >>> underlying file system
> >> > >>>
> >> > >>> Would someone be able to give us some insight on why anyone would
> >> want
> >> > >>> replicas?
> >> > >>>
> >> > >>> And also could one easily change this behavior in the HBase native
> >> Java
> >> > >>> client to support what I had been imagining as the concept for
> >> > replicas?
> >> > >>>
> >> > >>>
> >> > >>> thanks
> >> > >>>
> >> >
> >>
>

Re: On HBase Read Replicas

Posted by Enis Söztutar <en...@gmail.com>.

If you are doing a get to a specific replica, it will execute as a read
with retries to a single "copy". There will not be any backup / fallback
RPCs to any other replica.

Only in timeline consistency mode there will be fallback RPCs.

Enis

On Sun, Feb 19, 2017 at 9:43 PM, Anoop John <an...@gmail.com> wrote:

> Thanks Enis.. I was not knowing the way of setting replica id
> specifically..  So what will happen if that said replica is down at
> the read time?  Will that go to another replica?
>
> -Anoop-
>
> On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <en...@gmail.com> wrote:
> > You can do gets using two different "modes":
> >  - Do a read with backup RPCs. In case, the algorithm that I have above
> > will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
> >  - Do a read to a single replica. In this case, there is only 1 RPC that
> > will happen to that given replica.
> >
> > Enis
> >
> > On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
> > wrote:
> >
> >> Enis
> >>
> >> Thanks for taking the time to reply
> >>
> >> So i thought that a read request is sent to all Replicas regardless. If
> we
> >> have the option of Sending to one, analyzing response, and then sending
> to
> >> another, this bodes well with our scenarios.
> >>
> >> Please confirm
> >>
> >> thanks
> >>
> >> ________________________________
> >> From: Enis Söztutar <en...@gmail.com>
> >> Sent: Friday, February 17, 2017 11:38:42 AM
> >> To: hbase-user
> >> Subject: Re: On HBase Read Replicas
> >>
> >> You can use read-replicas to distribute the read-load if you are fine
> with
> >> stale reads. The read replicas normally have a "backup rpc" path, which
> >> implements a logic like this:
> >>  - Send the RPC to the primary replica
> >>  - if no response for 100ms (or configured timeout), send RPCs to the
> other
> >> replicas
> >>  - return the first non-exception response.
> >>
> >> However, there is also another feature for read replicas, where you can
> >> indicate which exact replica_id you want to read from when you are
> doing a
> >> get. If you do this:
> >> Get get = new Get(row);
> >> get.setReplicaId(2);
> >>
> >> the Get RPC will only go to the replica_id=2. Note that if you have
> region
> >> replication = 3, then you will have regions with replica ids: {0, 1, 2}
> >> where replica_id=0 is the primary.
> >>
> >> So you can do load-balancing with a get.setReplicaId(random() %
> >> num_replicas) kind of pattern.
> >>
> >> Enis
> >>
> >>
> >>
> >> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com>
> wrote:
> >>
> >> > Never saw this kind of discussion.
> >> >
> >> > -Anoop-
> >> >
> >> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <jeffsaremi@hotmail.com
> >
> >> > wrote:
> >> > > Thanks Anoop.
> >> > >
> >> > > Understood.
> >> > >
> >> > > Have there been enhancement requests or discussions on load
> balancing
> >> by
> >> > providing additional replicas in the past? Has anyone else come up
> with
> >> > anything on this?
> >> > > thanks
> >> > >
> >> > > ________________________________
> >> > > From: Anoop John <an...@gmail.com>
> >> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> >> > > To: user@hbase.apache.org
> >> > > Subject: Re: On HBase Read Replicas
> >> > >
> >> > > The region replica feature came in so as to reduce the MTTR and so
> >> > > increase the data availability.  When the master region containing
> RS
> >> > > dies, the clients can read from the secondary regions.  But to keep
> >> > > one thing in mind that this data from secondary regions will be bit
> >> > > out of sync as the replica is eventual consistent.   Because of this
> >> > > said reason,  change client so as to share the load across diff RSs
> >> > > might be tough.
> >> > >
> >> > > -Anoop-
> >> > >
> >> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <
> jeffsaremi@hotmail.com>
> >> > wrote:
> >> > >> Yes indeed. thank you very much Ted
> >> > >>
> >> > >> ________________________________
> >> > >> From: Ted Yu <yu...@gmail.com>
> >> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> >> > >> To: user@hbase.apache.org
> >> > >> Subject: Re: On HBase Read Replicas
> >> > >>
> >> > >> Please take a look at the design doc attached to
> >> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> >> > >>
> >> > >> Your first question would be answered by that document.
> >> > >>
> >> > >> Cheers
> >> > >>
> >> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <
> jeffsaremi@hotmail.com>
> >> > wrote:
> >> > >>
> >> > >>> The first time I heard replicas in HBase the following thought
> >> > immediately
> >> > >>> came to my mind:
> >> > >>> To alleviate the load in read-heavy clusters, one could assign
> Region
> >> > >>> servers to be replicas of others so that the load is distributed
> and
> >> > there
> >> > >>> is less pressure on the main RS.
> >> > >>>
> >> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual
> that
> >> > >>> contradicted this completely. Apparently, the replicas do not help
> >> > with the
> >> > >>> load but they actually contribute to more traffic on the network
> and
> >> > on the
> >> > >>> underlying file system
> >> > >>>
> >> > >>> Would someone be able to give us some insight on why anyone would
> >> want
> >> > >>> replicas?
> >> > >>>
> >> > >>> And also could one easily change this behavior in the HBase native
> >> Java
> >> > >>> client to support what I had been imagining as the concept for
> >> > replicas?
> >> > >>>
> >> > >>>
> >> > >>> thanks
> >> > >>>
> >> >
> >>
>

Re: On HBase Read Replicas

Posted by Anoop John <an...@gmail.com>.

Thanks Enis.. I was not knowing the way of setting replica id
specifically..  So what will happen if that said replica is down at
the read time?  Will that go to another replica?

-Anoop-

On Sat, Feb 18, 2017 at 3:34 AM, Enis Söztutar <en...@gmail.com> wrote:
> You can do gets using two different "modes":
>  - Do a read with backup RPCs. In case, the algorithm that I have above
> will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
>  - Do a read to a single replica. In this case, there is only 1 RPC that
> will happen to that given replica.
>
> Enis
>
> On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
> wrote:
>
>> Enis
>>
>> Thanks for taking the time to reply
>>
>> So i thought that a read request is sent to all Replicas regardless. If we
>> have the option of Sending to one, analyzing response, and then sending to
>> another, this bodes well with our scenarios.
>>
>> Please confirm
>>
>> thanks
>>
>> ________________________________
>> From: Enis Söztutar <en...@gmail.com>
>> Sent: Friday, February 17, 2017 11:38:42 AM
>> To: hbase-user
>> Subject: Re: On HBase Read Replicas
>>
>> You can use read-replicas to distribute the read-load if you are fine with
>> stale reads. The read replicas normally have a "backup rpc" path, which
>> implements a logic like this:
>>  - Send the RPC to the primary replica
>>  - if no response for 100ms (or configured timeout), send RPCs to the other
>> replicas
>>  - return the first non-exception response.
>>
>> However, there is also another feature for read replicas, where you can
>> indicate which exact replica_id you want to read from when you are doing a
>> get. If you do this:
>> Get get = new Get(row);
>> get.setReplicaId(2);
>>
>> the Get RPC will only go to the replica_id=2. Note that if you have region
>> replication = 3, then you will have regions with replica ids: {0, 1, 2}
>> where replica_id=0 is the primary.
>>
>> So you can do load-balancing with a get.setReplicaId(random() %
>> num_replicas) kind of pattern.
>>
>> Enis
>>
>>
>>
>> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com> wrote:
>>
>> > Never saw this kind of discussion.
>> >
>> > -Anoop-
>> >
>> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <je...@hotmail.com>
>> > wrote:
>> > > Thanks Anoop.
>> > >
>> > > Understood.
>> > >
>> > > Have there been enhancement requests or discussions on load balancing
>> by
>> > providing additional replicas in the past? Has anyone else come up with
>> > anything on this?
>> > > thanks
>> > >
>> > > ________________________________
>> > > From: Anoop John <an...@gmail.com>
>> > > Sent: Thursday, February 16, 2017 2:35:48 AM
>> > > To: user@hbase.apache.org
>> > > Subject: Re: On HBase Read Replicas
>> > >
>> > > The region replica feature came in so as to reduce the MTTR and so
>> > > increase the data availability.  When the master region containing RS
>> > > dies, the clients can read from the secondary regions.  But to keep
>> > > one thing in mind that this data from secondary regions will be bit
>> > > out of sync as the replica is eventual consistent.   Because of this
>> > > said reason,  change client so as to share the load across diff RSs
>> > > might be tough.
>> > >
>> > > -Anoop-
>> > >
>> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com>
>> > wrote:
>> > >> Yes indeed. thank you very much Ted
>> > >>
>> > >> ________________________________
>> > >> From: Ted Yu <yu...@gmail.com>
>> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
>> > >> To: user@hbase.apache.org
>> > >> Subject: Re: On HBase Read Replicas
>> > >>
>> > >> Please take a look at the design doc attached to
>> > >> https://issues.apache.org/jira/browse/HBASE-10070.
>> > >>
>> > >> Your first question would be answered by that document.
>> > >>
>> > >> Cheers
>> > >>
>> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com>
>> > wrote:
>> > >>
>> > >>> The first time I heard replicas in HBase the following thought
>> > immediately
>> > >>> came to my mind:
>> > >>> To alleviate the load in read-heavy clusters, one could assign Region
>> > >>> servers to be replicas of others so that the load is distributed and
>> > there
>> > >>> is less pressure on the main RS.
>> > >>>
>> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
>> > >>> contradicted this completely. Apparently, the replicas do not help
>> > with the
>> > >>> load but they actually contribute to more traffic on the network and
>> > on the
>> > >>> underlying file system
>> > >>>
>> > >>> Would someone be able to give us some insight on why anyone would
>> want
>> > >>> replicas?
>> > >>>
>> > >>> And also could one easily change this behavior in the HBase native
>> Java
>> > >>> client to support what I had been imagining as the concept for
>> > replicas?
>> > >>>
>> > >>>
>> > >>> thanks
>> > >>>
>> >
>>

Re: On HBase Read Replicas

Posted by Enis Söztutar <en...@gmail.com>.

You can do gets using two different "modes":
 - Do a read with backup RPCs. In case, the algorithm that I have above
will be used. 1 RPC to primary, and 2 more RPCs after primary timeouts.
 - Do a read to a single replica. In this case, there is only 1 RPC that
will happen to that given replica.

Enis

On Fri, Feb 17, 2017 at 12:03 PM, jeff saremi <je...@hotmail.com>
wrote:

> Enis
>
> Thanks for taking the time to reply
>
> So i thought that a read request is sent to all Replicas regardless. If we
> have the option of Sending to one, analyzing response, and then sending to
> another, this bodes well with our scenarios.
>
> Please confirm
>
> thanks
>
> ________________________________
> From: Enis Söztutar <en...@gmail.com>
> Sent: Friday, February 17, 2017 11:38:42 AM
> To: hbase-user
> Subject: Re: On HBase Read Replicas
>
> You can use read-replicas to distribute the read-load if you are fine with
> stale reads. The read replicas normally have a "backup rpc" path, which
> implements a logic like this:
>  - Send the RPC to the primary replica
>  - if no response for 100ms (or configured timeout), send RPCs to the other
> replicas
>  - return the first non-exception response.
>
> However, there is also another feature for read replicas, where you can
> indicate which exact replica_id you want to read from when you are doing a
> get. If you do this:
> Get get = new Get(row);
> get.setReplicaId(2);
>
> the Get RPC will only go to the replica_id=2. Note that if you have region
> replication = 3, then you will have regions with replica ids: {0, 1, 2}
> where replica_id=0 is the primary.
>
> So you can do load-balancing with a get.setReplicaId(random() %
> num_replicas) kind of pattern.
>
> Enis
>
>
>
> On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com> wrote:
>
> > Never saw this kind of discussion.
> >
> > -Anoop-
> >
> > On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <je...@hotmail.com>
> > wrote:
> > > Thanks Anoop.
> > >
> > > Understood.
> > >
> > > Have there been enhancement requests or discussions on load balancing
> by
> > providing additional replicas in the past? Has anyone else come up with
> > anything on this?
> > > thanks
> > >
> > > ________________________________
> > > From: Anoop John <an...@gmail.com>
> > > Sent: Thursday, February 16, 2017 2:35:48 AM
> > > To: user@hbase.apache.org
> > > Subject: Re: On HBase Read Replicas
> > >
> > > The region replica feature came in so as to reduce the MTTR and so
> > > increase the data availability.  When the master region containing RS
> > > dies, the clients can read from the secondary regions.  But to keep
> > > one thing in mind that this data from secondary regions will be bit
> > > out of sync as the replica is eventual consistent.   Because of this
> > > said reason,  change client so as to share the load across diff RSs
> > > might be tough.
> > >
> > > -Anoop-
> > >
> > > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com>
> > wrote:
> > >> Yes indeed. thank you very much Ted
> > >>
> > >> ________________________________
> > >> From: Ted Yu <yu...@gmail.com>
> > >> Sent: Saturday, February 11, 2017 3:40:50 PM
> > >> To: user@hbase.apache.org
> > >> Subject: Re: On HBase Read Replicas
> > >>
> > >> Please take a look at the design doc attached to
> > >> https://issues.apache.org/jira/browse/HBASE-10070.
> > >>
> > >> Your first question would be answered by that document.
> > >>
> > >> Cheers
> > >>
> > >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com>
> > wrote:
> > >>
> > >>> The first time I heard replicas in HBase the following thought
> > immediately
> > >>> came to my mind:
> > >>> To alleviate the load in read-heavy clusters, one could assign Region
> > >>> servers to be replicas of others so that the load is distributed and
> > there
> > >>> is less pressure on the main RS.
> > >>>
> > >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> > >>> contradicted this completely. Apparently, the replicas do not help
> > with the
> > >>> load but they actually contribute to more traffic on the network and
> > on the
> > >>> underlying file system
> > >>>
> > >>> Would someone be able to give us some insight on why anyone would
> want
> > >>> replicas?
> > >>>
> > >>> And also could one easily change this behavior in the HBase native
> Java
> > >>> client to support what I had been imagining as the concept for
> > replicas?
> > >>>
> > >>>
> > >>> thanks
> > >>>
> >
>

Re: On HBase Read Replicas

Posted by jeff saremi <je...@hotmail.com>.

Enis

Thanks for taking the time to reply

So i thought that a read request is sent to all Replicas regardless. If we have the option of Sending to one, analyzing response, and then sending to another, this bodes well with our scenarios.

Please confirm

thanks

________________________________
From: Enis Söztutar <en...@gmail.com>
Sent: Friday, February 17, 2017 11:38:42 AM
To: hbase-user
Subject: Re: On HBase Read Replicas

You can use read-replicas to distribute the read-load if you are fine with
stale reads. The read replicas normally have a "backup rpc" path, which
implements a logic like this:
 - Send the RPC to the primary replica
 - if no response for 100ms (or configured timeout), send RPCs to the other
replicas
 - return the first non-exception response.

However, there is also another feature for read replicas, where you can
indicate which exact replica_id you want to read from when you are doing a
get. If you do this:
Get get = new Get(row);
get.setReplicaId(2);

the Get RPC will only go to the replica_id=2. Note that if you have region
replication = 3, then you will have regions with replica ids: {0, 1, 2}
where replica_id=0 is the primary.

So you can do load-balancing with a get.setReplicaId(random() %
num_replicas) kind of pattern.

Enis



On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com> wrote:

> Never saw this kind of discussion.
>
> -Anoop-
>
> On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <je...@hotmail.com>
> wrote:
> > Thanks Anoop.
> >
> > Understood.
> >
> > Have there been enhancement requests or discussions on load balancing by
> providing additional replicas in the past? Has anyone else come up with
> anything on this?
> > thanks
> >
> > ________________________________
> > From: Anoop John <an...@gmail.com>
> > Sent: Thursday, February 16, 2017 2:35:48 AM
> > To: user@hbase.apache.org
> > Subject: Re: On HBase Read Replicas
> >
> > The region replica feature came in so as to reduce the MTTR and so
> > increase the data availability.  When the master region containing RS
> > dies, the clients can read from the secondary regions.  But to keep
> > one thing in mind that this data from secondary regions will be bit
> > out of sync as the replica is eventual consistent.   Because of this
> > said reason,  change client so as to share the load across diff RSs
> > might be tough.
> >
> > -Anoop-
> >
> > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com>
> wrote:
> >> Yes indeed. thank you very much Ted
> >>
> >> ________________________________
> >> From: Ted Yu <yu...@gmail.com>
> >> Sent: Saturday, February 11, 2017 3:40:50 PM
> >> To: user@hbase.apache.org
> >> Subject: Re: On HBase Read Replicas
> >>
> >> Please take a look at the design doc attached to
> >> https://issues.apache.org/jira/browse/HBASE-10070.
> >>
> >> Your first question would be answered by that document.
> >>
> >> Cheers
> >>
> >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com>
> wrote:
> >>
> >>> The first time I heard replicas in HBase the following thought
> immediately
> >>> came to my mind:
> >>> To alleviate the load in read-heavy clusters, one could assign Region
> >>> servers to be replicas of others so that the load is distributed and
> there
> >>> is less pressure on the main RS.
> >>>
> >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> >>> contradicted this completely. Apparently, the replicas do not help
> with the
> >>> load but they actually contribute to more traffic on the network and
> on the
> >>> underlying file system
> >>>
> >>> Would someone be able to give us some insight on why anyone would want
> >>> replicas?
> >>>
> >>> And also could one easily change this behavior in the HBase native Java
> >>> client to support what I had been imagining as the concept for
> replicas?
> >>>
> >>>
> >>> thanks
> >>>
>

Re: On HBase Read Replicas

Posted by Enis Söztutar <en...@gmail.com>.

You can use read-replicas to distribute the read-load if you are fine with
stale reads. The read replicas normally have a "backup rpc" path, which
implements a logic like this:
 - Send the RPC to the primary replica
 - if no response for 100ms (or configured timeout), send RPCs to the other
replicas
 - return the first non-exception response.

However, there is also another feature for read replicas, where you can
indicate which exact replica_id you want to read from when you are doing a
get. If you do this:
Get get = new Get(row);
get.setReplicaId(2);

the Get RPC will only go to the replica_id=2. Note that if you have region
replication = 3, then you will have regions with replica ids: {0, 1, 2}
where replica_id=0 is the primary.

So you can do load-balancing with a get.setReplicaId(random() %
num_replicas) kind of pattern.

Enis



On Thu, Feb 16, 2017 at 9:41 AM, Anoop John <an...@gmail.com> wrote:

> Never saw this kind of discussion.
>
> -Anoop-
>
> On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <je...@hotmail.com>
> wrote:
> > Thanks Anoop.
> >
> > Understood.
> >
> > Have there been enhancement requests or discussions on load balancing by
> providing additional replicas in the past? Has anyone else come up with
> anything on this?
> > thanks
> >
> > ________________________________
> > From: Anoop John <an...@gmail.com>
> > Sent: Thursday, February 16, 2017 2:35:48 AM
> > To: user@hbase.apache.org
> > Subject: Re: On HBase Read Replicas
> >
> > The region replica feature came in so as to reduce the MTTR and so
> > increase the data availability.  When the master region containing RS
> > dies, the clients can read from the secondary regions.  But to keep
> > one thing in mind that this data from secondary regions will be bit
> > out of sync as the replica is eventual consistent.   Because of this
> > said reason,  change client so as to share the load across diff RSs
> > might be tough.
> >
> > -Anoop-
> >
> > On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com>
> wrote:
> >> Yes indeed. thank you very much Ted
> >>
> >> ________________________________
> >> From: Ted Yu <yu...@gmail.com>
> >> Sent: Saturday, February 11, 2017 3:40:50 PM
> >> To: user@hbase.apache.org
> >> Subject: Re: On HBase Read Replicas
> >>
> >> Please take a look at the design doc attached to
> >> https://issues.apache.org/jira/browse/HBASE-10070.
> >>
> >> Your first question would be answered by that document.
> >>
> >> Cheers
> >>
> >> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com>
> wrote:
> >>
> >>> The first time I heard replicas in HBase the following thought
> immediately
> >>> came to my mind:
> >>> To alleviate the load in read-heavy clusters, one could assign Region
> >>> servers to be replicas of others so that the load is distributed and
> there
> >>> is less pressure on the main RS.
> >>>
> >>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> >>> contradicted this completely. Apparently, the replicas do not help
> with the
> >>> load but they actually contribute to more traffic on the network and
> on the
> >>> underlying file system
> >>>
> >>> Would someone be able to give us some insight on why anyone would want
> >>> replicas?
> >>>
> >>> And also could one easily change this behavior in the HBase native Java
> >>> client to support what I had been imagining as the concept for
> replicas?
> >>>
> >>>
> >>> thanks
> >>>
>

Re: On HBase Read Replicas

Posted by Anoop John <an...@gmail.com>.

Never saw this kind of discussion.

-Anoop-

On Thu, Feb 16, 2017 at 10:13 PM, jeff saremi <je...@hotmail.com> wrote:
> Thanks Anoop.
>
> Understood.
>
> Have there been enhancement requests or discussions on load balancing by providing additional replicas in the past? Has anyone else come up with anything on this?
> thanks
>
> ________________________________
> From: Anoop John <an...@gmail.com>
> Sent: Thursday, February 16, 2017 2:35:48 AM
> To: user@hbase.apache.org
> Subject: Re: On HBase Read Replicas
>
> The region replica feature came in so as to reduce the MTTR and so
> increase the data availability.  When the master region containing RS
> dies, the clients can read from the secondary regions.  But to keep
> one thing in mind that this data from secondary regions will be bit
> out of sync as the replica is eventual consistent.   Because of this
> said reason,  change client so as to share the load across diff RSs
> might be tough.
>
> -Anoop-
>
> On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com> wrote:
>> Yes indeed. thank you very much Ted
>>
>> ________________________________
>> From: Ted Yu <yu...@gmail.com>
>> Sent: Saturday, February 11, 2017 3:40:50 PM
>> To: user@hbase.apache.org
>> Subject: Re: On HBase Read Replicas
>>
>> Please take a look at the design doc attached to
>> https://issues.apache.org/jira/browse/HBASE-10070.
>>
>> Your first question would be answered by that document.
>>
>> Cheers
>>
>> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com> wrote:
>>
>>> The first time I heard replicas in HBase the following thought immediately
>>> came to my mind:
>>> To alleviate the load in read-heavy clusters, one could assign Region
>>> servers to be replicas of others so that the load is distributed and there
>>> is less pressure on the main RS.
>>>
>>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
>>> contradicted this completely. Apparently, the replicas do not help with the
>>> load but they actually contribute to more traffic on the network and on the
>>> underlying file system
>>>
>>> Would someone be able to give us some insight on why anyone would want
>>> replicas?
>>>
>>> And also could one easily change this behavior in the HBase native Java
>>> client to support what I had been imagining as the concept for replicas?
>>>
>>>
>>> thanks
>>>

Re: On HBase Read Replicas

Posted by jeff saremi <je...@hotmail.com>.

Thanks Anoop.

Understood.

Have there been enhancement requests or discussions on load balancing by providing additional replicas in the past? Has anyone else come up with anything on this?
thanks

________________________________
From: Anoop John <an...@gmail.com>
Sent: Thursday, February 16, 2017 2:35:48 AM
To: user@hbase.apache.org
Subject: Re: On HBase Read Replicas

The region replica feature came in so as to reduce the MTTR and so
increase the data availability.  When the master region containing RS
dies, the clients can read from the secondary regions.  But to keep
one thing in mind that this data from secondary regions will be bit
out of sync as the replica is eventual consistent.   Because of this
said reason,  change client so as to share the load across diff RSs
might be tough.

-Anoop-

On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com> wrote:
> Yes indeed. thank you very much Ted
>
> ________________________________
> From: Ted Yu <yu...@gmail.com>
> Sent: Saturday, February 11, 2017 3:40:50 PM
> To: user@hbase.apache.org
> Subject: Re: On HBase Read Replicas
>
> Please take a look at the design doc attached to
> https://issues.apache.org/jira/browse/HBASE-10070.
>
> Your first question would be answered by that document.
>
> Cheers
>
> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com> wrote:
>
>> The first time I heard replicas in HBase the following thought immediately
>> came to my mind:
>> To alleviate the load in read-heavy clusters, one could assign Region
>> servers to be replicas of others so that the load is distributed and there
>> is less pressure on the main RS.
>>
>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
>> contradicted this completely. Apparently, the replicas do not help with the
>> load but they actually contribute to more traffic on the network and on the
>> underlying file system
>>
>> Would someone be able to give us some insight on why anyone would want
>> replicas?
>>
>> And also could one easily change this behavior in the HBase native Java
>> client to support what I had been imagining as the concept for replicas?
>>
>>
>> thanks
>>

Re: On HBase Read Replicas

Posted by Anoop John <an...@gmail.com>.

The region replica feature came in so as to reduce the MTTR and so
increase the data availability.  When the master region containing RS
dies, the clients can read from the secondary regions.  But to keep
one thing in mind that this data from secondary regions will be bit
out of sync as the replica is eventual consistent.   Because of this
said reason,  change client so as to share the load across diff RSs
might be tough.

-Anoop-

On Sun, Feb 12, 2017 at 8:13 AM, jeff saremi <je...@hotmail.com> wrote:
> Yes indeed. thank you very much Ted
>
> ________________________________
> From: Ted Yu <yu...@gmail.com>
> Sent: Saturday, February 11, 2017 3:40:50 PM
> To: user@hbase.apache.org
> Subject: Re: On HBase Read Replicas
>
> Please take a look at the design doc attached to
> https://issues.apache.org/jira/browse/HBASE-10070.
>
> Your first question would be answered by that document.
>
> Cheers
>
> On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com> wrote:
>
>> The first time I heard replicas in HBase the following thought immediately
>> came to my mind:
>> To alleviate the load in read-heavy clusters, one could assign Region
>> servers to be replicas of others so that the load is distributed and there
>> is less pressure on the main RS.
>>
>> Just 2 days ago a colleague quoted a paragraph from HBase manual that
>> contradicted this completely. Apparently, the replicas do not help with the
>> load but they actually contribute to more traffic on the network and on the
>> underlying file system
>>
>> Would someone be able to give us some insight on why anyone would want
>> replicas?
>>
>> And also could one easily change this behavior in the HBase native Java
>> client to support what I had been imagining as the concept for replicas?
>>
>>
>> thanks
>>

Re: On HBase Read Replicas

Posted by jeff saremi <je...@hotmail.com>.

Yes indeed. thank you very much Ted

________________________________
From: Ted Yu <yu...@gmail.com>
Sent: Saturday, February 11, 2017 3:40:50 PM
To: user@hbase.apache.org
Subject: Re: On HBase Read Replicas

Please take a look at the design doc attached to
https://issues.apache.org/jira/browse/HBASE-10070.

Your first question would be answered by that document.

Cheers

On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com> wrote:

> The first time I heard replicas in HBase the following thought immediately
> came to my mind:
> To alleviate the load in read-heavy clusters, one could assign Region
> servers to be replicas of others so that the load is distributed and there
> is less pressure on the main RS.
>
> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> contradicted this completely. Apparently, the replicas do not help with the
> load but they actually contribute to more traffic on the network and on the
> underlying file system
>
> Would someone be able to give us some insight on why anyone would want
> replicas?
>
> And also could one easily change this behavior in the HBase native Java
> client to support what I had been imagining as the concept for replicas?
>
>
> thanks
>

Re: On HBase Read Replicas

Posted by Ted Yu <yu...@gmail.com>.

Please take a look at the design doc attached to
https://issues.apache.org/jira/browse/HBASE-10070.

Your first question would be answered by that document.

Cheers

On Sat, Feb 11, 2017 at 2:06 PM, jeff saremi <je...@hotmail.com> wrote:

> The first time I heard replicas in HBase the following thought immediately
> came to my mind:
> To alleviate the load in read-heavy clusters, one could assign Region
> servers to be replicas of others so that the load is distributed and there
> is less pressure on the main RS.
>
> Just 2 days ago a colleague quoted a paragraph from HBase manual that
> contradicted this completely. Apparently, the replicas do not help with the
> load but they actually contribute to more traffic on the network and on the
> underlying file system
>
> Would someone be able to give us some insight on why anyone would want
> replicas?
>
> And also could one easily change this behavior in the HBase native Java
> client to support what I had been imagining as the concept for replicas?
>
>
> thanks
>