You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Gabriel Reid <ga...@ngdata.com> on 2013/02/12 17:37:06 UTC

Replication sink selection strategy

Hi,

I was wondering if someone (perhaps Jean-Daniel, but anyone is welcome) could explain the reasoning for the current peer sink selection logic within replication.

As it currently stands, a percentage (by default 10%) of the slave cluster's region servers are randomly chosen by each region server in the master cluster as their replication pool. Each time a batch of edits is shipped to a peer, one region server is chosen from the pre-selected pool of slave region servers.

I was wondering what the advantage(s) of this approach are compared to each master region server simply randomly choosing a slave peer from the full set of slave region servers. In my (probably naive) view, this approach would provide a more even distribution of usage over the whole slave cluster, and I can't see any real advantages that the current approach has (although I assume there must be some).

Could someone let me know what the reasoning is behind the current approach?

Thanks,

Gabriel

Re: Replication sink selection strategy

Posted by Gabriel Reid <ga...@ngdata.com>.

Hi J-D,

Thanks for the info -- I was wondering if I was missing something important there.

I submitted a patch (HBASE-7634) a while back to rework the peer selection a bit to improve responsiveness to changes in peer clusters, but left the initial pool of slaves intact as it was. If you feel it's worth it (i.e. if you think the patch is worth integrating) I'd be happy to update it to use the non-pool based slave selection. Can you let me know what you think?

Thanks again,

Gabriel


On 12 Feb 2013, at 22:14, Jean-Daniel Cryans <jd...@apache.org> wrote:

> Hey Gabriel,
> 
> I think when I originally designed it I over-engineered it a bit. Just
> picking a random one should be enough and make the code simpler.
> 
> J-D
> 
> On Tue, Feb 12, 2013 at 8:37 AM, Gabriel Reid <ga...@ngdata.com> wrote:
>> Hi,
>> 
>> I was wondering if someone (perhaps Jean-Daniel, but anyone is welcome) could explain the reasoning for the current peer sink selection logic within replication.
>> 
>> As it currently stands, a percentage (by default 10%) of the slave cluster's region servers are randomly chosen by each region server in the master cluster as their replication pool. Each time a batch of edits is shipped to a peer, one region server is chosen from the pre-selected pool of slave region servers.
>> 
>> I was wondering what the advantage(s) of this approach are compared to each master region server simply randomly choosing a slave peer from the full set of slave region servers. In my (probably naive) view, this approach would provide a more even distribution of usage over the whole slave cluster, and I can't see any real advantages that the current approach has (although I assume there must be some).
>> 
>> Could someone let me know what the reasoning is behind the current approach?
>> 
>> Thanks,
>> 
>> Gabriel

Re: Replication sink selection strategy

Posted by Jean-Daniel Cryans <jd...@apache.org>.

Hey Gabriel,

I think when I originally designed it I over-engineered it a bit. Just
picking a random one should be enough and make the code simpler.

J-D

On Tue, Feb 12, 2013 at 8:37 AM, Gabriel Reid <ga...@ngdata.com> wrote:
> Hi,
>
> I was wondering if someone (perhaps Jean-Daniel, but anyone is welcome) could explain the reasoning for the current peer sink selection logic within replication.
>
> As it currently stands, a percentage (by default 10%) of the slave cluster's region servers are randomly chosen by each region server in the master cluster as their replication pool. Each time a batch of edits is shipped to a peer, one region server is chosen from the pre-selected pool of slave region servers.
>
> I was wondering what the advantage(s) of this approach are compared to each master region server simply randomly choosing a slave peer from the full set of slave region servers. In my (probably naive) view, this approach would provide a more even distribution of usage over the whole slave cluster, and I can't see any real advantages that the current approach has (although I assume there must be some).
>
> Could someone let me know what the reasoning is behind the current approach?
>
> Thanks,
>
> Gabriel