You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Ishan Chhabra <ic...@rocketfuel.com> on 2013/11/08 22:14:27 UTC

Setting up NxN replication

I want to setup NxN replication i.e. N clusters each replicating to each
other. N is expected to be around 10.

On doing some research, I realize it is possible after HBASE-7709 fix, but
it would lead to much more data flowing in the system. eg.

Lets say we have 3 clusters: A,B and C.
A new write to A will go to B and then C, and also go to C directly via the
direct path. This leads to unnecessary network usage and writes to WAL of
B, that should be avoided. Now imagine this with 10 clusters, it won’t
scale.

One option is to create a minimum spanning tree joining all the clusters
and make nodes replicate to their immediate peers in a master-master
fashion. This is much better than NxN mesh, but still has extra network and
WAL usage. It also suffers from a failure scenarios where the a single
cluster going down will pause replication to clusters downstream.

What I really want is that the ReplicationSource should only forward
WALEdits with cluster-id same as the local cluster-id. This seems like a
straight forward patch to put in.

Any thoughts on the suggested approach or alternatives?

-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

Vlad, nice one. :-)

Very good point. Unless your company have data center in 10 different
locations, I don't see a good use case for such complex setup
Also, please keep in mind, currently the only replication type is 'global'.
Before we find a good way to specify peer at table:family level. It will be
a nightmare to replicating in a setup of more than 3~4 clusters

Demai


On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:

> *I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.*
>
> Preparing to thermonuclear war?
>
>
>
>
> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > I want to setup NxN replication i.e. N clusters each replicating to each
> > other. N is expected to be around 10.
> >
> > On doing some research, I realize it is possible after HBASE-7709 fix,
> but
> > it would lead to much more data flowing in the system. eg.
> >
> > Lets say we have 3 clusters: A,B and C.
> > A new write to A will go to B and then C, and also go to C directly via
> the
> > direct path. This leads to unnecessary network usage and writes to WAL of
> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
> > scale.
> >
> > One option is to create a minimum spanning tree joining all the clusters
> > and make nodes replicate to their immediate peers in a master-master
> > fashion. This is much better than NxN mesh, but still has extra network
> and
> > WAL usage. It also suffers from a failure scenarios where the a single
> > cluster going down will pause replication to clusters downstream.
> >
> > What I really want is that the ReplicationSource should only forward
> > WALEdits with cluster-id same as the local cluster-id. This seems like a
> > straight forward patch to put in.
> >
> > Any thoughts on the suggested approach or alternatives?
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

Ted, good point on the 11th location. Thanks

One thing I didn't mention (clearly) is about the limitation of 'global'
replication. Imaging all the 10 clusters are setup well for the 1st
table:column family. Then 6 months later, the 2nd table:column enters the
picture. How to limit the replication of 2nd cf to fewer clusters(let's say
3 of the 10)? least two good reasons for such use case:
1) 2nd cf is less important (or used by fewer workloads) so no need to
waster network/storage to keep all 10 copies(and multiply by the # of
replica within the same cluster)
2) 2nd cf is so important that legal/business requires the data to be kept
within U.S, but one of the 10 cluster is in Japan.

If we simply don't create target table on 7 of the 10 clusters, the queue
will growth to hit the max capacity very quickly. I am not sure about the
consequence of such situation, but my guess it won't be pretty.

Basically, once the business requirement evolves a bit, complexity gets
larger.

Demai




On Fri, Nov 8, 2013 at 3:47 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. how about your company have a new office in the 11th locations?
>
> With minimum spanning tree approach, the increase in load wouldn't be
> exponential.
>
>
> On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <ni...@gmail.com> wrote:
>
> > Ishan,
> >
> > have to admit that I am a bit surprise about the need of have data center
> > in 10 different locations. Well, I guess I shouldn't be, as every company
> > is global now(anyone from Mars yet?)
> >
> > In your case, since there is only one column family. The headache is not
> as
> > bad. Let's call your clusters as C1, C2, ... C10
> >
> > The safest way for your most critical data is still have setup the M-M
> > replication by 1 to N-1. That is every cluster add the rest of clusters
> as
> > its peer. For example C1 will have C2, C3...C10 as its peers; C2 will
> have
> > C1, C3.. C10. Well, that will be a lot of data over the network. Although
> > it is the best/fast way to get all the cluster sync-up. I don't like the
> > idea at all(too expensive for one).
> >
> > Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> > carefully planned the distribution so that all the clusters will get
> equal
> > load. Well, a system administrator has to do it manually.
> >
> > Now, thinking about the headache:
> > 1) what if your company(that is your manager who has no idea how
> difficult
> > it is) decide to have one more column family to be replicated?  how about
> > two more? The load will grow exponentially
> > 2) how about your company have a new office in the 11th locations? again,
> > grow exponentially
> > 3) let's say you are the best administrator, and keep nice record of
> > everything (unforturnatly, Hbase alone doesn't have a good way to
> maintain
> > all the record of who is being replicated). And then, the admin left the
> > company? or this is a global company has 10 admin at different locations.
> > How do they communicate of the replication setup?
> >
> > :-) Well, the 3) is not too bad. I just like to point it out as it can be
> > quite true for a company large enough to have 10 locations
> >
> > Demai
> >
> >
> >
> >
> > On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > >wrote:
> >
> > > Ted:
> > > Yes. It is the same table that is being written to from all locations.
> A
> > > single row could be updated from multiple locations, but our schema is
> > > designed in a manner that writes will be independent and not clobber
> each
> > > other.
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Ishan:
> > > > In your use case, the same table is written to in 10 clusters at
> > roughly
> > > > the same time ?
> > > >
> > > > Please clarify.
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > > > >wrote:
> > > >
> > > > > @Demai,
> > > > > We actually have 10 clusters in different locations.
> > > > > The replication scope is not an issue for me since I have only one
> > > column
> > > > > family and we want it replicated to each location.
> > > > > Can you elaborate more on why a replication setup of more than 3-4
> > > > clusters
> > > > > would be a headache in your opinion?
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> > ichhabra@rocketfuel.com
> > > > > >wrote:
> > > > >
> > > > > > @Demai,
> > > > > > Writes from B should also go to A and C. So, if I were to
> continue
> > on
> > > > > your
> > > > > > suggestion, I would setup A-B master master and B-C
> master-master,
> > > > which
> > > > > is
> > > > > > what I was proposing in the 2nd approach (MST based).
> > > > > >
> > > > > > @Vladimir
> > > > > > That is classified. :P
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > > vladrodionov@gmail.com>wrote:
> > > > > >
> > > > > >> *I want to setup NxN replication i.e. N clusters each
> replicating
> > to
> > > > > each
> > > > > >> other. N is expected to be around 10.*
> > > > > >>
> > > > > >> Preparing to thermonuclear war?
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > > ichhabra@rocketfuel.com
> > > > > >> >wrote:
> > > > > >>
> > > > > >> > I want to setup NxN replication i.e. N clusters each
> replicating
> > > to
> > > > > each
> > > > > >> > other. N is expected to be around 10.
> > > > > >> >
> > > > > >> > On doing some research, I realize it is possible after
> > HBASE-7709
> > > > fix,
> > > > > >> but
> > > > > >> > it would lead to much more data flowing in the system. eg.
> > > > > >> >
> > > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > > >> > A new write to A will go to B and then C, and also go to C
> > > directly
> > > > > via
> > > > > >> the
> > > > > >> > direct path. This leads to unnecessary network usage and
> writes
> > to
> > > > WAL
> > > > > >> of
> > > > > >> > B, that should be avoided. Now imagine this with 10 clusters,
> it
> > > > won’t
> > > > > >> > scale.
> > > > > >> >
> > > > > >> > One option is to create a minimum spanning tree joining all
> the
> > > > > clusters
> > > > > >> > and make nodes replicate to their immediate peers in a
> > > master-master
> > > > > >> > fashion. This is much better than NxN mesh, but still has
> extra
> > > > > network
> > > > > >> and
> > > > > >> > WAL usage. It also suffers from a failure scenarios where the
> a
> > > > single
> > > > > >> > cluster going down will pause replication to clusters
> > downstream.
> > > > > >> >
> > > > > >> > What I really want is that the ReplicationSource should only
> > > forward
> > > > > >> > WALEdits with cluster-id same as the local cluster-id. This
> > seems
> > > > > like a
> > > > > >> > straight forward patch to put in.
> > > > > >> >
> > > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > > >> >
> > > > > >> > --
> > > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
>

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

Demai,

I see. That is a good suggestion to add redundancy, but doubles the network
traffic and also doubles the wal edits. Also, after HBASE-7709, HBase
stores a list of cluster-ids and this list will grow very fat in this case,
maybe making waledits heavy.

I am now inclined to implement what I described in the first post, but am
not sure if it would be useful upstream. Ill file a JIRA and see.

In any case, thanks for the wonderful discussion. Ill report back here on
what I did and if it worked.


On Fri, Nov 8, 2013 at 6:55 PM, Demai Ni <ni...@gmail.com> wrote:

> Ishan,
>
> "Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to
> have
> the data available from 1 to all clusters. How would I do it with your
> setup?".
>
> If I understand the requirement currently, your setup are almost here :
> C1 <-> C2 <-> C3 <-> C4  and *C4<->C1*
> Basically, a double-linked-list forming a cycle. In this way, no single
> point of failure, writes on any of the cluster will eventually be
> replicated to all the clusters. The good part is that for each write,
> although the total # of the writes are the same as NXN, each cluster will
> only need the handle at most 2. With this said, I never setup more than 3
> clusters, and have to assume no other bugs similar of HBASE-7709(loop in
> Master/Master Replication) coming out of this.
>
> Still, I don't have a good solution for '..a row should be present in only
> 4/10 clusters..". One approach will use more than one columnfamily, +
> either HBase-5002(control replication peer per column family) or
> HBase-8751. Unfortunately, neither of the jira has been resolved yet. my 2
> cents.
>
> Demai
>
>
> On Fri, Nov 8, 2013 at 4:38 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > Demai, Ted:
> >
> > Thanks for the detailed answer.
> >
> > I should add some more context here. The underlying network is a NxN
> mesh.
> > The “cost" for each link is same.
> >
> > Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to
> have
> > the data available from 1 to all clusters. How would I do it with your
> > setup?
> >
> > For the difference between MST and NxN:
> > Consider the following example, with 4 clusters: C1, C2, C3, C4, and
> write
> > going to C1.
> >
> > In NxN mesh, the write will be propagated as:
> > C1 -> C2
> > C1 -> C3
> > C1 -> C4
> >
> > network cost: 3, writes to wal: 3
> >
> > MST with tree as C1 <-> C2 <-> C3 <-> C4, the write will be propagated
> as:
> > C1 -> C2
> > C2 -> C3
> > C3 -> C4
> >
> > network cost: 3, writes to wal: 3
> >
> > Both approaches have the same network and wal cost. The only difference
> is
> > that in MST, if C2 fails, writes from C1 will not go to C3 and C4, where
> as
> > in NxN case, the writes will still happen.
> >
> > Also, (1) and (3) are not an issue for us.
> >
> > Having said that, I do realize that adding more clusters is increasing
> the
> > load quadratically, and that does worry me. Our actual use case is that a
> > row should be present in only 4/10 clusters, but it varies based on the
> row
> > and not on the cluster. So I cannot come up with a static replication
> > configuration that will handle that. I am looking into per row
> replication,
> > but will start that a separate discussion and share my ideas there.
> >
> > I hope this makes more sense now.
> >
> >
> > On Fri, Nov 8, 2013 at 3:47 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > bq. how about your company have a new office in the 11th locations?
> > >
> > > With minimum spanning tree approach, the increase in load wouldn't be
> > > exponential.
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <ni...@gmail.com> wrote:
> > >
> > > > Ishan,
> > > >
> > > > have to admit that I am a bit surprise about the need of have data
> > center
> > > > in 10 different locations. Well, I guess I shouldn't be, as every
> > company
> > > > is global now(anyone from Mars yet?)
> > > >
> > > > In your case, since there is only one column family. The headache is
> > not
> > > as
> > > > bad. Let's call your clusters as C1, C2, ... C10
> > > >
> > > > The safest way for your most critical data is still have setup the
> M-M
> > > > replication by 1 to N-1. That is every cluster add the rest of
> clusters
> > > as
> > > > its peer. For example C1 will have C2, C3...C10 as its peers; C2 will
> > > have
> > > > C1, C3.. C10. Well, that will be a lot of data over the network.
> > Although
> > > > it is the best/fast way to get all the cluster sync-up. I don't like
> > the
> > > > idea at all(too expensive for one).
> > > >
> > > > Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9,
> and
> > > > carefully planned the distribution so that all the clusters will get
> > > equal
> > > > load. Well, a system administrator has to do it manually.
> > > >
> > > > Now, thinking about the headache:
> > > > 1) what if your company(that is your manager who has no idea how
> > > difficult
> > > > it is) decide to have one more column family to be replicated?  how
> > about
> > > > two more? The load will grow exponentially
> > > > 2) how about your company have a new office in the 11th locations?
> > again,
> > > > grow exponentially
> > > > 3) let's say you are the best administrator, and keep nice record of
> > > > everything (unforturnatly, Hbase alone doesn't have a good way to
> > > maintain
> > > > all the record of who is being replicated). And then, the admin left
> > the
> > > > company? or this is a global company has 10 admin at different
> > locations.
> > > > How do they communicate of the replication setup?
> > > >
> > > > :-) Well, the 3) is not too bad. I just like to point it out as it
> can
> > be
> > > > quite true for a company large enough to have 10 locations
> > > >
> > > > Demai
> > > >
> > > >
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > > > >wrote:
> > > >
> > > > > Ted:
> > > > > Yes. It is the same table that is being written to from all
> > locations.
> > > A
> > > > > single row could be updated from multiple locations, but our schema
> > is
> > > > > designed in a manner that writes will be independent and not
> clobber
> > > each
> > > > > other.
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com>
> wrote:
> > > > >
> > > > > > Ishan:
> > > > > > In your use case, the same table is written to in 10 clusters at
> > > > roughly
> > > > > > the same time ?
> > > > > >
> > > > > > Please clarify.
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <
> > > ichhabra@rocketfuel.com
> > > > > > >wrote:
> > > > > >
> > > > > > > @Demai,
> > > > > > > We actually have 10 clusters in different locations.
> > > > > > > The replication scope is not an issue for me since I have only
> > one
> > > > > column
> > > > > > > family and we want it replicated to each location.
> > > > > > > Can you elaborate more on why a replication setup of more than
> > 3-4
> > > > > > clusters
> > > > > > > would be a headache in your opinion?
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> > > > ichhabra@rocketfuel.com
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > @Demai,
> > > > > > > > Writes from B should also go to A and C. So, if I were to
> > > continue
> > > > on
> > > > > > > your
> > > > > > > > suggestion, I would setup A-B master master and B-C
> > > master-master,
> > > > > > which
> > > > > > > is
> > > > > > > > what I was proposing in the 2nd approach (MST based).
> > > > > > > >
> > > > > > > > @Vladimir
> > > > > > > > That is classified. :P
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > > > > vladrodionov@gmail.com>wrote:
> > > > > > > >
> > > > > > > >> *I want to setup NxN replication i.e. N clusters each
> > > replicating
> > > > to
> > > > > > > each
> > > > > > > >> other. N is expected to be around 10.*
> > > > > > > >>
> > > > > > > >> Preparing to thermonuclear war?
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > > > > ichhabra@rocketfuel.com
> > > > > > > >> >wrote:
> > > > > > > >>
> > > > > > > >> > I want to setup NxN replication i.e. N clusters each
> > > replicating
> > > > > to
> > > > > > > each
> > > > > > > >> > other. N is expected to be around 10.
> > > > > > > >> >
> > > > > > > >> > On doing some research, I realize it is possible after
> > > > HBASE-7709
> > > > > > fix,
> > > > > > > >> but
> > > > > > > >> > it would lead to much more data flowing in the system. eg.
> > > > > > > >> >
> > > > > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > > > > >> > A new write to A will go to B and then C, and also go to C
> > > > > directly
> > > > > > > via
> > > > > > > >> the
> > > > > > > >> > direct path. This leads to unnecessary network usage and
> > > writes
> > > > to
> > > > > > WAL
> > > > > > > >> of
> > > > > > > >> > B, that should be avoided. Now imagine this with 10
> > clusters,
> > > it
> > > > > > won’t
> > > > > > > >> > scale.
> > > > > > > >> >
> > > > > > > >> > One option is to create a minimum spanning tree joining
> all
> > > the
> > > > > > > clusters
> > > > > > > >> > and make nodes replicate to their immediate peers in a
> > > > > master-master
> > > > > > > >> > fashion. This is much better than NxN mesh, but still has
> > > extra
> > > > > > > network
> > > > > > > >> and
> > > > > > > >> > WAL usage. It also suffers from a failure scenarios where
> > the
> > > a
> > > > > > single
> > > > > > > >> > cluster going down will pause replication to clusters
> > > > downstream.
> > > > > > > >> >
> > > > > > > >> > What I really want is that the ReplicationSource should
> only
> > > > > forward
> > > > > > > >> > WALEdits with cluster-id same as the local cluster-id.
> This
> > > > seems
> > > > > > > like a
> > > > > > > >> > straight forward patch to put in.
> > > > > > > >> >
> > > > > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > > > > >> >
> > > > > > > >> > --
> > > > > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

Ishan,

"Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to have
the data available from 1 to all clusters. How would I do it with your
setup?".

If I understand the requirement currently, your setup are almost here :
C1 <-> C2 <-> C3 <-> C4  and *C4<->C1*
Basically, a double-linked-list forming a cycle. In this way, no single
point of failure, writes on any of the cluster will eventually be
replicated to all the clusters. The good part is that for each write,
although the total # of the writes are the same as NXN, each cluster will
only need the handle at most 2. With this said, I never setup more than 3
clusters, and have to assume no other bugs similar of HBASE-7709(loop in
Master/Master Replication) coming out of this.

Still, I don't have a good solution for '..a row should be present in only
4/10 clusters..". One approach will use more than one columnfamily, +
either HBase-5002(control replication peer per column family) or
HBase-8751. Unfortunately, neither of the jira has been resolved yet. my 2
cents.

Demai


On Fri, Nov 8, 2013 at 4:38 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> Demai, Ted:
>
> Thanks for the detailed answer.
>
> I should add some more context here. The underlying network is a NxN mesh.
> The “cost" for each link is same.
>
> Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to have
> the data available from 1 to all clusters. How would I do it with your
> setup?
>
> For the difference between MST and NxN:
> Consider the following example, with 4 clusters: C1, C2, C3, C4, and write
> going to C1.
>
> In NxN mesh, the write will be propagated as:
> C1 -> C2
> C1 -> C3
> C1 -> C4
>
> network cost: 3, writes to wal: 3
>
> MST with tree as C1 <-> C2 <-> C3 <-> C4, the write will be propagated as:
> C1 -> C2
> C2 -> C3
> C3 -> C4
>
> network cost: 3, writes to wal: 3
>
> Both approaches have the same network and wal cost. The only difference is
> that in MST, if C2 fails, writes from C1 will not go to C3 and C4, where as
> in NxN case, the writes will still happen.
>
> Also, (1) and (3) are not an issue for us.
>
> Having said that, I do realize that adding more clusters is increasing the
> load quadratically, and that does worry me. Our actual use case is that a
> row should be present in only 4/10 clusters, but it varies based on the row
> and not on the cluster. So I cannot come up with a static replication
> configuration that will handle that. I am looking into per row replication,
> but will start that a separate discussion and share my ideas there.
>
> I hope this makes more sense now.
>
>
> On Fri, Nov 8, 2013 at 3:47 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > bq. how about your company have a new office in the 11th locations?
> >
> > With minimum spanning tree approach, the increase in load wouldn't be
> > exponential.
> >
> >
> > On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <ni...@gmail.com> wrote:
> >
> > > Ishan,
> > >
> > > have to admit that I am a bit surprise about the need of have data
> center
> > > in 10 different locations. Well, I guess I shouldn't be, as every
> company
> > > is global now(anyone from Mars yet?)
> > >
> > > In your case, since there is only one column family. The headache is
> not
> > as
> > > bad. Let's call your clusters as C1, C2, ... C10
> > >
> > > The safest way for your most critical data is still have setup the M-M
> > > replication by 1 to N-1. That is every cluster add the rest of clusters
> > as
> > > its peer. For example C1 will have C2, C3...C10 as its peers; C2 will
> > have
> > > C1, C3.. C10. Well, that will be a lot of data over the network.
> Although
> > > it is the best/fast way to get all the cluster sync-up. I don't like
> the
> > > idea at all(too expensive for one).
> > >
> > > Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> > > carefully planned the distribution so that all the clusters will get
> > equal
> > > load. Well, a system administrator has to do it manually.
> > >
> > > Now, thinking about the headache:
> > > 1) what if your company(that is your manager who has no idea how
> > difficult
> > > it is) decide to have one more column family to be replicated?  how
> about
> > > two more? The load will grow exponentially
> > > 2) how about your company have a new office in the 11th locations?
> again,
> > > grow exponentially
> > > 3) let's say you are the best administrator, and keep nice record of
> > > everything (unforturnatly, Hbase alone doesn't have a good way to
> > maintain
> > > all the record of who is being replicated). And then, the admin left
> the
> > > company? or this is a global company has 10 admin at different
> locations.
> > > How do they communicate of the replication setup?
> > >
> > > :-) Well, the 3) is not too bad. I just like to point it out as it can
> be
> > > quite true for a company large enough to have 10 locations
> > >
> > > Demai
> > >
> > >
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > > >wrote:
> > >
> > > > Ted:
> > > > Yes. It is the same table that is being written to from all
> locations.
> > A
> > > > single row could be updated from multiple locations, but our schema
> is
> > > > designed in a manner that writes will be independent and not clobber
> > each
> > > > other.
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:
> > > >
> > > > > Ishan:
> > > > > In your use case, the same table is written to in 10 clusters at
> > > roughly
> > > > > the same time ?
> > > > >
> > > > > Please clarify.
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <
> > ichhabra@rocketfuel.com
> > > > > >wrote:
> > > > >
> > > > > > @Demai,
> > > > > > We actually have 10 clusters in different locations.
> > > > > > The replication scope is not an issue for me since I have only
> one
> > > > column
> > > > > > family and we want it replicated to each location.
> > > > > > Can you elaborate more on why a replication setup of more than
> 3-4
> > > > > clusters
> > > > > > would be a headache in your opinion?
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> > > ichhabra@rocketfuel.com
> > > > > > >wrote:
> > > > > >
> > > > > > > @Demai,
> > > > > > > Writes from B should also go to A and C. So, if I were to
> > continue
> > > on
> > > > > > your
> > > > > > > suggestion, I would setup A-B master master and B-C
> > master-master,
> > > > > which
> > > > > > is
> > > > > > > what I was proposing in the 2nd approach (MST based).
> > > > > > >
> > > > > > > @Vladimir
> > > > > > > That is classified. :P
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > > > vladrodionov@gmail.com>wrote:
> > > > > > >
> > > > > > >> *I want to setup NxN replication i.e. N clusters each
> > replicating
> > > to
> > > > > > each
> > > > > > >> other. N is expected to be around 10.*
> > > > > > >>
> > > > > > >> Preparing to thermonuclear war?
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > > > ichhabra@rocketfuel.com
> > > > > > >> >wrote:
> > > > > > >>
> > > > > > >> > I want to setup NxN replication i.e. N clusters each
> > replicating
> > > > to
> > > > > > each
> > > > > > >> > other. N is expected to be around 10.
> > > > > > >> >
> > > > > > >> > On doing some research, I realize it is possible after
> > > HBASE-7709
> > > > > fix,
> > > > > > >> but
> > > > > > >> > it would lead to much more data flowing in the system. eg.
> > > > > > >> >
> > > > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > > > >> > A new write to A will go to B and then C, and also go to C
> > > > directly
> > > > > > via
> > > > > > >> the
> > > > > > >> > direct path. This leads to unnecessary network usage and
> > writes
> > > to
> > > > > WAL
> > > > > > >> of
> > > > > > >> > B, that should be avoided. Now imagine this with 10
> clusters,
> > it
> > > > > won’t
> > > > > > >> > scale.
> > > > > > >> >
> > > > > > >> > One option is to create a minimum spanning tree joining all
> > the
> > > > > > clusters
> > > > > > >> > and make nodes replicate to their immediate peers in a
> > > > master-master
> > > > > > >> > fashion. This is much better than NxN mesh, but still has
> > extra
> > > > > > network
> > > > > > >> and
> > > > > > >> > WAL usage. It also suffers from a failure scenarios where
> the
> > a
> > > > > single
> > > > > > >> > cluster going down will pause replication to clusters
> > > downstream.
> > > > > > >> >
> > > > > > >> > What I really want is that the ReplicationSource should only
> > > > forward
> > > > > > >> > WALEdits with cluster-id same as the local cluster-id. This
> > > seems
> > > > > > like a
> > > > > > >> > straight forward patch to put in.
> > > > > > >> >
> > > > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > > > >> >
> > > > > > >> > --
> > > > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

Demai, Ted:

Thanks for the detailed answer.

I should add some more context here. The underlying network is a NxN mesh.
The “cost" for each link is same.

Coming to Demai’s suggestion of M-M to 2 instead of 9, i still want to have
the data available from 1 to all clusters. How would I do it with your
setup?

For the difference between MST and NxN:
Consider the following example, with 4 clusters: C1, C2, C3, C4, and write
going to C1.

In NxN mesh, the write will be propagated as:
C1 -> C2
C1 -> C3
C1 -> C4

network cost: 3, writes to wal: 3

MST with tree as C1 <-> C2 <-> C3 <-> C4, the write will be propagated as:
C1 -> C2
C2 -> C3
C3 -> C4

network cost: 3, writes to wal: 3

Both approaches have the same network and wal cost. The only difference is
that in MST, if C2 fails, writes from C1 will not go to C3 and C4, where as
in NxN case, the writes will still happen.

Also, (1) and (3) are not an issue for us.

Having said that, I do realize that adding more clusters is increasing the
load quadratically, and that does worry me. Our actual use case is that a
row should be present in only 4/10 clusters, but it varies based on the row
and not on the cluster. So I cannot come up with a static replication
configuration that will handle that. I am looking into per row replication,
but will start that a separate discussion and share my ideas there.

I hope this makes more sense now.


On Fri, Nov 8, 2013 at 3:47 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. how about your company have a new office in the 11th locations?
>
> With minimum spanning tree approach, the increase in load wouldn't be
> exponential.
>
>
> On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <ni...@gmail.com> wrote:
>
> > Ishan,
> >
> > have to admit that I am a bit surprise about the need of have data center
> > in 10 different locations. Well, I guess I shouldn't be, as every company
> > is global now(anyone from Mars yet?)
> >
> > In your case, since there is only one column family. The headache is not
> as
> > bad. Let's call your clusters as C1, C2, ... C10
> >
> > The safest way for your most critical data is still have setup the M-M
> > replication by 1 to N-1. That is every cluster add the rest of clusters
> as
> > its peer. For example C1 will have C2, C3...C10 as its peers; C2 will
> have
> > C1, C3.. C10. Well, that will be a lot of data over the network. Although
> > it is the best/fast way to get all the cluster sync-up. I don't like the
> > idea at all(too expensive for one).
> >
> > Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> > carefully planned the distribution so that all the clusters will get
> equal
> > load. Well, a system administrator has to do it manually.
> >
> > Now, thinking about the headache:
> > 1) what if your company(that is your manager who has no idea how
> difficult
> > it is) decide to have one more column family to be replicated?  how about
> > two more? The load will grow exponentially
> > 2) how about your company have a new office in the 11th locations? again,
> > grow exponentially
> > 3) let's say you are the best administrator, and keep nice record of
> > everything (unforturnatly, Hbase alone doesn't have a good way to
> maintain
> > all the record of who is being replicated). And then, the admin left the
> > company? or this is a global company has 10 admin at different locations.
> > How do they communicate of the replication setup?
> >
> > :-) Well, the 3) is not too bad. I just like to point it out as it can be
> > quite true for a company large enough to have 10 locations
> >
> > Demai
> >
> >
> >
> >
> > On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > >wrote:
> >
> > > Ted:
> > > Yes. It is the same table that is being written to from all locations.
> A
> > > single row could be updated from multiple locations, but our schema is
> > > designed in a manner that writes will be independent and not clobber
> each
> > > other.
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > > > Ishan:
> > > > In your use case, the same table is written to in 10 clusters at
> > roughly
> > > > the same time ?
> > > >
> > > > Please clarify.
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > > > >wrote:
> > > >
> > > > > @Demai,
> > > > > We actually have 10 clusters in different locations.
> > > > > The replication scope is not an issue for me since I have only one
> > > column
> > > > > family and we want it replicated to each location.
> > > > > Can you elaborate more on why a replication setup of more than 3-4
> > > > clusters
> > > > > would be a headache in your opinion?
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> > ichhabra@rocketfuel.com
> > > > > >wrote:
> > > > >
> > > > > > @Demai,
> > > > > > Writes from B should also go to A and C. So, if I were to
> continue
> > on
> > > > > your
> > > > > > suggestion, I would setup A-B master master and B-C
> master-master,
> > > > which
> > > > > is
> > > > > > what I was proposing in the 2nd approach (MST based).
> > > > > >
> > > > > > @Vladimir
> > > > > > That is classified. :P
> > > > > >
> > > > > >
> > > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > > vladrodionov@gmail.com>wrote:
> > > > > >
> > > > > >> *I want to setup NxN replication i.e. N clusters each
> replicating
> > to
> > > > > each
> > > > > >> other. N is expected to be around 10.*
> > > > > >>
> > > > > >> Preparing to thermonuclear war?
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > > ichhabra@rocketfuel.com
> > > > > >> >wrote:
> > > > > >>
> > > > > >> > I want to setup NxN replication i.e. N clusters each
> replicating
> > > to
> > > > > each
> > > > > >> > other. N is expected to be around 10.
> > > > > >> >
> > > > > >> > On doing some research, I realize it is possible after
> > HBASE-7709
> > > > fix,
> > > > > >> but
> > > > > >> > it would lead to much more data flowing in the system. eg.
> > > > > >> >
> > > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > > >> > A new write to A will go to B and then C, and also go to C
> > > directly
> > > > > via
> > > > > >> the
> > > > > >> > direct path. This leads to unnecessary network usage and
> writes
> > to
> > > > WAL
> > > > > >> of
> > > > > >> > B, that should be avoided. Now imagine this with 10 clusters,
> it
> > > > won’t
> > > > > >> > scale.
> > > > > >> >
> > > > > >> > One option is to create a minimum spanning tree joining all
> the
> > > > > clusters
> > > > > >> > and make nodes replicate to their immediate peers in a
> > > master-master
> > > > > >> > fashion. This is much better than NxN mesh, but still has
> extra
> > > > > network
> > > > > >> and
> > > > > >> > WAL usage. It also suffers from a failure scenarios where the
> a
> > > > single
> > > > > >> > cluster going down will pause replication to clusters
> > downstream.
> > > > > >> >
> > > > > >> > What I really want is that the ReplicationSource should only
> > > forward
> > > > > >> > WALEdits with cluster-id same as the local cluster-id. This
> > seems
> > > > > like a
> > > > > >> > straight forward patch to put in.
> > > > > >> >
> > > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > > >> >
> > > > > >> > --
> > > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ted Yu <yu...@gmail.com>.

bq. how about your company have a new office in the 11th locations?

With minimum spanning tree approach, the increase in load wouldn't be
exponential.


On Fri, Nov 8, 2013 at 2:58 PM, Demai Ni <ni...@gmail.com> wrote:

> Ishan,
>
> have to admit that I am a bit surprise about the need of have data center
> in 10 different locations. Well, I guess I shouldn't be, as every company
> is global now(anyone from Mars yet?)
>
> In your case, since there is only one column family. The headache is not as
> bad. Let's call your clusters as C1, C2, ... C10
>
> The safest way for your most critical data is still have setup the M-M
> replication by 1 to N-1. That is every cluster add the rest of clusters as
> its peer. For example C1 will have C2, C3...C10 as its peers; C2 will have
> C1, C3.. C10. Well, that will be a lot of data over the network. Although
> it is the best/fast way to get all the cluster sync-up. I don't like the
> idea at all(too expensive for one).
>
> Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
> carefully planned the distribution so that all the clusters will get equal
> load. Well, a system administrator has to do it manually.
>
> Now, thinking about the headache:
> 1) what if your company(that is your manager who has no idea how difficult
> it is) decide to have one more column family to be replicated?  how about
> two more? The load will grow exponentially
> 2) how about your company have a new office in the 11th locations? again,
> grow exponentially
> 3) let's say you are the best administrator, and keep nice record of
> everything (unforturnatly, Hbase alone doesn't have a good way to maintain
> all the record of who is being replicated). And then, the admin left the
> company? or this is a global company has 10 admin at different locations.
> How do they communicate of the replication setup?
>
> :-) Well, the 3) is not too bad. I just like to point it out as it can be
> quite true for a company large enough to have 10 locations
>
> Demai
>
>
>
>
> On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > Ted:
> > Yes. It is the same table that is being written to from all locations. A
> > single row could be updated from multiple locations, but our schema is
> > designed in a manner that writes will be independent and not clobber each
> > other.
> >
> >
> > On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> > > Ishan:
> > > In your use case, the same table is written to in 10 clusters at
> roughly
> > > the same time ?
> > >
> > > Please clarify.
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > > >wrote:
> > >
> > > > @Demai,
> > > > We actually have 10 clusters in different locations.
> > > > The replication scope is not an issue for me since I have only one
> > column
> > > > family and we want it replicated to each location.
> > > > Can you elaborate more on why a replication setup of more than 3-4
> > > clusters
> > > > would be a headache in your opinion?
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > > > >wrote:
> > > >
> > > > > @Demai,
> > > > > Writes from B should also go to A and C. So, if I were to continue
> on
> > > > your
> > > > > suggestion, I would setup A-B master master and B-C master-master,
> > > which
> > > > is
> > > > > what I was proposing in the 2nd approach (MST based).
> > > > >
> > > > > @Vladimir
> > > > > That is classified. :P
> > > > >
> > > > >
> > > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > > vladrodionov@gmail.com>wrote:
> > > > >
> > > > >> *I want to setup NxN replication i.e. N clusters each replicating
> to
> > > > each
> > > > >> other. N is expected to be around 10.*
> > > > >>
> > > > >> Preparing to thermonuclear war?
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > > ichhabra@rocketfuel.com
> > > > >> >wrote:
> > > > >>
> > > > >> > I want to setup NxN replication i.e. N clusters each replicating
> > to
> > > > each
> > > > >> > other. N is expected to be around 10.
> > > > >> >
> > > > >> > On doing some research, I realize it is possible after
> HBASE-7709
> > > fix,
> > > > >> but
> > > > >> > it would lead to much more data flowing in the system. eg.
> > > > >> >
> > > > >> > Lets say we have 3 clusters: A,B and C.
> > > > >> > A new write to A will go to B and then C, and also go to C
> > directly
> > > > via
> > > > >> the
> > > > >> > direct path. This leads to unnecessary network usage and writes
> to
> > > WAL
> > > > >> of
> > > > >> > B, that should be avoided. Now imagine this with 10 clusters, it
> > > won’t
> > > > >> > scale.
> > > > >> >
> > > > >> > One option is to create a minimum spanning tree joining all the
> > > > clusters
> > > > >> > and make nodes replicate to their immediate peers in a
> > master-master
> > > > >> > fashion. This is much better than NxN mesh, but still has extra
> > > > network
> > > > >> and
> > > > >> > WAL usage. It also suffers from a failure scenarios where the a
> > > single
> > > > >> > cluster going down will pause replication to clusters
> downstream.
> > > > >> >
> > > > >> > What I really want is that the ReplicationSource should only
> > forward
> > > > >> > WALEdits with cluster-id same as the local cluster-id. This
> seems
> > > > like a
> > > > >> > straight forward patch to put in.
> > > > >> >
> > > > >> > Any thoughts on the suggested approach or alternatives?
> > > > >> >
> > > > >> > --
> > > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

Ishan,

have to admit that I am a bit surprise about the need of have data center
in 10 different locations. Well, I guess I shouldn't be, as every company
is global now(anyone from Mars yet?)

In your case, since there is only one column family. The headache is not as
bad. Let's call your clusters as C1, C2, ... C10

The safest way for your most critical data is still have setup the M-M
replication by 1 to N-1. That is every cluster add the rest of clusters as
its peer. For example C1 will have C2, C3...C10 as its peers; C2 will have
C1, C3.. C10. Well, that will be a lot of data over the network. Although
it is the best/fast way to get all the cluster sync-up. I don't like the
idea at all(too expensive for one).

Now, let's improve it a bit. C1 will setup M-M to 2 of the rest 9, and
carefully planned the distribution so that all the clusters will get equal
load. Well, a system administrator has to do it manually.

Now, thinking about the headache:
1) what if your company(that is your manager who has no idea how difficult
it is) decide to have one more column family to be replicated?  how about
two more? The load will grow exponentially
2) how about your company have a new office in the 11th locations? again,
grow exponentially
3) let's say you are the best administrator, and keep nice record of
everything (unforturnatly, Hbase alone doesn't have a good way to maintain
all the record of who is being replicated). And then, the admin left the
company? or this is a global company has 10 admin at different locations.
How do they communicate of the replication setup?

:-) Well, the 3) is not too bad. I just like to point it out as it can be
quite true for a company large enough to have 10 locations

Demai

On Fri, Nov 8, 2013 at 2:42 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> Ted:
> Yes. It is the same table that is being written to from all locations. A
> single row could be updated from multiple locations, but our schema is
> designed in a manner that writes will be independent and not clobber each
> other.
>
>
> On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > Ishan:
> > In your use case, the same table is written to in 10 clusters at roughly
> > the same time ?
> >
> > Please clarify.
> >
> >
> > On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > >wrote:
> >
> > > @Demai,
> > > We actually have 10 clusters in different locations.
> > > The replication scope is not an issue for me since I have only one
> column
> > > family and we want it replicated to each location.
> > > Can you elaborate more on why a replication setup of more than 3-4
> > clusters
> > > would be a headache in your opinion?
> > >
> > >
> > > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > > >wrote:
> > >
> > > > @Demai,
> > > > Writes from B should also go to A and C. So, if I were to continue on
> > > your
> > > > suggestion, I would setup A-B master master and B-C master-master,
> > which
> > > is
> > > > what I was proposing in the 2nd approach (MST based).
> > > >
> > > > @Vladimir
> > > > That is classified. :P
> > > >
> > > >
> > > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > > vladrodionov@gmail.com>wrote:
> > > >
> > > >> *I want to setup NxN replication i.e. N clusters each replicating to
> > > each
> > > >> other. N is expected to be around 10.*
> > > >>
> > > >> Preparing to thermonuclear war?
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> > ichhabra@rocketfuel.com
> > > >> >wrote:
> > > >>
> > > >> > I want to setup NxN replication i.e. N clusters each replicating
> to
> > > each
> > > >> > other. N is expected to be around 10.
> > > >> >
> > > >> > On doing some research, I realize it is possible after HBASE-7709
> > fix,
> > > >> but
> > > >> > it would lead to much more data flowing in the system. eg.
> > > >> >
> > > >> > Lets say we have 3 clusters: A,B and C.
> > > >> > A new write to A will go to B and then C, and also go to C
> directly
> > > via
> > > >> the
> > > >> > direct path. This leads to unnecessary network usage and writes to
> > WAL
> > > >> of
> > > >> > B, that should be avoided. Now imagine this with 10 clusters, it
> > won’t
> > > >> > scale.
> > > >> >
> > > >> > One option is to create a minimum spanning tree joining all the
> > > clusters
> > > >> > and make nodes replicate to their immediate peers in a
> master-master
> > > >> > fashion. This is much better than NxN mesh, but still has extra
> > > network
> > > >> and
> > > >> > WAL usage. It also suffers from a failure scenarios where the a
> > single
> > > >> > cluster going down will pause replication to clusters downstream.
> > > >> >
> > > >> > What I really want is that the ReplicationSource should only
> forward
> > > >> > WALEdits with cluster-id same as the local cluster-id. This seems
> > > like a
> > > >> > straight forward patch to put in.
> > > >> >
> > > >> > Any thoughts on the suggested approach or alternatives?
> > > >> >
> > > >> > --
> > > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >> >
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > > >
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

Ted:
Yes. It is the same table that is being written to from all locations. A
single row could be updated from multiple locations, but our schema is
designed in a manner that writes will be independent and not clobber each
other.


On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:

> Ishan:
> In your use case, the same table is written to in 10 clusters at roughly
> the same time ?
>
> Please clarify.
>
>
> On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > @Demai,
> > We actually have 10 clusters in different locations.
> > The replication scope is not an issue for me since I have only one column
> > family and we want it replicated to each location.
> > Can you elaborate more on why a replication setup of more than 3-4
> clusters
> > would be a headache in your opinion?
> >
> >
> > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > >wrote:
> >
> > > @Demai,
> > > Writes from B should also go to A and C. So, if I were to continue on
> > your
> > > suggestion, I would setup A-B master master and B-C master-master,
> which
> > is
> > > what I was proposing in the 2nd approach (MST based).
> > >
> > > @Vladimir
> > > That is classified. :P
> > >
> > >
> > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > vladrodionov@gmail.com>wrote:
> > >
> > >> *I want to setup NxN replication i.e. N clusters each replicating to
> > each
> > >> other. N is expected to be around 10.*
> > >>
> > >> Preparing to thermonuclear war?
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > >> >wrote:
> > >>
> > >> > I want to setup NxN replication i.e. N clusters each replicating to
> > each
> > >> > other. N is expected to be around 10.
> > >> >
> > >> > On doing some research, I realize it is possible after HBASE-7709
> fix,
> > >> but
> > >> > it would lead to much more data flowing in the system. eg.
> > >> >
> > >> > Lets say we have 3 clusters: A,B and C.
> > >> > A new write to A will go to B and then C, and also go to C directly
> > via
> > >> the
> > >> > direct path. This leads to unnecessary network usage and writes to
> WAL
> > >> of
> > >> > B, that should be avoided. Now imagine this with 10 clusters, it
> won’t
> > >> > scale.
> > >> >
> > >> > One option is to create a minimum spanning tree joining all the
> > clusters
> > >> > and make nodes replicate to their immediate peers in a master-master
> > >> > fashion. This is much better than NxN mesh, but still has extra
> > network
> > >> and
> > >> > WAL usage. It also suffers from a failure scenarios where the a
> single
> > >> > cluster going down will pause replication to clusters downstream.
> > >> >
> > >> > What I really want is that the ReplicationSource should only forward
> > >> > WALEdits with cluster-id same as the local cluster-id. This seems
> > like a
> > >> > straight forward patch to put in.
> > >> >
> > >> > Any thoughts on the suggested approach or alternatives?
> > >> >
> > >> > --
> > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

Ted:
Yes. It is the same table that is being written to from all locations. A
single row could be updated from multiple locations, but our schema is
designed in a manner that writes will be independent and not clobber each
other.


On Fri, Nov 8, 2013 at 2:33 PM, Ted Yu <yu...@gmail.com> wrote:

> Ishan:
> In your use case, the same table is written to in 10 clusters at roughly
> the same time ?
>
> Please clarify.
>
>
> On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > @Demai,
> > We actually have 10 clusters in different locations.
> > The replication scope is not an issue for me since I have only one column
> > family and we want it replicated to each location.
> > Can you elaborate more on why a replication setup of more than 3-4
> clusters
> > would be a headache in your opinion?
> >
> >
> > On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> > >wrote:
> >
> > > @Demai,
> > > Writes from B should also go to A and C. So, if I were to continue on
> > your
> > > suggestion, I would setup A-B master master and B-C master-master,
> which
> > is
> > > what I was proposing in the 2nd approach (MST based).
> > >
> > > @Vladimir
> > > That is classified. :P
> > >
> > >
> > > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> > vladrodionov@gmail.com>wrote:
> > >
> > >> *I want to setup NxN replication i.e. N clusters each replicating to
> > each
> > >> other. N is expected to be around 10.*
> > >>
> > >> Preparing to thermonuclear war?
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <
> ichhabra@rocketfuel.com
> > >> >wrote:
> > >>
> > >> > I want to setup NxN replication i.e. N clusters each replicating to
> > each
> > >> > other. N is expected to be around 10.
> > >> >
> > >> > On doing some research, I realize it is possible after HBASE-7709
> fix,
> > >> but
> > >> > it would lead to much more data flowing in the system. eg.
> > >> >
> > >> > Lets say we have 3 clusters: A,B and C.
> > >> > A new write to A will go to B and then C, and also go to C directly
> > via
> > >> the
> > >> > direct path. This leads to unnecessary network usage and writes to
> WAL
> > >> of
> > >> > B, that should be avoided. Now imagine this with 10 clusters, it
> won’t
> > >> > scale.
> > >> >
> > >> > One option is to create a minimum spanning tree joining all the
> > clusters
> > >> > and make nodes replicate to their immediate peers in a master-master
> > >> > fashion. This is much better than NxN mesh, but still has extra
> > network
> > >> and
> > >> > WAL usage. It also suffers from a failure scenarios where the a
> single
> > >> > cluster going down will pause replication to clusters downstream.
> > >> >
> > >> > What I really want is that the ReplicationSource should only forward
> > >> > WALEdits with cluster-id same as the local cluster-id. This seems
> > like a
> > >> > straight forward patch to put in.
> > >> >
> > >> > Any thoughts on the suggested approach or alternatives?
> > >> >
> > >> > --
> > >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> > >
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ted Yu <yu...@gmail.com>.

Ishan:
In your use case, the same table is written to in 10 clusters at roughly
the same time ?

Please clarify.


On Fri, Nov 8, 2013 at 2:29 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> @Demai,
> We actually have 10 clusters in different locations.
> The replication scope is not an issue for me since I have only one column
> family and we want it replicated to each location.
> Can you elaborate more on why a replication setup of more than 3-4 clusters
> would be a headache in your opinion?
>
>
> On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > @Demai,
> > Writes from B should also go to A and C. So, if I were to continue on
> your
> > suggestion, I would setup A-B master master and B-C master-master, which
> is
> > what I was proposing in the 2nd approach (MST based).
> >
> > @Vladimir
> > That is classified. :P
> >
> >
> > On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <
> vladrodionov@gmail.com>wrote:
> >
> >> *I want to setup NxN replication i.e. N clusters each replicating to
> each
> >> other. N is expected to be around 10.*
> >>
> >> Preparing to thermonuclear war?
> >>
> >>
> >>
> >>
> >> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >> >wrote:
> >>
> >> > I want to setup NxN replication i.e. N clusters each replicating to
> each
> >> > other. N is expected to be around 10.
> >> >
> >> > On doing some research, I realize it is possible after HBASE-7709 fix,
> >> but
> >> > it would lead to much more data flowing in the system. eg.
> >> >
> >> > Lets say we have 3 clusters: A,B and C.
> >> > A new write to A will go to B and then C, and also go to C directly
> via
> >> the
> >> > direct path. This leads to unnecessary network usage and writes to WAL
> >> of
> >> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
> >> > scale.
> >> >
> >> > One option is to create a minimum spanning tree joining all the
> clusters
> >> > and make nodes replicate to their immediate peers in a master-master
> >> > fashion. This is much better than NxN mesh, but still has extra
> network
> >> and
> >> > WAL usage. It also suffers from a failure scenarios where the a single
> >> > cluster going down will pause replication to clusters downstream.
> >> >
> >> > What I really want is that the ReplicationSource should only forward
> >> > WALEdits with cluster-id same as the local cluster-id. This seems
> like a
> >> > straight forward patch to put in.
> >> >
> >> > Any thoughts on the suggested approach or alternatives?
> >> >
> >> > --
> >> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >> >
> >>
> >
> >
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

@Demai,
We actually have 10 clusters in different locations.
The replication scope is not an issue for me since I have only one column
family and we want it replicated to each location.
Can you elaborate more on why a replication setup of more than 3-4 clusters
would be a headache in your opinion?


On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> @Demai,
> Writes from B should also go to A and C. So, if I were to continue on your
> suggestion, I would setup A-B master master and B-C master-master, which is
> what I was proposing in the 2nd approach (MST based).
>
> @Vladimir
> That is classified. :P
>
>
> On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:
>
>> *I want to setup NxN replication i.e. N clusters each replicating to each
>> other. N is expected to be around 10.*
>>
>> Preparing to thermonuclear war?
>>
>>
>>
>>
>> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
>> >wrote:
>>
>> > I want to setup NxN replication i.e. N clusters each replicating to each
>> > other. N is expected to be around 10.
>> >
>> > On doing some research, I realize it is possible after HBASE-7709 fix,
>> but
>> > it would lead to much more data flowing in the system. eg.
>> >
>> > Lets say we have 3 clusters: A,B and C.
>> > A new write to A will go to B and then C, and also go to C directly via
>> the
>> > direct path. This leads to unnecessary network usage and writes to WAL
>> of
>> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
>> > scale.
>> >
>> > One option is to create a minimum spanning tree joining all the clusters
>> > and make nodes replicate to their immediate peers in a master-master
>> > fashion. This is much better than NxN mesh, but still has extra network
>> and
>> > WAL usage. It also suffers from a failure scenarios where the a single
>> > cluster going down will pause replication to clusters downstream.
>> >
>> > What I really want is that the ReplicationSource should only forward
>> > WALEdits with cluster-id same as the local cluster-id. This seems like a
>> > straight forward patch to put in.
>> >
>> > Any thoughts on the suggested approach or alternatives?
>> >
>> > --
>> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>> >
>>
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

@Demai,
We actually have 10 clusters in different locations.
The replication scope is not an issue for me since I have only one column
family and we want it replicated to each location.
Can you elaborate more on why a replication setup of more than 3-4 clusters
would be a headache in your opinion?


On Fri, Nov 8, 2013 at 2:16 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> @Demai,
> Writes from B should also go to A and C. So, if I were to continue on your
> suggestion, I would setup A-B master master and B-C master-master, which is
> what I was proposing in the 2nd approach (MST based).
>
> @Vladimir
> That is classified. :P
>
>
> On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:
>
>> *I want to setup NxN replication i.e. N clusters each replicating to each
>> other. N is expected to be around 10.*
>>
>> Preparing to thermonuclear war?
>>
>>
>>
>>
>> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
>> >wrote:
>>
>> > I want to setup NxN replication i.e. N clusters each replicating to each
>> > other. N is expected to be around 10.
>> >
>> > On doing some research, I realize it is possible after HBASE-7709 fix,
>> but
>> > it would lead to much more data flowing in the system. eg.
>> >
>> > Lets say we have 3 clusters: A,B and C.
>> > A new write to A will go to B and then C, and also go to C directly via
>> the
>> > direct path. This leads to unnecessary network usage and writes to WAL
>> of
>> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
>> > scale.
>> >
>> > One option is to create a minimum spanning tree joining all the clusters
>> > and make nodes replicate to their immediate peers in a master-master
>> > fashion. This is much better than NxN mesh, but still has extra network
>> and
>> > WAL usage. It also suffers from a failure scenarios where the a single
>> > cluster going down will pause replication to clusters downstream.
>> >
>> > What I really want is that the ReplicationSource should only forward
>> > WALEdits with cluster-id same as the local cluster-id. This seems like a
>> > straight forward patch to put in.
>> >
>> > Any thoughts on the suggested approach or alternatives?
>> >
>> > --
>> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>> >
>>
>
>
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

@Demai,
Writes from B should also go to A and C. So, if I were to continue on your
suggestion, I would setup A-B master master and B-C master-master, which is
what I was proposing in the 2nd approach (MST based).

@Vladimir
That is classified. :P


On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:

> *I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.*
>
> Preparing to thermonuclear war?
>
>
>
>
> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > I want to setup NxN replication i.e. N clusters each replicating to each
> > other. N is expected to be around 10.
> >
> > On doing some research, I realize it is possible after HBASE-7709 fix,
> but
> > it would lead to much more data flowing in the system. eg.
> >
> > Lets say we have 3 clusters: A,B and C.
> > A new write to A will go to B and then C, and also go to C directly via
> the
> > direct path. This leads to unnecessary network usage and writes to WAL of
> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
> > scale.
> >
> > One option is to create a minimum spanning tree joining all the clusters
> > and make nodes replicate to their immediate peers in a master-master
> > fashion. This is much better than NxN mesh, but still has extra network
> and
> > WAL usage. It also suffers from a failure scenarios where the a single
> > cluster going down will pause replication to clusters downstream.
> >
> > What I really want is that the ReplicationSource should only forward
> > WALEdits with cluster-id same as the local cluster-id. This seems like a
> > straight forward patch to put in.
> >
> > Any thoughts on the suggested approach or alternatives?
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Ishan Chhabra <ic...@rocketfuel.com>.

@Demai,
Writes from B should also go to A and C. So, if I were to continue on your
suggestion, I would setup A-B master master and B-C master-master, which is
what I was proposing in the 2nd approach (MST based).

@Vladimir
That is classified. :P


On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:

> *I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.*
>
> Preparing to thermonuclear war?
>
>
>
>
> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > I want to setup NxN replication i.e. N clusters each replicating to each
> > other. N is expected to be around 10.
> >
> > On doing some research, I realize it is possible after HBASE-7709 fix,
> but
> > it would lead to much more data flowing in the system. eg.
> >
> > Lets say we have 3 clusters: A,B and C.
> > A new write to A will go to B and then C, and also go to C directly via
> the
> > direct path. This leads to unnecessary network usage and writes to WAL of
> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
> > scale.
> >
> > One option is to create a minimum spanning tree joining all the clusters
> > and make nodes replicate to their immediate peers in a master-master
> > fashion. This is much better than NxN mesh, but still has extra network
> and
> > WAL usage. It also suffers from a failure scenarios where the a single
> > cluster going down will pause replication to clusters downstream.
> >
> > What I really want is that the ReplicationSource should only forward
> > WALEdits with cluster-id same as the local cluster-id. This seems like a
> > straight forward patch to put in.
> >
> > Any thoughts on the suggested approach or alternatives?
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>



-- 
*Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

Vlad, nice one. :-)

Very good point. Unless your company have data center in 10 different
locations, I don't see a good use case for such complex setup
Also, please keep in mind, currently the only replication type is 'global'.
Before we find a good way to specify peer at table:family level. It will be
a nightmare to replicating in a setup of more than 3~4 clusters

Demai


On Fri, Nov 8, 2013 at 1:20 PM, Vladimir Rodionov <vl...@gmail.com>wrote:

> *I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.*
>
> Preparing to thermonuclear war?
>
>
>
>
> On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ichhabra@rocketfuel.com
> >wrote:
>
> > I want to setup NxN replication i.e. N clusters each replicating to each
> > other. N is expected to be around 10.
> >
> > On doing some research, I realize it is possible after HBASE-7709 fix,
> but
> > it would lead to much more data flowing in the system. eg.
> >
> > Lets say we have 3 clusters: A,B and C.
> > A new write to A will go to B and then C, and also go to C directly via
> the
> > direct path. This leads to unnecessary network usage and writes to WAL of
> > B, that should be avoided. Now imagine this with 10 clusters, it won’t
> > scale.
> >
> > One option is to create a minimum spanning tree joining all the clusters
> > and make nodes replicate to their immediate peers in a master-master
> > fashion. This is much better than NxN mesh, but still has extra network
> and
> > WAL usage. It also suffers from a failure scenarios where the a single
> > cluster going down will pause replication to clusters downstream.
> >
> > What I really want is that the ReplicationSource should only forward
> > WALEdits with cluster-id same as the local cluster-id. This seems like a
> > straight forward patch to put in.
> >
> > Any thoughts on the suggested approach or alternatives?
> >
> > --
> > *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
> >
>

Re: Setting up NxN replication

Posted by Vladimir Rodionov <vl...@gmail.com>.

*I want to setup NxN replication i.e. N clusters each replicating to each
other. N is expected to be around 10.*

Preparing to thermonuclear war?




On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.
>
> On doing some research, I realize it is possible after HBASE-7709 fix, but
> it would lead to much more data flowing in the system. eg.
>
> Lets say we have 3 clusters: A,B and C.
> A new write to A will go to B and then C, and also go to C directly via the
> direct path. This leads to unnecessary network usage and writes to WAL of
> B, that should be avoided. Now imagine this with 10 clusters, it won’t
> scale.
>
> One option is to create a minimum spanning tree joining all the clusters
> and make nodes replicate to their immediate peers in a master-master
> fashion. This is much better than NxN mesh, but still has extra network and
> WAL usage. It also suffers from a failure scenarios where the a single
> cluster going down will pause replication to clusters downstream.
>
> What I really want is that the ReplicationSource should only forward
> WALEdits with cluster-id same as the local cluster-id. This seems like a
> straight forward patch to put in.
>
> Any thoughts on the suggested approach or alternatives?
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

hi, Ishan,

can you please elaborate a bit about your use case?

for your example of 3 cluster, I think it should be set up as two M-S for
the same table:family
first one: A = Master, B = Slave
2nd one: B = slave, C = Master.
There is no need to setup the direct relationship between A and C

Demai


On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.
>
> On doing some research, I realize it is possible after HBASE-7709 fix, but
> it would lead to much more data flowing in the system. eg.
>
> Lets say we have 3 clusters: A,B and C.
> A new write to A will go to B and then C, and also go to C directly via the
> direct path. This leads to unnecessary network usage and writes to WAL of
> B, that should be avoided. Now imagine this with 10 clusters, it won’t
> scale.
>
> One option is to create a minimum spanning tree joining all the clusters
> and make nodes replicate to their immediate peers in a master-master
> fashion. This is much better than NxN mesh, but still has extra network and
> WAL usage. It also suffers from a failure scenarios where the a single
> cluster going down will pause replication to clusters downstream.
>
> What I really want is that the ReplicationSource should only forward
> WALEdits with cluster-id same as the local cluster-id. This seems like a
> straight forward patch to put in.
>
> Any thoughts on the suggested approach or alternatives?
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Demai Ni <ni...@gmail.com>.

hi, Ishan,

can you please elaborate a bit about your use case?

for your example of 3 cluster, I think it should be set up as two M-S for
the same table:family
first one: A = Master, B = Slave
2nd one: B = slave, C = Master.
There is no need to setup the direct relationship between A and C

Demai


On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.
>
> On doing some research, I realize it is possible after HBASE-7709 fix, but
> it would lead to much more data flowing in the system. eg.
>
> Lets say we have 3 clusters: A,B and C.
> A new write to A will go to B and then C, and also go to C directly via the
> direct path. This leads to unnecessary network usage and writes to WAL of
> B, that should be avoided. Now imagine this with 10 clusters, it won’t
> scale.
>
> One option is to create a minimum spanning tree joining all the clusters
> and make nodes replicate to their immediate peers in a master-master
> fashion. This is much better than NxN mesh, but still has extra network and
> WAL usage. It also suffers from a failure scenarios where the a single
> cluster going down will pause replication to clusters downstream.
>
> What I really want is that the ReplicationSource should only forward
> WALEdits with cluster-id same as the local cluster-id. This seems like a
> straight forward patch to put in.
>
> Any thoughts on the suggested approach or alternatives?
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>

Re: Setting up NxN replication

Posted by Vladimir Rodionov <vl...@gmail.com>.

*I want to setup NxN replication i.e. N clusters each replicating to each
other. N is expected to be around 10.*

Preparing to thermonuclear war?




On Fri, Nov 8, 2013 at 1:14 PM, Ishan Chhabra <ic...@rocketfuel.com>wrote:

> I want to setup NxN replication i.e. N clusters each replicating to each
> other. N is expected to be around 10.
>
> On doing some research, I realize it is possible after HBASE-7709 fix, but
> it would lead to much more data flowing in the system. eg.
>
> Lets say we have 3 clusters: A,B and C.
> A new write to A will go to B and then C, and also go to C directly via the
> direct path. This leads to unnecessary network usage and writes to WAL of
> B, that should be avoided. Now imagine this with 10 clusters, it won’t
> scale.
>
> One option is to create a minimum spanning tree joining all the clusters
> and make nodes replicate to their immediate peers in a master-master
> fashion. This is much better than NxN mesh, but still has extra network and
> WAL usage. It also suffers from a failure scenarios where the a single
> cluster going down will pause replication to clusters downstream.
>
> What I really want is that the ReplicationSource should only forward
> WALEdits with cluster-id same as the local cluster-id. This seems like a
> straight forward patch to put in.
>
> Any thoughts on the suggested approach or alternatives?
>
> --
> *Ishan Chhabra *| Rocket Scientist | RocketFuel Inc.
>