You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Bryan Cheng <br...@blockcypher.com> on 2015/09/01 02:10:15 UTC

Rebuild new DC nodes against new DC?

Hi list,

We're bringing up a second DC, and following the procedure outlined here:
http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html

We have three nodes in the new DC that are members of the cluster and
indicate that they are running normally. We have begun the process of
altering the keyspaces for multi-DC and are streaming over data via
nodetool rebuild on a keyspace-by-keyspace basis.

I couldn't find a clear answer for this: at what point is it safe to
rebuild from the new dc versus the old?

In other words, I have machines a, b, and c in DC2 (the new DC). I build a
and b by specifying DC1 on the rebuild command line. Can I safely rebuild
against DC2 for machine c? Is this at all dependent on quorum settings?

Our DC's are linked by a VPN that doesn't have as big of a pipe as we'd
like- streaming in the new DC would make things faster and ease some
headaches.

Thanks for any help!

--Bryan

Re: Rebuild new DC nodes against new DC?

Posted by Alain RODRIGUEZ <ar...@gmail.com>.
Hi Bryan,

I have no clear answer to you yet I can give you some insights, my
understanding of this.

First, I am not sure that nodetool will let you "rebuild" from the DC the
node is in.
Then this would only work properly (if it works) because you have 3 nodes
and a RF of 2 or 3 and so all the data is already present in your new DC,
else you will rebuild from an incomplete DC. BTW Consistency Level - quorum
- has no impact as CL is for clients and you are on server operations, what
matters here is the RF and what data each node "hold". Using 'repair' or
copy SSTable directly instead of rebuild are options you might want to
consider (in the case all your data is already present in DC2 with only 2
nodes loaded).

That was to answer to your question, but I would say you should stick with
the procedure, it should definitely work, you just did it twice... "as we'd
like- streaming in the new DC would make things faster and ease some
headaches." being creative and deviate from standard procedure sometimes
work great... But often increase headaches and make things slower, take
care, be sure of what you're doing or follow procedures, imho.

"Our DC's are linked by a VPN that doesn't have as big of a pipe" --> you
rather should try to solve this as much as possible, you will need to
repair your cluster which can be quite bandwidth consuming.

As global advices for new DC: you might also want to disable
read_repair_chance on your tables to avoid cross DC at read time
(use dclocal_read_repair_chance instead), use "Local_Quorum" instead of
quorum and have your clients sticking the local (to them) DC.

Hope this will help, even if I can't answer precisely the "would it work"
question.

C*heers,

Alain


2015-09-01 2:10 GMT+02:00 Bryan Cheng <br...@blockcypher.com>:

> Hi list,
>
> We're bringing up a second DC, and following the procedure outlined here:
> http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html
>
> We have three nodes in the new DC that are members of the cluster and
> indicate that they are running normally. We have begun the process of
> altering the keyspaces for multi-DC and are streaming over data via
> nodetool rebuild on a keyspace-by-keyspace basis.
>
> I couldn't find a clear answer for this: at what point is it safe to
> rebuild from the new dc versus the old?
>
> In other words, I have machines a, b, and c in DC2 (the new DC). I build a
> and b by specifying DC1 on the rebuild command line. Can I safely rebuild
> against DC2 for machine c? Is this at all dependent on quorum settings?
>
> Our DC's are linked by a VPN that doesn't have as big of a pipe as we'd
> like- streaming in the new DC would make things faster and ease some
> headaches.
>
> Thanks for any help!
>
> --Bryan
>