You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2015/10/06 21:32:08 UTC

Maximum node decommission // bootstrap at once.

We're in the middle of migrating datacenters.

We're migrating from 13 nodes to 30 nodes in the new datacenter.

The plan was to bootstrap the 30 nodes first, wait until they have joined.
 then we're going to decommission the old ones.

How many nodes can we bootstrap at once?  How many can we decommission?

I remember reading docs for this but hell if I can find it now :-P

I know what the answer is theoretically.  I just want to make sure we do
everything properly.

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Maximum node decommission // bootstrap at once.

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Oct 6, 2015 at 3:41 PM, Bryan Cheng <br...@blockcypher.com> wrote:

> Robert, I might be misinterpreting you but I *think* your link is talking
> about bootstrapping a new node by bulk loading replica data from your
> existing cluster?
>

Sure... more generically it's about treating the set of data from a source
cluster as if it might be the data of a target cluster.

> I was referring to using Cassandra's bootstrap to get the node to join and
> run (as a member of DC2 but with physical residence in DC1), and then
> transfer the /data directory to a new machine to assume the identity of the
> old. I *believe* that from the cluster point of view this is just the node
> being down for an extended period of time (so the usual caveats apply)
>

Sure, as long as you specify initial_token in the target DC, this is
approximately the same operation as doing it between two different clusters.

In the same cluster : new node takes over generation number of old number
via auto_bootstrap:false or etc.
In different clusters : node bootstraps/auto_bootstrap:falses to same
ownership.

=Rob

Re: Maximum node decommission // bootstrap at once.

Posted by Bryan Cheng <br...@blockcypher.com>.

Robert, I might be misinterpreting you but I *think* your link is talking
about bootstrapping a new node by bulk loading replica data from your
existing cluster?

I was referring to using Cassandra's bootstrap to get the node to join and
run (as a member of DC2 but with physical residence in DC1), and then
transfer the /data directory to a new machine to assume the identity of the
old. I *believe* that from the cluster point of view this is just the node
being down for an extended period of time (so the usual caveats apply)

On Tue, Oct 6, 2015 at 2:20 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Oct 6, 2015 at 2:14 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> Plan be we will just rsync the data.. Does it pretty much work just by
>> putting the data in a directory or do you have to do anything special?
>>
>
> http://www.pythian.com/blog/bulk-loading-options-for-cassandra/
>
> Be careful, with vnodes the rsync approach gets meaningfully harder.
>
> =Rob
>

Re: Maximum node decommission // bootstrap at once.

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Oct 6, 2015 at 2:14 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> Plan be we will just rsync the data.. Does it pretty much work just by
> putting the data in a directory or do you have to do anything special?
>

http://www.pythian.com/blog/bulk-loading-options-for-cassandra/

Be careful, with vnodes the rsync approach gets meaningfully harder.

=Rob

Re: Maximum node decommission // bootstrap at once.

Posted by Kevin Burton <bu...@spinn3r.com>.

OH. interesting.  Yeah. That's another strategy.  We've already done a
bunch of TCP tuning... we get about 1Gbit with large TCP windows.  So I
think we have that part done.

It's sad that CS can't resume...

Plan be we will just rsync the data.. Does it pretty much work just by
putting the data in a directory or do you have to do anything special?

On Tue, Oct 6, 2015 at 1:34 PM, Bryan Cheng <br...@blockcypher.com> wrote:

> Honestly, we've had more luck bootstrapping in our old DC (defining
> topology properties as the new DC) and using rsync to migrate the data
> files to new machines in the new datacenter. We had 10gig within the
> datacenter but significantly less than this cross-DC, which lead to a lot
> of broken streaming pipes and wasted effort. This might make sense
> depending on your link quality and the resources/time you have available to
> do TCP tuning,
>
> On Tue, Oct 6, 2015 at 1:29 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> I'm not sure which is faster/easier.  Just joining one box at a time and
>> then decommissioning or using replace_address.
>>
>> this stuff is always something you do rarely and then more complex than
>> it needs to be.
>>
>> This complicates long term migration too.  Having to have gigabit is
>> somewhat of a problem in that you might now actually have it where you're
>> going.
>>
>> We're migrating from Washington, DC to Germany so we have to change TCP
>> send/receive buffers to get decent bandwidth.
>>
>> But I think we can do this at 1Gb per so per box.
>>
>>
>> On Tue, Oct 6, 2015 at 12:48 PM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>
>>> On Tue, Oct 6, 2015 at 12:32 PM, Kevin Burton <bu...@spinn3r.com>
>>> wrote:
>>>
>>>> How many nodes can we bootstrap at once?  How many can we decommission?
>>>>
>>>
>>> short answer : 1 node can join or part at simultaneously
>>>
>>> longer answer : https://issues.apache.org/jira/browse/CASSANDRA-2434 /
>>> https://issues.apache.org/jira/browse/CASSANDRA-7069 /
>>> -Dconsistent.rangemovement
>>>
>>> Have you considered using replace_address to replace your existing 13
>>> nodes, at which point you just have to join 17 more?
>>>
>>> =Rob
>>>
>>>
>>
>>
>>
>> --
>>
>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>> Engineers!
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> <https://plus.google.com/102718274791889610666/posts>
>>
>>
>


-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Maximum node decommission // bootstrap at once.

Posted by Bryan Cheng <br...@blockcypher.com>.

Honestly, we've had more luck bootstrapping in our old DC (defining
topology properties as the new DC) and using rsync to migrate the data
files to new machines in the new datacenter. We had 10gig within the
datacenter but significantly less than this cross-DC, which lead to a lot
of broken streaming pipes and wasted effort. This might make sense
depending on your link quality and the resources/time you have available to
do TCP tuning,

On Tue, Oct 6, 2015 at 1:29 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> I'm not sure which is faster/easier.  Just joining one box at a time and
> then decommissioning or using replace_address.
>
> this stuff is always something you do rarely and then more complex than it
> needs to be.
>
> This complicates long term migration too.  Having to have gigabit is
> somewhat of a problem in that you might now actually have it where you're
> going.
>
> We're migrating from Washington, DC to Germany so we have to change TCP
> send/receive buffers to get decent bandwidth.
>
> But I think we can do this at 1Gb per so per box.
>
>
> On Tue, Oct 6, 2015 at 12:48 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Tue, Oct 6, 2015 at 12:32 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>>
>>> How many nodes can we bootstrap at once?  How many can we decommission?
>>>
>>
>> short answer : 1 node can join or part at simultaneously
>>
>> longer answer : https://issues.apache.org/jira/browse/CASSANDRA-2434 /
>> https://issues.apache.org/jira/browse/CASSANDRA-7069 /
>> -Dconsistent.rangemovement
>>
>> Have you considered using replace_address to replace your existing 13
>> nodes, at which point you just have to join 17 more?
>>
>> =Rob
>>
>>
>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: Maximum node decommission // bootstrap at once.

Posted by Kevin Burton <bu...@spinn3r.com>.

I'm not sure which is faster/easier.  Just joining one box at a time and
then decommissioning or using replace_address.

this stuff is always something you do rarely and then more complex than it
needs to be.

This complicates long term migration too.  Having to have gigabit is
somewhat of a problem in that you might now actually have it where you're
going.

We're migrating from Washington, DC to Germany so we have to change TCP
send/receive buffers to get decent bandwidth.

But I think we can do this at 1Gb per so per box.

On Tue, Oct 6, 2015 at 12:48 PM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Oct 6, 2015 at 12:32 PM, Kevin Burton <bu...@spinn3r.com> wrote:
>
>> How many nodes can we bootstrap at once?  How many can we decommission?
>>
>
> short answer : 1 node can join or part at simultaneously
>
> longer answer : https://issues.apache.org/jira/browse/CASSANDRA-2434 /
> https://issues.apache.org/jira/browse/CASSANDRA-7069 /
> -Dconsistent.rangemovement
>
> Have you considered using replace_address to replace your existing 13
> nodes, at which point you just have to join 17 more?
>
> =Rob
>
>

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>

Re: Maximum node decommission // bootstrap at once.

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Oct 6, 2015 at 12:32 PM, Kevin Burton <bu...@spinn3r.com> wrote:

> How many nodes can we bootstrap at once?  How many can we decommission?
>

short answer : 1 node can join or part at simultaneously

longer answer : https://issues.apache.org/jira/browse/CASSANDRA-2434 /
https://issues.apache.org/jira/browse/CASSANDRA-7069 /
-Dconsistent.rangemovement

Have you considered using replace_address to replace your existing 13
nodes, at which point you just have to join 17 more?

=Rob