You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Parth Setya <se...@gmail.com> on 2015/01/15 14:52:00 UTC

Add a node with existing data to a cluster

Hi

I am attempting to add a cassandra node which has some existing data on it
to an existing clutser. Is this a legit thing to do?
And what will happen if the same data with different timestamps exists on
the node to be added and the existing cluster?
What will happen if auto_bootstrapping property is enabled and also i run a
repair as soon as the node is added to the cluster?

Best
Parth

Re: Add a node with existing data to a cluster

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Jan 15, 2015 at 5:52 AM, Parth Setya <se...@gmail.com> wrote:

> I am attempting to add a cassandra node which has some existing data on it
> to an existing clutser. Is this a legit thing to do?
>

Sure, it's similar to running "nodetool refresh" but without the lack of
safety. It also may interfere with bootstrapping. But it's not entirely
illegitimate.

> And what will happen if the same data with different timestamps exists on
> the node to be added and the existing cluster?
>

The same reconciliation logic that would otherwise occur. There are cases
where this could unmask deleted data and other unexpected seeming things.

> What will happen if auto_bootstrapping property is enabled and also i run
> a repair as soon as the node is added to the cluster?
>

Depends on whether your version contains :

https://issues.apache.org/jira/browse/CASSANDRA-6961

This allows you to start a node with old data, repair it while it
"hibernate"s as a non-full-member of the cluster, and then have it join.

Otherwise, you serve stale reads at CLs below quorum until the repair
completes.

=Rob