You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Voytek Jarnot <vo...@gmail.com> on 2019/09/11 19:41:28 UTC

nodetool rebuild on non-empty nodes?

Pardon the convoluted scenario, but we face some pretty ridiculous
infrastructure restrictions.

datacenter DC1: nodes containing many years of data written before
2019-09-01 (for example)

datacenter DC2: nodes containing data written after 2019-09-01

The idea is that these are independent clusters. We now connect them into a
multi-DC cluster, and alter our keyspace to replicate to both DCs.

What is the effect of running `nodetool rebuild -- DC1` on nodes in DC2? I
know we'll get that historical DC1 data, but my concern is about the new
data that had been written to the DC2 datacenter. Would the rebuild end up
dropping our post 2019-09-01 data?

Thanks,
Voytek Jarnot

Re: nodetool rebuild on non-empty nodes?

Posted by Voytek Jarnot <vo...@gmail.com>.
Apologies for the bump, but I'm wondering if anyone has any thoughts on the
question below - specifically about running nodetool rebuild on a
destination that has data that does not exist in the source

Thanks.

On Wed, Sep 11, 2019 at 2:41 PM Voytek Jarnot <vo...@gmail.com>
wrote:

> Pardon the convoluted scenario, but we face some pretty ridiculous
> infrastructure restrictions.
>
> datacenter DC1: nodes containing many years of data written before
> 2019-09-01 (for example)
>
> datacenter DC2: nodes containing data written after 2019-09-01
>
> The idea is that these are independent clusters. We now connect them into
> a multi-DC cluster, and alter our keyspace to replicate to both DCs.
>
> What is the effect of running `nodetool rebuild -- DC1` on nodes in DC2? I
> know we'll get that historical DC1 data, but my concern is about the new
> data that had been written to the DC2 datacenter. Would the rebuild end up
> dropping our post 2019-09-01 data?
>
> Thanks,
> Voytek Jarnot
>