You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Yudong Gao <st...@umich.edu> on 2011/04/12 21:56:04 UTC

Update the Keyspace replication factor online

Hi,

What operations will be executed (and what is the associated overhead)
when the Keyspace replication factor is changed online, in a
multi-datacenter setup with NetworkTopologyStrategy?

I checked the wiki and the archive of the mailing list and find this,
but it is not very complete.

http://wiki.apache.org/cassandra/Operations
"
Replication factor is not really intended to be changed in a live
cluster either, but increasing it may be done if you (a) use
ConsistencyLevel.QUORUM or ALL (depending on your existing replication
factor) to make sure that a replica that actually has the data is
consulted, (b) are willing to accept downtime while anti-entropy
repair runs (see below), or (c) are willing to live with some clients
potentially being told no data exists if they read from the new
replica location(s) until repair is done.
"

More specifically, in this scenario:

{DC1:1, DC2:1} -> {DC2:1, DC3:1}

1. Can this be done online without shutting down the cluster? I
thought there is an "update keyspace" command in the cassandra-cli.

2. If so, what operations will be executed? Will new replicas be
created in new locations (in DC3) and existing replicas be deleted in
old locations (in DC1)?

3. Or they will be updated only with read with ConssitencyLevel.QUORUM
or All, or "nodetool repair"?

Thanks!

Yudong

Re: Update the Keyspace replication factor online

Posted by Yudong Gao <st...@umich.edu>.

Thanks, Aaron! I will try the scenario in small scale first.

I appreciate if anyone else have tried this before and can share the
experience with us.

Thanks!

Yudong

On Thu, Apr 14, 2011 at 4:26 AM, aaron morton <aa...@thelastpickle.com> wrote:
> It looks like you are dropping DC1, in that case perhaps you could just move the nodes from DC1 into DC 3.
>
> I *think* in your case if you made the RF change, ran repair on them,  and worked at Quorum or ALL your clients would be ok. *BUT* I've not done this myself, please take care or ask for a grown up to help.
>
> The warning about down time during repair have to do with the potential impact of repair slowing nodes way down.
>
> Hope that helps
> Aaron
>
> On 13 Apr 2011, at 16:00, Yudong Gao wrote:
>
>> Thanks for the reply, Aaron!
>>
>> On Tue, Apr 12, 2011 at 10:52 PM, aaron morton <aa...@thelastpickle.com> wrote:
>>> Are you changing the replication factor or moving nodes ?
>>
>> I am just changing the replication factor, without touching the node
>> configuration.
>>
>>>
>>> To change the RF you need to repair and then once all repairing is done run cleanup to remove the hold data.
>>
>> Do I need to shutdown the cluster when running the repair? If I just
>> repair the nodes one by one, will some users get the error of no data
>> exists, if the node responsible for the new replica is not yet
>> repaired?
>>
>> Yudong
>>
>>>
>>> You can move whole nodes by moving all their data with them, assigning a new ip, and updating the topology file if used.
>>>
>>> Aaron
>>>
>>> On 13 Apr 2011, at 07:56, Yudong Gao wrote:
>>>
>>>> Hi,
>>>>
>>>> What operations will be executed (and what is the associated overhead)
>>>> when the Keyspace replication factor is changed online, in a
>>>> multi-datacenter setup with NetworkTopologyStrategy?
>>>>
>>>> I checked the wiki and the archive of the mailing list and find this,
>>>> but it is not very complete.
>>>>
>>>> http://wiki.apache.org/cassandra/Operations
>>>> "
>>>> Replication factor is not really intended to be changed in a live
>>>> cluster either, but increasing it may be done if you (a) use
>>>> ConsistencyLevel.QUORUM or ALL (depending on your existing replication
>>>> factor) to make sure that a replica that actually has the data is
>>>> consulted, (b) are willing to accept downtime while anti-entropy
>>>> repair runs (see below), or (c) are willing to live with some clients
>>>> potentially being told no data exists if they read from the new
>>>> replica location(s) until repair is done.
>>>> "
>>>>
>>>> More specifically, in this scenario:
>>>>
>>>> {DC1:1, DC2:1} -> {DC2:1, DC3:1}
>>>>
>>>> 1. Can this be done online without shutting down the cluster? I
>>>> thought there is an "update keyspace" command in the cassandra-cli.
>>>>
>>>> 2. If so, what operations will be executed? Will new replicas be
>>>> created in new locations (in DC3) and existing replicas be deleted in
>>>> old locations (in DC1)?
>>>>
>>>> 3. Or they will be updated only with read with ConssitencyLevel.QUORUM
>>>> or All, or "nodetool repair"?
>>>>
>>>> Thanks!
>>>>
>>>> Yudong
>>>
>>>
>
>

Re: Update the Keyspace replication factor online

Posted by aaron morton <aa...@thelastpickle.com>.

It looks like you are dropping DC1, in that case perhaps you could just move the nodes from DC1 into DC 3. 

I *think* in your case if you made the RF change, ran repair on them,  and worked at Quorum or ALL your clients would be ok. *BUT* I've not done this myself, please take care or ask for a grown up to help. 

The warning about down time during repair have to do with the potential impact of repair slowing nodes way down.

Hope that helps 
Aaron
 
On 13 Apr 2011, at 16:00, Yudong Gao wrote:

> Thanks for the reply, Aaron!
> 
> On Tue, Apr 12, 2011 at 10:52 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> Are you changing the replication factor or moving nodes ?
> 
> I am just changing the replication factor, without touching the node
> configuration.
> 
>> 
>> To change the RF you need to repair and then once all repairing is done run cleanup to remove the hold data.
> 
> Do I need to shutdown the cluster when running the repair? If I just
> repair the nodes one by one, will some users get the error of no data
> exists, if the node responsible for the new replica is not yet
> repaired?
> 
> Yudong
> 
>> 
>> You can move whole nodes by moving all their data with them, assigning a new ip, and updating the topology file if used.
>> 
>> Aaron
>> 
>> On 13 Apr 2011, at 07:56, Yudong Gao wrote:
>> 
>>> Hi,
>>> 
>>> What operations will be executed (and what is the associated overhead)
>>> when the Keyspace replication factor is changed online, in a
>>> multi-datacenter setup with NetworkTopologyStrategy?
>>> 
>>> I checked the wiki and the archive of the mailing list and find this,
>>> but it is not very complete.
>>> 
>>> http://wiki.apache.org/cassandra/Operations
>>> "
>>> Replication factor is not really intended to be changed in a live
>>> cluster either, but increasing it may be done if you (a) use
>>> ConsistencyLevel.QUORUM or ALL (depending on your existing replication
>>> factor) to make sure that a replica that actually has the data is
>>> consulted, (b) are willing to accept downtime while anti-entropy
>>> repair runs (see below), or (c) are willing to live with some clients
>>> potentially being told no data exists if they read from the new
>>> replica location(s) until repair is done.
>>> "
>>> 
>>> More specifically, in this scenario:
>>> 
>>> {DC1:1, DC2:1} -> {DC2:1, DC3:1}
>>> 
>>> 1. Can this be done online without shutting down the cluster? I
>>> thought there is an "update keyspace" command in the cassandra-cli.
>>> 
>>> 2. If so, what operations will be executed? Will new replicas be
>>> created in new locations (in DC3) and existing replicas be deleted in
>>> old locations (in DC1)?
>>> 
>>> 3. Or they will be updated only with read with ConssitencyLevel.QUORUM
>>> or All, or "nodetool repair"?
>>> 
>>> Thanks!
>>> 
>>> Yudong
>> 
>>

Re: Update the Keyspace replication factor online

Posted by Yudong Gao <st...@umich.edu>.

Thanks for the reply, Aaron!

On Tue, Apr 12, 2011 at 10:52 PM, aaron morton <aa...@thelastpickle.com> wrote:
> Are you changing the replication factor or moving nodes ?

I am just changing the replication factor, without touching the node
configuration.

>
> To change the RF you need to repair and then once all repairing is done run cleanup to remove the hold data.

Do I need to shutdown the cluster when running the repair? If I just
repair the nodes one by one, will some users get the error of no data
exists, if the node responsible for the new replica is not yet
repaired?

Yudong

>
> You can move whole nodes by moving all their data with them, assigning a new ip, and updating the topology file if used.
>
> Aaron
>
> On 13 Apr 2011, at 07:56, Yudong Gao wrote:
>
>> Hi,
>>
>> What operations will be executed (and what is the associated overhead)
>> when the Keyspace replication factor is changed online, in a
>> multi-datacenter setup with NetworkTopologyStrategy?
>>
>> I checked the wiki and the archive of the mailing list and find this,
>> but it is not very complete.
>>
>> http://wiki.apache.org/cassandra/Operations
>> "
>> Replication factor is not really intended to be changed in a live
>> cluster either, but increasing it may be done if you (a) use
>> ConsistencyLevel.QUORUM or ALL (depending on your existing replication
>> factor) to make sure that a replica that actually has the data is
>> consulted, (b) are willing to accept downtime while anti-entropy
>> repair runs (see below), or (c) are willing to live with some clients
>> potentially being told no data exists if they read from the new
>> replica location(s) until repair is done.
>> "
>>
>> More specifically, in this scenario:
>>
>> {DC1:1, DC2:1} -> {DC2:1, DC3:1}
>>
>> 1. Can this be done online without shutting down the cluster? I
>> thought there is an "update keyspace" command in the cassandra-cli.
>>
>> 2. If so, what operations will be executed? Will new replicas be
>> created in new locations (in DC3) and existing replicas be deleted in
>> old locations (in DC1)?
>>
>> 3. Or they will be updated only with read with ConssitencyLevel.QUORUM
>> or All, or "nodetool repair"?
>>
>> Thanks!
>>
>> Yudong
>
>

Re: Update the Keyspace replication factor online

Posted by aaron morton <aa...@thelastpickle.com>.

Are you changing the replication factor or moving nodes ? 

To change the RF you need to repair and then once all repairing is done run cleanup to remove the hold data. 

You can move whole nodes by moving all their data with them, assigning a new ip, and updating the topology file if used.  

Aaron

On 13 Apr 2011, at 07:56, Yudong Gao wrote:

> Hi,
> 
> What operations will be executed (and what is the associated overhead)
> when the Keyspace replication factor is changed online, in a
> multi-datacenter setup with NetworkTopologyStrategy?
> 
> I checked the wiki and the archive of the mailing list and find this,
> but it is not very complete.
> 
> http://wiki.apache.org/cassandra/Operations
> "
> Replication factor is not really intended to be changed in a live
> cluster either, but increasing it may be done if you (a) use
> ConsistencyLevel.QUORUM or ALL (depending on your existing replication
> factor) to make sure that a replica that actually has the data is
> consulted, (b) are willing to accept downtime while anti-entropy
> repair runs (see below), or (c) are willing to live with some clients
> potentially being told no data exists if they read from the new
> replica location(s) until repair is done.
> "
> 
> More specifically, in this scenario:
> 
> {DC1:1, DC2:1} -> {DC2:1, DC3:1}
> 
> 1. Can this be done online without shutting down the cluster? I
> thought there is an "update keyspace" command in the cassandra-cli.
> 
> 2. If so, what operations will be executed? Will new replicas be
> created in new locations (in DC3) and existing replicas be deleted in
> old locations (in DC1)?
> 
> 3. Or they will be updated only with read with ConssitencyLevel.QUORUM
> or All, or "nodetool repair"?
> 
> Thanks!
> 
> Yudong