You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Brice Argenson <ba...@gmail.com> on 2015/05/22 21:00:38 UTC

Periodic Anti-Entropy repair

Hi everyone,

We are currently migrating from DSE to Apache Cassandra and we would like
to put in place an automatic and periodic *nodetool repair* execution to
replace the one executed by OpsCenter.

I wanted to create a script / service that would run something like that:




*token_rings = `nodetool ring | awk '{print $8}’` for(int i = 0; i <
token_rings.length; i += 2) {    `nodetool repair -st token_rings[i] -et
token_rings[i+1]` }*

That script / service would run every week (our GCGrace is 10 days) and
would repair all the ranges of the ring one by one.

I also looked a bit on Google and I found that script:
https://github.com/BrianGallew/cassandra_range_repair
It seems to do something equivalent but it also seems to run the repair
node by node instead of the complete ring.
>From my understanding, that would mean that the script has to be run for
every node of the cluster and that all token ranges would be repair as many
time as the number of replicas containing it.


Is there something I misunderstand?
Which approach is better?
How do you handle your Periodic Anti-Entropy Repairs?


Thanks a lot!

Re: Periodic Anti-Entropy repair

Posted by Alprema <al...@alprema.com>.

Did you have a look at Reaper? https://github.com/spotify/cassandra-reaper
it's a project created by Spotify to address this issue, I did not evaluate
it yet but it looks promising.
 On May 22, 2015 9:01 PM, "Brice Argenson" <ba...@gmail.com> wrote:

> Hi everyone,
>
> We are currently migrating from DSE to Apache Cassandra and we would like
> to put in place an automatic and periodic *nodetool repair* execution to
> replace the one executed by OpsCenter.
>
> I wanted to create a script / service that would run something like that:
>
>
>
>
> *token_rings = `nodetool ring | awk '{print $8}’` for(int i = 0; i <
> token_rings.length; i += 2) {    `nodetool repair -st token_rings[i] -et
> token_rings[i+1]` }*
>
> That script / service would run every week (our GCGrace is 10 days) and
> would repair all the ranges of the ring one by one.
>
> I also looked a bit on Google and I found that script:
> https://github.com/BrianGallew/cassandra_range_repair
> It seems to do something equivalent but it also seems to run the repair
> node by node instead of the complete ring.
> From my understanding, that would mean that the script has to be run for
> every node of the cluster and that all token ranges would be repair as many
> time as the number of replicas containing it.
>
>
> Is there something I misunderstand?
> Which approach is better?
> How do you handle your Periodic Anti-Entropy Repairs?
>
>
> Thanks a lot!
>

RE: Periodic Anti-Entropy repair

Posted by SE...@homedepot.com.

We use nodetool repair -pr on each node through the week. Basically I have a cron job that checks a schedule on each host to see “should I run repair today?” If yes, it kicks off repair.

Sean Durity

From: Tiwari, Tarun [mailto:Tarun.Tiwari@Kronos.com]
Sent: Monday, May 25, 2015 12:41 AM
To: user@cassandra.apache.org
Subject: RE: Periodic Anti-Entropy repair

We have encountered issues of very long running nodetool repair when we ran it node by node on really large dataset. It even kept on running for a week in some cases.
IMO the strategy you are choosing of repairing nodes by –st and –et is good one and does the same work in small increments logs of which can be analyzed easily.

In addition my suggestion would be to use –h option to connect to the node from outside, and take care of the fact that node tool ring will give even –ve token ranges in the ‘for’ loop. You can go from -2^63 to first ring value, then from (there+1) to next token value. Better not use i+=2 because token values are not necessarily even numbers.

Regards,
Tarun

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Sunday, May 24, 2015 6:31 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Periodic Anti-Entropy repair

You should use nodetool repair -pr on every node to make sure that each range is repaired only once.

Thanks
Anuj Wadehra

Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________
From:"Brice Argenson" <ba...@gmail.com>>
Date:Sat, 23 May, 2015 at 12:31 am
Subject:Periodic Anti-Entropy repair
Hi everyone,

We are currently migrating from DSE to Apache Cassandra and we would like to put in place an automatic and periodic nodetool repair execution to replace the one executed by OpsCenter.

I wanted to create a script / service that would run something like that:

token_rings = `nodetool ring | awk '{print $8}’`
for(int i = 0; i < token_rings.length; i += 2) {
   `nodetool repair -st token_rings[i] -et token_rings[i+1]`
}

That script / service would run every week (our GCGrace is 10 days) and would repair all the ranges of the ring one by one.

I also looked a bit on Google and I found that script: https://github.com/BrianGallew/cassandra_range_repair
It seems to do something equivalent but it also seems to run the repair node by node instead of the complete ring.
From my understanding, that would mean that the script has to be run for every node of the cluster and that all token ranges would be repair as many time as the number of replicas containing it.

Is there something I misunderstand?
Which approach is better?
How do you handle your Periodic Anti-Entropy Repairs?

Thanks a lot!

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

RE: Periodic Anti-Entropy repair

Posted by "Tiwari, Tarun" <Ta...@Kronos.com>.

We have encountered issues of very long running nodetool repair when we ran it node by node on really large dataset. It even kept on running for a week in some cases.
IMO the strategy you are choosing of repairing nodes by –st and –et is good one and does the same work in small increments logs of which can be analyzed easily.

In addition my suggestion would be to use –h option to connect to the node from outside, and take care of the fact that node tool ring will give even –ve token ranges in the ‘for’ loop. You can go from -2^63 to first ring value, then from (there+1) to next token value. Better not use i+=2 because token values are not necessarily even numbers.

Regards,
Tarun

From: Anuj Wadehra [mailto:anujw_2003@yahoo.co.in]
Sent: Sunday, May 24, 2015 6:31 AM
To: user@cassandra.apache.org
Subject: Re: Periodic Anti-Entropy repair

You should use nodetool repair -pr on every node to make sure that each range is repaired only once.


Thanks
Anuj Wadehra

Sent from Yahoo Mail on Android<https://overview.mail.yahoo.com/mobile/?.src=Android>

________________________________
From:"Brice Argenson" <ba...@gmail.com>>
Date:Sat, 23 May, 2015 at 12:31 am
Subject:Periodic Anti-Entropy repair
Hi everyone,

We are currently migrating from DSE to Apache Cassandra and we would like to put in place an automatic and periodic nodetool repair execution to replace the one executed by OpsCenter.

I wanted to create a script / service that would run something like that:

token_rings = `nodetool ring | awk '{print $8}’`
for(int i = 0; i < token_rings.length; i += 2) {
   `nodetool repair -st token_rings[i] -et token_rings[i+1]`
}

That script / service would run every week (our GCGrace is 10 days) and would repair all the ranges of the ring one by one.

I also looked a bit on Google and I found that script: https://github.com/BrianGallew/cassandra_range_repair
It seems to do something equivalent but it also seems to run the repair node by node instead of the complete ring.
From my understanding, that would mean that the script has to be run for every node of the cluster and that all token ranges would be repair as many time as the number of replicas containing it.


Is there something I misunderstand?
Which approach is better?
How do you handle your Periodic Anti-Entropy Repairs?


Thanks a lot!

Re: Periodic Anti-Entropy repair

Posted by Anuj Wadehra <an...@yahoo.co.in>.

You should use nodetool repair -pr on every node to make sure that each range is repaired only once.



Thanks

Anuj Wadehra

Sent from Yahoo Mail on Android

From:"Brice Argenson" <ba...@gmail.com>
Date:Sat, 23 May, 2015 at 12:31 am
Subject:Periodic Anti-Entropy repair

Hi everyone,


We are currently migrating from DSE to Apache Cassandra and we would like to put in place an automatic and periodic nodetool repair execution to replace the one executed by OpsCenter.


I wanted to create a script / service that would run something like that:


token_rings = `nodetool ring | awk '{print $8}’` 
for(int i = 0; i < token_rings.length; i += 2) { 
   `nodetool repair -st token_rings[i] -et token_rings[i+1]` 
}


That script / service would run every week (our GCGrace is 10 days) and would repair all the ranges of the ring one by one.


I also looked a bit on Google and I found that script: https://github.com/BrianGallew/cassandra_range_repair

It seems to do something equivalent but it also seems to run the repair node by node instead of the complete ring. 

From my understanding, that would mean that the script has to be run for every node of the cluster and that all token ranges would be repair as many time as the number of replicas containing it.



Is there something I misunderstand? 

Which approach is better? 

How do you handle your Periodic Anti-Entropy Repairs?



Thanks a lot!