You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Tamar Fraenkel <ta...@tok-media.com> on 2012/06/04 13:24:50 UTC

repair

Hi!
I apologize if for this naive question.
When I run nodetool repair, is it enough to run on one of the nodes, or do
I need to run on each one of them?
Thanks

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956

Re: repair

Posted by Tamar Fraenkel <ta...@tok-media.com>.

Thank you all!
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Jun 4, 2012 at 3:16 PM, R. Verlangen <ro...@us2.nl> wrote:

> The "repair -pr" only repairs the nodes primary range: so is only usefull
> in day to day use. When you're recovering from a crash use it without -pr.
>
>
> 2012/6/4 Romain HARDOUIN <ro...@urssaf.fr>
>
>>
>> Run "repair -pr" in your cron.
>>
>> Tamar Fraenkel <ta...@tok-media.com> a écrit sur 04/06/2012 13:44:32 :
>>
>> > Thanks.
>> >
>> > I actually did just that with cron jobs running on different hours.
>> >
>> > I asked the question because I saw that when one of the logs was
>> > running the repair, all nodes logged some repair related entries in
>> > /var/log/cassandra/system.log
>> >
>> > Thanks again,
>> > Tamar Fraenkel
>> > Senior Software Engineer, TOK Media
>>
>
>
>
> --
> With kind regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

Re: repair

Posted by Tamar Fraenkel <ta...@tok-media.com>.

Thanks, one more question. On regular basis, should I run repair for the
system keyspace?

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Jun 4, 2012 at 5:02 PM, Viktor Jevdokimov <
Viktor.Jevdokimov@adform.com> wrote:

>  Why without –PR when recovering from crash?****
>
> ** **
>
> Repair without –PR runs full repair of the cluster, the node which
> receives a command is a repair controller, ALL nodes synchronizesreplicas at the same time, streaming data between each other.
> ****
>
> The problems may arise:****
>
> **·         **When streaming hangs (it tends to hang even on a stable
> network), repair session hangs (any version does re-stream?)****
>
> **·         **Network will be highly saturated****
>
> **·         **In case of high inconsistency some nodes may receive a lot
> of data, disk usage much more than 2x (depends on RF)****
>
> **·         **A lot of compactions will be pending****
>
> ** **
>
> IMO, best way to run repair is from script with –PR for single CF from
> single node at a time and monitoring progress, like:****
>
> repair -pr node1 ks1 cf1****
>
> repair -pr node2 ks1 cf1****
>
> repair -pr node3 ks1 cf1****
>
> repair -pr node1 ks1 cf2****
>
> repair -pr node2 ks1 cf2****
>
> repair -pr node3 ks1 cf2****
>
> With some progress or other control in between, your choice.****
>
> ** **
>
> Use repair with care, do not let your cluster go down.****
>
> ** **
>
> ** **
>
> ** **
>
>
>    Best regards / Pagarbiai
> *Viktor Jevdokimov*
> Senior Developer
>
> Email: Viktor.Jevdokimov@adform.com
> Phone: +370 5 212 3063, Fax +370 5 261 0453
> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
> Follow us on Twitter: @adforminsider <http://twitter.com/#!/adforminsider>
> What is Adform: watch this short video <http://vimeo.com/adform/display>
>  [image: Adform News] <http://www.adform.com>
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>   *From:* R. Verlangen [mailto:robin@us2.nl]
> *Sent:* Monday, June 04, 2012 15:17
> *To:* user@cassandra.apache.org
> *Subject:* Re: repair****
>
> ** **
>
> The "repair -pr" only repairs the nodes primary range: so is only usefullin day to day use. When you're recovering from a crash use it without -
> pr.****
>
> 2012/6/4 Romain HARDOUIN <ro...@urssaf.fr>****
>
>
> Run "repair -pr" in your cron.
>
> Tamar Fraenkel <ta...@tok-media.com> a écrit sur 04/06/2012 13:44:32 :
>
> > Thanks.  ****
>
> >
> > I actually did just that with cron jobs running on different hours.
> >
> > I asked the question because I saw that when one of the logs was
> > running the repair, all nodes logged some repair related entries in
> > /var/log/cassandra/system.log
> >
> > Thanks again,
> > Tamar Fraenkel
> > Senior Software Engineer, TOK Media ****
>
>
>
> ****
>
> ** **
>
> --
> With kind regards,****
>
> ** **
>
> Robin Verlangen****
>
> *Software engineer*****
>
> ** **
>
> W www.robinverlangen.nl****
>
> E robin@us2.nl****
>
> ** **
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.****
>
> ** **
>

RE: repair

Posted by Viktor Jevdokimov <Vi...@adform.com>.

Why without -PR when recovering from crash?

Repair without -PR runs full repair of the cluster, the node which receives a command is a repair controller, ALL nodes synchronizes replicas at the same time, streaming data between each other.
The problems may arise:

·         When streaming hangs (it tends to hang even on a stable network), repair session hangs (any version does re-stream?)

·         Network will be highly saturated

·         In case of high inconsistency some nodes may receive a lot of data, disk usage much more than 2x (depends on RF)

·         A lot of compactions will be pending

IMO, best way to run repair is from script with -PR for single CF from single node at a time and monitoring progress, like:
repair -pr node1 ks1 cf1
repair -pr node2 ks1 cf1
repair -pr node3 ks1 cf1
repair -pr node1 ks1 cf2
repair -pr node2 ks1 cf2
repair -pr node3 ks1 cf2
With some progress or other control in between, your choice.

Use repair with care, do not let your cluster go down.





Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Email: Viktor.Jevdokimov@adform.com<ma...@adform.com>
Phone: +370 5 212 3063, Fax +370 5 261 0453
J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
Follow us on Twitter: @adforminsider<http://twitter.com/#!/adforminsider>
What is Adform: watch this short video<http://vimeo.com/adform/display>

[Adform News] <http://www.adform.com>


Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

From: R. Verlangen [mailto:robin@us2.nl]
Sent: Monday, June 04, 2012 15:17
To: user@cassandra.apache.org
Subject: Re: repair

The "repair -pr" only repairs the nodes primary range: so is only usefull in day to day use. When you're recovering from a crash use it without -pr.
2012/6/4 Romain HARDOUIN <ro...@urssaf.fr>>

Run "repair -pr" in your cron.

Tamar Fraenkel <ta...@tok-media.com>> a écrit sur 04/06/2012 13:44:32 :

> Thanks.
>
> I actually did just that with cron jobs running on different hours.
>
> I asked the question because I saw that when one of the logs was
> running the repair, all nodes logged some repair related entries in
> /var/log/cassandra/system.log
>
> Thanks again,
> Tamar Fraenkel
> Senior Software Engineer, TOK Media



--
With kind regards,

Robin Verlangen
Software engineer

W www.robinverlangen.nl<http://www.robinverlangen.nl>
E robin@us2.nl<ma...@us2.nl>

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

Re: repair

Posted by "R. Verlangen" <ro...@us2.nl>.

The "repair -pr" only repairs the nodes primary range: so is only usefull
in day to day use. When you're recovering from a crash use it without -pr.

2012/6/4 Romain HARDOUIN <ro...@urssaf.fr>

>
> Run "repair -pr" in your cron.
>
> Tamar Fraenkel <ta...@tok-media.com> a écrit sur 04/06/2012 13:44:32 :
>
> > Thanks.
> >
> > I actually did just that with cron jobs running on different hours.
> >
> > I asked the question because I saw that when one of the logs was
> > running the repair, all nodes logged some repair related entries in
> > /var/log/cassandra/system.log
> >
> > Thanks again,
> > Tamar Fraenkel
> > Senior Software Engineer, TOK Media
>



-- 
With kind regards,

Robin Verlangen
*Software engineer*
*
*
W www.robinverlangen.nl
E robin@us2.nl

Disclaimer: The information contained in this message and attachments is
intended solely for the attention and use of the named addressee and may be
confidential. If you are not the intended recipient, you are reminded that
the information remains the property of the sender. You must not use,
disclose, distribute, copy, print or rely on this e-mail. If you have
received this message in error, please contact the sender immediately and
irrevocably delete this message and any copies.

Re: repair

Posted by Romain HARDOUIN <ro...@urssaf.fr>.

Run "repair -pr" in your cron.

Tamar Fraenkel <ta...@tok-media.com> a écrit sur 04/06/2012 13:44:32 :

> Thanks. 
> 
> I actually did just that with cron jobs running on different hours.
> 
> I asked the question because I saw that when one of the logs was 
> running the repair, all nodes logged some repair related entries in 
> /var/log/cassandra/system.log
> 
> Thanks again,
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media

Re: repair

Posted by Tamar Fraenkel <ta...@tok-media.com>.

Thanks.

I actually did just that with cron jobs running on different hours.

I asked the question because I saw that when one of the logs was running
the repair, all nodes logged some repair related entries in /var/log/
cassandra/system.log

Thanks again,
*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

tamar@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Mon, Jun 4, 2012 at 2:35 PM, Rishabh Agrawal <
rishabh.agrawal@impetus.co.in> wrote:

>  Hello,
>
>
>
> As far as my knowledge goes, it works per node basis. So you have to run
> on different nodes. I would suggest you to not to execute it simultaneously
> on all nodes in a production environment.
>
>
>
> Regards
>
> Rishabh Agrawal
>
>
>
> *From:* Tamar Fraenkel [mailto:tamar@tok-media.com]
> *Sent:* Monday, June 04, 2012 4:25 AM
> *To:* user@cassandra.apache.org
> *Subject:* repair
>
>
>
> Hi!
>
> I apologize if for this naive question.
>
> When I run nodetool repair, is it enough to run on one of the nodes, or
> do I need to run on each one of them?
>
> Thanks
>
>
>   *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
>
> tamar@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
>
>
>
>
> ------------------------------
>
> Register for Impetus webinar ‘User Experience Design for iPad
> Applications’ June 8(10:00am PT). http://lf1.me/f9/
>
> Impetus’ Head of Labs to present on ‘Integrating Big Data technologies in
> your IT portfolio’ at Cloud Expo, NY (June 11-14). Contact us for a
> complimentary pass.Impetus also sponsoring the Yahoo Summit 2012.
>
>
> NOTE: This message may contain information that is confidential,
> proprietary, privileged or otherwise protected by law. The message is
> intended solely for the named addressee. If received in error, please
> destroy and notify the sender. Any use of this email is prohibited when
> received in error. Impetus does not represent, warrant and/or guarantee,
> that the integrity of this communication has been maintained nor that the
> communication is free of errors, virus, interception or interference.
>

RE: repair

Posted by Rishabh Agrawal <ri...@impetus.co.in>.

Hello,

As far as my knowledge goes, it works per node basis. So you have to run on different nodes. I would suggest you to not to execute it simultaneously on all nodes in a production environment.

Regards
Rishabh Agrawal

From: Tamar Fraenkel [mailto:tamar@tok-media.com]
Sent: Monday, June 04, 2012 4:25 AM
To: user@cassandra.apache.org
Subject: repair

Hi!
I apologize if for this naive question.
When I run nodetool repair, is it enough to run on one of the nodes, or do I need to run on each one of them?
Thanks

Tamar Fraenkel
Senior Software Engineer, TOK Media
[Inline image 1]

tamar@tok-media.com<ma...@tok-media.com>
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956

________________________________

Register for Impetus webinar 'User Experience Design for iPad Applications' June 8(10:00am PT). http://lf1.me/f9/

Impetus' Head of Labs to present on 'Integrating Big Data technologies in your IT portfolio' at Cloud Expo, NY (June 11-14). Contact us for a complimentary pass.Impetus also sponsoring the Yahoo Summit 2012.

NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

RE repair

Posted by Samuel CARRIERE <sa...@urssaf.fr>.

Hi,
It is not enough to run the repair in one node, except if the node contain 
all the data (ex : 3 node cluster with RF=3).
In the general case, the best is to launch the repair in every node, with 
the "-rp" option (use -rp to repair only the first range returned by the 
partitioner)





Tamar Fraenkel <ta...@tok-media.com> 
04/06/2012 13:24
Veuillez répondre à
user@cassandra.apache.org


A
user@cassandra.apache.org
cc

Objet
repair






Hi!
I apologize if for this naive question.
When I run nodetool repair, is it enough to run on one of the nodes, or do 
I need to run on each one of them?
Thanks

Tamar Fraenkel 
Senior Software Engineer, TOK Media 



tamar@tok-media.com
Tel:   +972 2 6409736 
Mob:  +972 54 8356490 
Fax:   +972 2 5612956