You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Ian Rose <ia...@fullstory.com> on 2015/03/19 18:30:11 UTC

best way to measure repair times?

Howdy -

I'd like to (a) monitor how long my repairs are taking, and (b) know when a
repair is finished so that I can take some kind of followup action.  What's
the best way to tackle either or both of these?

Some potentially relevant details:

- running community apache cassandra (not DSE)
- version 2.0.13
- we currently trigger repairs via an external timer that
calls forceRepairAsync on the StorageService mbean via JMX

Thanks!
- Ian

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 4:56 PM, Rahul Neelakantan <ra...@rahul.be> wrote:

> Wouldn't GC Grace set to 34 days increase the bloat in the DB?
>

Yes, but as I say in the ticket, my belief is that the fixed cost of repair
combined with the fact that it frequently doesn't work at all (hangs
forever, etc.) is much more expensive than the on-disk bloat. With
incremental and/or snapshot repair which actually works (arriving Real Soon
Now) the inputs into the cost/benefit analysis change.

On ticket, Jonathan Ellis waves his hands and asserts that current costs
are likely to be equal in the typical user's case. This suggests that the
typical user is doing a bunch of DELETE in a log structured database with
immutable data files, which seems rather unlikely to me...

=Rob

Re: best way to measure repair times?

Posted by Rahul Neelakantan <ra...@rahul.be>.

Wouldn't GC Grace set to 34 days increase the bloat in the DB?

Rahul

> On Mar 19, 2015, at 3:02 PM, Robert Coli <rc...@eventbrite.com> wrote:
> 
>> On Thu, Mar 19, 2015 at 10:30 AM, Ian Rose <ia...@fullstory.com> wrote:
>> I'd like to (a) monitor how long my repairs are taking, and (b) know when a repair is finished so that I can take some kind of followup action.  What's the best way to tackle either or both of these?
> 
> https://issues.apache.org/jira/browse/CASSANDRA-5483
> 
> Also consider increasing your gc_grace_seconds to 34 days by default (CASSANDRA-5850) to decrease the frequency of repair.
> 
> =Rob
>

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 10:30 AM, Ian Rose <ia...@fullstory.com> wrote:

> I'd like to (a) monitor how long my repairs are taking, and (b) know when
> a repair is finished so that I can take some kind of followup action.
> What's the best way to tackle either or both of these?
>

https://issues.apache.org/jira/browse/CASSANDRA-5483

Also consider increasing your gc_grace_seconds to 34 days by default
(CASSANDRA-5850) to decrease the frequency of repair.

=Rob

RE: best way to measure repair times?

Posted by Jason Kushmaul | WDA <ja...@wda.com>.

Ian,

In my experience I don’t get any output from repair (2.0.7) that is useful until the keyspace is finished.  Perhaps this has been solved but we do something much more painful:


We tail the log on the node having repair run on it, watching for the first repair session, and then count each “session completed” line.  Each keyspace being repaired will produce num_tokens worth of messages.

Find the start time:
$grep AntiEntropy /var/log/cassandra/system.log | grep –m 1 "new session"
INFO [AntiEntropySessions:1] 2015-01-06 08:00:01,817 RepairSession.java (line 244) [repair #1c1023c0-95b0-11e4-abc7-9d8c76a06ae7] new session: will sync /10.x.y.z, /10.x.y.z on range (2770269247941187446,2771538486312712323] for menomena.[x, y, z]
Note – you have to catch the *first* message, there will be more to follow.  This is something that would be great if there was a differentiator in the log output to know if it is the initial start of a repair vs a new range.


So start_time = 2015-01-06 08:00:01,817


From there you count session completed messages:
$grep AntiEntropy /var/log/cassandra/system.log | grep "session completed" | wc -l
INFO [AntiEntropySessions:192] 2015-01-06 14:35:13,874 RepairSession.java (line 282) [repair #1c1023c0-95b0-11e4-abc7-9d8c76a06ae7] session completed successfully

Since I have num_tokens=256; If I see a count of 412, I know that OpsCenter(256) is finished and menomena(256) is about 40% finished.

As Jan said, you could then use this to calculate remaining time from the start time and the remainder of the ranges.

I’ve found this to give me immediate indication of progress, rather than having to wait for the keyspace to be finished.  We are running 2.0.7, maybe some of this has been exposed through nodetool repair (which would be sweet).  This seems to be more or less accurate, but anyone correct me if I am wrong please.  We use this more for automatically detecting long running repairs more than to simply watch progress, which our internal zabbix server will whine about it to my team.


Jason Kushmaul | V.P. Mobile Engineering
4050 Hunsaker Drive | East Lansing, MI 48823 USA
517-337-2701 x 5225| 517-337-2754 (fax)

From: Jan [mailto:cnet62@yahoo.com]
Sent: Thursday, March 19, 2015 4:04 PM
To: user@cassandra.apache.org
Subject: Re: best way to measure repair times?

Ian;

to respond to your specific question:

You could pipe the output of your repair into a file and subsequently determine the time taken.
example:

nodetool repair -dc DC1

[2014-07-24 21:59:55,326] Nothing to repair for keyspace 'system'

[2014-07-24 21:59:55,617] Starting repair command #2, repairing 490 ranges

  for keyspace system_traces (seq=true, full=true)

[2014-07-24 22:23:14,299] Repair session 323b9490-137e-11e4-88e3-c972e09793ca

  for range (820981369067266915,822627736366088177] finished

[2014-07-24 22:23:14,320] Repair session 38496a61-137e-11e4-88e3-c972e09793ca

  for range (2506042417712465541,2515941262699962473] finished



What to look for:

a)  Look for the specific name of the Keyspace & the word 'starting repair'

b)  Look for the word 'finished'.

c)  Compute the average time per keyspace and you would be able to have a rough idea of how long your repairs would take on a regular basis.    This is only for continual operational repair, not the first time its done.



hope this helps

Jan/





On Thursday, March 19, 2015 12:55 PM, Paulo Motta <pa...@gmail.com>> wrote:

From: http://www.datastax.com/dev/blog/modern-hinted-handoff
Repair and the fine print
At first glance, it may appear that Hinted Handoff lets you safely get away without needing repair. This is only true if you never have hardware failure. Hardware failure means that

 1.  We lose “historical” data for which the write has already finished, so there is nothing to tell the rest of the cluster exactly what data has gone missing
 2.  We can also lose hints-not-yet-replayed from requests the failed node coordinated
With sufficient dedication, you can get by with “only run repair after hardware failure and rely on hinted handoff the rest of the time,” but as your clusters grow (and hardware failure becomes more common) performing repair as a one-off special case will become increasingly difficult to do perfectly. Thus, we continue to recommend running a full repair weekly.


2015-03-19 16:42 GMT-03:00 Robert Coli <rc...@eventbrite.com>>:
On Thu, Mar 19, 2015 at 12:13 PM, Ali Akhtar <al...@gmail.com>> wrote:
Cassandra doesn't guarantee eventual consistency?

If you run regularly scheduled repair, it does. If you do not run repair, it does not.

Hinted handoff, for example, is considered an optimization for repair, and does not assert that it provides a consistency guarantee.

=Rob
http://twitter.com/rcolidba



--
Paulo Ricardo

--
European Master in Distributed Computing
Royal Institute of Technology - KTH
Instituto Superior Técnico - IST
http://paulormg.com<http://paulormg.com/>

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 1:03 PM, Jan <cn...@yahoo.com> wrote:

> to respond to your specific question:
>
> You could pipe the output of your repair into a file and subsequently
> determine the time taken.
>

By this method, what is the duration of a repair which will never complete?

=Rob

Re: best way to measure repair times?

Posted by Ian Rose <ia...@fullstory.com>.

Thanks Jan, although I'm a bit unsure of the details.  It looks like when
you run a repair this actually occurs over several "sessions".  e.g. in
your example above there are 2 different "repair session [...] finished"
lines.  So does it makes sense that I would want to measure between when I
first see the "Starting repair command..." line until the *last* "repair
session [...] finished" line?  If so, how do I know when I have seen the
last session finish?  Is there a way to know how many sessions there will
be (perhaps 1 per range)?  And how do I correlate session logs to the
repair, since the session logs identify the repair like
"#22f77ad0-cad0-11e4-8f34-77e1731d15ff" whereas the "starting repair" log
identifies it with a much smaller number (e.g. "repair command #2").

- Ian


On Thu, Mar 19, 2015 at 4:03 PM, Jan <cn...@yahoo.com> wrote:

> Ian;
>
> to respond to your specific question:
>
> You could pipe the output of your repair into a file and subsequently
> determine the time taken.
> example:
>
> nodetool repair -dc DC1
> [2014-07-24 21:59:55,326] Nothing to repair for keyspace 'system'
> [2014-07-24 21:59:55,617] Starting repair command #2, repairing 490 ranges
>   for keyspace system_traces (seq=true, full=true)
> [2014-07-24 22:23:14,299] Repair session 323b9490-137e-11e4-88e3-c972e09793ca
>   for range (820981369067266915,822627736366088177] finished
> [2014-07-24 22:23:14,320] Repair session 38496a61-137e-11e4-88e3-c972e09793ca
>   for range (2506042417712465541,2515941262699962473] finished
>
>
> What to look for:
>
> a)  Look for the specific name of the Keyspace & the word 'starting repair'
>
> b)  Look for the word 'finished'.
>
> c)  Compute the average time per keyspace and you would be able to have a rough idea of how long your repairs would take on a regular basis.    This is only for continual operational repair, not the first time its done.
>
>
> hope this helps
>
> Jan/
>
>
>
>
>
>
>   On Thursday, March 19, 2015 12:55 PM, Paulo Motta <
> pauloricardomg@gmail.com> wrote:
>
>
> From: http://www.datastax.com/dev/blog/modern-hinted-handoff
> Repair and the fine print
> At first glance, it may appear that Hinted Handoff lets you safely get
> away without needing repair. This is only true if you never have hardware
> failure. Hardware failure means that
>
>    1. We lose “historical” data for which the write has already finished,
>    so there is nothing to tell the rest of the cluster exactly what data has
>    gone missing
>    2. We can also lose hints-not-yet-replayed from requests the failed
>    node coordinated
>
> With sufficient dedication, you can get by with “only run repair after
> hardware failure and rely on hinted handoff the rest of the time,” but as
> your clusters grow (and hardware failure becomes more common) performing
> repair as a one-off special case will become increasingly difficult to do
> perfectly. Thus, we continue to recommend running a full repair weekly.
>
>
> 2015-03-19 16:42 GMT-03:00 Robert Coli <rc...@eventbrite.com>:
>
> On Thu, Mar 19, 2015 at 12:13 PM, Ali Akhtar <al...@gmail.com> wrote:
>
> Cassandra doesn't guarantee eventual consistency?
>
>
> If you run regularly scheduled repair, it does. If you do not run repair,
> it does not.
>
> Hinted handoff, for example, is considered an optimization for repair, and
> does not assert that it provides a consistency guarantee.
>
> =Rob
> http://twitter.com/rcolidba
>
>
>
>
> --
> Paulo Ricardo
>
> --
> European Master in Distributed Computing
>
> *Royal Institute of Technology - KTH*
> *Instituto Superior Técnico - IST*
> *http://paulormg.com <http://paulormg.com/>*
>
>
>

Re: best way to measure repair times?

Posted by Jan <cn...@yahoo.com>.

Ian; 
to respond to your specific question:
You could pipe the output of your repair into a file and subsequently determine the time taken.    example: nodetool repair -dc DC1
[2014-07-24 21:59:55,326] Nothing to repair for keyspace 'system'
[2014-07-24 21:59:55,617] Starting repair command #2, repairing 490 ranges 
  for keyspace system_traces (seq=true, full=true)
[2014-07-24 22:23:14,299] Repair session 323b9490-137e-11e4-88e3-c972e09793ca 
  for range (820981369067266915,822627736366088177] finished
[2014-07-24 22:23:14,320] Repair session 38496a61-137e-11e4-88e3-c972e09793ca 
  for range (2506042417712465541,2515941262699962473] finished
What to look for: a)  Look for the specific name of the Keyspace & the word 'starting repair'b)  Look for the word 'finished'. c)  Compute the average time per keyspace and you would be able to have a rough idea of how long your repairs would take on a regular basis.    This is only for continual operational repair, not the first time its done.  
hope this helpsJan/
 



     On Thursday, March 19, 2015 12:55 PM, Paulo Motta <pa...@gmail.com> wrote:
   

 From: http://www.datastax.com/dev/blog/modern-hinted-handoff

Repair and the fine print
At first glance, it may appear that Hinted Handoff lets you safely get away without needing repair. This is only true if you never have hardware failure. Hardware failure means that   
   - We lose “historical” data for which the write has already finished, so there is nothing to tell the rest of the cluster exactly what data has gone missing
   - We can also lose hints-not-yet-replayed from requests the failed node coordinated
With sufficient dedication, you can get by with “only run repair after hardware failure and rely on hinted handoff the rest of the time,” but as your clusters grow (and hardware failure becomes more common) performing repair as a one-off special case will become increasingly difficult to do perfectly. Thus, we continue to recommend running a full repair weekly.

2015-03-19 16:42 GMT-03:00 Robert Coli <rc...@eventbrite.com>:

On Thu, Mar 19, 2015 at 12:13 PM, Ali Akhtar <al...@gmail.com> wrote:

Cassandra doesn't guarantee eventual consistency? 

If you run regularly scheduled repair, it does. If you do not run repair, it does not.
Hinted handoff, for example, is considered an optimization for repair, and does not assert that it provides a consistency guarantee.
=Rob http://twitter.com/rcolidba



-- 
Paulo Ricardo
-- 
European Master in Distributed Computing
Royal Institute of Technology - KTH
Instituto Superior Técnico - ISThttp://paulormg.com

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 12:53 PM, Paulo Motta <pa...@gmail.com>
wrote:

> This is only true if you never have hardware failure. Hardware failure
> means that
>
For the record, I hate this formulation for being a little too clever.

" This is never true, because we live in a world where hardware fails. "

Would be a better phrasing.

=Rob

Re: best way to measure repair times?

Posted by Paulo Motta <pa...@gmail.com>.

From: http://www.datastax.com/dev/blog/modern-hinted-handoff
Repair and the fine print

At first glance, it may appear that Hinted Handoff lets you safely get away
without needing repair. This is only true if you never have hardware
failure. Hardware failure means that

   1. We lose “historical” data for which the write has already finished,
   so there is nothing to tell the rest of the cluster exactly what data has
   gone missing
   2. We can also lose hints-not-yet-replayed from requests the failed node
   coordinated

With sufficient dedication, you can get by with “only run repair after
hardware failure and rely on hinted handoff the rest of the time,” but as
your clusters grow (and hardware failure becomes more common) performing
repair as a one-off special case will become increasingly difficult to do
perfectly. Thus, we continue to recommend running a full repair weekly.

2015-03-19 16:42 GMT-03:00 Robert Coli <rc...@eventbrite.com>:

> On Thu, Mar 19, 2015 at 12:13 PM, Ali Akhtar <al...@gmail.com> wrote:
>
>> Cassandra doesn't guarantee eventual consistency?
>>
>
> If you run regularly scheduled repair, it does. If you do not run repair,
> it does not.
>
> Hinted handoff, for example, is considered an optimization for repair, and
> does not assert that it provides a consistency guarantee.
>
> =Rob
> http://twitter.com/rcolidba
>

-- 
Paulo Ricardo

-- 
European Master in Distributed Computing

*Royal Institute of Technology - KTH*
*Instituto Superior Técnico - IST*
*http://paulormg.com <http://paulormg.com>*

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 12:13 PM, Ali Akhtar <al...@gmail.com> wrote:

> Cassandra doesn't guarantee eventual consistency?
>

If you run regularly scheduled repair, it does. If you do not run repair,
it does not.

Hinted handoff, for example, is considered an optimization for repair, and
does not assert that it provides a consistency guarantee.

=Rob
http://twitter.com/rcolidba

Re: best way to measure repair times?

Posted by Ali Akhtar <al...@gmail.com>.

Cassandra doesn't guarantee eventual consistency?

On Fri, Mar 20, 2015 at 12:04 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Thu, Mar 19, 2015 at 10:32 AM, Ali Akhtar <al...@gmail.com> wrote:
>
>> Just wondering - why do you have to trigger the repairs? Is that
>> necessary in Cassandra?
>>
>
> Manual repair is the only mechanism in Cassandra which guarantees
> consistency.
>
> A repair must be run once per gc_grace_seconds in every column family that
> does DELETE-like[1] operations.
>
> =Rob
> [1] including some forms of CQL UPDATE, etc.
>
>

Re: best way to measure repair times?

Posted by Robert Coli <rc...@eventbrite.com>.

On Thu, Mar 19, 2015 at 10:32 AM, Ali Akhtar <al...@gmail.com> wrote:

> Just wondering - why do you have to trigger the repairs? Is that necessary
> in Cassandra?
>

Manual repair is the only mechanism in Cassandra which guarantees
consistency.

A repair must be run once per gc_grace_seconds in every column family that
does DELETE-like[1] operations.

=Rob
[1] including some forms of CQL UPDATE, etc.

Re: best way to measure repair times?

Posted by Ali Akhtar <al...@gmail.com>.

Just wondering - why do you have to trigger the repairs? Is that necessary
in Cassandra?

(Sorry for the off topic question)

On Thu, Mar 19, 2015 at 10:30 PM, Ian Rose <ia...@fullstory.com> wrote:

> Howdy -
>
> I'd like to (a) monitor how long my repairs are taking, and (b) know when
> a repair is finished so that I can take some kind of followup action.
> What's the best way to tackle either or both of these?
>
> Some potentially relevant details:
>
> - running community apache cassandra (not DSE)
> - version 2.0.13
> - we currently trigger repairs via an external timer that
> calls forceRepairAsync on the StorageService mbean via JMX
>
> Thanks!
> - Ian
>
>