You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Lokesh Shrivastava <lo...@gmail.com> on 2016/09/18 12:58:21 UTC

Nodetool repair

Hi,

I tried to run nodetool repair command on one of my keyspaces and found
that it took lot more time than I anticipated. Is there a way to know in
advance the ETA of manual repair before triggering it? I believe repair
performs following operations -

1) Major compaction
2) Exchange of merkle trees with neighbouring nodes.

Is there any other operation performed during manual repair? What if I kill
the process in the middle?

Thanks.
Lokesh

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Romain,

I was trying what you mentioned as below:

a. nodetool stop VALIDATION
b. echo run -b org.apache.cassandra.db:type=StorageService
forceTerminateAllRepairSessions | java -jar
/tmp/jmxterm/jmxterm-1.0-alpha-4-uber.jar
-l 127.0.0.1:7199

to stop a seemingly forever-going repair but seeing really odd behavior
with C* 2.0.9. Here is what I did:
1. First, I run 'nodetool tpstats' on all nodes in the cluster and seeing
only one node have 1 active pending AntiEntropySessions. All other nodes do
not have any pending or active AntiEntropySessions.
2. Then I grep 'Repair' on all logs on all nodes and seeing absolutely no
repair related activity in these logs for the past day.
3. Then on the node that has active AntiEntropySessions, I did steps 'a'
and 'b' above. Now all the sudden I start seeing repair activities, on
nodes that did not have pending AntiEntropySessions, I am seeing the
following in their logs:
INFO [NonPeriodicTasks:1] 2016-09-29 17:12:53,469 StreamingRepairTask.java
(line 87) [repair #e80e17d0-8667-11e6-a801-e172d7a67134] streaming task
succeed, returning response to /10.253.2.166
On node 10.253.2.166 which has active pending AntiEntropySessions, I am
seeing the following in the log:
INFO [AntiEntropySessions:136] 2016-09-29 17:03:02,405 RepairSession.java
(line 282) [repair #812dafe0-8666-11e6-a801-e172d7a67134] session completed
successfully

So it seems to me that by doing forceTerminateAllRepairSessions, it
actually 'wakes up' the dormant repair so it goes again. So far, the only
way I can get working to stop a repair is to restart C* node where the
repair command is initiated.

Thanks.

George.

On Fri, Sep 23, 2016 at 6:20 AM, Romain Hardouin <ro...@yahoo.fr>
wrote:

> OK. If you still have issues after setting streaming_socket_timeout_in_ms
> != 0, consider increasing request_timeout_in_ms to a high value, say 1 or 2
> minutes. See comments in https://issues.apache.org/
> jira/browse/CASSANDRA-7904
> Regarding 2.1, be sure to test incremental repair on your data before to
> run it in production ;-)
>
> Romain
>

Re: Nodetool repair

Posted by Romain Hardouin <ro...@yahoo.fr>.

OK. If you still have issues after setting streaming_socket_timeout_in_ms != 0, consider increasing request_timeout_in_ms to a high value, say 1 or 2 minutes. See comments in https://issues.apache.org/jira/browse/CASSANDRA-7904Regarding 2.1, be sure to test incremental repair on your data before to run it in production ;-)
Romain

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Thanks a lot, guys. That is lots of useful info to digest.
In my cassandra.ymal, request_timeout_in_ms is set to
10000, streaming_socket_timeout_in_ms is not set hence takes default of 0.
Looks like 2.1x has made quite some improvement on this area. Besides, I
can use incremental repair. So for right now, I will kill the repair using
JMX when it hangs. I am looking into upgrading to 2.1x.
Many thank again. Great stuff!

George.

On Thu, Sep 22, 2016 at 9:47 AM, Romain Hardouin <ro...@yahoo.fr>
wrote:

> Alain, you replied faster, I didn't see your answer :-D
>

Re: Nodetool repair

Posted by Romain Hardouin <ro...@yahoo.fr>.

Alain, you replied faster, I didn't see your answer :-D

Re: Nodetool repair

Posted by Romain Hardouin <ro...@yahoo.fr>.

Hi,
@Matija: George wrote that he uses C* 2.0.9, so the Spotify master is OK for him :-) But you're right about C* >= 2.1, we also use a fork to run it against our 2.1 clusters.
@George: your repair might be slow and not necessarily stuck.  As Alain said, check the progression of nodetool netstats.Did you set streaming_socket_timeout_in_ms to a value different than 0?What is the value of request_timeout_in_ms?Also I suggest you to upgrade to the last 2.0.x (i.e. 2.0.17). No need to upgrade SSTables but be sure to read https://github.com/apache/cassandra/blob/cassandra-2.0/NEWS.txtAgain, you should have a look at cassandra-reaper and the GUI, you will have a progress bar to follow the repair.
Finally if you want to kill a repair you can invoke forceTerminateAllRepairSessions with jmxterm on each node:1. nodetool stop VALIDATION2. echo run -b org.apache.cassandra.db:type=StorageService forceTerminateAllRepairSessions | java -jar /tmp/jmxterm/jmxterm-1.0-alpha-4-uber.jar -l 127.0.0.1:7199
jmxterm download: http://downloads.sourceforge.net/cyclops-group/jmxterm-1.0-alpha-4-uber.jar
Best,
Romain

Le Jeudi 22 septembre 2016 16h45, "Li, Guangxing" <gu...@pearson.com> a écrit :

Romain,

I had another repair that seems to just hang last night. When I did 'nodetool tpstats' on nodes, I see the following in the node where I initiated the repair:
AntiEntropySessions               1         1
On all other nodes, I see:
AntiEntropySessions               0         0

When I check the log for pattern "session completed successfully" in system.log, I see the last finished range occurred in 14 hours ago. So I think it is safe to say that the repair has hanged somehow. In order to start another repair, do we need to 'kill' this repair. If so, how do we do that?

Thanks.

George.

On Thu, Sep 22, 2016 at 6:23 AM, Romain Hardouin <ro...@yahoo.fr> wrote:

I meant that pending (and active) AntiEntropySessions are a simple way to check if a repair is still running on a cluster. Also have a look at Cassandra reaper:
>- https://github.com/spotify/ cassandra-reaper
>
>- https://github.com/ spodkowinski/cassandra-reaper- ui
>
>Best,
>Romain
>
>
>
>
>Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <gu...@pearson.com> a écrit :
>
>Romain,
>
>I started running a new repair. If I see such behavior again, I will try what you mentioned.
>
>Thanks.
>

Re: Nodetool repair

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

As Matija mentioned, my coworker Alexander worked on Reaper. I believe the
branches of most interest would be:

Incremental repairs on Reaper:
https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-that-works
UI integration with incremental repairs on Reaper:
https://github.com/adejanovski/cassandra-reaper/tree/inc-repair-support-with-ui

@George

When I check the log for pattern "session completed successfully" in
> system.log, I see the last finished range occurred in 14 hours ago. So I
> think it is safe to say that the repair has hanged somehow.
>

What is your current setting for 'streaming_socket_timeout_in_ms'. You
might want to be aware of
https://issues.apache.org/jira/browse/CASSANDRA-8611 and
https://issues.apache.org/jira/browse/CASSANDRA-11840

Depending on how long the streams are expected to be, you might want to try
'3600000 ms (1 hour)', if you are currently using 0, or increasing this
value it is already set if you think you might be hitting
https://issues.apache.org/jira/browse/CASSANDRA-11840

In order to start another repair, do we need to 'kill' this repair. If so,
> how do we do that?


Restarting the node is a straightforward way of doing that.

If you do not want to restart for some reason, you can use JMX (
forceTerminateAllRepairSessions). If you are going to use JMX and don't
know much about it, this video of the presentation done by Nate, , another
coworker, at the Cassandra Summit 2016 might be of interest
https://www.youtube.com/watch?v=uiUThbonnpc&index=21&list=PLm-EPIkBI3YoiA-02vufoEj4CgYvIQgIk
.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com


2016-09-22 16:45 GMT+02:00 Li, Guangxing <gu...@pearson.com>:

> Romain,
>
> I had another repair that seems to just hang last night. When I did 'nodetool
> tpstats' on nodes, I see the following in the node where I initiated the
> repair:
> AntiEntropySessions               1         1
> On all other nodes, I see:
> AntiEntropySessions               0         0
> When I check the log for pattern "session completed successfully" in
> system.log, I see the last finished range occurred in 14 hours ago. So I
> think it is safe to say that the repair has hanged somehow. In order to
> start another repair, do we need to 'kill' this repair. If so, how do we do
> that?
>
> Thanks.
>
> George.
>
> On Thu, Sep 22, 2016 at 6:23 AM, Romain Hardouin <ro...@yahoo.fr>
> wrote:
>
>> I meant that pending (and active) AntiEntropySessions are a simple way to
>> check if a repair is still running on a cluster. Also have a look at
>> Cassandra reaper:
>> - https://github.com/spotify/cassandra-reaper
>>
>> - https://github.com/spodkowinski/cassandra-reaper-ui
>>
>> Best,
>> Romain
>>
>>
>>
>> Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <
>> guangxing.li@pearson.com> a écrit :
>>
>> Romain,
>>
>> I started running a new repair. If I see such behavior again, I will try
>> what you mentioned.
>>
>> Thanks.
>>
>
>

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Romain,

I had another repair that seems to just hang last night. When I did 'nodetool
tpstats' on nodes, I see the following in the node where I initiated the
repair:
AntiEntropySessions               1         1
On all other nodes, I see:
AntiEntropySessions               0         0
When I check the log for pattern "session completed successfully" in
system.log, I see the last finished range occurred in 14 hours ago. So I
think it is safe to say that the repair has hanged somehow. In order to
start another repair, do we need to 'kill' this repair. If so, how do we do
that?

Thanks.

George.

On Thu, Sep 22, 2016 at 6:23 AM, Romain Hardouin <ro...@yahoo.fr>
wrote:

> I meant that pending (and active) AntiEntropySessions are a simple way to
> check if a repair is still running on a cluster. Also have a look at
> Cassandra reaper:
> - https://github.com/spotify/cassandra-reaper
>
> - https://github.com/spodkowinski/cassandra-reaper-ui
>
> Best,
> Romain
>
>
>
> Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <
> guangxing.li@pearson.com> a écrit :
>
> Romain,
>
> I started running a new repair. If I see such behavior again, I will try
> what you mentioned.
>
> Thanks.
>

Re: Nodetool repair

Posted by Matija Gobec <ma...@gmail.com>.

Spotifys stopped working on their reaper. Alexander Dejanovski did a pretty
good job to fork it and add incremental repair support.
Check his fork at https://github.com/adejanovski/cassandra-reaper

Matija

On Thu, Sep 22, 2016 at 2:23 PM, Romain Hardouin <ro...@yahoo.fr>
wrote:

> I meant that pending (and active) AntiEntropySessions are a simple way to
> check if a repair is still running on a cluster. Also have a look at
> Cassandra reaper:
> - https://github.com/spotify/cassandra-reaper
>
> - https://github.com/spodkowinski/cassandra-reaper-ui
>
> Best,
> Romain
>
>
>
> Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <
> guangxing.li@pearson.com> a écrit :
>
> Romain,
>
> I started running a new repair. If I see such behavior again, I will try
> what you mentioned.
>
> Thanks.
>

Re: Nodetool repair

Posted by Romain Hardouin <ro...@yahoo.fr>.

I meant that pending (and active) AntiEntropySessions are a simple way to check if a repair is still running on a cluster. Also have a look at Cassandra reaper:
- https://github.com/spotify/cassandra-reaper

- https://github.com/spodkowinski/cassandra-reaper-ui

Best,
Romain



Le Mercredi 21 septembre 2016 22h32, "Li, Guangxing" <gu...@pearson.com> a écrit :

Romain,

I started running a new repair. If I see such behavior again, I will try what you mentioned.

Thanks.

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Romain,

I started running a new repair. If I see such behavior again, I will try
what you mentioned.

Thanks.

On Wed, Sep 21, 2016 at 9:51 AM, Romain Hardouin <ro...@yahoo.fr>
wrote:

> Do you see any pending AntiEntropySessions (not AntiEntropyStage) with
> nodetool tpstats on nodes?
>
> Romain
>
>
> Le Mercredi 21 septembre 2016 16h45, "Li, Guangxing" <
> guangxing.li@pearson.com> a écrit :
>
>
> Alain,
>
> my script actually grep through all the log files, including those
> system.log.*. So it was probably due to a failed session. So now my script
> assumes the repair has finished (possibly due to failure) if it does not
> see any more repair related logs after 2 hours.
>
> Thanks.
>
> George.
>
> On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ <ar...@gmail.com>
> wrote:
>
> Hi George,
>
> That's the best way to monitor repairs "out of the box" I could think of.
> When you're not seeing 2048 (in your case), it might be due to log rotation
> or to a session failure. Have you had a look at repair failures?
>
> I am wondering why the implementor did not put something in the log (e.g.
> ... Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
>
> +1, and some informations about ranges successfully repaired and the
> ranges that failed could be a very good thing as well. It would be easy to
> then read the repair result and to know what to do next (re-run repair on
> some ranges, move to the next node, etc).
>
>
> 2016-09-20 17:00 GMT+02:00 Li, Guangxing <gu...@pearson.com>:
>
> Hi,
>
> I am using version 2.0.9. I have been looking into the logs to see if a
> repair is finished. Each time a repair is started on a node, I am seeing
> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
> StorageService.java (line 2646) Starting repair command #41, repairing 2048
> ranges for keyspace groupmanager" in system.log. So I know that I am
> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
> #8b910950-7c43-11e6-88f3-f147e a74230b] session completed successfully".
> Once I see 2048 such log lines, I know this repair has completed. But this
> is not dependable since sometimes I am seeing less than 2048 but I know
> there is no repair going on since I do not see any trace of repair in
> system.log for a long time. So it seems to me that there is a clear way to
> tell that a repair has started but there is no clear way to tell a repair
> has ended. The only thing you can do is to watch the log and if you do not
> see repair activity for a long time, the repair is done somehow. I am
> wondering why the implementor did not put something in the log (e.g. ...
> Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
> Thanks.
>
> George.
>
> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil <je...@tink.se> wrote:
>
> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com>
> wrote:
>
> ...
>
> - The size of your data
> - The number of vnodes
> - The compaction throughput
> - The streaming throughput
> - The hardware available
> - The load of the cluster
> - ...
>
>
> I've also heard that the number of clustering keys per partition key could
> have an impact. Might be worth investigating.
>
> Cheers,
> Jens
> --
> Jens Rantil
> Backend Developer @ Tink
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>
>
>
>
>
>
>

Re: Nodetool repair

Posted by Romain Hardouin <ro...@yahoo.fr>.

Do you see any pending AntiEntropySessions (not AntiEntropyStage) with nodetool tpstats on nodes?
Romain

Le Mercredi 21 septembre 2016 16h45, "Li, Guangxing" <gu...@pearson.com> a écrit :

Alain,
my script actually grep through all the log files, including those system.log.*. So it was probably due to a failed session. So now my script assumes the repair has finished (possibly due to failure) if it does not see any more repair related logs after 2 hours.
Thanks.
George.
On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

Hi George,
That's the best way to monitor repairs "out of the box" I could think of. When you're not seeing 2048 (in your case), it might be due to log rotation or to a session failure. Have you had a look at repair failures?

I am wondering why the implementor did not put something in the log (e.g. ... Repair command #41 has ended...) to clearly state that the repair has completed.

+1, and some informations about ranges successfully repaired and the ranges that failed could be a very good thing as well. It would be easy to then read the repair result and to know what to do next (re-run repair on some ranges, move to the next node, etc).

2016-09-20 17:00 GMT+02:00 Li, Guangxing <gu...@pearson.com>:

Hi,
I am using version 2.0.9. I have been looking into the logs to see if a repair is finished. Each time a repair is started on a node, I am seeing log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805 StorageService.java (line 2646) Starting repair command #41, repairing 2048 ranges for keyspace groupmanager" in system.log. So I know that I am expecting to see 2048 log lines like "INFO [AntiEntropySessions:109] 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair #8b910950-7c43-11e6-88f3-f147e a74230b] session completed successfully". Once I see 2048 such log lines, I know this repair has completed. But this is not dependable since sometimes I am seeing less than 2048 but I know there is no repair going on since I do not see any trace of repair in system.log for a long time. So it seems to me that there is a clear way to tell that a repair has started but there is no clear way to tell a repair has ended. The only thing you can do is to watch the log and if you do not see repair activity for a long time, the repair is done somehow. I am wondering why the implementor did not put something in the log (e.g. ... Repair command #41 has ended...) to clearly state that the repair has completed.
Thanks.
George.
On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil <je...@tink.se> wrote:

On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com> wrote:

...
- The size of your data- The number of vnodes- The compaction throughput- The streaming throughput- The hardware available- The load of the cluster- ...

I've also heard that the number of clustering keys per partition key could have an impact. Might be worth investigating.
Cheers,Jens --
Jens Rantil
Backend Developer @ TinkTink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Alain,

my script actually grep through all the log files, including those
system.log.*. So it was probably due to a failed session. So now my script
assumes the repair has finished (possibly due to failure) if it does not
see any more repair related logs after 2 hours.

Thanks.

George.

On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Hi George,
>
> That's the best way to monitor repairs "out of the box" I could think of.
> When you're not seeing 2048 (in your case), it might be due to log rotation
> or to a session failure. Have you had a look at repair failures?
>
> I am wondering why the implementor did not put something in the log (e.g.
>> ... Repair command #41 has ended...) to clearly state that the repair has
>> completed.
>
>
> +1, and some informations about ranges successfully repaired and the
> ranges that failed could be a very good thing as well. It would be easy to
> then read the repair result and to know what to do next (re-run repair on
> some ranges, move to the next node, etc).
>
>
> 2016-09-20 17:00 GMT+02:00 Li, Guangxing <gu...@pearson.com>:
>
>> Hi,
>>
>> I am using version 2.0.9. I have been looking into the logs to see if a
>> repair is finished. Each time a repair is started on a node, I am seeing
>> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
>> StorageService.java (line 2646) Starting repair command #41, repairing 2048
>> ranges for keyspace groupmanager" in system.log. So I know that I am
>> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
>> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
>> #8b910950-7c43-11e6-88f3-f147ea74230b] session completed successfully".
>> Once I see 2048 such log lines, I know this repair has completed. But this
>> is not dependable since sometimes I am seeing less than 2048 but I know
>> there is no repair going on since I do not see any trace of repair in
>> system.log for a long time. So it seems to me that there is a clear way to
>> tell that a repair has started but there is no clear way to tell a repair
>> has ended. The only thing you can do is to watch the log and if you do not
>> see repair activity for a long time, the repair is done somehow. I am
>> wondering why the implementor did not put something in the log (e.g. ...
>> Repair command #41 has ended...) to clearly state that the repair has
>> completed.
>>
>> Thanks.
>>
>> George.
>>
>> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil <je...@tink.se> wrote:
>>
>>> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com>
>>> wrote:
>>>
>>> ...
>>>
>>>> - The size of your data
>>>> - The number of vnodes
>>>> - The compaction throughput
>>>> - The streaming throughput
>>>> - The hardware available
>>>> - The load of the cluster
>>>> - ...
>>>>
>>>
>>> I've also heard that the number of clustering keys per partition key
>>> could have an impact. Might be worth investigating.
>>>
>>> Cheers,
>>> Jens
>>> --
>>>
>>> Jens Rantil
>>> Backend Developer @ Tink
>>>
>>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>>> For urgent matters you can reach me at +46-708-84 18 32.
>>>
>>
>>
>

Re: Nodetool repair

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi George,

That's the best way to monitor repairs "out of the box" I could think of.
When you're not seeing 2048 (in your case), it might be due to log rotation
or to a session failure. Have you had a look at repair failures?

I am wondering why the implementor did not put something in the log (e.g.
> ... Repair command #41 has ended...) to clearly state that the repair has
> completed.


+1, and some informations about ranges successfully repaired and the ranges
that failed could be a very good thing as well. It would be easy to then
read the repair result and to know what to do next (re-run repair on some
ranges, move to the next node, etc).


2016-09-20 17:00 GMT+02:00 Li, Guangxing <gu...@pearson.com>:

> Hi,
>
> I am using version 2.0.9. I have been looking into the logs to see if a
> repair is finished. Each time a repair is started on a node, I am seeing
> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
> StorageService.java (line 2646) Starting repair command #41, repairing 2048
> ranges for keyspace groupmanager" in system.log. So I know that I am
> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
> #8b910950-7c43-11e6-88f3-f147ea74230b] session completed successfully".
> Once I see 2048 such log lines, I know this repair has completed. But this
> is not dependable since sometimes I am seeing less than 2048 but I know
> there is no repair going on since I do not see any trace of repair in
> system.log for a long time. So it seems to me that there is a clear way to
> tell that a repair has started but there is no clear way to tell a repair
> has ended. The only thing you can do is to watch the log and if you do not
> see repair activity for a long time, the repair is done somehow. I am
> wondering why the implementor did not put something in the log (e.g. ...
> Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
> Thanks.
>
> George.
>
> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil <je...@tink.se> wrote:
>
>> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com>
>> wrote:
>>
>> ...
>>
>>> - The size of your data
>>> - The number of vnodes
>>> - The compaction throughput
>>> - The streaming throughput
>>> - The hardware available
>>> - The load of the cluster
>>> - ...
>>>
>>
>> I've also heard that the number of clustering keys per partition key
>> could have an impact. Might be worth investigating.
>>
>> Cheers,
>> Jens
>> --
>>
>> Jens Rantil
>> Backend Developer @ Tink
>>
>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>> For urgent matters you can reach me at +46-708-84 18 32.
>>
>
>

Re: Nodetool repair

Posted by "Li, Guangxing" <gu...@pearson.com>.

Hi,

I am using version 2.0.9. I have been looking into the logs to see if a
repair is finished. Each time a repair is started on a node, I am seeing
log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
StorageService.java (line 2646) Starting repair command #41, repairing 2048
ranges for keyspace groupmanager" in system.log. So I know that I am
expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
#8b910950-7c43-11e6-88f3-f147ea74230b] session completed successfully".
Once I see 2048 such log lines, I know this repair has completed. But this
is not dependable since sometimes I am seeing less than 2048 but I know
there is no repair going on since I do not see any trace of repair in
system.log for a long time. So it seems to me that there is a clear way to
tell that a repair has started but there is no clear way to tell a repair
has ended. The only thing you can do is to watch the log and if you do not
see repair activity for a long time, the repair is done somehow. I am
wondering why the implementor did not put something in the log (e.g. ...
Repair command #41 has ended...) to clearly state that the repair has
completed.

Thanks.

George.

On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil <je...@tink.se> wrote:

> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com>
> wrote:
>
> ...
>
>> - The size of your data
>> - The number of vnodes
>> - The compaction throughput
>> - The streaming throughput
>> - The hardware available
>> - The load of the cluster
>> - ...
>>
>
> I've also heard that the number of clustering keys per partition key could
> have an impact. Might be worth investigating.
>
> Cheers,
> Jens
> --
>
> Jens Rantil
> Backend Developer @ Tink
>
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>

Re: Nodetool repair

Posted by Jens Rantil <je...@tink.se>.

On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ <ar...@gmail.com> wrote:

...

> - The size of your data
> - The number of vnodes
> - The compaction throughput
> - The streaming throughput
> - The hardware available
> - The load of the cluster
> - ...
>

I've also heard that the number of clustering keys per partition key could
have an impact. Might be worth investigating.

Cheers,
Jens
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.

Re: Nodetool repair

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Hi Lokesh,

Repair is a regular, very common and yet non trivial operations in
Cassandra. A lot of people are struggling with it.

Some good talks were done about repairs during the summit, you might want
to have a look in the Datastax youtube channel in a few days :-).
https://www.youtube.com/user/DataStaxMedia

Is there a way to know in advance the ETA of manual repair before
> triggering it
>

There is not such a thing. And it is probably because the duration of the
repair is going to depend on:

- The size of your data
- The number of vnodes
- The compaction throughput
- The streaming throughput
- The hardware available
- The load of the cluster
- ...

So the best thing to do is to benchmark it in your own environment. You can
track repairs using logs. I used something like that in the past:

for i in $(echo "SELECT columnfamily_name FROM system.schema_columns WHERE
keyspace_name = ‘my_keyspace';" | cqlsh | uniq | tail -n +4 | head -n -2);
do echo Sessions synced for $i: $(grep -i "$i is fully synced"
/var/log/cassandra/system.log* | wc -l); done

Depending on your version of Cassandra - and the path to your logs - this
might work or not, you might need to adjust it. The number of "sessions"
depends on the number of nodes and of vnodes. But the number of session
will be the same on all the tables, from all the nodes if you are using the
same number of vnodes.

So you will soon have a good idea on how long it takes to repair a table /
a keyspace and some informations about the completeness of the repairs (be
aware of the rotations in the logs and of the previous repairs logs if
using the command above).

How fast repair can go will also depend on the options and techniques you
are using:

- Subranges: https://github.com/BrianGallew/cassandra_range_repair ?
- Incremental / Full repairs ?

I believe repair performs following operations -
>
> 1) Major compaction
> 2) Exchange of merkle trees with neighbouring nodes.
>

 AFAIK, a repair doesn't trigger a major compaction, but I might be wrong
> here.

Jens is right, no major compaction in there. This is how repairs (roughly)
works. There are 2 main steps:

- Compare / exchange merkle trees (done through a VALIDATION compaction,
like a compaction, but without the write phase)
- Streaming: Any mismatch detected in the previous validation is fixed by
streaming a larger block of data (read more about that:
http://www.datastax.com/dev/blog/advanced-repair-techniques)

To monitor those operations use

- validation: nodetool compactionstats -H (Look for "VALIDATION COMPACTION"
off the top of my head)
- streaming: watch -d 'nodetool netstats -H | grep -v 100%'

You should think about what would be a good repair strategy according to
your use case and workload (run repairs by night ? Use subranges ?). Keep
in mind that "nodetool repair" is useful to reduce entropy in your cluster,
and so reducing the risk of inconsistencies. Repair also prevents deleted
data from reappearing (Zombies) as long as it is run cluster-wide within
gc_grace_seconds (per table option).

What if I kill the process in the middle?

This is safe, some parts of the data will not be repair on this node,
that's it. You can either restart the node or find the right JMX command.

C*heers,
-----------------------
Alain Rodriguez - @arodream - alain@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-19 11:18 GMT+02:00 Jens Rantil <je...@tink.se>:

> Hi Lokesh,
>
> Which version of Cassandra are you using? Which compaction strategy are
> you using?
>
> AFAIK, a repair doesn't trigger a major compaction, but I might be wrong
> here.
>
> What you could do is to run a repair for a subset of the ring (see `-st`
> and `-et` `nodetool repair` parameters). If you repair 1/1000 or the ring,
> repairing the whole ring will take ~1000 longer than your sample.
>
> Also, you might want to look at incremental repairs.
>
> If you kill the process in the middle the repair will not start again. You
> will need to reissue it.
>
> Cheers,
> Jens
>
> On Sun, Sep 18, 2016 at 2:58 PM Lokesh Shrivastava <
> lokesh.shrivastava@gmail.com> wrote:
>
>> Hi,
>>
>> I tried to run nodetool repair command on one of my keyspaces and found
>> that it took lot more time than I anticipated. Is there a way to know in
>> advance the ETA of manual repair before triggering it? I believe repair
>> performs following operations -
>>
>> 1) Major compaction
>> 2) Exchange of merkle trees with neighbouring nodes.
>>
>> Is there any other operation performed during manual repair? What if I
>> kill the process in the middle?
>>
>> Thanks.
>> Lokesh
>>
> --
>
> Jens Rantil
> Backend Developer @ Tink
>
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>

Re: Nodetool repair

Posted by Jens Rantil <je...@tink.se>.

Hi Lokesh,

Which version of Cassandra are you using? Which compaction strategy are you
using?

AFAIK, a repair doesn't trigger a major compaction, but I might be wrong
here.

What you could do is to run a repair for a subset of the ring (see `-st`
and `-et` `nodetool repair` parameters). If you repair 1/1000 or the ring,
repairing the whole ring will take ~1000 longer than your sample.

Also, you might want to look at incremental repairs.

If you kill the process in the middle the repair will not start again. You
will need to reissue it.

Cheers,
Jens

On Sun, Sep 18, 2016 at 2:58 PM Lokesh Shrivastava <
lokesh.shrivastava@gmail.com> wrote:

> Hi,
>
> I tried to run nodetool repair command on one of my keyspaces and found
> that it took lot more time than I anticipated. Is there a way to know in
> advance the ETA of manual repair before triggering it? I believe repair
> performs following operations -
>
> 1) Major compaction
> 2) Exchange of merkle trees with neighbouring nodes.
>
> Is there any other operation performed during manual repair? What if I
> kill the process in the middle?
>
> Thanks.
> Lokesh
>
-- 

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.