You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Haithem Jarraya <ha...@struq.com> on 2013/02/19 10:29:19 UTC

Long running nodetool repair

Hi,

I am new to Cassandra and I am not sure if this is the normal behavior but
nodetool repair runs for too long even for small dataset per node. As I am
writing I started a nodetool repair last night at 18:41 and now it's 9:18
and it's still running, the size of my data is only ~500mb per node.
We have
3 Node cluster in DC1 with RF 3
1 Node Cluster in DC2 with RF 1
1 Node cluster in DC3 with RF 1

and running Cassandra V1.2.1 with 256 vNodes.

>From cassandra logs I do not see AntiEntropy logs anymore only compaction
Task and FlushWriter.

Is this a normal behaviour of nodetool repair?
Is the running time grow linearly with the size of the data?

Any help or direction will be much appreciated.


Thanks,

H

Re: Long running nodetool repair

Posted by Michael Kjellman <mk...@barracuda.com>.
This is very normal (unfortunately). Are you doing a repair –pr or a straight up repair?

Does nodetool netstats show anything? I frequently see repair hang in 1.2.1, and I haven't been able to figure out why yet though. Feel free to take a stack dump with jstack on the node doing the repair and see if there are any deadlocks potentially occurring after the merkel tree's are received.

And to help more, do you have the last logs after AntiEntrophy? Any streaming sessions from other nodes?

Bug is being tracked here: https://issues.apache.org/jira/browse/CASSANDRA-5146

Best,
Michael

From: Haithem Jarraya <ha...@struq.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, February 19, 2013 1:29 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Long running nodetool repair

Hi,

I am new to Cassandra and I am not sure if this is the normal behavior but nodetool repair runs for too long even for small dataset per node. As I am writing I started a nodetool repair last night at 18:41 and now it's 9:18 and it's still running, the size of my data is only ~500mb per node.
We have
3 Node cluster in DC1 with RF 3
1 Node Cluster in DC2 with RF 1
1 Node cluster in DC3 with RF 1

and running Cassandra V1.2.1 with 256 vNodes.

>From cassandra logs I do not see AntiEntropy logs anymore only compaction Task and FlushWriter.

Is this a normal behaviour of nodetool repair?
Is the running time grow linearly with the size of the data?

Any help or direction will be much appreciated.


Thanks,

H