You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Reynald BOURTEMBOURG <re...@esrf.fr> on 2014/11/28 15:46:11 UTC

Repair hanging with C* 2.1.2

Hi,

We have a three nodes cluster running Cassandra 2.1.2 on Linux Debian 7.
Yesterday, I upgraded the nodes one by one from Cassandra 2.1.1 to 
Cassandra 2.1.2, with several minutes between each upgrade.

More than 2 hours later, I executed "nodetool repair" on one of the 
nodes (cass2).
It started to repair the keyspace we created (RF=3) and it got stuck there.

The nodetool repair command didn't return yet.

In OpsCenter, in the activities section, I can see that one table 
(always the same) is listed as getting repaired on one of the nodes (cass3).
The status of this activity is "Running", with a percentage of 0%.
It is like that since yesterday afternoon.
What we can observe is that the cassandra daemon on that node is taking 
100% of one CPU core all the time since the repair started.
nodetool compactionstats on this node is always giving the following:

# nodetool compactionstats
pending tasks: 1
    compaction type   keyspace                    table completed        
total    unit   progress
         Validation        hdb att_array_devdouble_rw         474   
3623135133   bytes      0.00%
Active compaction remaining time :   0h00m00s

This command is giving the following answer on the 2 other nodes:
# nodetool compactionstats
pending tasks: 0

No obvious error was seen in the system.log files.

What I could observe from the system logs is the following:
A flush of att_array_devdouble_rw table occurred on the 3 nodes just 
before a merkle tree request for this table.
It seems that the node that is currently taking 100% of one CPU core 
(cass3) is the only one that really took into account this request for 
Merkle tree because in the logs, we can see that it is the only node 
executing the ValidationExecutor for this specific table.
We can observe as well that sstables for thisatt_array_devdouble_rw 
table were being compacted by one of the nodes (cass5) when the merkle 
tree request for this table was sent.

I attach a subset of the system logs for the experts to have a look if 
they think it is useful.
To better understand the logs, I can say the following:
nodetool repair was sent from the node named cass2 (ip address 
xxx.xxx.xxx.128).
The node taking 100% of one CPU core is named cass3 (ip address 
xxx.xxx.xxx.129).
The node that was compacting the att_array_devdouble_rw sstables at the 
time of the Merkle tree request for this table is named cass5 (ip 
address xxx.xxx.xxx.131).

att_array_devdouble_rw compaction on node cass5 started at 17:19:08,751 
and was completed at 17:20:47,221.

The Merkle tree request for att_array_devdouble_rw table that 
triggered(?) the problem was sent on 2014-11-27 17:20:45,698 from cass2.

As an additional piece of information, here is the schema of the table 
where it got stuck:

cqlsh:hdb> DESCRIBE TABLE att_array_devdouble_rw ;

CREATE TABLE hdb.att_array_devdouble_rw (
     att_conf_id timeuuid,
     period text,
     event_time timestamp,
     event_time_us int,
     dim_x int,
     dim_y int,
     insert_time timestamp,
     insert_time_us int,
     recv_time timestamp,
     recv_time_us int,
     value_r list<double>,
     value_w list<double>,
     PRIMARY KEY ((att_conf_id, period), event_time, event_time_us)
) WITH CLUSTERING ORDER BY (event_time ASC, event_time_us ASC)
     AND bloom_filter_fp_chance = 0.1
     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
     AND comment = 'Array DevDouble ReadWrite Values Table'
     AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
'max_threshold': '32'}
     AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
     AND dclocal_read_repair_chance = 0.1
     AND default_time_to_live = 0
     AND gc_grace_seconds = 864000
     AND max_index_interval = 2048
     AND memtable_flush_period_in_ms = 0
     AND min_index_interval = 128
     AND read_repair_chance = 0.0
     AND speculative_retry = '99.0PERCENTILE';

Is there anything that you could suggest me to do before to restart any 
node in order to try to better understand the origin of this problem, if 
this is an unknown issue?
Do you know if someone encountered a similar problem already?

Thank you very much for your advices.

Reynald

Re: Repair hanging with C* 2.1.2

Posted by Robert Coli <rc...@eventbrite.com>.

On Fri, Nov 28, 2014 at 6:46 AM, Reynald BOURTEMBOURG <
reynald.bourtembourg@esrf.fr> wrote:

> We have a three nodes cluster running Cassandra 2.1.2 on Linux Debian 7.
> More than 2 hours later, I executed "nodetool repair" on one of the nodes
> (cass2).
> It started to repair the keyspace we created (RF=3) and it got stuck there.
> The nodetool repair command didn't return yet.
>

Yes, repair has historically not really worked, and still hangs sometimes.
Search the archives for tons of posts where I link various JIRA tickets
with background.

On the plus side, so many people have had such a negative experience for so
long that the squeaky wheel is finally getting the grease. Significant
improvements have been made in the reliability and transparency of repair
in recent versions.

> Is there anything that you could suggest me to do before to restart any
> node in order to try to better understand the origin of this problem, if
> this is an unknown issue?
>

You could use the JMX endpoint to stop repair, but if you have vnodes it's
probably easier to just restart the affected nodes.

https://issues.apache.org/jira/browse/CASSANDRA-3486

=Rob