You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "WeiFan (JIRA)" <ji...@apache.org> on 2015/06/26 18:03:04 UTC

[jira] [Created] (CASSANDRA-9661) Endless compaction to a tiny, tombstoned SStable

WeiFan created CASSANDRA-9661:
---------------------------------

             Summary: Endless compaction to a tiny, tombstoned SStable
                 Key: CASSANDRA-9661
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9661
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: WeiFan


We deployed a 3-nodes cluster (with 2.1.5) which worked under stable write requests ( about 2k wps) to a CF with DTCS, a default TTL as 43200s and gc_grace as 21600s. The CF contained inserted only, complete time series data. We found cassandra will occasionally keep writing logs like this:

INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,195 CompactionTask.java:270 - Compacted 1 sstables to [/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516270,].  449 bytes to 449 (~100% of original) in 12ms = 0.035683MB/s.  4 total partitions merged to 4.  Partition merge counts were {1:4, }
INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,241 CompactionTask.java:140 - Compacting [SSTableReader(path='/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516270-Data.db')]
INFO  [CompactionExecutor:30551] 2015-06-26 18:10:06,253 CompactionTask.java:270 - Compacted 1 sstables to [/home/cassandra/workdata/data/sen_vaas_test/nodestatus-f96c7c50155811e589f69752ac9b06c7/sen_vaas_test-nodestatus-ka-2516271,].  449 bytes to 449 (~100% of original) in 12ms = 0.035683MB/s.  4 total partitions merged to 4.  Partition merge counts were {1:4, }

It seems that cassandra kept doing compacting to a single SStable, serveral times per second, and lasted for many hours. Tons of logs were thrown and one CPU core exhausted during this time. The endless compacting finally end when another compaction started with a group of SStables (including previous one). All of our 3 nodes have been hit by this problem, but occurred in different time.

We could not figure out how the problematic SStable come up because the log has wrapped around. 

We have dumped the records in the SStable and found it has the oldest data in our CF (again, our data was time series), and all of the record in this SStable have bben expired for more than 18 hours (12 hrs TTL + 6 hrs gc) so they should be dropped. However, c* do nothing to this SStable but compacting it again and again, until more SStable were out-dated enough to be considered for compacting together with this one by DTCS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)