You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Danh Kieu (JIRA)" <ji...@apache.org> on 2013/08/14 20:14:48 UTC

[jira] [Created] (CASSANDRA-5890) Exception: CorruptSSTableException after nightly 'enqueuing flush'

Danh Kieu created CASSANDRA-5890:
------------------------------------

             Summary: Exception: CorruptSSTableException after nightly 'enqueuing flush'
                 Key: CASSANDRA-5890
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5890
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: Release: 1.2.5
2 Cassandra nodes in a cluster
OS: Ubuntu 12.04.2 LTS
CQL: 3.0.2
            Reporter: Danh Kieu


The 2 Cassandra nodes have worked well for 2 days. However, after a nightly 'enqueuing flush' process, a few column families (in all keyspaces) are corrupted and the following error is observed in log file while trying to access the CF:

ERROR [ReadStage:413277] 2013-08-14 10:16:13,322 CassandraDaemon.java (line 175) Exception in thread Thread[ReadStage:413277,5,main]
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:65)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
        at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
        at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133)
        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
        at org.apache.cassandra.db.Table.getRow(Table.java:347)
        at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)
Caused by: java.io.EOFException
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394)
        at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380)
        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
        at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:116)
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:60)
        ... 14 more
ERROR [ReadStage:413292] 2013-08-14 10:16:53,700 CassandraDaemon.java (line 175) Exception in thread Thread[ReadStage:413292,5,main]
java.lang.AssertionError: DecoratedKey(-8619398030348476976, 796177616431406d6e75626f2e636f6d) != DecoratedKey(-8430385117828588592, 79624079622e636f6d) in /var/lib/cassandra/data/newui/user/newui-user-ic-1-Data.db
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:119)
        at org.apache.cassandra.db.columniterator.SSTableNamesIterator.<init>(SSTableNamesIterator.java:60)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
        at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
        at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:133)
        at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
        at org.apache.cassandra.db.Table.getRow(Table.java:347)
        at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)


Here the traces of the 'enqueuing flush' process:
 INFO [OptionalTasks:1] 2013-08-14 04:01:50,586 ColumnFamilyStore.java (line 631) Enqueuing flush of Memtable-user@848766585(3130/3130 serialized/live bytes, 96 ops)
 INFO [FlushWriter:381] 2013-08-14 04:01:50,632 Memtable.java (line 461) Writing Memtable-user@848766585(3130/3130 serialized/live bytes, 96 ops)
 INFO [FlushWriter:381] 2013-08-14 04:01:50,641 Memtable.java (line 495) Completed flushing /var/lib/cassandra/data/newui/user/newui-user-ic-1-Data.db (1513 bytes) for commitlog position ReplayPosition(segmentId=1373063890581, position=9254)

The issue had happened quite regularly and on a random CF. Please note that we don't have a lot of data (max 20 records per CF).

We've to run 'nodetool scrub' and 'nodetool repair' to get out of troubles. Please let me know if this issue is related to a configuration, a bad use of Cassandra or a real problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira