You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jake Maizel <ja...@soundcloud.com> on 2011/02/22 13:32:19 UTC

Help with Error on reading sstable

I'm getting this error after a space problem caused issues during a
repair operation on one of six nodes in our cluster:

2011-02-22_11:54:50.26788 'ERROR [ROW-READ-STAGE:305] 11:54:50,267
CassandraDaemon.java:87 Uncaught exception in thread
Thread[ROW-READ-STAGE:305,5,main]
2011-02-22_11:54:50.26789 'java.lang.ArrayIndexOutOfBoundsException
2011-02-22_11:54:50.26789       at
org.apache.cassandra.io.util.BufferedRandomAccessFile.read(BufferedRandomAccessFile.java:326)
2011-02-22_11:54:50.26790       at
java.io.RandomAccessFile.readFully(RandomAccessFile.java:381)
2011-02-22_11:54:50.26790       at
java.io.DataInputStream.readUTF(DataInputStream.java:592)
2011-02-22_11:54:50.26790       at
java.io.RandomAccessFile.readUTF(RandomAccessFile.java:887)
2011-02-22_11:54:50.26791       at
org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.<init>(SSTableSliceIterator.java:125)
2011-02-22_11:54:50.26791       at
org.apache.cassandra.db.filter.SSTableSliceIterator.<init>(SSTableSliceIterator.java:59)
2011-02-22_11:54:50.26792       at
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
2011-02-22_11:54:50.26792       at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:990)
2011-02-22_11:54:50.26793       at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:901)
2011-02-22_11:54:50.26793       at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:870)
2011-02-22_11:54:50.26794       at
org.apache.cassandra.db.Table.getRow(Table.java:382)
2011-02-22_11:54:50.26794       at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
2011-02-22_11:54:50.26794       at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:70)
2011-02-22_11:54:50.26795       at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:49)
2011-02-22_11:54:50.26795       at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
2011-02-22_11:54:50.26796       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
2011-02-22_11:54:50.26796       at java.lang.Thread.run(Thread.java:619)
2011-02-22_11:54:54.71933 'ERROR [ROW-READ-STAGE:302] 11:54:54,718
DebuggableThreadPoolExecutor.java:102 Error in ThreadPoolExecutor
2011-02-22_11:54:54.71935 'java.lang.ArrayIndexOutOfBoundsException
2011-02-22_11:54:54.71935       at
org.apache.cassandra.io.util.BufferedRandomAccessFile.read(BufferedRandomAccessFile.java:326)
2011-02-22_11:54:54.71936       at
java.io.RandomAccessFile.readFully(RandomAccessFile.java:381)
2011-02-22_11:54:54.71936       at
java.io.DataInputStream.readUTF(DataInputStream.java:592)
2011-02-22_11:54:54.71937       at
java.io.RandomAccessFile.readUTF(RandomAccessFile.java:887)
2011-02-22_11:54:54.71937       at
org.apache.cassandra.db.filter.SSTableSliceIterator$ColumnGroupReader.<init>(SSTableSliceIterator.java:125)
2011-02-22_11:54:54.71937       at
org.apache.cassandra.db.filter.SSTableSliceIterator.<init>(SSTableSliceIterator.java:59)
2011-02-22_11:54:54.71938       at
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:63)
2011-02-22_11:54:54.71938       at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:990)
2011-02-22_11:54:54.71939       at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:901)
2011-02-22_11:54:54.71939       at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:870)
2011-02-22_11:54:54.71941       at
org.apache.cassandra.db.Table.getRow(Table.java:382)
2011-02-22_11:54:54.71942       at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:59)
2011-02-22_11:54:54.71942       at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:70)
2011-02-22_11:54:54.71942       at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:49)
2011-02-22_11:54:54.71943       at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
2011-02-22_11:54:54.71943       at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
2011-02-22_11:54:54.71944       at java.lang.Thread.run(Thread.java:619)

I am thinking that there was a failure with writing out an SSTable
because of space and now its corrupt.   Also, the repair caused a huge
amount of disk to be used and therefore ran out.  Currently, is there
a way to clear space in this situation?  Would running a clean up
help?

Running ver 0.6.6.

Thanks,

-- 
Jake Maizel

Re: Help with Error on reading sstable

Posted by Robert Coli <rc...@digg.com>.
On Tue, Feb 22, 2011 at 4:32 AM, Jake Maizel <ja...@soundcloud.com> wrote:
> I'm getting this error after a space problem caused issues during a
> repair operation on one of six nodes in our cluster:
> ...
> I am thinking that there was a failure with writing out an SSTable
> because of space and now its corrupt.

If this is the case, the easiest way to get this node back up is to
simply remove the corrupt SSTable from the datadir temporarily and to
try to recover its data offline.

> Also, the repair caused a huge amount of disk to be used and therefore ran out.

https://issues.apache.org/jira/browse/CASSANDRA-1674

> Currently, is there a way to clear space in this situation?  Would running a clean up
> help?

If your disk is still full, it's going to be difficult for you to
complete a cleanup compaction. However if you are able to do so, I
believe it would in fact help.

> Running ver 0.6.6.

0.6.6 does not contain the patch from :

https://issues.apache.org/jira/browse/CASSANDRA-1676

And is therefore highly unlikely to successfully repair except by
accident. If your intent is to successfully repair, I strongly suggest
either upgrading to 0.6.8 (which contains the imo critical 1676, but
not the important-but-not-critical-1674) or just patching 1676 and
1674 into 0.6.6.

=Rob