You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sammy Yu (JIRA)" <ji...@apache.org> on 2009/09/23 00:08:15 UTC

[jira] Created: (CASSANDRA-452) Corrupt SSTable

Corrupt SSTable
---------------

                 Key: CASSANDRA-452
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
             Project: Cassandra
          Issue Type: Bug
         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
            Reporter: Sammy Yu


We noticed on one of our node the number of SStables is growing.  It appears that compaction thread is still alive.  However, compaction is failing because one of the sstable is corrupt:

ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
        at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.Table.getRow(Table.java:589)
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
        ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-452:
-------------------------------------

    Attachment: keys.txt

keys from corrupt sstable file

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Yu updated CASSANDRA-452:
-------------------------------

    Attachment: FriendActions-17122-Index.db
                FriendActions-17122-Filter.db
                FriendActions-17122-Data.db

corrupt sstable


> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758540#action_12758540 ] 

Jonathan Ellis commented on CASSANDRA-452:
------------------------------------------

What is the definition for this CF?

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758541#action_12758541 ] 

Chris Goffinet commented on CASSANDRA-452:
------------------------------------------

<ColumnFamily ColumnType="Super" CompareWith="BytesType" CompareSubcolumnsWith="BytesType" Name="FriendActions"/>

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Chris Goffinet (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775272#action_12775272 ] 

Chris Goffinet commented on CASSANDRA-452:
------------------------------------------

This is safe to close for now, we haven't run into a corruption issue yet.

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-452.
--------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.5
         Assignee: Jonathan Ellis

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>            Assignee: Jonathan Ellis
>             Fix For: 0.5
>
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12775078#action_12775078 ] 

Jonathan Ellis commented on CASSANDRA-452:
------------------------------------------

haven't heard anything on this in a while -- does it look like the new compaction code in 0.5 fixes it then?

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765538#action_12765538 ] 

Sammy Yu commented on CASSANDRA-452:
------------------------------------

This is still almost 0.4.


> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Yu updated CASSANDRA-452:
-------------------------------

    Description: 
We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:

ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
        at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.Table.getRow(Table.java:589)
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
        ... 4 more

  was:
We noticed on one of our node the number of SStables is growing.  It appears that compaction thread is still alive.  However, compaction is failing because one of the sstable is corrupt:

ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.EOFException
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
        at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
        at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
        at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
        at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
        at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
        at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
        at org.apache.cassandra.db.Table.getRow(Table.java:589)
        at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
        at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
        ... 4 more


> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758808#action_12758808 ] 

Sammy Yu commented on CASSANDRA-452:
------------------------------------

We determined last night that 17120 is a result of compaction.  I'm also working on CASSANDRA-453 which will help us validate offline the integrity of sstables.



> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Yu updated CASSANDRA-452:
-------------------------------

    Attachment: FriendActions-17120.tar.gz

good sstable

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765527#action_12765527 ] 

Jonathan Ellis commented on CASSANDRA-452:
------------------------------------------

Is this still your almost-0.4 code, or trunk?

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12765606#action_12765606 ] 

Jonathan Ellis commented on CASSANDRA-452:
------------------------------------------

I don't think I can do a whole lot until someone runs this with SnapshotBeforeCompaction (from CASSANDRA-426) turned on, and gives me a "compacting this set of sstables produces this corrupt row" case (or less likely, proves that a row was already corrupt when it was first flushed from a memtable).  0.4.1 and trunk both include this patch now

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (CASSANDRA-452) Corrupt SSTable

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758570#action_12758570 ] 

Jonathan Ellis commented on CASSANDRA-452:
------------------------------------------

17120 is corrupt, all right.

The local deletion time is 0, which is nonsense (it's generated from System.CurrentTimeMillis) and the timestamp associated with that delete doesn't look like your other timestamps.  After that it's clearly reading nonsense and eventually EOFs while trying to read 19988495 columns.  Some debug output from my local code:

DEBUG - key is 148447622005950731053602871503233733033:itemdiggs15891086
...
DEBUG - reading name of length 7                                                                                     
DEBUG - deserializing SC ironeus deleted @0/21025451015143424; reading 19988495 columns                              
DEBUG - deserializing subcolumn 0                                                                                    
DEBUG - reading name of length 27237                                                                                 
DEBUG - deserializing rryjamesstoneJ�� ...

I can't see how the compaction code could cause this kind of corruption.  (Your logs should show: is this even the product of a compaction?  Or is it a direct result of a memtable or BMt?)

If it is a product of compaction, do you have a snapshot of that sstable any time prior to that compaction?  Can you reproduce the bug compacting those files?

I hate to blame hardware but there are a couple things that indicate this might actually be caused by that.  First, a localDeletionTime of zero is exactly 1 bit away from Integer.MIN_VALUE in 2's complement.  All the other localDT values are Integer.MIN_VALUE as would be expected if no deletes are done.  Second, no bytes are being skipped (which is often how you see some expected small number of columns get huge) -- the row sizes are correct and all the keys are readable where they should be.

I will attach a list of keys in this sstable so you can force read repair on them.  Tomorrow I will patch compaction to be able to recover from this error, and if Sammy or Chris can do CASSANDRA-426 then we will be able to reproduce any such future errors (assuming they are compaction related, rather than memtable/BMt).

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (CASSANDRA-452) Corrupt SSTable

Posted by "Sammy Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sammy Yu updated CASSANDRA-452:
-------------------------------

    Attachment: FriendActions-22478.tar.gz

Another corrupt sstable:
Digg/FriendActions-22478-Data.db: could not be fully read last key=106990576908512342565493723105254981184:itemdiggs15991619 at position 1039957

> Corrupt SSTable
> ---------------
>
>                 Key: CASSANDRA-452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-452
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Pre 0.4 based on r805615 on trunk w/ #370, #392, #394, #405, #406, #418
>            Reporter: Sammy Yu
>         Attachments: FriendActions-17120.tar.gz, FriendActions-17122-Data.db, FriendActions-17122-Filter.db, FriendActions-17122-Index.db, FriendActions-22478.tar.gz, keys.txt
>
>
> We noticed on one of our node the number of SStables is growing.  The compaction thread is alive and running.  We can see that it is constantly trying to compact the same set of sstables.  However, it is failing because one of the sstable is corrupt:
> ERROR [ROW-READ-STAGE:475] 2009-09-21 00:29:17,068 DebuggableThreadPoolExecutor.java (line 125) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.EOFException
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:110)
>         at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:44)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.EOFException
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>         at java.io.RandomAccessFile.readFully(RandomAccessFile.java:361)
>         at org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:390)
>         at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:64)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:349)
>         at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:309)
>         at org.apache.cassandra.db.filter.SSTableNamesIterator.<init>(SSTableNamesIterator.java:102)
>         at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:69)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1467)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1420)
>         at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1401)
>         at org.apache.cassandra.db.Table.getRow(Table.java:589)
>         at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)
>         at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:78)
>         ... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.