You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Michael Theroux (JIRA)" <ji...@apache.org> on 2012/07/21 21:15:34 UTC

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419912#comment-13419912 ] 

Michael Theroux commented on CASSANDRA-4456:
--------------------------------------------

I just hit this problem myself today, on a single node in a six node cluster.  I was running nodetool repair, and it halted with this exception in the log.  I was monitoring the repair pretty closely.  A couple of observations:

1) It happened while compaction of the same column family was happening simultaneously
2) When I re-ran it, it worked.

Note: I am not a cassandra developer, but I looked at the code.  A highly uneducated guess is that an sstable was compacted and deleted while validation was expecting it to be there?  
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira