You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mike Heffner (JIRA)" <ji...@apache.org> on 2012/07/21 16:07:34 UTC

[jira] [Created] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Mike Heffner created CASSANDRA-4456:
---------------------------------------

             Summary: AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
                 Key: CASSANDRA-4456
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.1.2
         Environment: Ubuntu 11.04 64-bit
            Reporter: Mike Heffner


We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.


ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
java.lang.AssertionError
        at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
        at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
        at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
        at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
        at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
        at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)


In building this ring we migrated sstables from an identical 0.8.8 ring by:

 1. Creating the schema on our new 1.1.2 ring.
 2. Rsyncing over sstables from 0.8.8 ring.
 3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
 4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
 5. Ran nodetool upgradesstables for each CF across each node.

When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420746#comment-13420746 ] 

Sylvain Lebresne commented on CASSANDRA-4456:
---------------------------------------------

Actually I think this can happen even when snapshots are not used since a sstable can finish to be compacted just between when we chose sstable for repair and when we create the CompactionController for the validation compaction. In particular, I wonder if Michael and Mike have used -snapshot for their compaction. Though it's true that repair on snapshot will make that way more likely to happen.

But actually I don't think we need to call getOverlappingSStables at all in the first place for repair, since this is used only to decide if we can purge but repair does not do purging. Attaching a simple patch to skip the call entirely.

                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.3
>
>         Attachments: 4456.txt
>
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4456:
--------------------------------------

    Assignee: Sylvain Lebresne

I think this was introduced by CASSANDRA-3721: getOverlappingSSTables assumes that the sstables we check for overlaps are part of the live set, but now we can validate over a snapshot instead.
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-4456:
--------------------------------------

    Comment: was deleted

(was: I think this was introduced by CASSANDRA-3721: getOverlappingSSTables assumes that the sstables we check for overlaps are part of the live set, but now we can validate over a snapshot instead.)
    
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420755#comment-13420755 ] 

Jonathan Ellis commented on CASSANDRA-4456:
-------------------------------------------

You need to wire VCC in to ValidationCompactionIterable, but otherwise +1.
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.3
>
>         Attachments: 4456.txt
>
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420685#comment-13420685 ] 

Jonathan Ellis commented on CASSANDRA-4456:
-------------------------------------------

I think this was introduced by CASSANDRA-3721: getOverlappingSSTables assumes that the sstables we check for overlaps are part of the live set, but now we can validate over a snapshot instead.
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Michael Theroux (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419913#comment-13419913 ] 

Michael Theroux commented on CASSANDRA-4456:
--------------------------------------------

I am also on 1.1.2.
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-4456:
----------------------------------------

    Attachment: 4456.txt
    
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>            Assignee: Sylvain Lebresne
>             Fix For: 1.1.3
>
>         Attachments: 4456.txt
>
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair

Posted by "Michael Theroux (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419912#comment-13419912 ] 

Michael Theroux commented on CASSANDRA-4456:
--------------------------------------------

I just hit this problem myself today, on a single node in a six node cluster.  I was running nodetool repair, and it halted with this exception in the log.  I was monitoring the repair pretty closely.  A couple of observations:

1) It happened while compaction of the same column family was happening simultaneously
2) When I re-ran it, it worked.

Note: I am not a cassandra developer, but I looked at the code.  A highly uneducated guess is that an sstable was compacted and deleted while validation was expecting it to be there?  
                
> AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4456
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.2
>         Environment: Ubuntu 11.04 64-bit
>            Reporter: Mike Heffner
>
> We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs.
> ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main]
> java.lang.AssertionError
>         at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874)
>         at org.apache.cassandra.db.compaction.CompactionController.<init>(CompactionController.java:69)
>         at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.<init>(CompactionManager.java:834)
>         at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698)
>         at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68)
>         at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> In building this ring we migrated sstables from an identical 0.8.8 ring by:
>  1. Creating the schema on our new 1.1.2 ring.
>  2. Rsyncing over sstables from 0.8.8 ring.
>  3. Renaming the sstables to match the directory and file naming structure of 1.1.x.
>  4. Ran nodetool refresh <keyspace> <cf> for each CF across each node.
>  5. Ran nodetool upgradesstables for each CF across each node.
> When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira