You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Dan Hendry (Created) (JIRA)" <ji...@apache.org> on 2011/11/11 20:04:51 UTC

[jira] [Created] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Bizarre Compaction Manager Behaviour
------------------------------------

                 Key: CASSANDRA-3484
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.0.2
         Environment: RHEL 6
java version "1.6.0_26"
6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
            Reporter: Dan Hendry


It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 

I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?

I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Dan Hendry (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148662#comment-13148662 ] 

Dan Hendry edited comment on CASSANDRA-3484 at 11/11/11 7:07 PM:
-----------------------------------------------------------------

JMX evidence
                
      was (Author: dhendry):
    JMX
                  
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Dan Hendry (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148681#comment-13148681 ] 

Dan Hendry commented on CASSANDRA-3484:
---------------------------------------

Yes it would. The issue seemed to result in some pretty significant temporary performance degradation. Any chance of getting 2407 into 1.0.3 instead of 1.1?
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: 3484.txt, compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Dan Hendry (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Hendry updated CASSANDRA-3484:
----------------------------------

    Attachment: 3484.txt

Patch to only reschedule another compaction check when the active check resulted in a task being executed
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: 3484.txt, compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Jonathan Ellis (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148675#comment-13148675 ] 

Jonathan Ellis commented on CASSANDRA-3484:
-------------------------------------------

I think the patch on #2407 would also fix this.
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: 3484.txt, compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Sylvain Lebresne (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne resolved CASSANDRA-3484.
-----------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.0.3
         Reviewer: slebresne

Alright, +1 on the patch here, committed.
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>             Fix For: 1.0.3
>
>         Attachments: 3484.txt, compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Sylvain Lebresne (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148697#comment-13148697 ] 

Sylvain Lebresne commented on CASSANDRA-3484:
---------------------------------------------

We can commit either this patch or the one on CASSANDRA-2407 for this issue (since they both fix the issue here). But the goal of #2407 is a bit different so we just should leave that issue solve the problem it want to solve.
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: 3484.txt, compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3484) Bizarre Compaction Manager Behaviour

Posted by "Dan Hendry (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dan Hendry updated CASSANDRA-3484:
----------------------------------

    Attachment: compaction.png

JMX
                
> Bizarre Compaction Manager Behaviour
> ------------------------------------
>
>                 Key: CASSANDRA-3484
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3484
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.2
>         Environment: RHEL 6
> java version "1.6.0_26"
> 6 node cluster (5 nodes 0.8.6, 1 node 1.0.2 minus CASSANDRA-2503)
>            Reporter: Dan Hendry
>         Attachments: compaction.png
>
>
> It seems the CompactionManager has gotten itself into a bad state. My 1.0.2 node has been up for 20 hours now - checking via JMX, the compaction manager is reporting that it has completed 14,797,412,000 tasks. Yep, thats right 14 billion tasks and increasing at a rate of roughly 208,400/second. 
> I should point out that I am currently running a major compaction on the node. My theory is that this problem was introduced by CASSANDRA-3363. It looks like SizeTieredCompactionStrategy.getBackgroundTasks() returns a set of task without consideration for any in-progress compactions. Compactions are only kicked off if task.markSSTablesForCompaction() returns true (CompactionManager line 127) but the task resubmission is based only on the task list not being empty (CompactionManager line 141). Should the logic not be to only reschedule if a task has actually been executed?
> I am just waiting now for the major compaction to finish to see if the problem goes away as would be suggested by my theory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira