You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Brandon Williams (Created) (JIRA)" <ji...@apache.org> on 2011/11/30 20:23:40 UTC

[jira] [Created] (CASSANDRA-3543) Infinite hang during shutdown

Infinite hang during shutdown
-----------------------------

                 Key: CASSANDRA-3543
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
             Project: Cassandra
          Issue Type: Bug
          Components: Core
            Reporter: Brandon Williams
         Attachments: hung_stack.txt

While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:

{noformat}
WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3543) Infinite hang during shutdown

Posted by "Rick Branson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160307#comment-13160307 ] 

Rick Branson commented on CASSANDRA-3543:
-----------------------------------------

This is a bug in the new commit log allocator from #3411. The MutationStage threads are all blocked because the CommitLogExecutor queue is full. The COMMIT-LOG-WRITER thread which drains this queue is blocking on fetchSegment() which waits on CommitLogAllocator to push newly created segments onto this queue. Brandon also stated that only 1 commit log segment existed afterwards.
                
> Infinite hang during shutdown
> -----------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3543) Commit Log Allocator deadlock during shutdown

Posted by "Rick Branson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160391#comment-13160391 ] 

Rick Branson commented on CASSANDRA-3543:
-----------------------------------------

Steps to reproduce:

1) $ rm -rf /var/lib/cassandra/*
2) Start Cassandra
3) $ stress -F1 -C 999999999

The stress will run until requests start timing out, on my box it was ~400,000.
                
> Commit Log Allocator deadlock during shutdown
> ---------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3543) Infinite hang during shutdown

Posted by "Brandon Williams (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-3543:
----------------------------------------

    Attachment: hung_stack.txt

Here is a thread dump.
                
> Infinite hang during shutdown
> -----------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Brandon Williams
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3543) Commit Log Allocator deadlock after first start with empty commitlog directory

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13161914#comment-13161914 ] 

Hudson commented on CASSANDRA-3543:
-----------------------------------

Integrated in Cassandra #1234 (See [https://builds.apache.org/job/Cassandra/1234/])
    enableReserveSegmentCreation even when there is nothing to replay
patch by Rick Branson; reviewed by jbellis for CASSANDRA-3543

jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1209724
Files : 
* /cassandra/trunk/CHANGES.txt
* /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java

                
> Commit Log Allocator deadlock after first start with empty commitlog directory
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>             Fix For: 1.1
>
>         Attachments: 3543.txt, hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (CASSANDRA-3543) Commit Log Allocator deadlock during shutdown

Posted by "Rick Branson (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160391#comment-13160391 ] 

Rick Branson edited comment on CASSANDRA-3543 at 11/30/11 10:31 PM:
--------------------------------------------------------------------

Steps to reproduce:

1) $ rm -rf /var/lib/cassandra/*
2) Start Cassandra
3) $ stress -F1 -n 999999999

The stress will run until requests start timing out, on my box it was ~400,000.
                
      was (Author: rbranson):
    Steps to reproduce:

1) $ rm -rf /var/lib/cassandra/*
2) Start Cassandra
3) $ stress -F1 -C 999999999

The stress will run until requests start timing out, on my box it was ~400,000.
                  
> Commit Log Allocator deadlock during shutdown
> ---------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3543) Commit Log Allocator deadlock after first start with empty commitlog directory

Posted by "Rick Branson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Branson updated CASSANDRA-3543:
------------------------------------

    Attachment: 3543.txt
    
> Commit Log Allocator deadlock after first start with empty commitlog directory
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: 3543.txt, hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3543) Commit Log Allocator deadlock after first start with empty commitlog directory

Posted by "Rick Branson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Branson updated CASSANDRA-3543:
------------------------------------

    Summary: Commit Log Allocator deadlock after first start with empty commitlog directory  (was: Commit Log Allocator deadlock during shutdown)

A workaround is to restart Cassandra immediately after it comes up the first time with an empty commitlog directory.
                
> Commit Log Allocator deadlock after first start with empty commitlog directory
> ------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (CASSANDRA-3543) Infinite hang during shutdown

Posted by "Rick Branson (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Branson reassigned CASSANDRA-3543:
---------------------------------------

    Assignee: Rick Branson
    
> Infinite hang during shutdown
> -----------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3543) Commit Log Allocator deadlock during shutdown

Posted by "Rick Branson (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rick Branson updated CASSANDRA-3543:
------------------------------------

    Affects Version/s: 1.1
              Summary: Commit Log Allocator deadlock during shutdown  (was: Infinite hang during shutdown)
    
> Commit Log Allocator deadlock during shutdown
> ---------------------------------------------
>
>                 Key: CASSANDRA-3543
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3543
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>            Reporter: Brandon Williams
>            Assignee: Rick Branson
>         Attachments: hung_stack.txt
>
>
> While testing CASSANDRA-3541 at some point stress completely timed out.  I proceeded to shut the cluster down and 2/3 JVMs hang infinitely.  After a while, one of them logged:
> {noformat}
> WARN 19:07:50,133 Some hints were not written before shutdown.  This is not supposed to happen.  You should (a) run repair, and (b) file a bug report
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira