You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "John Chakerian (Created) (JIRA)" <ji...@apache.org> on 2012/01/18 06:20:39 UTC

[jira] [Created] (CASSANDRA-3751) Possible livelock during commit log playback

Possible livelock during commit log playback
--------------------------------------------

                 Key: CASSANDRA-3751
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 1.0.1
         Environment: Linux (CentOS 5.7)
            Reporter: John Chakerian


In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:

{noformat}
INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 
INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)

{noformat}

The most recently modified file in the commit log directory was this entry:
{noformat}
-rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
{noformat}
though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3751) Possible livelock during commit log playback

Posted by "Rick Branson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188285#comment-13188285 ] 

Rick Branson commented on CASSANDRA-3751:
-----------------------------------------

Which commitlog_sync option do you have configured on this node?
                
> Possible livelock during commit log playback
> --------------------------------------------
>
>                 Key: CASSANDRA-3751
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.1
>         Environment: Linux (CentOS 5.7)
>            Reporter: John Chakerian
>             Fix For: 1.0.8
>
>
> In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:
> {noformat}
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The most recently modified file in the commit log directory was this entry:
> {noformat}
> -rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
> {noformat}
> though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3751) Possible livelock during commit log playback

Posted by "Brandon Williams (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brandon Williams updated CASSANDRA-3751:
----------------------------------------

    Fix Version/s: 1.0.8
    
> Possible livelock during commit log playback
> --------------------------------------------
>
>                 Key: CASSANDRA-3751
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.1
>         Environment: Linux (CentOS 5.7)
>            Reporter: John Chakerian
>             Fix For: 1.0.8
>
>
> In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:
> {noformat}
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The most recently modified file in the commit log directory was this entry:
> {noformat}
> -rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
> {noformat}
> though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-3751) Possible livelock during commit log playback

Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3751:
--------------------------------------

    Attachment: 3751.txt

You're absolutely right, tablesRecovered is not threadsafe but it's mutated by up to concurrent_writes threads during log replay.  Patch attached to switch to NBHS.
                
> Possible livelock during commit log playback
> --------------------------------------------
>
>                 Key: CASSANDRA-3751
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Linux (CentOS 5.7)
>            Reporter: John Chakerian
>              Labels: commitlog
>             Fix For: 0.8.10, 1.0.8
>
>         Attachments: 3751.txt
>
>
> In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:
> {noformat}
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The most recently modified file in the commit log directory was this entry:
> {noformat}
> -rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
> {noformat}
> though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3751) Possible livelock during commit log playback

Posted by "Rick Branson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196600#comment-13196600 ] 

Rick Branson commented on CASSANDRA-3751:
-----------------------------------------

Reviewed, +1
                
> Possible livelock during commit log playback
> --------------------------------------------
>
>                 Key: CASSANDRA-3751
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Linux (CentOS 5.7)
>            Reporter: John Chakerian
>            Assignee: Jonathan Ellis
>              Labels: commitlog
>             Fix For: 0.8.10, 1.0.8
>
>         Attachments: 3751.txt
>
>
> In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:
> {noformat}
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The most recently modified file in the commit log directory was this entry:
> {noformat}
> -rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
> {noformat}
> though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-3751) Possible livelock during commit log playback

Posted by "John Chakerian (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188289#comment-13188289 ] 

John Chakerian commented on CASSANDRA-3751:
-------------------------------------------

periodic
                
> Possible livelock during commit log playback
> --------------------------------------------
>
>                 Key: CASSANDRA-3751
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.1
>         Environment: Linux (CentOS 5.7)
>            Reporter: John Chakerian
>             Fix For: 1.0.8
>
>
> In CommitLog.recover, there seems to be the possibility of concurrent inserts to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line 323 in 1.0.7). This apparently happened during a commit log playback during startup of a node that had not shut down cleanly (the cluster was under heavy load previously and there were several gigabytes of commit logs), resulting in two threads running in perpetuity (2 cores were at 100% from running these threads), preventing the node from coming up. The relevant portion of the stack trace is:
> {noformat}
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:25" prio=10 tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 
> INFO   | jvm 1    | 2012/01/16 16:54:42 | "MutationStage:21" prio=10 tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
> INFO   | jvm 1    | 2012/01/16 16:54:42 |    java.lang.Thread.State: RUNNABLE
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashMap.put(HashMap.java:374)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.HashSet.add(HashSet.java:200)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> INFO   | jvm 1    | 2012/01/16 16:54:42 | 	at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The most recently modified file in the commit log directory was this entry:
> {noformat}
> -rw-r----- 1 <redacted> <redacted>    0 Jan 16 16:03 CommitLog-1326758622599.log
> {noformat}
> though I'm not sure if this was related or not. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira