You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2011/09/25 03:24:26 UTC

[jira] [Updated] (CASSANDRA-3253) inherent deadlock situation in commitLog flush?

     [ https://issues.apache.org/jira/browse/CASSANDRA-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-3253:
--------------------------------------

    Attachment: 3253.txt

excellent diagnosis of the problem, Yang.

patch attached to push the flush calls off of the CL executor.

> inherent deadlock situation in commitLog flush?
> -----------------------------------------------
>
>                 Key: CASSANDRA-3253
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3253
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Yang Yang
>         Attachments: 3253.txt
>
>
> after my system ran for a while, it consitently goes into frozen state where all the mutations stage threads are waiting
> on the switchlock,
> the reason is that the switchlock is held by commit log, as shown by the following thread dump:
> "COMMIT-LOG-WRITER" prio=10 tid=0x00000000010df000 nid=0x32d3 waiting on condition [0x00007f2d81557000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f3579eec060> (a java.util.concurrent.FutureTask$Sync)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>         at org.apache.cassandra.db.commitlog.CommitLog.getContext(CommitLog.java:386)
>         at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:650)
>         at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:722)
>         at org.apache.cassandra.db.commitlog.CommitLog.createNewSegment(CommitLog.java:573)
>         at org.apache.cassandra.db.commitlog.CommitLog.access$300(CommitLog.java:81)
>         at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:596)
>         at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at java.lang.Thread.run(Thread.java:679)
> we can clearly see that the COMMIT-LOG-WRITER thread is running the regular appender , but the appender itself calls getContext(), which again submits a new Callable to be executed, and waits on the Callable. but the new Callable is never going to be executed since the executor has only *one* thread.
> I believe this is a deterministic bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira