You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2011/09/25 03:24:26 UTC
[jira] [Updated] (CASSANDRA-3253) inherent deadlock situation in
commitLog flush?
[ https://issues.apache.org/jira/browse/CASSANDRA-3253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-3253:
--------------------------------------
Attachment: 3253.txt
excellent diagnosis of the problem, Yang.
patch attached to push the flush calls off of the CL executor.
> inherent deadlock situation in commitLog flush?
> -----------------------------------------------
>
> Key: CASSANDRA-3253
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3253
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Yang Yang
> Attachments: 3253.txt
>
>
> after my system ran for a while, it consitently goes into frozen state where all the mutations stage threads are waiting
> on the switchlock,
> the reason is that the switchlock is held by commit log, as shown by the following thread dump:
> "COMMIT-LOG-WRITER" prio=10 tid=0x00000000010df000 nid=0x32d3 waiting on condition [0x00007f2d81557000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x00007f3579eec060> (a java.util.concurrent.FutureTask$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998)
> at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
> at java.util.concurrent.FutureTask.get(FutureTask.java:111)
> at org.apache.cassandra.db.commitlog.CommitLog.getContext(CommitLog.java:386)
> at org.apache.cassandra.db.ColumnFamilyStore.maybeSwitchMemtable(ColumnFamilyStore.java:650)
> at org.apache.cassandra.db.ColumnFamilyStore.forceFlush(ColumnFamilyStore.java:722)
> at org.apache.cassandra.db.commitlog.CommitLog.createNewSegment(CommitLog.java:573)
> at org.apache.cassandra.db.commitlog.CommitLog.access$300(CommitLog.java:81)
> at org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:596)
> at org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> at java.lang.Thread.run(Thread.java:679)
> we can clearly see that the COMMIT-LOG-WRITER thread is running the regular appender , but the appender itself calls getContext(), which again submits a new Callable to be executed, and waits on the Callable. but the new Callable is never going to be executed since the executor has only *one* thread.
> I believe this is a deterministic bug.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira