You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Mark Manley (JIRA)" <ji...@apache.org> on 2016/03/29 13:01:25 UTC

[jira] [Comment Edited] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

    [ https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15215841#comment-15215841 ] 

Mark Manley edited comment on CASSANDRA-11447 at 3/29/16 11:01 AM:
-------------------------------------------------------------------

Of course:

{code}
INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 CompactionManager.java:1464 - Compaction interrupted: Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_log, 9302818642/12244413096)bytes
{code}


was (Author: mwmanley):
Of course:

INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 CompactionManager.java:1464 - Compaction interrupted: Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_log, 9302818642/12244413096)bytes

> Flush writer deadlock in Cassandra 2.2.5
> ----------------------------------------
>
>                 Key: CASSANDRA-11447
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Mark Manley
>             Fix For: 2.2.x
>
>         Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 tid=0x0000000005fc11d0 nid=0x7664 waiting for monitor entry [0x00007fb83f0e5000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
>         - waiting to lock <0x0000000400956258> (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
>         at org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
>         at org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
>         at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
>         at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
>         at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>         at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS and the rest using DTCS.  None of the tables here save for the LCS one seem to have large SSTable counts:
> {code}
> 		Table: active_counters
> 		SSTable count: 2
> --
> 		Table: aggregation_job_entries
> 		SSTable count: 2
> --
> 		Table: dsp_metrics_log
> 		SSTable count: 207
> --
> 		Table: dsp_metrics_ts_5min
> 		SSTable count: 3
> --
> 		Table: dsp_metrics_ts_day
> 		SSTable count: 2
> --
> 		Table: dsp_metrics_ts_hour
> 		SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all this to get rid of the 400+ SStable files before this system went into use, but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)