You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2012/12/11 17:49:21 UTC

[jira] [Commented] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

    [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529111#comment-13529111 ] 

Jonathan Ellis commented on CASSANDRA-4492:
-------------------------------------------

If you can give us a set of sstables that reliably cause the hang (snapshot before compaction option should be useful here) then we can troubleshoot.  Nothing is obviously wrong from just eyeballing things.
                
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira