You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "Jason Harvey (JIRA)" <ji...@apache.org> on 2012/08/05 06:14:02 UTC

[jira] [Created] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Jason Harvey created CASSANDRA-4492:
---------------------------------------

Summary: Large HintsColumnFamily compactions hang
Key: CASSANDRA-4492
URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
Project: Cassandra
Issue Type: Bug
Affects Versions: 1.0.11
Reporter: Jason Harvey
Priority: Minor

Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}

  was:
Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}

    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Brandon Williams (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13428854#comment-13428854 ] 

Brandon Williams commented on CASSANDRA-4492:
---------------------------------------------

MT compaction is unfortunately a) not highly used and b) known to be suspect to issues.  The best course of action right now is to just not use it. :(
                
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

    
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Summary: HintsColumnFamily compactions hang when using multithreaded compaction  (was: HintsColumnFamily compactions hang when using MultiThreadedCompaction)
    
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "T Jake Luciani (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477221#comment-13477221 ] 

T Jake Luciani commented on CASSANDRA-4492:
-------------------------------------------

I hit this recently as well.

Though mine I was able to reproduce.  If you call truncate while a compaction is currently going on it hangs both the truncate and the parallel compaction iterator.  


                
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using MultiThreadedCompaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Disabling multithreaded compaction stops this issue from occurring.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

        Summary: HintsColumnFamily compactions hang when using MultiThreadedCompaction  (was: Large HintsColumnFamily compactions hang)
    
> HintsColumnFamily compactions hang when using MultiThreadedCompaction
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Disabling multithreaded compaction stops this issue from occurring.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

    
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Attachment:     (was: jstack)
    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}

  was:
Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Attachment: jstack
    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

    
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Attachment: jstack.txt
    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}

    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using MultiThreadedCompaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Disabling multithreaded compaction stops this issue from occurring.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

    
> HintsColumnFamily compactions hang when using MultiThreadedCompaction
> ---------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Thomas Vachon (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506806#comment-13506806 ] 

Thomas Vachon commented on CASSANDRA-4492:
------------------------------------------

This actually is severe.  Since they hang, it blocks all schema changes.
                
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Michael Kjellman (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477500#comment-13477500 ] 

Michael Kjellman commented on CASSANDRA-4492:
---------------------------------------------

hit this as well on both 1.1.5 and 1.1.6. Turned multhreaded compaction off and HintsColumnFamily finished very quickly. Nothing in the logs of interest and not reproducible.
                
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

  was:
Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.

Compactions of all other CFs seem to work just fine.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.

{code}
pending tasks: 1
          compaction type        keyspace   column family bytes compacted     bytes total  progress
               Compaction          systemHintsColumnFamily          268082       464784758     0.06%
{code}


The hung thread stack is as follows: (full jstack attached, as well)

{code}
"CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
        at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
        at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
        at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
{code}

    
> HintsColumnFamily compactions hang when using multithreaded compaction
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>         Attachments: jstack.txt
>
>
> Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring.
> Compactions of all other CFs seem to work just fine.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB.
> Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     bytes total  progress
>                Compaction          systemHintsColumnFamily          268082       464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4492) Large HintsColumnFamily compactions hang

Posted by "Jason Harvey (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Description: 
Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

  was:
Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.

I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.

Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and see about sending them directly.

This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 nodes 1.0.11 ring where whenever a somewhat large set of hints build up (seen as low as 400MB), compaction on the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The issue always comes back rather quickly and predictably after wiping the sstables. Compaction always seems to succeed if the hints CFs are rather small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some copies of HintsColumnFamily sstables that do replicate this issue. However, the hints may contain confidential data. If they'd be helpful in troubleshooting this issue, let me know and I can see about sending them directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira