You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dustin Wenz <du...@ebureau.com> on 2012/06/27 00:37:16 UTC

Amazingly bad compaction performance

We occasionally see fairly poor compaction performance on random nodes in our 7-node cluster, and I have no idea why. This is one example from the log:

	[CompactionExecutor:45] 2012-06-26 13:40:18,721 CompactionTask.java (line 221) Compacted to [/raid00/cassandra_data/main/basic/main-basic.basic_id_index-hd-160-Data.db,].  26,632,210 to 26,679,667 (~100% of original) bytes for 2 keys at 0.006250MB/s.  Time: 4,071,163ms.

That particular event took over an hour to compact only 25 megabytes. During that time, there was very little disk IO, and the java process (OpenJDK 7) was pegged at 200% CPU. The node was also completely unresponsive to network requests until the compaction was finished. Most compactions run just over 7MB/s. This is an extreme outlier, but users definitely notice the hit when it occurs.

I grabbed a sample of the process using jstack, and this was the only thread in CompactionExecutor:

	"CompactionExecutor:54" daemon prio=1 tid=41247522816 nid=0x99a5ff740 runnable [140737253617664]
	   java.lang.Thread.State: RUNNABLE
		at org.xerial.snappy.SnappyNative.rawCompress(Native Method)
		at org.xerial.snappy.Snappy.rawCompress(Snappy.java:358)
		at org.apache.cassandra.io.compress.SnappyCompressor.compress(SnappyCompressor.java:80)
		at org.apache.cassandra.io.compress.CompressedSequentialWriter.flushData(CompressedSequentialWriter.java:89)
		at org.apache.cassandra.io.util.SequentialWriter.flushInternal(SequentialWriter.java:196)
		at org.apache.cassandra.io.util.SequentialWriter.reBuffer(SequentialWriter.java:260)
		at org.apache.cassandra.io.util.SequentialWriter.writeAtMost(SequentialWriter.java:128)
		at org.apache.cassandra.io.util.SequentialWriter.write(SequentialWriter.java:112)
		at java.io.DataOutputStream.write(DataOutputStream.java:107)
		- locked <36527862064> (a java.io.DataOutputStream)
		at org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:142)
		at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
		at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
		at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
		at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
		at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
		at java.util.concurrent.FutureTask.run(FutureTask.java:166)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
		at java.lang.Thread.run(Thread.java:722)

Is it possible that there is an issue with snappy compression? Based on the lousy compression ratio, I think we could get by without it just fine. Can compression be changed or disabled on-the-fly with cassandra?

	- .Dustin

Re: Amazingly bad compaction performance

Posted by Tyler Hobbs <ty...@datastax.com>.

On Wed, Jun 27, 2012 at 1:42 AM, Derek Andree <da...@lacunasystems.com>wrote:

> Last I heard only Oracle's JDK was officially supported with Cassandra,
> possibly nitpicky but is this still the case?
>
> On Jun 26, 2012, at 3:37 PM, Dustin Wenz wrote:
>
> > (OpenJDK 7) was pegged at 200% CPU
>
>
Java 7 still hasn't been that thoroughly tested, and from your description
of the problem, it sounds like that might indeed be the cause.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Amazingly bad compaction performance

Posted by Derek Andree <da...@lacunasystems.com>.

Last I heard only Oracle's JDK was officially supported with Cassandra, possibly nitpicky but is this still the case?

On Jun 26, 2012, at 3:37 PM, Dustin Wenz wrote:

> (OpenJDK 7) was pegged at 200% CPU

Re: Amazingly bad compaction performance

Posted by aaron morton <aa...@thelastpickle.com>.

>>> Can compression be changed or disabled on-the-fly with cassandra?
Yes. Disable it in the schema and then run nodetool upgradetables

As Tyler said, JDK7 is not officially supported yet and you may be running into issues others have not found. Any chance you could downgrade one node to JDK6 and check the performance ? If it looks like a JDK issue could you post your findings to https://issues.apache.org/jira/browse/CASSANDRA and include the schema details ? 

Thanks

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 29/06/2012, at 2:36 AM, Dustin Wenz wrote:

> My maximum and initial heap sizes are set to 6GB. Actual memory usage for the VM is around 11-12GB. The machine has 24GB of physical memory, so there isn't any paging going in.
> 
> I don't see any GC events logged that are longer than a few hundred milliseconds. Is it possible that GC is taking significant time without it being reported?
> 
> 	- .Dustin
> 
> On Jun 27, 2012, at 1:31 AM, Igor wrote:
> 
>> Hello
>> 
>> Too much GC? Check JVM heap settings and real usage.
>> 
>> On 06/27/2012 01:37 AM, Dustin Wenz wrote:
>>> We occasionally see fairly poor compaction performance on random nodes in our 7-node cluster, and I have no idea why. This is one example from the log:
>>> 
>>> 	[CompactionExecutor:45] 2012-06-26 13:40:18,721 CompactionTask.java (line 221) Compacted to [/raid00/cassandra_data/main/basic/main-basic.basic_id_index-hd-160-Data.db,].  26,632,210 to 26,679,667 (~100% of original) bytes for 2 keys at 0.006250MB/s.  Time: 4,071,163ms.
>>> 
>>> That particular event took over an hour to compact only 25 megabytes. During that time, there was very little disk IO, and the java process (OpenJDK 7) was pegged at 200% CPU. The node was also completely unresponsive to network requests until the compaction was finished. Most compactions run just over 7MB/s. This is an extreme outlier, but users definitely notice the hit when it occurs.
>>> 
>>> I grabbed a sample of the process using jstack, and this was the only thread in CompactionExecutor:
>>> 
>>> 	"CompactionExecutor:54" daemon prio=1 tid=41247522816 nid=0x99a5ff740 runnable [140737253617664]
>>> 	   java.lang.Thread.State: RUNNABLE
>>> 		at org.xerial.snappy.SnappyNative.rawCompress(Native Method)
>>> 		at org.xerial.snappy.Snappy.rawCompress(Snappy.java:358)
>>> 		at org.apache.cassandra.io.compress.SnappyCompressor.compress(SnappyCompressor.java:80)
>>> 		at org.apache.cassandra.io.compress.CompressedSequentialWriter.flushData(CompressedSequentialWriter.java:89)
>>> 		at org.apache.cassandra.io.util.SequentialWriter.flushInternal(SequentialWriter.java:196)
>>> 		at org.apache.cassandra.io.util.SequentialWriter.reBuffer(SequentialWriter.java:260)
>>> 		at org.apache.cassandra.io.util.SequentialWriter.writeAtMost(SequentialWriter.java:128)
>>> 		at org.apache.cassandra.io.util.SequentialWriter.write(SequentialWriter.java:112)
>>> 		at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>> 		- locked <36527862064> (a java.io.DataOutputStream)
>>> 		at org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:142)
>>> 		at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
>>> 		at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
>>> 		at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
>>> 		at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>> 		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>> 		at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>> 		at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>> 		at java.lang.Thread.run(Thread.java:722)
>>> 
>>> Is it possible that there is an issue with snappy compression? Based on the lousy compression ratio, I think we could get by without it just fine. Can compression be changed or disabled on-the-fly with cassandra?
>>> 
>>> 	- .Dustin
>> 
>> 
>

Re: Amazingly bad compaction performance

Posted by Dustin Wenz <du...@ebureau.com>.

My maximum and initial heap sizes are set to 6GB. Actual memory usage for the VM is around 11-12GB. The machine has 24GB of physical memory, so there isn't any paging going in.

I don't see any GC events logged that are longer than a few hundred milliseconds. Is it possible that GC is taking significant time without it being reported?

	- .Dustin

On Jun 27, 2012, at 1:31 AM, Igor wrote:

> Hello
> 
> Too much GC? Check JVM heap settings and real usage.
> 
> On 06/27/2012 01:37 AM, Dustin Wenz wrote:
>> We occasionally see fairly poor compaction performance on random nodes in our 7-node cluster, and I have no idea why. This is one example from the log:
>> 
>> 	[CompactionExecutor:45] 2012-06-26 13:40:18,721 CompactionTask.java (line 221) Compacted to [/raid00/cassandra_data/main/basic/main-basic.basic_id_index-hd-160-Data.db,].  26,632,210 to 26,679,667 (~100% of original) bytes for 2 keys at 0.006250MB/s.  Time: 4,071,163ms.
>> 
>> That particular event took over an hour to compact only 25 megabytes. During that time, there was very little disk IO, and the java process (OpenJDK 7) was pegged at 200% CPU. The node was also completely unresponsive to network requests until the compaction was finished. Most compactions run just over 7MB/s. This is an extreme outlier, but users definitely notice the hit when it occurs.
>> 
>> I grabbed a sample of the process using jstack, and this was the only thread in CompactionExecutor:
>> 
>> 	"CompactionExecutor:54" daemon prio=1 tid=41247522816 nid=0x99a5ff740 runnable [140737253617664]
>> 	   java.lang.Thread.State: RUNNABLE
>> 		at org.xerial.snappy.SnappyNative.rawCompress(Native Method)
>> 		at org.xerial.snappy.Snappy.rawCompress(Snappy.java:358)
>> 		at org.apache.cassandra.io.compress.SnappyCompressor.compress(SnappyCompressor.java:80)
>> 		at org.apache.cassandra.io.compress.CompressedSequentialWriter.flushData(CompressedSequentialWriter.java:89)
>> 		at org.apache.cassandra.io.util.SequentialWriter.flushInternal(SequentialWriter.java:196)
>> 		at org.apache.cassandra.io.util.SequentialWriter.reBuffer(SequentialWriter.java:260)
>> 		at org.apache.cassandra.io.util.SequentialWriter.writeAtMost(SequentialWriter.java:128)
>> 		at org.apache.cassandra.io.util.SequentialWriter.write(SequentialWriter.java:112)
>> 		at java.io.DataOutputStream.write(DataOutputStream.java:107)
>> 		- locked <36527862064> (a java.io.DataOutputStream)
>> 		at org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:142)
>> 		at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
>> 		at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
>> 		at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
>> 		at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>> 		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> 		at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>> 		at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>> 		at java.lang.Thread.run(Thread.java:722)
>> 
>> Is it possible that there is an issue with snappy compression? Based on the lousy compression ratio, I think we could get by without it just fine. Can compression be changed or disabled on-the-fly with cassandra?
>> 
>> 	- .Dustin
> 
>

Re: Amazingly bad compaction performance

Posted by Igor <ig...@4friends.od.ua>.

Hello

Too much GC? Check JVM heap settings and real usage.

On 06/27/2012 01:37 AM, Dustin Wenz wrote:
> We occasionally see fairly poor compaction performance on random nodes in our 7-node cluster, and I have no idea why. This is one example from the log:
>
> 	[CompactionExecutor:45] 2012-06-26 13:40:18,721 CompactionTask.java (line 221) Compacted to [/raid00/cassandra_data/main/basic/main-basic.basic_id_index-hd-160-Data.db,].  26,632,210 to 26,679,667 (~100% of original) bytes for 2 keys at 0.006250MB/s.  Time: 4,071,163ms.
>
> That particular event took over an hour to compact only 25 megabytes. During that time, there was very little disk IO, and the java process (OpenJDK 7) was pegged at 200% CPU. The node was also completely unresponsive to network requests until the compaction was finished. Most compactions run just over 7MB/s. This is an extreme outlier, but users definitely notice the hit when it occurs.
>
> I grabbed a sample of the process using jstack, and this was the only thread in CompactionExecutor:
>
> 	"CompactionExecutor:54" daemon prio=1 tid=41247522816 nid=0x99a5ff740 runnable [140737253617664]
> 	   java.lang.Thread.State: RUNNABLE
> 		at org.xerial.snappy.SnappyNative.rawCompress(Native Method)
> 		at org.xerial.snappy.Snappy.rawCompress(Snappy.java:358)
> 		at org.apache.cassandra.io.compress.SnappyCompressor.compress(SnappyCompressor.java:80)
> 		at org.apache.cassandra.io.compress.CompressedSequentialWriter.flushData(CompressedSequentialWriter.java:89)
> 		at org.apache.cassandra.io.util.SequentialWriter.flushInternal(SequentialWriter.java:196)
> 		at org.apache.cassandra.io.util.SequentialWriter.reBuffer(SequentialWriter.java:260)
> 		at org.apache.cassandra.io.util.SequentialWriter.writeAtMost(SequentialWriter.java:128)
> 		at org.apache.cassandra.io.util.SequentialWriter.write(SequentialWriter.java:112)
> 		at java.io.DataOutputStream.write(DataOutputStream.java:107)
> 		- locked <36527862064> (a java.io.DataOutputStream)
> 		at org.apache.cassandra.db.compaction.PrecompactedRow.write(PrecompactedRow.java:142)
> 		at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:156)
> 		at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:159)
> 		at org.apache.cassandra.db.compaction.CompactionManager$1.runMayThrow(CompactionManager.java:150)
> 		at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> 		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 		at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> 		at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> 		at java.lang.Thread.run(Thread.java:722)
>
> Is it possible that there is an issue with snappy compression? Based on the lousy compression ratio, I think we could get by without it just fine. Can compression be changed or disabled on-the-fly with cassandra?
>
> 	- .Dustin