You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jonathan Ellis (JIRA)" <ji...@apache.org> on 2011/02/18 22:20:38 UTC
[jira] Resolved: (CASSANDRA-2198) Removing GC pressure

     [ https://issues.apache.org/jira/browse/CASSANDRA-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis resolved CASSANDRA-2198.
---------------------------------------

       Resolution: Invalid
    Fix Version/s:     (was: 0.7.3)

This is about the Hector side, not Cassandra.

> Removing GC pressure
> --------------------
>
>                 Key: CASSANDRA-2198
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2198
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Joaquin Casares
>            Priority: Minor
>
> we were debugging our client code for performance (we call our client "driver").
> we found that on a 16-node cassandra cluster, we use a 64-node set of driver boxes, each box uses 20 threads to pump read traffic as fast as possible.
> on the resulting latency graph, we see that most accesses are below 25ms, yet we periodically get these short gaps of about 20--40ms, where everyone just freezes, and after that freeze period, everyone's query immediately finish, creating a set of high latency measurements at the same time.
> this behavior strongly suggests GC , I turned on -verbose:gc, and indeed, during the stuck up period, we see GC operations. 
> currently we have
> -XX:-UseGCOverheadLimit \ 
> -XX:+DisableExplicitGC \ 
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails \ 
> -XX:+UseParallelGC -XX:+UseParallelOldGC \
> while there is room to tune the GC params, we went ahead to find the ***theoretical*** performance of cassandra without GC issues. so we used a machine with HUGE memory, and allocated 62GB for new generation heap, then this time, all the stuck up period disappeared. as a result, we have a 95% latency of 35ms (previous was like 60ms)
> on the other hand, we can mitigate the GC problem by creating fewer unecessary objects. 
> using yourkit, we can see clearly what allocates excessive objects. it shows that Hector StringSerializer, and some thrift api creating a lot of objects.
> it would be helpful to work on these areas to remove GC pressure.
> We are using cassandra 0.7.0 svn branch, version 1069075
> Hector-0.7.0-22 (we are moving to the -27 version soon)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira