You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Joel Samuelsson <sa...@gmail.com> on 2014/01/22 15:50:58 UTC

Extremely long GC

Hello,

We've been having problems with long GC pauses and can't seem to get rid of
them.

Our latest test is on a clean machine with Ubuntu 12.04 LTS, Java 1.7.0_45
and JNA installed.
It is a single node cluster with most settings being default, the only
things changed are ip-addresses, cluster name and partitioner (to
RandomPartitioner).
We are running Cassandra 2.0.4.
We are running on a virtual machine with Xen.
We have 16GB of ram and default memory settings for C* (i.e. heap size of
4GB). CPU specified as 8 cores by our provider.

Right now, we have no data on the machine and no requests to it at all.
Still we get ParNew GCs like the following:
INFO [ScheduledTasks:1] 2014-01-18 10:54:42,286 GCInspector.java (line 116)
GC for ParNew: 464 ms for 1 collections, 102838776 used; max is 4106223616

While this may not be extremely long, on other machines with the same setup
but some data (around 12GB) and around 10 read requests/s (i.e. basically
no load) we have seen ParNew GC for 20 minutes or more. During this time,
the machine goes down completely (I can't even ssh to it). The requests are
mostly from OpsCenter and the rows requested are not extremely large
(typically less than 1KB).

We have tried a lot of different things to solve these issues since we've
been having them for a long time including:
- Upgrading Cassandra to new versions
- Upgrading Java to new versions
- Printing promotion failures in GC-log (no failures found!)
- Different sizes of heap and heap space for different GC spaces (Eden etc.)
- Different versions of Ubuntu
- Running on Amazon EC2 instead of the provider we are using now (not with
Datastax AMI)

Something that may be a clue is that when running the DataStax Community
AMI on Amazon we haven't seen the GC yet (it's been running for a week or
so). Just to be clear, another test on Amazon EC2 mentioned above (without
the Datastax AMI) shows the GC freezes.

If any other information is needed, just let me know.

Best regards,
Joel Samuelsson

Re: Extremely long GC

Posted by Yogi Nerella <yn...@gmail.com>.

Hi Joel,

One simple above log record will not help.   Please provide the entire
command line the process is started with including JVM options, the log
file showing all GC messages..
Is it possible for you to collect VerboseGC, which would print the before
and after statistics of the memory.

Version of java, version of Cassandra, availability of swap space, disk
space etc.,
CPU usage around the timeframe the garbage collection has happened.

What is the java -Xms (minimum memory) setting, try reducing it and see if
it helps.


Yogi




On Wed, Jan 22, 2014 at 11:12 PM, Joel Samuelsson <samuelsson.joel@gmail.com
> wrote:

> Here is one example. 12GB data, no load besides OpsCenter and perhaps 1-2
> requests per minute.
> INFO [ScheduledTasks:1] 2013-12-29 01:03:25,381 GCInspector.java (line
> 119) GC for ParNew: 426400 ms for 1 collections, 2253360864 used; max is
> 4114612224
>
>
> 2014/1/22 Yogi Nerella <yn...@gmail.com>
>
>> Hi,
>>
>> Can you share the GC logs for the systems you are running problems into?
>>
>> Yogi
>>
>>
>> On Wed, Jan 22, 2014 at 6:50 AM, Joel Samuelsson <
>> samuelsson.joel@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> We've been having problems with long GC pauses and can't seem to get rid
>>> of them.
>>>
>>> Our latest test is on a clean machine with Ubuntu 12.04 LTS, Java
>>> 1.7.0_45 and JNA installed.
>>> It is a single node cluster with most settings being default, the only
>>> things changed are ip-addresses, cluster name and partitioner (to
>>> RandomPartitioner).
>>> We are running Cassandra 2.0.4.
>>> We are running on a virtual machine with Xen.
>>> We have 16GB of ram and default memory settings for C* (i.e. heap size
>>> of 4GB). CPU specified as 8 cores by our provider.
>>>
>>> Right now, we have no data on the machine and no requests to it at all.
>>> Still we get ParNew GCs like the following:
>>> INFO [ScheduledTasks:1] 2014-01-18 10:54:42,286 GCInspector.java (line
>>> 116) GC for ParNew: 464 ms for 1 collections, 102838776 used; max is
>>> 4106223616
>>>
>>> While this may not be extremely long, on other machines with the same
>>> setup but some data (around 12GB) and around 10 read requests/s (i.e.
>>> basically no load) we have seen ParNew GC for 20 minutes or more. During
>>> this time, the machine goes down completely (I can't even ssh to it). The
>>> requests are mostly from OpsCenter and the rows requested are not extremely
>>> large (typically less than 1KB).
>>>
>>> We have tried a lot of different things to solve these issues since
>>> we've been having them for a long time including:
>>> - Upgrading Cassandra to new versions
>>> - Upgrading Java to new versions
>>> - Printing promotion failures in GC-log (no failures found!)
>>> - Different sizes of heap and heap space for different GC spaces (Eden
>>> etc.)
>>> - Different versions of Ubuntu
>>> - Running on Amazon EC2 instead of the provider we are using now (not
>>> with Datastax AMI)
>>>
>>> Something that may be a clue is that when running the DataStax Community
>>> AMI on Amazon we haven't seen the GC yet (it's been running for a week or
>>> so). Just to be clear, another test on Amazon EC2 mentioned above (without
>>> the Datastax AMI) shows the GC freezes.
>>>
>>> If any other information is needed, just let me know.
>>>
>>> Best regards,
>>> Joel Samuelsson
>>>
>>
>>
>

Re: Extremely long GC

Posted by Joel Samuelsson <sa...@gmail.com>.

Here is one example. 12GB data, no load besides OpsCenter and perhaps 1-2
requests per minute.
INFO [ScheduledTasks:1] 2013-12-29 01:03:25,381 GCInspector.java (line 119)
GC for ParNew: 426400 ms for 1 collections, 2253360864 used; max is
4114612224


2014/1/22 Yogi Nerella <yn...@gmail.com>

> Hi,
>
> Can you share the GC logs for the systems you are running problems into?
>
> Yogi
>
>
> On Wed, Jan 22, 2014 at 6:50 AM, Joel Samuelsson <
> samuelsson.joel@gmail.com> wrote:
>
>> Hello,
>>
>> We've been having problems with long GC pauses and can't seem to get rid
>> of them.
>>
>> Our latest test is on a clean machine with Ubuntu 12.04 LTS, Java
>> 1.7.0_45 and JNA installed.
>> It is a single node cluster with most settings being default, the only
>> things changed are ip-addresses, cluster name and partitioner (to
>> RandomPartitioner).
>> We are running Cassandra 2.0.4.
>> We are running on a virtual machine with Xen.
>> We have 16GB of ram and default memory settings for C* (i.e. heap size of
>> 4GB). CPU specified as 8 cores by our provider.
>>
>> Right now, we have no data on the machine and no requests to it at all.
>> Still we get ParNew GCs like the following:
>> INFO [ScheduledTasks:1] 2014-01-18 10:54:42,286 GCInspector.java (line
>> 116) GC for ParNew: 464 ms for 1 collections, 102838776 used; max is
>> 4106223616
>>
>> While this may not be extremely long, on other machines with the same
>> setup but some data (around 12GB) and around 10 read requests/s (i.e.
>> basically no load) we have seen ParNew GC for 20 minutes or more. During
>> this time, the machine goes down completely (I can't even ssh to it). The
>> requests are mostly from OpsCenter and the rows requested are not extremely
>> large (typically less than 1KB).
>>
>> We have tried a lot of different things to solve these issues since we've
>> been having them for a long time including:
>> - Upgrading Cassandra to new versions
>> - Upgrading Java to new versions
>> - Printing promotion failures in GC-log (no failures found!)
>> - Different sizes of heap and heap space for different GC spaces (Eden
>> etc.)
>> - Different versions of Ubuntu
>> - Running on Amazon EC2 instead of the provider we are using now (not
>> with Datastax AMI)
>>
>> Something that may be a clue is that when running the DataStax Community
>> AMI on Amazon we haven't seen the GC yet (it's been running for a week or
>> so). Just to be clear, another test on Amazon EC2 mentioned above (without
>> the Datastax AMI) shows the GC freezes.
>>
>> If any other information is needed, just let me know.
>>
>> Best regards,
>> Joel Samuelsson
>>
>
>

Re: Extremely long GC

Posted by Yogi Nerella <yn...@gmail.com>.

Hi,

Can you share the GC logs for the systems you are running problems into?

Yogi


On Wed, Jan 22, 2014 at 6:50 AM, Joel Samuelsson
<sa...@gmail.com>wrote:

> Hello,
>
> We've been having problems with long GC pauses and can't seem to get rid
> of them.
>
> Our latest test is on a clean machine with Ubuntu 12.04 LTS, Java 1.7.0_45
> and JNA installed.
> It is a single node cluster with most settings being default, the only
> things changed are ip-addresses, cluster name and partitioner (to
> RandomPartitioner).
> We are running Cassandra 2.0.4.
> We are running on a virtual machine with Xen.
> We have 16GB of ram and default memory settings for C* (i.e. heap size of
> 4GB). CPU specified as 8 cores by our provider.
>
> Right now, we have no data on the machine and no requests to it at all.
> Still we get ParNew GCs like the following:
> INFO [ScheduledTasks:1] 2014-01-18 10:54:42,286 GCInspector.java (line
> 116) GC for ParNew: 464 ms for 1 collections, 102838776 used; max is
> 4106223616
>
> While this may not be extremely long, on other machines with the same
> setup but some data (around 12GB) and around 10 read requests/s (i.e.
> basically no load) we have seen ParNew GC for 20 minutes or more. During
> this time, the machine goes down completely (I can't even ssh to it). The
> requests are mostly from OpsCenter and the rows requested are not extremely
> large (typically less than 1KB).
>
> We have tried a lot of different things to solve these issues since we've
> been having them for a long time including:
> - Upgrading Cassandra to new versions
> - Upgrading Java to new versions
> - Printing promotion failures in GC-log (no failures found!)
> - Different sizes of heap and heap space for different GC spaces (Eden
> etc.)
> - Different versions of Ubuntu
> - Running on Amazon EC2 instead of the provider we are using now (not with
> Datastax AMI)
>
> Something that may be a clue is that when running the DataStax Community
> AMI on Amazon we haven't seen the GC yet (it's been running for a week or
> so). Just to be clear, another test on Amazon EC2 mentioned above (without
> the Datastax AMI) shows the GC freezes.
>
> If any other information is needed, just let me know.
>
> Best regards,
> Joel Samuelsson
>