You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Tobin Landricombe <to...@sensatus.com> on 2016/06/10 15:11:22 UTC

java.lang.OutOfMemoryError: Java heap space

Hi,

I've been googling various parts of this all day but none of the suggestions seem to fit.

I have 2 nodes, one of which is a seed. I'm trying to add a third node but, after a few minutes in the UJ state, the node dies with the above error (http://pastebin.com/iRvYfuAu).

Here are the warnings from the logs: http://pastebin.com/vYLvsHrv

I've googled them but nothing seems appropriate.

Debug log part 1: http://pastebin.com/b8ZSYtqV
Debug log part 2: http://pastebin.com/1Bbb7Vf8

Thanks for any suggestions,
Tobin

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Tobin Landricombe <to...@sensatus.com>.

Hi Ben,

We're using the akka persistence layer which doesn't give me much scope for remodelling data.

So, on the assumption that the guys who wrote the persistence layer knew what they were doing, I followed your suggestion to increase RAM (still only to a miserly 8gig, which the startup script has decided means the JVM should be started with -Xms1968M, -Xmx1968M, -Xmn200M) and now the new nodes are coming up.

Thanks for your help,
Tobin

> On 12 Jun 2016, at 06:52, Ben Slater <be...@instaclustr.com> wrote:
> 
> I should add - there is probably an option (c) of fiddling with a bunch of tuning parameters to try to nurse things through with your current config but I’m not sure that’s useful unless you really need to make the current set up work for some reason.
> 
> On Sun, 12 Jun 2016 at 15:23 Ben Slater <be...@instaclustr.com> wrote:
> Hi Tobin,
> 
> 4G RAM is a pretty small machine to be using to run Cassandra. As I mentioned, 8G of heap is the normal recommendation for a production machine which means you need at least 14-16G total (and can get performance benefit from more).
> 
> I agree disk space doesn’t look to really be an issue here and I’m not sure what impact degraded mode has but it doesn’t really sound good :-) (I think it’s caused by Is swap disabled? : false - ie you have swap enabled which is not recommended). 
> 
> In this case, I would expect that the relatively large partition(s) (175MB in the warning) in conjunction with the low heap allocation is what is causing C* to run out of heap. Heap exhaustion often manifests when C* has to compact a large partition. When you add a new node the data that gets streamed across has to be compacted which is why you’ll see it on the new node but node the existing nodes (yet).
> 
> So, I’d say your options are either (a) get more memory and increase heap space or (b) remodel your data with a partition key that does not create such large partitions (generally, smaller is better if it meets your functional needs and stay under 10MB to avoid having to tune specifically to meet the needs of large partitions). And, there is a fair chance you need to do (b) for a healthy cluster in the long run.
> 
> Cheers
> Ben
> 
> On Sat, 11 Jun 2016 at 20:52 Tobin Landricombe <to...@sensatus.com> wrote:
> Hi Ben,
> 
> I think the degraded mode is caused by one or both of these...
>         • WARN  [main] 2016-06-10 14:23:01,690 StartupChecks.java:118 - jemalloc shared library could not be preloaded to speed up memory allocations
>         • WARN  [main] 2016-06-10 14:23:01,691 StartupChecks.java:150 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
> ...neither of which should cause the heap issue.
> 
> The disk space isn't low for our (test) usage and again shouldn't cause the heap issue.
> 
> Which leaves the large partition. I couldn't find what is considered a large partition. Is it possible that syncing the large partition is causing problems? Why would it only affect the new node, not the running ones?
> 
> I looked at increasing the heap space but after reviewing the docs, the current settings look correct for the machines.
> 
> All the nodes are running on VMs with 2 cores and 4gig RAM. Neither they nor the hypervisor are showing much load.
> 
> Thanks for your help,
> Tobin
> 
> > On 10 Jun 2016, at 22:18, Ben Slater <be...@instaclustr.com> wrote:
> >
> > The short-term fix is probably to try increasing heap space (in cassandra-env.sh). 8GB in the most standard but more may help in some circumstances.
> >
> > That said, your logs are pointing to a number of other issues which won’t be helping and probably need to be fixed for long-term stability:
> > - swap enabled ( Cassandra server running in degraded mode. Is swap disabled? : false,  Address space adequate? : true,  nofile limit adequate? : true, nproc limit adequate? : true)
> > - low disk space ( Only 36948 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots)
> > - large partitions ( Writing large partition feed/messages:MANAGER:0 (175811867 bytes))
> >
> > Cheers
> > Ben
> >
> > On Sat, 11 Jun 2016 at 01:11 Tobin Landricombe <to...@sensatus.com> wrote:
> > Hi,
> >
> > I've been googling various parts of this all day but none of the suggestions seem to fit.
> >
> > I have 2 nodes, one of which is a seed. I'm trying to add a third node but, after a few minutes in the UJ state, the node dies with the above error (http://pastebin.com/iRvYfuAu).
> >
> > Here are the warnings from the logs: http://pastebin.com/vYLvsHrv
> >
> > I've googled them but nothing seems appropriate.
> >
> > Debug log part 1: http://pastebin.com/b8ZSYtqV
> > Debug log part 2: http://pastebin.com/1Bbb7Vf8
> >
> > Thanks for any suggestions,
> > Tobin
> >
> > --
> > ————————
> > Ben Slater
> > Chief Product Officer, Instaclustr
> > +61 437 929 798
> 
> -- 
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
> -- 
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Ben Slater <be...@instaclustr.com>.

I should add - there is probably an option (c) of fiddling with a bunch of
tuning parameters to try to nurse things through with your current config
but I’m not sure that’s useful unless you really need to make the current
set up work for some reason.

On Sun, 12 Jun 2016 at 15:23 Ben Slater <be...@instaclustr.com> wrote:

> Hi Tobin,
>
> 4G RAM is a pretty small machine to be using to run Cassandra. As I
> mentioned, 8G of heap is the normal recommendation for a production machine
> which means you need at least 14-16G total (and can get performance benefit
> from more).
>
> I agree disk space doesn’t look to really be an issue here and I’m not
> sure what impact degraded mode has but it doesn’t really sound good :-) (I
> think it’s caused by Is swap disabled? : false - ie you have swap enabled
> which is not recommended).
>
> In this case, I would expect that the relatively large partition(s) (175MB
> in the warning) in conjunction with the low heap allocation is what is
> causing C* to run out of heap. Heap exhaustion often manifests when C* has
> to compact a large partition. When you add a new node the data that gets
> streamed across has to be compacted which is why you’ll see it on the new
> node but node the existing nodes (yet).
>
> So, I’d say your options are either (a) get more memory and increase heap
> space or (b) remodel your data with a partition key that does not create
> such large partitions (generally, smaller is better if it meets your
> functional needs and stay under 10MB to avoid having to tune specifically
> to meet the needs of large partitions). And, there is a fair chance you
> need to do (b) for a healthy cluster in the long run.
>
> Cheers
> Ben
>
> On Sat, 11 Jun 2016 at 20:52 Tobin Landricombe <to...@sensatus.com> wrote:
>
>> Hi Ben,
>>
>> I think the degraded mode is caused by one or both of these...
>>         • WARN  [main] 2016-06-10 14:23:01,690 StartupChecks.java:118 -
>> jemalloc shared library could not be preloaded to speed up memory
>> allocations
>>         • WARN  [main] 2016-06-10 14:23:01,691 StartupChecks.java:150 -
>> JMX is not enabled to receive remote connections. Please see
>> cassandra-env.sh for more info.
>> ...neither of which should cause the heap issue.
>>
>> The disk space isn't low for our (test) usage and again shouldn't cause
>> the heap issue.
>>
>> Which leaves the large partition. I couldn't find what is considered a
>> large partition. Is it possible that syncing the large partition is causing
>> problems? Why would it only affect the new node, not the running ones?
>>
>> I looked at increasing the heap space but after reviewing the docs, the
>> current settings look correct for the machines.
>>
>> All the nodes are running on VMs with 2 cores and 4gig RAM. Neither they
>> nor the hypervisor are showing much load.
>>
>> Thanks for your help,
>> Tobin
>>
>> > On 10 Jun 2016, at 22:18, Ben Slater <be...@instaclustr.com>
>> wrote:
>> >
>> > The short-term fix is probably to try increasing heap space (in
>> cassandra-env.sh). 8GB in the most standard but more may help in some
>> circumstances.
>> >
>> > That said, your logs are pointing to a number of other issues which
>> won’t be helping and probably need to be fixed for long-term stability:
>> > - swap enabled ( Cassandra server running in degraded mode. Is swap
>> disabled? : false,  Address space adequate? : true,  nofile limit adequate?
>> : true, nproc limit adequate? : true)
>> > - low disk space ( Only 36948 MB free across all data volumes. Consider
>> adding more capacity to your cluster or removing obsolete snapshots)
>> > - large partitions ( Writing large partition feed/messages:MANAGER:0
>> (175811867 bytes))
>> >
>> > Cheers
>> > Ben
>> >
>> > On Sat, 11 Jun 2016 at 01:11 Tobin Landricombe <to...@sensatus.com>
>> wrote:
>> > Hi,
>> >
>> > I've been googling various parts of this all day but none of the
>> suggestions seem to fit.
>> >
>> > I have 2 nodes, one of which is a seed. I'm trying to add a third node
>> but, after a few minutes in the UJ state, the node dies with the above
>> error (http://pastebin.com/iRvYfuAu).
>> >
>> > Here are the warnings from the logs: http://pastebin.com/vYLvsHrv
>> >
>> > I've googled them but nothing seems appropriate.
>> >
>> > Debug log part 1: http://pastebin.com/b8ZSYtqV
>> > Debug log part 2: http://pastebin.com/1Bbb7Vf8
>> >
>> > Thanks for any suggestions,
>> > Tobin
>> >
>> > --
>> > ————————
>> > Ben Slater
>> > Chief Product Officer, Instaclustr
>> > +61 437 929 798
>>
>> --
> ————————
> Ben Slater
> Chief Product Officer
> Instaclustr: Cassandra + Spark - Managed | Consulting | Support
> +61 437 929 798
>
-- 
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Ben Slater <be...@instaclustr.com>.

Hi Tobin,

4G RAM is a pretty small machine to be using to run Cassandra. As I
mentioned, 8G of heap is the normal recommendation for a production machine
which means you need at least 14-16G total (and can get performance benefit
from more).

I agree disk space doesn’t look to really be an issue here and I’m not sure
what impact degraded mode has but it doesn’t really sound good :-) (I think
it’s caused by Is swap disabled? : false - ie you have swap enabled which
is not recommended).

In this case, I would expect that the relatively large partition(s) (175MB
in the warning) in conjunction with the low heap allocation is what is
causing C* to run out of heap. Heap exhaustion often manifests when C* has
to compact a large partition. When you add a new node the data that gets
streamed across has to be compacted which is why you’ll see it on the new
node but node the existing nodes (yet).

So, I’d say your options are either (a) get more memory and increase heap
space or (b) remodel your data with a partition key that does not create
such large partitions (generally, smaller is better if it meets your
functional needs and stay under 10MB to avoid having to tune specifically
to meet the needs of large partitions). And, there is a fair chance you
need to do (b) for a healthy cluster in the long run.

Cheers
Ben

On Sat, 11 Jun 2016 at 20:52 Tobin Landricombe <to...@sensatus.com> wrote:

> Hi Ben,
>
> I think the degraded mode is caused by one or both of these...
>         • WARN  [main] 2016-06-10 14:23:01,690 StartupChecks.java:118 -
> jemalloc shared library could not be preloaded to speed up memory
> allocations
>         • WARN  [main] 2016-06-10 14:23:01,691 StartupChecks.java:150 -
> JMX is not enabled to receive remote connections. Please see
> cassandra-env.sh for more info.
> ...neither of which should cause the heap issue.
>
> The disk space isn't low for our (test) usage and again shouldn't cause
> the heap issue.
>
> Which leaves the large partition. I couldn't find what is considered a
> large partition. Is it possible that syncing the large partition is causing
> problems? Why would it only affect the new node, not the running ones?
>
> I looked at increasing the heap space but after reviewing the docs, the
> current settings look correct for the machines.
>
> All the nodes are running on VMs with 2 cores and 4gig RAM. Neither they
> nor the hypervisor are showing much load.
>
> Thanks for your help,
> Tobin
>
> > On 10 Jun 2016, at 22:18, Ben Slater <be...@instaclustr.com> wrote:
> >
> > The short-term fix is probably to try increasing heap space (in
> cassandra-env.sh). 8GB in the most standard but more may help in some
> circumstances.
> >
> > That said, your logs are pointing to a number of other issues which
> won’t be helping and probably need to be fixed for long-term stability:
> > - swap enabled ( Cassandra server running in degraded mode. Is swap
> disabled? : false,  Address space adequate? : true,  nofile limit adequate?
> : true, nproc limit adequate? : true)
> > - low disk space ( Only 36948 MB free across all data volumes. Consider
> adding more capacity to your cluster or removing obsolete snapshots)
> > - large partitions ( Writing large partition feed/messages:MANAGER:0
> (175811867 bytes))
> >
> > Cheers
> > Ben
> >
> > On Sat, 11 Jun 2016 at 01:11 Tobin Landricombe <to...@sensatus.com>
> wrote:
> > Hi,
> >
> > I've been googling various parts of this all day but none of the
> suggestions seem to fit.
> >
> > I have 2 nodes, one of which is a seed. I'm trying to add a third node
> but, after a few minutes in the UJ state, the node dies with the above
> error (http://pastebin.com/iRvYfuAu).
> >
> > Here are the warnings from the logs: http://pastebin.com/vYLvsHrv
> >
> > I've googled them but nothing seems appropriate.
> >
> > Debug log part 1: http://pastebin.com/b8ZSYtqV
> > Debug log part 2: http://pastebin.com/1Bbb7Vf8
> >
> > Thanks for any suggestions,
> > Tobin
> >
> > --
> > ————————
> > Ben Slater
> > Chief Product Officer, Instaclustr
> > +61 437 929 798
>
> --
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Tobin Landricombe <to...@sensatus.com>.

Hi Ben,

I think the degraded mode is caused by one or both of these...
	• WARN  [main] 2016-06-10 14:23:01,690 StartupChecks.java:118 - jemalloc shared library could not be preloaded to speed up memory allocations
	• WARN  [main] 2016-06-10 14:23:01,691 StartupChecks.java:150 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
...neither of which should cause the heap issue.

The disk space isn't low for our (test) usage and again shouldn't cause the heap issue.

Which leaves the large partition. I couldn't find what is considered a large partition. Is it possible that syncing the large partition is causing problems? Why would it only affect the new node, not the running ones?

I looked at increasing the heap space but after reviewing the docs, the current settings look correct for the machines.

All the nodes are running on VMs with 2 cores and 4gig RAM. Neither they nor the hypervisor are showing much load.

Thanks for your help,
Tobin

> On 10 Jun 2016, at 22:18, Ben Slater <be...@instaclustr.com> wrote:
> 
> The short-term fix is probably to try increasing heap space (in cassandra-env.sh). 8GB in the most standard but more may help in some circumstances.
> 
> That said, your logs are pointing to a number of other issues which won’t be helping and probably need to be fixed for long-term stability:
> - swap enabled ( Cassandra server running in degraded mode. Is swap disabled? : false,  Address space adequate? : true,  nofile limit adequate? : true, nproc limit adequate? : true)
> - low disk space ( Only 36948 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots)
> - large partitions ( Writing large partition feed/messages:MANAGER:0 (175811867 bytes))
> 
> Cheers
> Ben
> 
> On Sat, 11 Jun 2016 at 01:11 Tobin Landricombe <to...@sensatus.com> wrote:
> Hi,
> 
> I've been googling various parts of this all day but none of the suggestions seem to fit.
> 
> I have 2 nodes, one of which is a seed. I'm trying to add a third node but, after a few minutes in the UJ state, the node dies with the above error (http://pastebin.com/iRvYfuAu).
> 
> Here are the warnings from the logs: http://pastebin.com/vYLvsHrv
> 
> I've googled them but nothing seems appropriate.
> 
> Debug log part 1: http://pastebin.com/b8ZSYtqV
> Debug log part 2: http://pastebin.com/1Bbb7Vf8
> 
> Thanks for any suggestions,
> Tobin
> 
> -- 
> ————————
> Ben Slater
> Chief Product Officer, Instaclustr
> +61 437 929 798

Re: java.lang.OutOfMemoryError: Java heap space

Posted by Ben Slater <be...@instaclustr.com>.

The short-term fix is probably to try increasing heap space (in
cassandra-env.sh). 8GB in the most standard but more may help in some
circumstances.

That said, your logs are pointing to a number of other issues which won’t
be helping and probably need to be fixed for long-term stability:
- swap enabled ( Cassandra server running in degraded mode. Is swap
disabled? : false,  Address space adequate? : true,  nofile limit adequate?
: true, nproc limit adequate? : true)
- low disk space ( Only 36948 MB free across all data volumes. Consider
adding more capacity to your cluster or removing obsolete snapshots)
- large partitions ( Writing large partition feed/messages:MANAGER:0
(175811867 bytes))

Cheers
Ben

On Sat, 11 Jun 2016 at 01:11 Tobin Landricombe <to...@sensatus.com> wrote:

> Hi,
>
> I've been googling various parts of this all day but none of the
> suggestions seem to fit.
>
> I have 2 nodes, one of which is a seed. I'm trying to add a third node
> but, after a few minutes in the UJ state, the node dies with the above
> error (http://pastebin.com/iRvYfuAu).
>
> Here are the warnings from the logs: http://pastebin.com/vYLvsHrv
>
> I've googled them but nothing seems appropriate.
>
> Debug log part 1: http://pastebin.com/b8ZSYtqV
> Debug log part 2: http://pastebin.com/1Bbb7Vf8
>
> Thanks for any suggestions,
> Tobin
>
> --
————————
Ben Slater
Chief Product Officer, Instaclustr
+61 437 929 798