You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by "Thakrar, Jayesh" <jt...@conversantmedia.com> on 2017/08/22 14:42:27 UTC

Cassandra crashes....

Hi All,

We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the user group for their experiences.

Our usage profile is  batch jobs that load millions of rows to Cassandra every hour.
And there are similar period batch jobs that read millions of rows and do some processing, outputting the result to HDFS (no issues with HDFS).

We often seen Cassandra daemons crash.
Key points of our environment are:
Pretty good servers: 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB SSD drive
Compaction: TWCS compaction with 7 day windows as the data retention period is limited - about 120 days.
JDK: Java 1.8.0.71 and G1 GC
Heap Size: 16 GB
Large SSTables: 50 GB to 300+ GB

We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 seconds).
Note that we had set the target for GC pauses to be 200 ms

We have been somewhat able to tame the crashes by updating the TWCS compaction properties
to have min/max compaction sstables = 4 and by drastically reducing the size of the New/Eden space (to 5% of heap space = 800 MB).
Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.
Since the servers have more than sufficient resources, we are not seeing any noticeable performance impact.

Is this kind of tuning normal/expected?

Thanks,
Jayesh

Re: Cassandra crashes....

Posted by Jeff Jirsa <jj...@gmail.com>.

You typically don't want to set the eden space when you're using G1

-- 
Jeff Jirsa


> On Aug 22, 2017, at 7:42 AM, Thakrar, Jayesh <jt...@conversantmedia.com> wrote:
> 
> Hi All,
>  
> We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the user group for their experiences.
>  
> Our usage profile is  batch jobs that load millions of rows to Cassandra every hour.
> And there are similar period batch jobs that read millions of rows and do some processing, outputting the result to HDFS (no issues with HDFS).
>  
> We often seen Cassandra daemons crash.
> Key points of our environment are:
> Pretty good servers: 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB SSD drive
> Compaction: TWCS compaction with 7 day windows as the data retention period is limited - about 120 days.
> JDK: Java 1.8.0.71 and G1 GC
> Heap Size: 16 GB
> Large SSTables: 50 GB to 300+ GB
> 
> We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 seconds).
> Note that we had set the target for GC pauses to be 200 ms
>  
> We have been somewhat able to tame the crashes by updating the TWCS compaction properties
> to have min/max compaction sstables = 4 and by drastically reducing the size of the New/Eden space (to 5% of heap space = 800 MB).
> Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.
> Since the servers have more than sufficient resources, we are not seeing any noticeable performance impact.
>  
> Is this kind of tuning normal/expected?
>  
> Thanks,
> Jayesh
>