You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Satoshi Hikida <sa...@gmail.com> on 2016/10/11 17:23:43 UTC

Is there any way to throttle the memtable flushing throughput?

Hi,

I'm investigating the read/write performance of the C* (Ver. 2.2.8).
However, I have an issue about memtable flushing which forces the spiky
write throughput. And then it affects the latency of the client's requests.

So I want to know the answers for the following questions.

1. Is there any way that throttling the write throughput of the memtable
flushing? If it exists, how can I do that?
2. Is there any way to reduce the spike of the write bandwidth during the
memtable flushing?
   (I'm in trouble because the delay of the request increases when the
spike of the write bandwidth occurred)

I'm using one C* node for this investigation. And C* runs on an EC2
instance (2vCPU, 4GB memory), In addition, I attach two magnetic disks to
the instance, one stores system data(root file system.(/)), the other
stores C* data (data files and commit logs).

I also changed a few configurations.
- commitlog_sync: batch
- commitlog_sync_batch_window_in_ms: 2
(Using default value for the other configurations)


Regards,
Satoshi

Re: Is there any way to throttle the memtable flushing throughput?

Posted by Satoshi Hikida <sa...@gmail.com>.

Hi, Ben

Thank you for your reply.

> The AWS instance type you are using is not appropriate for a production
workload
I agree with you. I use it for a just verification of the C* behavior.

So I really want to understand the actual mechanism of the write request
 blocking. I would appreciate if you could give me more advice.

> The small amount of memory on the node could also mean your flush writers
are getting backed up (blocked)
Does it mean the flush writer thread is blocked? or write request (from
client) is blocked?
Why the small amount of memory leads to flush writers block?

Anyway, I'll change the some configurations of my node according to your
advice.
Thanks a lot.

Regards,
Satoshi

On Wed, Oct 12, 2016 at 4:27 AM, Ben Bromhead <be...@instaclustr.com> wrote:

> A few thoughts on the larger problem at hand.
>
> The AWS instance type you are using is not appropriate for a production
> workload. Also with memtable flushes that cause spiky write throughput it
> sounds like your commitlog is on the same disk as your data directory,
> combined with the use of non-SSD EBS I'm not surprised this is happening.
> The small amount of memory on the node could also mean your flush writers
> are getting backed up (blocked), possibly causing JVM heap pressure and
> other fun things (you can check this with nodetool tpstats).
>
> Before you get into tuning memtable flushing I would do the following:
>
>    - Reset your commitlog_sync settings back to default
>    - Use an EC2 instance type with at least 15GB of memory, 4 cores and
>    is EBS optimized (dedicated EBS bandwidth)
>    - Use gp2 or io2 EBS volumes
>    - Put your commitlog on a separate EBS volume.
>    - Make sure your memtable_flush_writers are not being blocked, if so
>    increase the number of flush writers (no more than # of cores)
>    - Optimize your read_ahead_kb size and compression_chunk_length to
>    keep those EBS reads as small as possible.
>
> Once you have fixed the above, memtable flushing should not be an issue.
> Even if you can't/don't want to upgrade the instance type, the other steps
> will help things.
>
> Ben
>
> On Tue, 11 Oct 2016 at 10:23 Satoshi Hikida <sa...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm investigating the read/write performance of the C* (Ver. 2.2.8).
>> However, I have an issue about memtable flushing which forces the spiky
>> write throughput. And then it affects the latency of the client's requests.
>>
>> So I want to know the answers for the following questions.
>>
>> 1. Is there any way that throttling the write throughput of the memtable
>> flushing? If it exists, how can I do that?
>> 2. Is there any way to reduce the spike of the write bandwidth during the
>> memtable flushing?
>>    (I'm in trouble because the delay of the request increases when the
>> spike of the write bandwidth occurred)
>>
>> I'm using one C* node for this investigation. And C* runs on an EC2
>> instance (2vCPU, 4GB memory), In addition, I attach two magnetic disks to
>> the instance, one stores system data(root file system.(/)), the other
>> stores C* data (data files and commit logs).
>>
>> I also changed a few configurations.
>> - commitlog_sync: batch
>> - commitlog_sync_batch_window_in_ms: 2
>> (Using default value for the other configurations)
>>
>>
>> Regards,
>> Satoshi
>>
>> --
> Ben Bromhead
> CTO | Instaclustr <https://www.instaclustr.com/>
> +1 650 284 9692
> Managed Cassandra / Spark on AWS, Azure and Softlayer
>

Re: Is there any way to throttle the memtable flushing throughput?

Posted by Ben Bromhead <be...@instaclustr.com>.

A few thoughts on the larger problem at hand.

The AWS instance type you are using is not appropriate for a production
workload. Also with memtable flushes that cause spiky write throughput it
sounds like your commitlog is on the same disk as your data directory,
combined with the use of non-SSD EBS I'm not surprised this is happening.
The small amount of memory on the node could also mean your flush writers
are getting backed up (blocked), possibly causing JVM heap pressure and
other fun things (you can check this with nodetool tpstats).

Before you get into tuning memtable flushing I would do the following:

   - Reset your commitlog_sync settings back to default
   - Use an EC2 instance type with at least 15GB of memory, 4 cores and is
   EBS optimized (dedicated EBS bandwidth)
   - Use gp2 or io2 EBS volumes
   - Put your commitlog on a separate EBS volume.
   - Make sure your memtable_flush_writers are not being blocked, if so
   increase the number of flush writers (no more than # of cores)
   - Optimize your read_ahead_kb size and compression_chunk_length to keep
   those EBS reads as small as possible.

Once you have fixed the above, memtable flushing should not be an issue.
Even if you can't/don't want to upgrade the instance type, the other steps
will help things.

Ben

On Tue, 11 Oct 2016 at 10:23 Satoshi Hikida <sa...@gmail.com> wrote:

> Hi,
>
> I'm investigating the read/write performance of the C* (Ver. 2.2.8).
> However, I have an issue about memtable flushing which forces the spiky
> write throughput. And then it affects the latency of the client's requests.
>
> So I want to know the answers for the following questions.
>
> 1. Is there any way that throttling the write throughput of the memtable
> flushing? If it exists, how can I do that?
> 2. Is there any way to reduce the spike of the write bandwidth during the
> memtable flushing?
>    (I'm in trouble because the delay of the request increases when the
> spike of the write bandwidth occurred)
>
> I'm using one C* node for this investigation. And C* runs on an EC2
> instance (2vCPU, 4GB memory), In addition, I attach two magnetic disks to
> the instance, one stores system data(root file system.(/)), the other
> stores C* data (data files and commit logs).
>
> I also changed a few configurations.
> - commitlog_sync: batch
> - commitlog_sync_batch_window_in_ms: 2
> (Using default value for the other configurations)
>
>
> Regards,
> Satoshi
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer