You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Stanley Xu <we...@gmail.com> on 2013/04/18 14:22:52 UTC

How to configure Cassandra params to handle heavy write, light read, short TTL scenario

Dear buddies,

We are using Cassandra to handle a tech scenario like the following:

1. A table using a Long as Key, and has one and only one Integer as a
ColumnFamily, with 2 hours as the TTL.
2. The wps(write per second) is 45000, the qps(read per second) would be
about 30 - 200.
3. There isn't a "hot zone" for read(which means each key query would be a
different key), but most of the reads will hit the writes in the last 30
minutes
4. All writes are new key with new value, no overwrite.


We were using Cassandra for this with 40 QPS of read before, but once the
QPS to read increase, it looks the IO_WAIT of the system increase heavily
and we got a lot of timeout in query(we set 10ms as the timeout).

Per my understand, the main reason is that most of the queries will hit the
disk with our configuration.

I am wondering if following things will help us to handle the load.

1. Increase the size of mem_table, so most of the read will read from
mem_table, and since the mem_table hasn't been flushed to disk yet, a query
to the sstable will be filtered by bloomfilter, so no disk seek will happen.

But our major concern is that once a large mem_table is flushed to the
disk, then the new incoming queries will all went to disk and the timeout
crash will still happen.

Is that possible that we could make some configuration, so there will be
like a mem_table queue in the memory, like there are 4 mem_tables in the
memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra
will flush mem1, and once there is a mem5 is full, it will flush the mem2.
Is that possible?


Best wishes,
Stanley Xu
Best wishes,
Stanley Xu

Re: How to configure Cassandra params to handle heavy write, light read, short TTL scenario

Posted by aaron morton <aa...@thelastpickle.com>.

Use nodetool cfhistograms to get a better understanding of the latency. It will also tell you the number of SSTables involved in the read, if it's above 3 it may indicate the data model needs changes. If you have rows written to for a long time they can become very fragmented. 

Also check the disk IO to see if you are not keeping up, and the key cache using nodetool info. 

Check the logs got GC Activity and consider tuning the GC for lower latency. Typically this may mean a smaller Eden space, which you can get by decreasing the NEW_HEAP and/or increasing the SurvivorSpace. Before you do anything enable the full GC logging in cassandra-env.sh and watch how the node performs. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/04/2013, at 1:55 PM, Stanley Xu <we...@gmail.com> wrote:

> Hi Aaron,
> 
> 1. A timeout more than 10ms to us is the max value we could accept
> 2. It is a random key access, not a range scan.
> 3. We have only one column family only for that keyspace, we select the columns.
> 
> Thanks.
> 
> Best wishes,
> Stanley Xu
> 
> 
> On Fri, Apr 19, 2013 at 2:22 AM, aaron morton <aa...@thelastpickle.com> wrote:
> > Is that possible that we could make some configuration, so there will be like a mem_table queue in the memory, like there are 4 mem_tables in the memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra will flush mem1, and once there is a mem5 is full, it will flush the mem2. Is that possible?
> No.
> 
> > We were using Cassandra for this with 40 QPS of read before, but once the QPS to read increase, it looks the IO_WAIT of the system increase heavily and we got a lot of timeout in query(we set 10ms as the timeout).
> Look at the cfhistogram for the CF. Look at the read latency column, the number on the left is microseconds and the number in the read latency column is how many local reads took that long. Also look at the SSTables column, this is the number of SSTables that were involved in the read.
> 
> Consider increasing the rpc_timeout to reduce the timeout errors until you reduce the read latency.
> 
> Is the read a range scan or selecting by row key?
> When you do the read, is a to select all columns in the row or do you select columns by name? The later is more performant.
> 
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 19/04/2013, at 12:22 AM, Stanley Xu <we...@gmail.com> wrote:
> 
> > Dear buddies,
> >
> > We are using Cassandra to handle a tech scenario like the following:
> >
> > 1. A table using a Long as Key, and has one and only one Integer as a ColumnFamily, with 2 hours as the TTL.
> > 2. The wps(write per second) is 45000, the qps(read per second) would be about 30 - 200.
> > 3. There isn't a "hot zone" for read(which means each key query would be a different key), but most of the reads will hit the writes in the last 30 minutes
> > 4. All writes are new key with new value, no overwrite.
> >
> >
> > We were using Cassandra for this with 40 QPS of read before, but once the QPS to read increase, it looks the IO_WAIT of the system increase heavily and we got a lot of timeout in query(we set 10ms as the timeout).
> >
> > Per my understand, the main reason is that most of the queries will hit the disk with our configuration.
> >
> > I am wondering if following things will help us to handle the load.
> >
> > 1. Increase the size of mem_table, so most of the read will read from mem_table, and since the mem_table hasn't been flushed to disk yet, a query to the sstable will be filtered by bloomfilter, so no disk seek will happen.
> >
> > But our major concern is that once a large mem_table is flushed to the disk, then the new incoming queries will all went to disk and the timeout crash will still happen.
> >
> > Is that possible that we could make some configuration, so there will be like a mem_table queue in the memory, like there are 4 mem_tables in the memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra will flush mem1, and once there is a mem5 is full, it will flush the mem2. Is that possible?
> >
> >
> > Best wishes,
> > Stanley Xu
> > Best wishes,
> > Stanley Xu
> 
>

Re: How to configure Cassandra params to handle heavy write, light read, short TTL scenario

Posted by Stanley Xu <we...@gmail.com>.

Hi Aaron,

1. A timeout more than 10ms to us is the max value we could accept
2. It is a random key access, not a range scan.
3. We have only one column family only for that keyspace, we select the
columns.

Thanks.

Best wishes,
Stanley Xu


On Fri, Apr 19, 2013 at 2:22 AM, aaron morton <aa...@thelastpickle.com>wrote:

> > Is that possible that we could make some configuration, so there will be
> like a mem_table queue in the memory, like there are 4 mem_tables in the
> memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra
> will flush mem1, and once there is a mem5 is full, it will flush the mem2.
> Is that possible?
> No.
>
> > We were using Cassandra for this with 40 QPS of read before, but once
> the QPS to read increase, it looks the IO_WAIT of the system increase
> heavily and we got a lot of timeout in query(we set 10ms as the timeout).
> Look at the cfhistogram for the CF. Look at the read latency column, the
> number on the left is microseconds and the number in the read latency
> column is how many local reads took that long. Also look at the SSTables
> column, this is the number of SSTables that were involved in the read.
>
> Consider increasing the rpc_timeout to reduce the timeout errors until you
> reduce the read latency.
>
> Is the read a range scan or selecting by row key?
> When you do the read, is a to select all columns in the row or do you
> select columns by name? The later is more performant.
>
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/04/2013, at 12:22 AM, Stanley Xu <we...@gmail.com> wrote:
>
> > Dear buddies,
> >
> > We are using Cassandra to handle a tech scenario like the following:
> >
> > 1. A table using a Long as Key, and has one and only one Integer as a
> ColumnFamily, with 2 hours as the TTL.
> > 2. The wps(write per second) is 45000, the qps(read per second) would be
> about 30 - 200.
> > 3. There isn't a "hot zone" for read(which means each key query would be
> a different key), but most of the reads will hit the writes in the last 30
> minutes
> > 4. All writes are new key with new value, no overwrite.
> >
> >
> > We were using Cassandra for this with 40 QPS of read before, but once
> the QPS to read increase, it looks the IO_WAIT of the system increase
> heavily and we got a lot of timeout in query(we set 10ms as the timeout).
> >
> > Per my understand, the main reason is that most of the queries will hit
> the disk with our configuration.
> >
> > I am wondering if following things will help us to handle the load.
> >
> > 1. Increase the size of mem_table, so most of the read will read from
> mem_table, and since the mem_table hasn't been flushed to disk yet, a query
> to the sstable will be filtered by bloomfilter, so no disk seek will happen.
> >
> > But our major concern is that once a large mem_table is flushed to the
> disk, then the new incoming queries will all went to disk and the timeout
> crash will still happen.
> >
> > Is that possible that we could make some configuration, so there will be
> like a mem_table queue in the memory, like there are 4 mem_tables in the
> memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra
> will flush mem1, and once there is a mem5 is full, it will flush the mem2.
> Is that possible?
> >
> >
> > Best wishes,
> > Stanley Xu
> > Best wishes,
> > Stanley Xu
>
>

Re: How to configure Cassandra params to handle heavy write, light read, short TTL scenario

Posted by aaron morton <aa...@thelastpickle.com>.

> Is that possible that we could make some configuration, so there will be like a mem_table queue in the memory, like there are 4 mem_tables in the memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra will flush mem1, and once there is a mem5 is full, it will flush the mem2. Is that possible?
No. 

> We were using Cassandra for this with 40 QPS of read before, but once the QPS to read increase, it looks the IO_WAIT of the system increase heavily and we got a lot of timeout in query(we set 10ms as the timeout).
Look at the cfhistogram for the CF. Look at the read latency column, the number on the left is microseconds and the number in the read latency column is how many local reads took that long. Also look at the SSTables column, this is the number of SSTables that were involved in the read. 

Consider increasing the rpc_timeout to reduce the timeout errors until you reduce the read latency. 

Is the read a range scan or selecting by row key?
When you do the read, is a to select all columns in the row or do you select columns by name? The later is more performant. 

Cheers 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 19/04/2013, at 12:22 AM, Stanley Xu <we...@gmail.com> wrote:

> Dear buddies,
> 
> We are using Cassandra to handle a tech scenario like the following:
> 
> 1. A table using a Long as Key, and has one and only one Integer as a ColumnFamily, with 2 hours as the TTL.
> 2. The wps(write per second) is 45000, the qps(read per second) would be about 30 - 200.
> 3. There isn't a "hot zone" for read(which means each key query would be a different key), but most of the reads will hit the writes in the last 30 minutes
> 4. All writes are new key with new value, no overwrite.
> 
> 
> We were using Cassandra for this with 40 QPS of read before, but once the QPS to read increase, it looks the IO_WAIT of the system increase heavily and we got a lot of timeout in query(we set 10ms as the timeout).
> 
> Per my understand, the main reason is that most of the queries will hit the disk with our configuration.
> 
> I am wondering if following things will help us to handle the load.
> 
> 1. Increase the size of mem_table, so most of the read will read from mem_table, and since the mem_table hasn't been flushed to disk yet, a query to the sstable will be filtered by bloomfilter, so no disk seek will happen.
> 
> But our major concern is that once a large mem_table is flushed to the disk, then the new incoming queries will all went to disk and the timeout crash will still happen.
> 
> Is that possible that we could make some configuration, so there will be like a mem_table queue in the memory, like there are 4 mem_tables in the memory, from mem1, mem2, mem3, mem4 based on time series, and the Cassandra will flush mem1, and once there is a mem5 is full, it will flush the mem2. Is that possible?
> 
> 
> Best wishes,
> Stanley Xu
> Best wishes,
> Stanley Xu