You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Pracheer Agarwal <pr...@gmail.com> on 2015/06/16 11:38:45 UTC

Tuning Cassandra 2.1 for High Writes and Immediate Reads

Hi,

We are evaluating Cassandra 2.1 for our new production system. The
following are the requirements:

1. 15K writes/sec with 5 KB blob in a single column of a column family,
2. This is followed by immediate Reads by multiple consumer threads, the
read requires us to return entire Row and not only the recently updated
column.
3. Around 1B unique keys.

So I am assuming for the reads the data can be fetched from both Memtable
(if it is not flushed) and Key-cache. (Row-cache is disabled)

How can we optimize for higher Read throughput at the cost of Writes?

Machine configuration, 10 Node cluster.

   - 24 core/machine
   - 64 GB RAM
   - 2TB*5 HDD per machine
   - 10G NIC


Till now, for optimization we have done the following:
1. We have provided 4G keycache.
2. Created parition-key and clustering key in such a way that, for every
new event a new cell is created, we never update a record.
3. No light-weight transactions
4. ReplicationFactor 3
5. Write quorum 2, Read quorum 2

How can we further optimize for the write/read patterns explained above?

Thanks,
Pracheer

Re: Tuning Cassandra 2.1 for High Writes and Immediate Reads

Posted by Sebastian Estevez <se...@datastax.com>.
If you use clustering order by and need to keep the top rows in cache, look
at the row cache in 2.1.

http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1
On Jun 16, 2015 5:39 AM, "Pracheer Agarwal" <pr...@gmail.com>
wrote:

> Hi,
>
> We are evaluating Cassandra 2.1 for our new production system. The
> following are the requirements:
>
> 1. 15K writes/sec with 5 KB blob in a single column of a column family,
> 2. This is followed by immediate Reads by multiple consumer threads, the
> read requires us to return entire Row and not only the recently updated
> column.
> 3. Around 1B unique keys.
>
> So I am assuming for the reads the data can be fetched from both Memtable
> (if it is not flushed) and Key-cache. (Row-cache is disabled)
>
> How can we optimize for higher Read throughput at the cost of Writes?
>
> Machine configuration, 10 Node cluster.
>
>    - 24 core/machine
>    - 64 GB RAM
>    - 2TB*5 HDD per machine
>    - 10G NIC
>
>
> Till now, for optimization we have done the following:
> 1. We have provided 4G keycache.
> 2. Created parition-key and clustering key in such a way that, for every
> new event a new cell is created, we never update a record.
> 3. No light-weight transactions
> 4. ReplicationFactor 3
> 5. Write quorum 2, Read quorum 2
>
> How can we further optimize for the write/read patterns explained above?
>
> Thanks,
> Pracheer
>