You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by amulya rattan <ta...@gmail.com> on 2013/02/21 20:03:49 UTC

Using Cassandra for read operations

Dear All,

We are currently evaluating Cassandra for an application involving strict
SLAs(Service level agreements). We just need one column family with a long
key and approximately 70-80 bytes row. We are not concerned about write
performance but are primarily concerned about read. For our SLAs, a read of
max 15-20 rows at once(using multi slice), should not take more than 4 ms.
Till now, on a single node setup, using cassandra' stress tool, the numbers
are promising. But I am guessing that's because there is no network latency
involved there and since we set memtable around 2gb(4 gb heap), we never
had to get to Disk I/O.

Assuming our nodes having >32GB RAM, a couple of questions regarding read:

* To avoid disk I/Os, the best option we thought is to have data in memory.
Is it a good idea to have memtable setup around 1/2 or 3/4 of heap size?
Obviously flushing will take a lot of time but would that hurt that node's
performance big time?

* Cassandra stress tool only gives out average read latency. Is there a way
to figure out max read-latency for a bunch of read operations?

* How big a row cache can one have? Given that cassandra provides off-heap
row caching, in a machine >32 gb RAM, would it be wise to have a >10 gb row
cache with 8 gb java heap? And how big should the corresponding key cache
be then?

Any response is appreciated.

~Amulya

RE: Using Cassandra for read operations

Posted by Viktor Jevdokimov <Vi...@adform.com>.

Bill de hÓra already answered, I'd like to add:

To achieve ~4ms reads (from client standpoint):
1. You can't use multi-slice, since different keys may occur on different nodes that require internode communication. Design you data and reads to use one key/row.
2. Use ConsistencyLevel.ONE to avoid waiting for other nodes.
3. Use smart client that selects endpoints by token (key) to put request to appropriate node, Astyanax (Java) or write such client yourself.
4. Turn off dynamic snitch. While coordinator node may read locally, dynamic snitch may redirect it to another replica.
5. Use SSD's to avoid re-cache issue when sstables are compacted.
6. If you do writes, the rest issue is GC. If you're not on Azul Zing JVM, which I can't confirm to be better than Oracle HotSpot or JRockit (both has GC issues), you can't tune JVM to avoid Young Gen GC pauses to be as low as you need. You will fight pause frequency VS time.
So if you can afford Zing, check also Aerospike (ex-CitrusLeaf) alternative to Cassandra, which is written in C and has no GC issues.

> From: Bill de hÓra [mailto:bill@dehora.net]
> Sent: Thursday, February 21, 2013 22:07
> To: user@cassandra.apache.org
> Subject: Re: Using Cassandra for read operations
>
> In a nutshell -
>
> - Start with defaults and tune based on small discrete adjustments and leave
> time to see the effect of each change. No-one will know your workload
> better than you and the questions you are asking are workload sensitive.
>
> - Allow time for tuning and spending time understanding the memory model
> and JVM GC.
>
> - Be very careful with caches. Leave enough room in the OS for its own disk
> cache.
>
> - Get an SSD
>
>
> Bill
>
>
> On 21 Feb 2013, at 19:03, amulya rattan <ta...@gmail.com> wrote:
>
> > Dear All,
> >
> > We are currently evaluating Cassandra for an application involving strict
> SLAs(Service level agreements). We just need one column family with a long
> key and approximately 70-80 bytes row. We are not concerned about write
> performance but are primarily concerned about read. For our SLAs, a read of
> max 15-20 rows at once(using multi slice), should not take more than 4 ms.
> Till now, on a single node setup, using cassandra' stress tool, the numbers are
> promising. But I am guessing that's because there is no network latency
> involved there and since we set memtable around 2gb(4 gb heap), we never
> had to get to Disk I/O.
> >
> > Assuming our nodes having >32GB RAM, a couple of questions regarding
> read:
> >
> > * To avoid disk I/Os, the best option we thought is to have data in memory.
> Is it a good idea to have memtable setup around 1/2 or 3/4 of heap size?
> Obviously flushing will take a lot of time but would that hurt that node's
> performance big time?
> >
> > * Cassandra stress tool only gives out average read latency. Is there a way
> to figure out max read-latency for a bunch of read operations?
> >
> > * How big a row cache can one have? Given that cassandra provides off-
> heap row caching, in a machine >32 gb RAM, would it be wise to have a >10
> gb row cache with 8 gb java heap? And how big should the corresponding key
> cache be then?
> >
> > Any response is appreciated.
> >
> > ~Amulya
> >

Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Email: Viktor.Jevdokimov@adform.com
Phone: +370 5 212 3063
Fax: +370 5 261 0453

J. Jasinskio 16C,
LT-01112 Vilnius,
Lithuania

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

Re: Using Cassandra for read operations

Posted by Bill de hÓra <bi...@dehora.net>.

> To avoid disk I/Os, the best option we thought is to have data in memory. 
> Is it a good idea to have memtable setup around 1/2 or 3/4 of 
> heap size? Obviously flushing will take a lot of time but would 
> that hurt that node's performance big time?

Start with the defaults and test your workload. If memtables start flushing aggressively (because of write load or bad settings), that can cause compaction work on the disk, that might impair read I/O. 

>  Is there a way to figure out max read-latency for a bunch of read operations?

Use nodetool's histogram feature to get a sense of outlier latency.

> We just need one column family with a long key

Take time to tune your key caches and bloom filters. They use memory and have an impact on read performance.

> Given that cassandra provides off-heap row caching, in a 
> machine >32 gb RAM, would it be wise to have a >10 gb row 
> cache with 8 gb java heap? 

If you use the off heap cache, allow enough room for the filesystems' own cache, i.e. don't give over all of ram to the off heap cache. Also the off heap cache can slow you down with wide rows due to serialisation overhead, or cache invalidation thrashing if you are update heavy. if you use the on-heap cache, pay close attention to GC cycles and memory stability - if you are cycling/evicting through the cache at a high rate that can leave too much garbage in memory such that the garbage collector can't keep up. If the node doesn't have enough working memory after GC, it will _resize_ key and row caches. This will lead to degraded read performance and with some workloads can result in a vicious cycle.

>  For our SLAs, a read of max 15-20 rows at once(using multi slice), 
> should not take more than 4 ms.

If you control your own hardware (and you probably should/must for this kind of latency demand) consider SSDs. You might want to carefully control background repair/compaction operations if predictable performance is your goal. You might want to avoid storing strings and use byte representations. If you have an application tier on the path consider caching in that tier as well to avoid the overhead of network calls and thrift processing.

In a nutshell -

- Start with defaults and tune based on small discrete adjustments and leave time to see the effect of each change. No-one will know your workload better than you and the questions you are asking are workload sensitive.

- Allow time for tuning and spending time understanding the memory model and JVM GC.

- Be very careful with caches. Leave enough room in the OS for its own disk cache.

- Get an SSD

Bill

On 21 Feb 2013, at 19:03, amulya rattan <ta...@gmail.com> wrote:

> Dear All,
> 
> We are currently evaluating Cassandra for an application involving strict SLAs(Service level agreements). We just need one column family with a long key and approximately 70-80 bytes row. We are not concerned about write performance but are primarily concerned about read. For our SLAs, a read of max 15-20 rows at once(using multi slice), should not take more than 4 ms. Till now, on a single node setup, using cassandra' stress tool, the numbers are promising. But I am guessing that's because there is no network latency involved there and since we set memtable around 2gb(4 gb heap), we never had to get to Disk I/O.
> 
> Assuming our nodes having >32GB RAM, a couple of questions regarding read:
> 
> * To avoid disk I/Os, the best option we thought is to have data in memory. Is it a good idea to have memtable setup around 1/2 or 3/4 of heap size? Obviously flushing will take a lot of time but would that hurt that node's performance big time?
> 
> * Cassandra stress tool only gives out average read latency. Is there a way to figure out max read-latency for a bunch of read operations?
> 
> * How big a row cache can one have? Given that cassandra provides off-heap row caching, in a machine >32 gb RAM, would it be wise to have a >10 gb row cache with 8 gb java heap? And how big should the corresponding key cache be then?
> 
> Any response is appreciated.
> 
> ~Amulya 
>