You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Joseph Bowman <bo...@gmail.com> on 2009/12/11 18:27:56 UTC

Questions about optmizing interaction with Cassandra

Greetings,

Jsondra has turned from a quick hack, into something I'm going to invest
some time into and get right. One of the things I'm looking at is that
Jsondra could potentially be used as a way to optimize reads, by acting as a
cache as well. Basically, the requirement at this point would need to be
that Jsondra be the only way applications talk to Cassandra. Something along
the lines of

read request = check memory cache, if it exists return, else query
Cassandra, if it exists cache and return, else return does not exist
write request = update cached item or create new one, write to Cassandra

Jsondra could be clustered by load balancing and using memcached for the
cache at this point.

The restriction being that if anything writes to Cassandra, there's no way
to update the cache (unless you have that application interact with
memcached as well).

Before I start working on something like this, I need to ask, is it even
necessary? Would it really provide a performance increase worth the added
complexity and dependence on things like memcached? I recall someone at Digg
gave a presentation where they found that Cassandra was fast even without
implementing a memcached layer. Does anyone on the list have any
suggestions/comments about this idea?

Re: Questions about optmizing interaction with Cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.
On Fri, Dec 11, 2009 at 1:32 PM, gabriele renzi <rf...@gmail.com> wrote:
> FWIW, in our current application with 0.4 we implemented in-cassandra
> caching, whereas a query that reads many thousands of columns and
> takes some time is cached in a separate entry
> QueryCache[q]=serializedResult
> and we flush it on update.
> I don't see why that cant be done by cassandra itself (maybe it's
> already done in 0.5)

No, but it is worth exploring.

-Jonathan

Re: Questions about optmizing interaction with Cassandra

Posted by gabriele renzi <rf...@gmail.com>.
2009/12/11 Joseph Bowman <bo...@gmail.com>:

> Before I start working on something like this, I need to ask, is it even necessary? Would it really provide a performance increase worth the added complexity and dependence on things like memcached? I recall someone at Digg gave a presentation where they found that Cassandra was fast even without implementing a memcached layer. Does anyone on the list have any suggestions/comments about this idea?

It seems cassandra is pretty fast when reading a small number of
columns, but can be slow for a large number.

Yet, you can cache at the cassandra level.. If you have a spare
gigabyte of ram to host a memcached instance, you can use it for it :)
FWIW, in our current application with 0.4 we implemented in-cassandra
caching, whereas a query that reads many thousands of columns and
takes some time is cached in a separate entry
QueryCache[q]=serializedResult
and we flush it on update.
I don't see why that cant be done by cassandra itself (maybe it's
already done in 0.5) although we will probably keep it all the same:
denormalizing denormalized stuff is fun.


-- 
blog en: http://www.riffraff.info
blog it: http://riffraff.blogsome.com