You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by E S <tr...@yahoo.com> on 2010/11/17 03:31:00 UTC

Smart Client Routing

I am considering building a system as follows:

1.  Data stored in Cassandra
2.  Webservice cluster (stateless) will pull data from cassandra and do business 
operations plus security enforcement
3.  Clients will hit the webservice cluster

I'm trying to maintain a low read latency and am worried about the number of 
hops.  Client will hit the webservice.  The webservice will hit a random node in 
the cassandra cluster.  The cassandra cluster will then route the the 
appropriate node and the data will flow all the way back.

How many of these hops can I remove?  I would bundle the cassandra and 
webservice processes onto each box.  If I route the webservice to always go to 
the local node, I'll remove one hop.  Is it possible to optimize this further so 
that the client can use the cassandra routing logic to go to the webservice that 
also houses a cassandra node that contains the data?  In this case, there would 
only be one hop, and if the data is used frequently, I it will likely reside in 
memory without requiring a separate caching layer.  This is an internal 
webservice, so I would be ok with a library on the client side to help with the 
routing.

Is any of this possible?  I was looking at the cassandra apis and couldn't 
figure out a way.

Thanks for any help!



      

Re: Smart Client Routing

Posted by Jonathan Ellis <jb...@gmail.com>.
This is what the StorageProxy API does.  There is an example in
contrib/client_only.  There are some fairly strong limitations:

 - Java-only
 - No compatibility guarantees from version to version
 - You need a separate IP for each client

On Tue, Nov 16, 2010 at 9:17 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
> No need to worry.
> I run REST requests through Varnish box > nginx / Tornaod / Python box >
> Cassandra cluster and can get requests in and out of the stack in a couple
> of milliseconds. Using some old workstation HW and not paying much attention
> to tuning.
> Build it like a normal system and separate out the parts, if / when you have
> problems then you can look at tuning the cassandra cluster or other parts of
> the stack. There are normally a number of other issues to deal with before
> network IO.
> Hope that helps.
> Aaron
>
> On 17 Nov, 2010,at 03:31 PM, E S <tr...@yahoo.com> wrote:
>
> I am considering building a system as follows:
>
> 1. Data stored in Cassandra
> 2. Webservice cluster (stateless) will pull data from cassandra and do
> business
> operations plus security enforcement
> 3. Clients will hit the webservice cluster
>
> I'm trying to maintain a low read latency and am worried about the number of
> hops. Client will hit the webservice. The webservice will hit a random node
> in
> the cassandra cluster. The cassandra cluster will then route the the
> appropriate node and the data will flow all the way back.
>
> How many of these hops can I remove? I would bundle the cassandra and
> webservice processes onto each box. If I route the webservice to always go
> to
> the local node, I'll remove one hop. Is it possible to optimize this further
> so
> that the client can use the cassandra routing logic to go to the webservice
> that
> also houses a cassandra node that contains the data? In this case, there
> would
> only be one hop, and if the data is used frequently, I it will likely reside
> in
> memory without requiring a separate caching layer. This is an internal
> webservice, so I would be ok with a library on the client side to help with
> the
> routing.
>
> Is any of this possible? I was looking at the cassandra apis and couldn't
> figure out a way.
>
> Thanks for any help!
>
>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Smart Client Routing

Posted by Aaron Morton <aa...@thelastpickle.com>.
No need to worry. 

I run REST requests through Varnish box > nginx / Tornaod / Python box > Cassandra cluster and can get requests in and out of the stack in a couple of milliseconds. Using some old workstation HW and not paying much attention to tuning. 

Build it like a normal system and separate out the parts, if / when you have problems then you can look at tuning the cassandra cluster or other parts of the stack. There are normally a number of other issues to deal with before network IO.

Hope that helps. 
Aaron


On 17 Nov, 2010,at 03:31 PM, E S <tr...@yahoo.com> wrote:

I am considering building a system as follows:

1. Data stored in Cassandra
2. Webservice cluster (stateless) will pull data from cassandra and do business 
operations plus security enforcement
3. Clients will hit the webservice cluster

I'm trying to maintain a low read latency and am worried about the number of 
hops. Client will hit the webservice. The webservice will hit a random node in 
the cassandra cluster. The cassandra cluster will then route the the 
appropriate node and the data will flow all the way back.

How many of these hops can I remove? I would bundle the cassandra and 
webservice processes onto each box. If I route the webservice to always go to 
the local node, I'll remove one hop. Is it possible to optimize this further so 
that the client can use the cassandra routing logic to go to the webservice that 
also houses a cassandra node that contains the data? In this case, there would 
only be one hop, and if the data is used frequently, I it will likely reside in 
memory without requiring a separate caching layer. This is an internal 
webservice, so I would be ok with a library on the client side to help with the 
routing.

Is any of this possible? I was looking at the cassandra apis and couldn't 
figure out a way.

Thanks for any help!