You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by re...@voodoowarez.com on 2013/06/30 10:48:47 UTC

Patterns for enabling Compute apps which only request Local Node's

Data 
Reply-To: 

Hello Cassandra-user ml, how is everyone?

Question; if we're co-locating our Cassandra and our compute application on the same nodes, are there any in-use
patterns in Cassandra user (or Cassandra dev) applications for having the compute application only pull data off the
localhost Cassandra process? If we have the ability to manage where we do compute, what options are there for keeping
compute happening on local data as much as possible? 

In the best case I can imagine:

If I have a KeyRange for a ColumnParent, there would be some way to know I'm going to fulfil that scan while only having
each node pull it's own local data (achieve read consistency via digest).

If it helps, I'd be glad to entertain options that only worked when using virtual nodes.

Regards,
rektide

Re: Patterns for enabling Compute apps which only request Local Node's

Posted by Robert Coli <rc...@eventbrite.com>.

On Sun, Jun 30, 2013 at 1:48 AM, <re...@voodoowarez.com> wrote:

> Question; if we're co-locating our Cassandra and our compute application
> on the same nodes, are there any in-use
> patterns in Cassandra user (or Cassandra dev) applications for having the
> compute application only pull data off the
> localhost Cassandra process? If we have the ability to manage where we do
> compute, what options are there for keeping
> compute happening on local data as much as possible?
>

The Hadoop support provides Hadoop-like support for locality. One presumes
you could make use of this functionality even if you were not actually
running Hadoop map/reduce as the compute application.

http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig

=Rob