You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Rajkumar Gupta <ra...@gmail.com> on 2011/01/03 13:21:24 UTC

meaning of eventual consistency in Cassandra ?

What is the meaning of eventual consistency in Cassandra when nodes in
a single cluster do not mantain the copies of same data but rather
data is distributed among nodes. Since a single peice of data is
recorded at a single place(node),Why wouldn't Cassandra return the
recent value from that single place of record? How do multiple copies
arise in this situation ? Where are the replicas in Cassandra cluster
?

Thanks

Re: meaning of eventual consistency in Cassandra ?

Posted by Peter Schuller <pe...@infidyne.com>.
> This means that nodes in cassandra cluster contain data that has been
> sharded onto serveral nodes as well as this sharded data may be
> replicated further across several nodes ? So cassandra storage
> utilizes both sharded as well as replication for load balancing? Is
> this correct ?

Yes, sort of (though depending on your definition of sharding it might
be slightly misleading). In short, replicas (copies of data) is placed
on a number (determined by the replication factor, or RF) of nodes
that participate in a ring where each node has a token associated with
it. The row key (see http://wiki.apache.org/cassandra/DataModel)
determines, along with the so-called 'replication strategy', which
nodes in the ring should have replicas of the data.

I just realized that I couldn't find a wiki page or section of the
Riptano docs that explained the DHT ring and its implications from a
top-down perspective. Am I missing something or is this something that
should be written, anyone?

Probably the best resource on this I found is
http://www.riptano.com/docs/0.6/operations/clustering

If you are interested in the reasoning behind it, I greatly recommend
the Amazon Dynamo paper:

   http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

Cassandra does not exactly implement what is described, but it is
strongly inspired by it.

-- 
/ Peter Schuller

Re: meaning of eventual consistency in Cassandra ?

Posted by Rajkumar Gupta <ra...@gmail.com>.
This means that nodes in cassandra cluster contain data that has been
sharded onto serveral nodes as well as this sharded data may be
replicated further across several nodes ? So cassandra storage
utilizes both sharded as well as replication for load balancing? Is
this correct ?

On Mon, Jan 3, 2011 at 11:28 PM, Peter Schuller
<pe...@infidyne.com> wrote:
>> What is the meaning of eventual consistency in Cassandra when nodes in
>> a single cluster do not mantain the copies of same data but rather
>> data is distributed among nodes. Since a single peice of data is
>> recorded at a single place(node),Why wouldn't Cassandra return the
>> recent value from that single place of record? How do multiple copies
>> arise in this situation ? Where are the replicas in Cassandra cluster
>> ?
>
> There is normally not just a single copy. If you run with RF
> (replication factor) = 1, you have a single copy. But this is only
> useful if you don't care about redundancy at all.
>
> With multiple replicas, the consistency depends on what you're doing.
> For example the choice of consistency level (see the levels listed on
> http://wiki.apache.org/cassandra/API).
>
> However note that even with RF=1, there are some things that are still
> only "eventual". For example if you submit a batch mutation,
> concurrent readers may see a partially applied batch mutation for a
> given row even though it is only being written to a single node.
>
> --
> / Peter Schuller
>

Re: meaning of eventual consistency in Cassandra ?

Posted by Peter Schuller <pe...@infidyne.com>.
> What is the meaning of eventual consistency in Cassandra when nodes in
> a single cluster do not mantain the copies of same data but rather
> data is distributed among nodes. Since a single peice of data is
> recorded at a single place(node),Why wouldn't Cassandra return the
> recent value from that single place of record? How do multiple copies
> arise in this situation ? Where are the replicas in Cassandra cluster
> ?

There is normally not just a single copy. If you run with RF
(replication factor) = 1, you have a single copy. But this is only
useful if you don't care about redundancy at all.

With multiple replicas, the consistency depends on what you're doing.
For example the choice of consistency level (see the levels listed on
http://wiki.apache.org/cassandra/API).

However note that even with RF=1, there are some things that are still
only "eventual". For example if you submit a batch mutation,
concurrent readers may see a partially applied batch mutation for a
given row even though it is only being written to a single node.

-- 
/ Peter Schuller