You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chris Goffinet <go...@digg.com> on 2009/05/18 19:24:38 UTC

Node Recovery

Scenario: if i setup a 2 node cluster, with replicationfactor of 2.  
Inserted a new key (1) into a table. Its replicated to both nodes. I  
shutdown node (2), delete all data, then bring it back up. I noticed  
that if i make a request to that node the first time for that key, it  
will return back an empty result (was using get_slice), then that node  
will pull the data from other node. On next request to that node its  
there. How does one really know if the data isn't there (should I  
retry) vs it was never there to begin with?

---
Chris Goffinet
goffinet@digg.com






Re: Node Recovery

Posted by Jonathan Ellis <jb...@gmail.com>.
That's the price you pay for (a) eventual consistency in general and
(b) doing read repair in the background specifically.  Cassandra also
has functionality (called "strong read") to do a quorum read in the
foreground and repair if necessary but that is not exposed in Thrift
yet -- but even with that there are scenarios where you could get back
"no data" for a write that has been acked.  The only way to avoid it
entirely is to require acking all writes from all replicas and
checking all replicas on all reads, which (in a large cluster) is
going to hurt from the availability standpoint.  Most apps are ok
trading off some consistency for availability.

-Jonathan

On Mon, May 18, 2009 at 12:24 PM, Chris Goffinet <go...@digg.com> wrote:
> Scenario: if i setup a 2 node cluster, with replicationfactor of 2. Inserted
> a new key (1) into a table. Its replicated to both nodes. I shutdown node
> (2), delete all data, then bring it back up. I noticed that if i make a
> request to that node the first time for that key, it will return back an
> empty result (was using get_slice), then that node will pull the data from
> other node. On next request to that node its there. How does one really know
> if the data isn't there (should I retry) vs it was never there to begin
> with?
>
> ---
> Chris Goffinet
> goffinet@digg.com
>
>
>
>
>
>