You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jon Travis <jt...@p00p.org> on 2014/07/25 19:46:17 UTC

Replication factor 2 with immutable data

I have a couple questions regarding the availability of my data in a RF=2
scenario.

- The setup -
I am currently storing immutable data in a CF with RF=2 and
read_repair_chance = 0.0.  There is a lot of data, so bumping up to RF=3
would increase my storage costs quite dramatically.  For the most part, I
am only adding data to this CF (and nightly, do some deleting).  Writes and
Reads are both being done with CL = ONE.

- The questions -
When I write a value, it is written to replicas A and B.  If B is down,
then A will still acknowledge the write and the write will succeed.  Great.
Now then, if B comes back up, and before B gets the handoff of the data
from A, a client attempts to read the recently-written data.  If the client
attempts to read the data and it gets routed to replica B, the data will
not exist there, and the read will fail, correct?

But what I really want is for the read to hit both A and B, and whichever
one returns the data then great -- I only need 1 of them to actually
acknowledge having it.

My questions are:
  - Is it possible to achieve consistency in this approach?  Even if I try
at CL=TWO and backoff to CL=ONE in a failure condition, there still seems
to be a race where I could hit the replica without the data.
  - Does a replica 'not having the data' count towards the CL requirements?
 I.e. replica B responds, "Nope, don't have it" -- I don't want the CL to
be satisfied, because the data is either there or it is not.  I have not
done updates to the data.

This feels a bit quorum-ish, where a quorum under RF=3 will ask 3 nodes for
the data and return success when 2 have consistent results.

It feels strange to be able to write data at RF=2, then with only 1 node
being down, not be able to read it ...

Thanks,

-- Jon

Re: Replication factor 2 with immutable data

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Jul 25, 2014 at 10:46 AM, Jon Travis <jt...@p00p.org> wrote:

> I have a couple questions regarding the availability of my data in a RF=2
> scenario.
>

You have just explained why consistency does not work with Replication
Factor of fewer than 3 and Consistency Level of less than QUORUM.

Basically, with RF=2, QUORUM is ALL, and you can't be available at ALL
because that's impossible.

=Rob