You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dan Hendry <da...@gmail.com> on 2010/12/04 04:19:52 UTC

Confused about consistency

I am seeing fairly strange, behavior in my Cassandra cluster.

Setup
 - 3 nodes (lets call them nodes 1 2 and 3)
 - RF=2
 - A set of servers (producers) which which write data to the cluster at
consistency level ONE
 - A set of servers (consumers/processors) which read data from the cluster
at consistency level ALL
 - Cassandra 0.7 (recent out of the svn branch, post beta 3)
 - Clients use the pelops library

Situation:
 - Everything is humming along nicely
 - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors
are the bain of my existence)
 - Producers continue to happily write to the cluster but consumers start
complaining by throwing TimeOutExceptions and UnavailableExceptions.
 - I stagger out of bed in the middle of the night and restart Cassandra on
node 3.
 - The consumers stop complaining and get back to business but generate
garbage data for the period node 3 was down. Its almost like half the data
is missing half the time. (Again, I am reading at consistency level ALL).
 - I force the consumers to reprocess data for the period node 3 was down.
They generate accurate output which is different from the first time round.

To be explicit, what seems to be happening is first read at consistency ALL
gives "A,C,E" (for example) and the second read at consistency level ALL
gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of consistency
levels flawed? My understanding is that you could achieve strongly
consistent behavior by writing at ONE and reading at ALL.

After this experience, my theory (uneducated, untested, and
under-researched) is that "strong consistency" applies only to column
values, not the set of columns (or super-columns in this case) which make up
a row. Any thoughts?

Re: Confused about consistency

Posted by Jonathan Ellis <jb...@gmail.com>.

You're right, they should be the same.

Next time this happens, set the log level to debug (from
StorageService jmx) on the surviving nodes and let a couple queries
fail, before restarting the 3rd (and setting level back to info).

On Sat, Dec 4, 2010 at 12:01 AM, Dan Hendry <da...@gmail.com> wrote:
> Doesn't consistency level ALL=QUORUM at RF=2 ?
>
> I have not had a chance to test your fix but I don't THINK this is the
> issue. If it is the issue, how do consistency levels ALL and QUORUM differ
> at this replication factor?
>
> On Sat, Dec 4, 2010 at 12:03 AM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> I think you are running into
>> https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an
>> inconsistency on QUORUM/ALL is discovered it always peformed the
>> repair at QUORUM instead of the original CL.  Thus, reading at ALL you
>> would see the correct answer on the 2nd read but you weren't
>> guaranteed to see it on the first.
>>
>> This was fixed in 0.6.4 but apparently I botched the merge to the 0.7
>> branch.  I corrected that just now, so when you update, you should be
>> good to go.
>>
>> On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry <da...@gmail.com>
>> wrote:
>> > I am seeing fairly strange, behavior in my Cassandra cluster.
>> > Setup
>> >  - 3 nodes (lets call them nodes 1 2 and 3)
>> >  - RF=2
>> >  - A set of servers (producers) which which write data to the cluster at
>> > consistency level ONE
>> >  - A set of servers (consumers/processors) which read data from the
>> > cluster
>> > at consistency level ALL
>> >  - Cassandra 0.7 (recent out of the svn branch, post beta 3)
>> >  - Clients use the pelops library
>> > Situation:
>> >  - Everything is humming along nicely
>> >  - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM
>> > errors
>> > are the bain of my existence)
>> >  - Producers continue to happily write to the cluster but consumers
>> > start
>> > complaining by throwing TimeOutExceptions and UnavailableExceptions.
>> >  - I stagger out of bed in the middle of the night and restart Cassandra
>> > on
>> > node 3.
>> >  - The consumers stop complaining and get back to business but generate
>> > garbage data for the period node 3 was down. Its almost like half the
>> > data
>> > is missing half the time. (Again, I am reading at consistency level
>> > ALL).
>> >  - I force the consumers to reprocess data for the period node 3 was
>> > down.
>> > They generate accurate output which is different from the first time
>> > round.
>> > To be explicit, what seems to be happening is first read at consistency
>> > ALL
>> > gives "A,C,E" (for example) and the second read at consistency level ALL
>> > gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of
>> > consistency
>> > levels flawed? My understanding is that you could achieve strongly
>> > consistent behavior by writing at ONE and reading at ALL.
>> > After this experience, my theory (uneducated, untested, and
>> > under-researched) is that "strong consistency" applies only to column
>> > values, not the set of columns (or super-columns in this case) which
>> > make up
>> > a row. Any thoughts?
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Confused about consistency

Posted by Dan Hendry <da...@gmail.com>.

Doesn't consistency level ALL=QUORUM at RF=2 ?

I have not had a chance to test your fix but I don't THINK this is the
issue. If it is the issue, how do consistency levels ALL and QUORUM differ
at this replication factor?

On Sat, Dec 4, 2010 at 12:03 AM, Jonathan Ellis <jb...@gmail.com> wrote:

> I think you are running into
> https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an
> inconsistency on QUORUM/ALL is discovered it always peformed the
> repair at QUORUM instead of the original CL.  Thus, reading at ALL you
> would see the correct answer on the 2nd read but you weren't
> guaranteed to see it on the first.
>
> This was fixed in 0.6.4 but apparently I botched the merge to the 0.7
> branch.  I corrected that just now, so when you update, you should be
> good to go.
>
> On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry <da...@gmail.com>
> wrote:
> > I am seeing fairly strange, behavior in my Cassandra cluster.
> > Setup
> >  - 3 nodes (lets call them nodes 1 2 and 3)
> >  - RF=2
> >  - A set of servers (producers) which which write data to the cluster at
> > consistency level ONE
> >  - A set of servers (consumers/processors) which read data from the
> cluster
> > at consistency level ALL
> >  - Cassandra 0.7 (recent out of the svn branch, post beta 3)
> >  - Clients use the pelops library
> > Situation:
> >  - Everything is humming along nicely
> >  - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors
> > are the bain of my existence)
> >  - Producers continue to happily write to the cluster but consumers start
> > complaining by throwing TimeOutExceptions and UnavailableExceptions.
> >  - I stagger out of bed in the middle of the night and restart Cassandra
> on
> > node 3.
> >  - The consumers stop complaining and get back to business but generate
> > garbage data for the period node 3 was down. Its almost like half the
> data
> > is missing half the time. (Again, I am reading at consistency level ALL).
> >  - I force the consumers to reprocess data for the period node 3 was
> down.
> > They generate accurate output which is different from the first time
> round.
> > To be explicit, what seems to be happening is first read at consistency
> ALL
> > gives "A,C,E" (for example) and the second read at consistency level ALL
> > gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of
> consistency
> > levels flawed? My understanding is that you could achieve strongly
> > consistent behavior by writing at ONE and reading at ALL.
> > After this experience, my theory (uneducated, untested, and
> > under-researched) is that "strong consistency" applies only to column
> > values, not the set of columns (or super-columns in this case) which make
> up
> > a row. Any thoughts?
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: Confused about consistency

Posted by Jonathan Ellis <jb...@gmail.com>.

I think you are running into
https://issues.apache.org/jira/browse/CASSANDRA-1316, where when an
inconsistency on QUORUM/ALL is discovered it always peformed the
repair at QUORUM instead of the original CL.  Thus, reading at ALL you
would see the correct answer on the 2nd read but you weren't
guaranteed to see it on the first.

This was fixed in 0.6.4 but apparently I botched the merge to the 0.7
branch.  I corrected that just now, so when you update, you should be
good to go.

On Fri, Dec 3, 2010 at 9:19 PM, Dan Hendry <da...@gmail.com> wrote:
> I am seeing fairly strange, behavior in my Cassandra cluster.
> Setup
>  - 3 nodes (lets call them nodes 1 2 and 3)
>  - RF=2
>  - A set of servers (producers) which which write data to the cluster at
> consistency level ONE
>  - A set of servers (consumers/processors) which read data from the cluster
> at consistency level ALL
>  - Cassandra 0.7 (recent out of the svn branch, post beta 3)
>  - Clients use the pelops library
> Situation:
>  - Everything is humming along nicely
>  - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors
> are the bain of my existence)
>  - Producers continue to happily write to the cluster but consumers start
> complaining by throwing TimeOutExceptions and UnavailableExceptions.
>  - I stagger out of bed in the middle of the night and restart Cassandra on
> node 3.
>  - The consumers stop complaining and get back to business but generate
> garbage data for the period node 3 was down. Its almost like half the data
> is missing half the time. (Again, I am reading at consistency level ALL).
>  - I force the consumers to reprocess data for the period node 3 was down.
> They generate accurate output which is different from the first time round.
> To be explicit, what seems to be happening is first read at consistency ALL
> gives "A,C,E" (for example) and the second read at consistency level ALL
> gives "A,B,C,D,E". Is this a Cassandra bug? Is my knowledge of consistency
> levels flawed? My understanding is that you could achieve strongly
> consistent behavior by writing at ONE and reading at ALL.
> After this experience, my theory (uneducated, untested, and
> under-researched) is that "strong consistency" applies only to column
> values, not the set of columns (or super-columns in this case) which make up
> a row. Any thoughts?



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Confused about consistency

Posted by Peter Schuller <pe...@infidyne.com>.

>  - A Cassandra node (say 3) goes down (even with 24 GB of ram, OOM errors
> are the bain of my existence)

Following up on this bit; OOM should not be the status quo. Have you
tweaked JVM heap sizes to reflect your memtables sizes etc?

http://wiki.apache.org/cassandra/MemtableThresholds

-- 
/ Peter Schuller