You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Daniel Doubleday <da...@gmx.net> on 2010/12/14 20:55:37 UTC

org.apache.cassandra.service.ReadResponseResolver question

Hi

I'm sorry - don't want to be a pain in the neck with source questions. So please just ignore me if this is stupid:

Isn't org.apache.cassandra.service.ReadResponseResolver suposed to throw a DigestMismatchException if it receives a digest wich does not match the digest of a read message?

If messages contains multiple digest responses it will drop all but one. So if any of the dropped digest are a mismatch to the version that mismatch is simply ignored.
It can cope with multiple reads (versions) but not with multiple digests and that's what it gets from quorum reads. 

It might be an edge case, but I think that would break quorum promise with rf > 3 because you could have 1 broken data message, 1 broken digest message and 2 good digest messages. If the 2 good messages were dropped than the quorum read that should have triggered repair and conflict resolution would return old data.

I just can't see what I'm not seeing here.

Cheers,
Daniel

Re: org.apache.cassandra.service.ReadResponseResolver question

Posted by Jonathan Ellis <jb...@gmail.com>.

That sounds like an interesting patch (as you point out we have had #982
open for a while), but I don't think we want to do something relatively
invasive on the 0.6 branch.  Let's target 0.7 or trunk.

On Wed, Dec 15, 2010 at 11:23 AM, Daniel Doubleday <daniel.doubleday@gmx.net
> wrote:

>
> On Dec 14, 2010, at 9:20 PM, Jonathan Ellis wrote:
>
> Correct.  https://issues.apache.org/jira/browse/CASSANDRA-1830 is open to
> fix that.  If you'd like to review the patch there, that would be very
> helpful. :)
>
>
> That patch looks good to me :-) Should have checked jira first ...
>
> Speaking of which, https://issues.apache.org/jira/browse/CASSANDRA-982 is
> referenced there and seems to be pretty close to something I was trying to
> do last 2 days.
>
> I'm not going to repeat the reasoning here, but its this thread:
> http://thread.gmane.org/gmane.comp.db.cassandra.user/10927/focus=10977
>
> Just wanted to mention that I implemented my idea and did some functional
> testing and load testing. Though certainly not enough ...
>
> But I was able to test
> - normal read behavior (all nodes up, two nodes up)
> - normal failure behavior (not enough nodes up)
> - behavior when the environment changes (affecting cores by controlling
> latency in the ReadVerbHandler, Timeouts in the read path, Exceptions in
> read path of a selected node, nodes going down during a read)
>
> So far it looks pretty promising. Everything worked as expected.
>
> The only real draw back I found is when a read fails on a selected node
> (such as an exception). As far as I understand it there's no way to signal
> the readresolvehandler to return early in this case. Thus you have to wait
> for the timeout until the rest of the nodes are consulted. But I hope that
> failure detection + scores should be good enough to prevent this from
> happening to often.
>
> I did some load testing and compared with vanilla cassandra. It's one of
> our use cases we have in production. Its a chat app. So it writes and reads
> messages and offline notifications. It's of limited use though since I was
> not able to reproduce our IO overload yet.
>
> But to give a first impression: In this rather cpu bound test the patched
> version did ~20 - 25% more tests. Test was on 3 nodes, rf 3, quorum read /
> writes. reproduced many times.
>
> I am currently working on a load test to reproduce the problem in our
> production environment last week.
>
> If someone's interested (note that this makes only sense - if at all - for
> quorum reads with the dynamic snitch):
>
> That's the patch I did to 0.6.8:  https://gist.github.com/742280
>
> And of course I'd be glad to get feedback if someone feels that I am about
> to lose my job...
>
> Thanks,
>
> Daniel
> smeet.com, Berlin
>
>
> On Tue, Dec 14, 2010 at 1:55 PM, Daniel Doubleday <
> daniel.doubleday@gmx.net> wrote:
>
>> Hi
>>
>> I'm sorry - don't want to be a pain in the neck with source questions. So
>> please just ignore me if this is stupid:
>>
>> Isn't org.apache.cassandra.service.ReadResponseResolver suposed to throw a
>> DigestMismatchException if it receives a digest wich does not match the
>> digest of a read message?
>>
>> If messages contains multiple digest responses it will drop all but one.
>> So if any of the dropped digest are a mismatch to the version that mismatch
>> is simply ignored.
>> It can cope with multiple reads (versions) but not with multiple digests
>> and that's what it gets from quorum reads.
>>
>> It might be an edge case, but I think that would break quorum promise with
>> rf > 3 because you could have 1 broken data message, 1 broken digest message
>> and 2 good digest messages. If the 2 good messages were dropped than the
>> quorum read that should have triggered repair and conflict resolution would
>> return old data.
>>
>> I just can't see what I'm not seeing here.
>>
>> Cheers,
>> Daniel
>>
>>
>>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>
>
>


-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: org.apache.cassandra.service.ReadResponseResolver question

Posted by Daniel Doubleday <da...@gmx.net>.

On Dec 14, 2010, at 9:20 PM, Jonathan Ellis wrote:

> Correct.  https://issues.apache.org/jira/browse/CASSANDRA-1830 is open to fix that.  If you'd like to review the patch there, that would be very helpful. :)

That patch looks good to me :-) Should have checked jira first ...

Speaking of which, https://issues.apache.org/jira/browse/CASSANDRA-982 is referenced there and seems to be pretty close to something I was trying to do last 2 days.

I'm not going to repeat the reasoning here, but its this thread: http://thread.gmane.org/gmane.comp.db.cassandra.user/10927/focus=10977

Just wanted to mention that I implemented my idea and did some functional testing and load testing. Though certainly not enough ...

But I was able to test 
- normal read behavior (all nodes up, two nodes up)
- normal failure behavior (not enough nodes up)
- behavior when the environment changes (affecting cores by controlling latency in the ReadVerbHandler, Timeouts in the read path, Exceptions in read path of a selected node, nodes going down during a read)

So far it looks pretty promising. Everything worked as expected.

The only real draw back I found is when a read fails on a selected node (such as an exception). As far as I understand it there's no way to signal the readresolvehandler to return early in this case. Thus you have to wait for the timeout until the rest of the nodes are consulted. But I hope that failure detection + scores should be good enough to prevent this from happening to often.

I did some load testing and compared with vanilla cassandra. It's one of our use cases we have in production. Its a chat app. So it writes and reads messages and offline notifications. It's of limited use though since I was not able to reproduce our IO overload yet.

But to give a first impression: In this rather cpu bound test the patched version did ~20 - 25% more tests. Test was on 3 nodes, rf 3, quorum read / writes. reproduced many times.

I am currently working on a load test to reproduce the problem in our production environment last week.

If someone's interested (note that this makes only sense - if at all - for quorum reads with the dynamic snitch):

That's the patch I did to 0.6.8:  https://gist.github.com/742280

And of course I'd be glad to get feedback if someone feels that I am about to lose my job... 

Thanks,

Daniel
smeet.com, Berlin

> On Tue, Dec 14, 2010 at 1:55 PM, Daniel Doubleday <da...@gmx.net> wrote:
> Hi
> 
> I'm sorry - don't want to be a pain in the neck with source questions. So please just ignore me if this is stupid:
> 
> Isn't org.apache.cassandra.service.ReadResponseResolver suposed to throw a DigestMismatchException if it receives a digest wich does not match the digest of a read message?
> 
> If messages contains multiple digest responses it will drop all but one. So if any of the dropped digest are a mismatch to the version that mismatch is simply ignored.
> It can cope with multiple reads (versions) but not with multiple digests and that's what it gets from quorum reads.
> 
> It might be an edge case, but I think that would break quorum promise with rf > 3 because you could have 1 broken data message, 1 broken digest message and 2 good digest messages. If the 2 good messages were dropped than the quorum read that should have triggered repair and conflict resolution would return old data.
> 
> I just can't see what I'm not seeing here.
> 
> Cheers,
> Daniel
> 
> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com

Re: org.apache.cassandra.service.ReadResponseResolver question

Posted by Jonathan Ellis <jb...@gmail.com>.

Correct.  https://issues.apache.org/jira/browse/CASSANDRA-1830 is open to
fix that.  If you'd like to review the patch there, that would be very
helpful. :)

On Tue, Dec 14, 2010 at 1:55 PM, Daniel Doubleday
<da...@gmx.net>wrote:

> Hi
>
> I'm sorry - don't want to be a pain in the neck with source questions. So
> please just ignore me if this is stupid:
>
> Isn't org.apache.cassandra.service.ReadResponseResolver suposed to throw a
> DigestMismatchException if it receives a digest wich does not match the
> digest of a read message?
>
> If messages contains multiple digest responses it will drop all but one. So
> if any of the dropped digest are a mismatch to the version that mismatch is
> simply ignored.
> It can cope with multiple reads (versions) but not with multiple digests
> and that's what it gets from quorum reads.
>
> It might be an edge case, but I think that would break quorum promise with
> rf > 3 because you could have 1 broken data message, 1 broken digest message
> and 2 good digest messages. If the 2 good messages were dropped than the
> quorum read that should have triggered repair and conflict resolution would
> return old data.
>
> I just can't see what I'm not seeing here.
>
> Cheers,
> Daniel
>
>
>


-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com