You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "David Alves (JIRA)" <ji...@apache.org> on 2016/10/14 19:49:20 UTC

[jira] [Commented] (KUDU-258) Create an integration test that performs writes with multiple consistency modes

    [ https://issues.apache.org/jira/browse/KUDU-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576306#comment-15576306 ] 

David Alves commented on KUDU-258:
----------------------------------

The gerrit link above is from the dark ages.

Here's the transcript of the non-bot part of the convo that happened there:
↩
Patch Set 1:
hm, when does the clock give you the same timestamp twice? couldn't we make it part of the clock contract that successive calls to Now don't return the same value?
Todd Lipcon
Apr 28, 2014
↩
Patch Set 1:
also does this have any impact on how we do snapshots for flush? if we're assigning "future" timestamps to writes for commit wait, and then we use MVCC for flush snapshots, can we miss those writes?
David Alves
Apr 28, 2014
↩
Patch Set 1:
both Now() and NowLatest() are monotonically increasing, but not against each other, i.e. say a call to NowLatest() returns 15 (10 + 5 error) later on there might be a call to Now() that also returns 15. If both are in-flight at the same time, in release more, we get a CHECK error when trying to commit he second one.
David Alves
Apr 28, 2014
↩
Patch Set 1:
was worried about that but after thinking about it for a while I think that is not a problem. Say we do a commit wait write at NowLatest() = 15, we then take the flush snap at Now() = 10 the flush will ignore the commit wait write, the commit wait write is still on the in-flights though and the second flush snap will include it as an in-flight.
Not sure that was clear, if you want we can discuss this through a hangout or something.
Todd Lipcon
Apr 28, 2014
↩
Patch Set 1:
What's the guarantee that the second flush would contain it in the snapshot? Couldn't the write still be in the future?
David Alves
Apr 28, 2014
↩
Patch Set 1:
when we prepared it (assigned the commit wait timestamp) it was added to the in-flights right? so this case just makes the in-flight interval larger. I.e. if we have a bunch of no_consistency txns and a commit_wait txn we might get a snapshot like: 10,11,12, 
↩
Patch Set 8: Code-Review+1
Can we write a test for this case? It would either blow up in StartTransactionAtLatest() or at commit time without this patch.
David Alves
May 8, 2014
↩
Patch Set 8:
though about it when I submitted this, but we can't do it without KUDU-156 (mock clock) and AFAIK that is not very high priority wise right now. Will add a note on this regard to KUDU-156 though.
Michael Percy
May 8, 2014
↩
Patch Set 8:
Is this fix high priority right now? Why don't we postpone this fix until we do the mock clock. I don't see why this has to block consensus going in either.
It's the best kind of bug... when you do something it doesn't like, it crashes.
David Alves
May 8, 2014
↩
Patch Set 8:
cause it's a bug I've seen in the wild? and that bug gets fixed by this change? why wouldn't we fix a bug?
Michael Percy
May 8, 2014
↩
Patch Set 8:
Well, it sucks that there's no unit test to verify the fix, that's my main concern. LMK if you want to discuss on IRC
David Alves
May 8, 2014
↩
Patch Set 8:
I get that unit tests are important and I try not to add anything without them. but seems like there's no good reason to solve a bug I've seen happening (that is very rare and only appears when really hammering a multi-machine cluster) and that got solved by this otherwise inconsequential 9lines patch.

reproducing some IRC conversation about this:
15:05 < todd> I'm thinking it's OK because the commit that has a "future" timestamp will commit-wait
15:05 < todd> so therefore it will be in-flight
15:05 < dralves> right
15:05 < todd> and once it's committed, then the MvccSnapshot will be after it
15:05 < dralves> exactly
15:06 < dralves> all that the mix of consistency levels adds to the snapshots stuff is that it makes the interval of 
                 in-flight transactions larger
15:07 < dralves> cause we're adding in-flights in the present and in the future
15:07 < dralves> but we never commit in the future, if that makes sense
15:09 < dralves> from another prespective we can think of commit wait transactiona as transactions that take a really 
                 long time
15:12 < todd> yup
15:17 < todd> dralves: I bet we're going to have some issues with component_lock fairness introducing latency
15:17 < todd> unrelated to your patch
15:17 < todd> but if you have a commit-wait txn, let's say it's sleeping 50ms... then the flush code tries to take 
              the w-lock
15:17 < todd> then any non-commit-wait txns are still blocked from taking the lock
15:18 < todd> I think we should eventually fix this by not (ab)using component_lock to quiesce txns
15:18 < todd> but instead we just need to do a txn epoch rollover type thing
15:18 < dralves> todd: agreed
15:18 < dralves> that would be true for long running transactions anyway
15:18 < todd> or a "transaction fence"
15:18 < todd> yea
15:18 < dralves> or make the lock a bit more unfair
15:18 < dralves> somehow
15:19 < todd> yea.. though if all we need is "any txns running prior to this point need to complete" then we can 
              implement that a lot more simply with no blocking
15:20 < dralves> yeah, agree the unfairness would also be weird
Todd Lipcon
May 14, 2014
↩
Change has been successfully cherry-picked as 85a8474346b7662b362f90d62e029eb2bc18951c

> Create an integration test that performs writes with multiple consistency modes
> -------------------------------------------------------------------------------
>
>                 Key: KUDU-258
>                 URL: https://issues.apache.org/jira/browse/KUDU-258
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: tserver
>    Affects Versions: M3
>            Reporter: David Alves
>            Assignee: David Alves
>
> Right now we test consistency modes independently, but they will eventually coexist and that can spawn trouble (e.g. KUDU-242). We should have an integration test that runs writes on multiple consistency modes at the same time.
> Plus we should have the YCSB run on multiple consistency modes at the same time (need to revive/cleanup what I did for the HT paper)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)