You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2018/03/22 20:11:00 UTC

[jira] [Commented] (KUDU-2370) Allow accessing consensus metadata during flush/sync

    [ https://issues.apache.org/jira/browse/KUDU-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16410213#comment-16410213 ] 

Todd Lipcon commented on KUDU-2370:
-----------------------------------

Another example:
- RequestVote holds update_lock_ and lock_ while waiting on the sync of metadata
- RequestVote tries to respond quickly with a "busy" response in this case:
{code}
    // There is another vote or update concurrent with the vote. In that case, that
    // other request is likely to reset the timer, and we'll end up just voting
    // "NO" after waiting. To avoid starving RPC handlers and causing cascading
    // timeouts, just vote a quick NO.
    //
    // We still need to take the state lock in order to respond with term info, etc.
    ThreadRestrictions::AssertWaitAllowed();
    LockGuard l(lock_);
    return RequestVoteRespondIsBusy(request, response);
{code}

However the LockGuard there ends up just waiting until the other vote is done, defeating the purpose of the quick response. In this case we are acquiring the lock just to get the current term, but in the case of rejecting a vote it would be fine to respond with an optimistic (not-yet-durable) term.

> Allow accessing consensus metadata during flush/sync
> ----------------------------------------------------
>
>                 Key: KUDU-2370
>                 URL: https://issues.apache.org/jira/browse/KUDU-2370
>             Project: Kudu
>          Issue Type: Improvement
>          Components: consensus, perf
>    Affects Versions: 1.8.0
>            Reporter: Todd Lipcon
>            Priority: Major
>
> In some cases when disks are overloaded or starting to go bad, flushing consensus metadata can take a significant amount of time. Currently, we hold the RaftConsensus::lock_ for the duration of things like voting or changing term, which blocks other requests such as writes or UpdateConsensus calls. There are certainly some cases where exposing "dirty" (non-durable) cmeta is illegal from a Raft perspectives, but there are other cases where it is safe. For example:
> - assume we receive a Write request, and we see that cmeta is currently busy flushing a change that marks the local replica as a FOLLOWER. In that case, if we wait on the lock, when we eventually acquire it, we'll just reject the request anyway. We might as well reject it immediately.
> - Assume we receive a Write request, and we see that cmeta is currently flushing a change that will mark the local replica as a LEADER in the next term. CheckLeadershipAndBindTerm can safely bind to the upcoming term rather than blocking until the flush completes.
> - Assume we recieve an UpdateConsensus or Vote request for term N, and we see that we're currently flushing a change to term M > N. I think it's safe to reject the request even though the new term isn't yet durable.
> Probably a few other cases here where it's safe to act on not-yet-durable info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)