You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@kudu.apache.org by "Alexey Serbin (Jira)" <ji...@apache.org> on 2019/10/14 22:47:00 UTC

[jira] [Updated] (KUDU-2452) Prevent follower from causing pre-elections when UpdateConsensus is slow

     [ https://issues.apache.org/jira/browse/KUDU-2452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexey Serbin updated KUDU-2452:
--------------------------------
    Labels: stability  (was: )

> Prevent follower from causing pre-elections when UpdateConsensus is slow
> ------------------------------------------------------------------------
>
>                 Key: KUDU-2452
>                 URL: https://issues.apache.org/jira/browse/KUDU-2452
>             Project: Kudu
>          Issue Type: Improvement
>    Affects Versions: 1.7.0
>            Reporter: William Berkeley
>            Priority: Major
>              Labels: stability
>
> Thanks to pre-elections (KUDU-1365), slow UpdateConsensus calls on a single follower don't disturb the whole tablet by calling elections. However, sometimes I see situations where one or more followers are constantly calling pre-elections, and only rarely, if ever, overflowing their service queues. Occasionally, in 3x replicated tablets, the followers will get "lucky" and detect a leader failure at around the same time, and an election will happen.
> This background instability has caused bugs like KUDU-2343 that should be rare to occur pretty frequently, plus the extra RequestConsensusVote RPCs add a little more stress on the consensus service and on replicas' consensus locks. It also spams the logs, since there's no generally no exponential backoff for these pre-elections because there's a successful heartbeat in between them.
> It seems like we can get into the situation where the average number of in-flight consensus requests is constant over time, so on average we are processing each heartbeat in less than the heartbeat interval, however some heartbeats take longer. Since UpdateConsensus calls to a replica are serialized, a few of these in a row trigger the failure detector, despite the follower receiving every heartbeat in a timely manner and responding successfully eventually (and on average in a timely manner).
> It'd be nice to prevent these worthless pre-elections. A couple of ideas:
> 1. Separately calculate a backoff for failed pre-elections, and reset it when a pre-election succeeds or more generally when there's an election.
> 2. Don't count the time the follower is executing UpdateConsensus against the failure detector. [~mpercy] suggested stopping the failure detector during UpdateReplica() and resuming it when the function returns.
> 3. Move leader failure detection out-of-band of UpdateConsensus entirely.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)