You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2016/12/19 05:33:58 UTC

[jira] [Resolved] (KUDU-1170) Queue should reset all_replicated_opid when becoming LEADER

     [ https://issues.apache.org/jira/browse/KUDU-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved KUDU-1170.
-------------------------------
       Resolution: Cannot Reproduce
    Fix Version/s: 1.2.0

> Queue should reset all_replicated_opid when becoming LEADER
> -----------------------------------------------------------
>
>                 Key: KUDU-1170
>                 URL: https://issues.apache.org/jira/browse/KUDU-1170
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>    Affects Versions: Private Beta
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 1.2.0
>
>
> Looking at the logs on a busy server, I see various cases like:
> {code}
> Queue going to LEADER mode. State: All replicated op: 10.6, Majority replicated op: 10.5,
> {code}
> I'm not sure if it's actually causing downstream problems, but definitely seems counter-intuitive. I think the issue is that in SetLeaderMode, we reset majority_replicated_op based on the committed index, but we don't reset all_replicated. I think it's possible that the all_replicated watermark in a previous term gets ahead of the committed index in the case that we hit the "cannot advance committed index until we've replicated something in our own term" or somesuch, but there may be some other race here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)