You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Stefan Egli (JIRA)" <ji...@apache.org> on 2015/11/03 15:56:27 UTC

[jira] [Resolved] (SLING-4640) Possibility of duplicate leaders w/discovery.impl on eventually consistent repo

     [ https://issues.apache.org/jira/browse/SLING-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stefan Egli resolved SLING-4640.
--------------------------------
    Resolution: Won't Fix
      Assignee: Stefan Egli

Marked this as Won't Fix with the following reasons:
* there's rather no standard way of knowing how large the read delay is in any underlying repository that can be used. That's simply not anything standard.
* if we're talking Oak, then we could come up with a way for Oak to expose such a delay
** however, exactly for Oak we have discovery.oak which works around repository-delays in general by using a lower-level collection for storing group-detection-info (leases). So for discovery.oak SLING-4640 is not relevant. I don't think it makes sense to come up with yet another non-standard extension for Oak for something where we have a different solution already..

> Possibility of duplicate leaders w/discovery.impl on eventually consistent repo
> -------------------------------------------------------------------------------
>
>                 Key: SLING-4640
>                 URL: https://issues.apache.org/jira/browse/SLING-4640
>             Project: Sling
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: Discovery Impl 1.1.0
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>
> Note: This is a fork of SLING-3432 based on a [comment|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14495936&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14495936]. So here is that comment again:
> Note that [the above|https://issues.apache.org/jira/browse/SLING-3432?focusedCommentId=14492494&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14492494] does not solve the problem where the underlying repository is eventually consistent and the heartbeat configured is too low to catch all possible delays (that such an eventually consistent repository might produce under load). Consider the following:
> # a cluster consisting of 3 nodes: A, B and C, A is the leader
> # writes from B and C are fast - and can be read by all 3 nodes fast
> # writes from A though are slow (ie A behaves asymmetric: slow writes but fast reads)
> # at some point writes from A are slower than the configured heartbeat timeout: at this point B and C find out about this and vote on a new clusterView consisting only of B and C and (let's say) B becomes leader.
> #* meanwhile at A however: A is still happy: it sees the heartbeats of B and C in time and would not start a new voting.
> # at some later point (with a *certain read delay*) A sees that B and C have declared a new {{/establishedViews}} - at this point it would (according to the new rule above) immediately send a TOPOLOGY_CHANGING and things would be 'ok' again. 
> #* *but* until it does send this event - *between 4. and 5. - we have two leaders: A and B*! -> thus could see issues reported here in SLING-3432 still during that small timeframe (which is basically the amount of time it takes for the new established view declared by B and C to be read by A).
> #* at a later time, when eg the delays in the repository have come down, A would rejoin the cluster - but would have to *not become leader* again, as the leader is B and must stay stable.
> This IMHO highlights the problem that using an eventually consistent repository (that has no max guaranteed delay) is *not* pseudo-network-partition/duplicate-leader free under load.
> Note that what is described here is not fixed by SLING-4627.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)