You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2017/10/02 17:51:00 UTC

[jira] [Commented] (SOLR-11417) Crashed leader's hanging emphemral will make restarting followers stuck in recovering

    [ https://issues.apache.org/jira/browse/SOLR-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188501#comment-16188501 ] 

Mark Miller commented on SOLR-11417:
------------------------------------

bq. becomes RECOVERING, so it won't participate anymore.

My first thought to try would be detecting a connection based error and in that case, use the method that publishes state but does not update the last state variable that gets checked.

It might even make sense to do that on any fail, not just connection errors - I'm not sure its preferable to have a replica disable it's own ability to be a leader - kind of defeats the repeated attempts.

> Crashed leader's hanging emphemral will make restarting followers stuck in recovering
> -------------------------------------------------------------------------------------
>
>                 Key: SOLR-11417
>                 URL: https://issues.apache.org/jira/browse/SOLR-11417
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 6.3
>            Reporter: Mano Kovacs
>         Attachments: SOLR-11417.png
>
>
> If replicas are starting up after leader crash and within the ZK session timeout, replicas
> * will lose leader election due to hanging ephemerals
> * will read stale data from ZK about current leader
> * will fail recovery and stuck in recovering state
> If leader is down permanently (eg. hardware failure) and all replicas are affected, shard will not come up (see also SOLR-7065).
> Tested on 6.3. See attached image for details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org