You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mano Kovacs (JIRA)" <ji...@apache.org> on 2017/10/03 12:48:00 UTC

[jira] [Created] (SOLR-11431) Leader candidate cannot become leader if replica responds 500 to PeerSync

Mano Kovacs created SOLR-11431:
----------------------------------

             Summary: Leader candidate cannot become leader if replica responds 500 to PeerSync
                 Key: SOLR-11431
                 URL: https://issues.apache.org/jira/browse/SOLR-11431
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 7.0
            Reporter: Mano Kovacs


When leader candidate does PeerSync to all replicas, to download any missing updates, it is tolerant to failures. It uses {{cantReachIsSuccess=true}} switch which handles connection issue, 404 and 503 as success, since replicas being DOWN should not affect the process.

However, if a replica has disk issues, the core initialization might fail and that results in {{500}} instead of {{503}}. I failing replica like that can prevent any other replicas becoming the leader.

Proposing either:
* Accepting {{500}} as "cant reach" so leader candidate can go on
or
* Changing {{SolrCoreInitializationException}} to return {{503}} instead of {{500}}
* * this might be API change, however



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org