You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2015/11/04 11:28:27 UTC
[jira] [Commented] (SOLR-7569) Create an API to force a leader election between nodes

    [ https://issues.apache.org/jira/browse/SOLR-7569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989269#comment-14989269 ] 

Noble Paul commented on SOLR-7569:
----------------------------------

Let's not keep the core admin command as OVERRIDELASTPUBLISHED. This means it can be a generic enough API which may be abused by others for other things. Let's not tell others what we are doing internally and keep the command name opaque

This particular collection admin operation does not really have to  go to overseer, it can be performed by the receiving node itself because the clearing of LIR node does not have to be done at overseer anyway

> Create an API to force a leader election between nodes
> ------------------------------------------------------
>
>                 Key: SOLR-7569
>                 URL: https://issues.apache.org/jira/browse/SOLR-7569
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Shalin Shekhar Mangar
>              Labels: difficulty-medium, impact-high
>         Attachments: SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569.patch, SOLR-7569_lir_down_state_test.patch
>
>
> There are many reasons why Solr will not elect a leader for a shard e.g. all replicas' last published state was recovery or due to bugs which cause a leader to be marked as 'down'. While the best solution is that they never get into this state, we need a manual way to fix this when it does get into this  state. Right now we can do a series of dance involving bouncing the node (since recovery paths between bouncing and REQUESTRECOVERY are different), but that is difficult when running a large cluster. Although it is possible that such a manual API may lead to some data loss but in some cases, it is the only possible option to restore availability.
> This issue proposes to build a new collection API which can be used to force replicas into recovering a leader while avoiding data loss on a best effort basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org