You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2015/02/13 21:10:11 UTC

[jira] [Commented] (SOLR-7109) Indexing threads stuck during network partition can put leader into down state

    [ https://issues.apache.org/jira/browse/SOLR-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320682#comment-14320682 ] 

Mark Miller commented on SOLR-7109:
-----------------------------------

I ran into a similar issue in SOLR-7065 with the new test I have there.

It's still exploratory, so I just took that DOWN publish out for the time being.

> Indexing threads stuck during network partition can put leader into down state
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-7109
>                 URL: https://issues.apache.org/jira/browse/SOLR-7109
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10.3, 5.0
>            Reporter: Shalin Shekhar Mangar
>             Fix For: Trunk, 5.1
>
>
> I found this recently while running some Jepsen tests. I found that some threads get stuck on zk operations for a long time in ZkController.updateLeaderInitiatedRecoveryState method and when they wake up they go ahead with setting the LIR state to down. But in the mean time, new leader has been elected and sometimes you'd get into a state where the leader itself is put into recovery causing the shard to reject all writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org