You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Cao Manh Dat (JIRA)" <ji...@apache.org> on 2018/03/30 04:09:00 UTC

[jira] [Created] (SOLR-12166) Race condition in rejoinElection and registering replica

Cao Manh Dat created SOLR-12166:
-----------------------------------

             Summary: Race condition in rejoinElection and registering replica
                 Key: SOLR-12166
                 URL: https://issues.apache.org/jira/browse/SOLR-12166
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Cao Manh Dat
            Assignee: Cao Manh Dat


I found this case when beasting LIROnShardRestartTest, the case here is
 * ReplicaA may be the new leader - try and sync with other replicasĀ and somehow failed to become the leader (ex: LIR flag).
 * ReplicaA call rejoinElection, therefore, starting the recovery process
 * After rejoinElection, it somehow wins the election (ex: all replicas participated in the election, therefore LIR flag is cleared).
 * ReplicaA register itself as ACTIVE after winning the election
 * The recovery process above publish ReplicaA to DOWN or RECOVERY
 * We end up with a dead-end shard with a DOWN leader, hence other replicas can't do recovery with replicaA



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org