You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Cao Manh Dat (Jira)" <ji...@apache.org> on 2020/03/27 11:25:00 UTC

[jira] [Updated] (SOLR-14368) SyncStrategy result should not prevent a replica to become leader

     [ https://issues.apache.org/jira/browse/SOLR-14368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cao Manh Dat updated SOLR-14368:
--------------------------------
    Description: 
h2. History

In the beginning of SolrCloud, to become leader a replica will need to _sync_ with other replicas, This process includes
 * Compare the current replica (leader’s candidate) tlog with others replicas. For example if current candidate’s data is too behind others, that replica should not become leader.
 * Requesting other replicas to do a sync back before become leader, so imagine when the old leader got shut down when it trying to send multiple updates (u1, u2, u3, u4) to others
 * Replica A may receive updates (u1, u2)
 * Replica B may receive updates (u3, u4)
 * If replica A becomes leader and it does not request replica B to sync back, replica B then needs to go into a recovery process which is costly.

But this process have some problem
 # We only sync with live replicas, so in case of no others live replicas at the time of the election, current replica can blindly become leader -> data loss, this problem was fixed with SOLR-11702
 # For any IOException which is not catched properly during the communication process with the current replica and others can prevent that replica becoming leader.

h2. Idea

Basically with new ShardTerms information, we can pick arbitrary replicas with the highest _term_ to become leader. The reason here is replica’s _term_ effectively represents how close a replica is up-to-date with the leader.

The only meaning of _sync_ with other replicas now is to prevent costly recovery processes from happening. Therefore SyncStrategy should not prevent a replica from becoming a leader.

  was:Update later...


> SyncStrategy result should not prevent a replica to become leader
> -----------------------------------------------------------------
>
>                 Key: SOLR-14368
>                 URL: https://issues.apache.org/jira/browse/SOLR-14368
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>
> h2. History
> In the beginning of SolrCloud, to become leader a replica will need to _sync_ with other replicas, This process includes
>  * Compare the current replica (leader’s candidate) tlog with others replicas. For example if current candidate’s data is too behind others, that replica should not become leader.
>  * Requesting other replicas to do a sync back before become leader, so imagine when the old leader got shut down when it trying to send multiple updates (u1, u2, u3, u4) to others
>  * Replica A may receive updates (u1, u2)
>  * Replica B may receive updates (u3, u4)
>  * If replica A becomes leader and it does not request replica B to sync back, replica B then needs to go into a recovery process which is costly.
> But this process have some problem
>  # We only sync with live replicas, so in case of no others live replicas at the time of the election, current replica can blindly become leader -> data loss, this problem was fixed with SOLR-11702
>  # For any IOException which is not catched properly during the communication process with the current replica and others can prevent that replica becoming leader.
> h2. Idea
> Basically with new ShardTerms information, we can pick arbitrary replicas with the highest _term_ to become leader. The reason here is replica’s _term_ effectively represents how close a replica is up-to-date with the leader.
> The only meaning of _sync_ with other replicas now is to prevent costly recovery processes from happening. Therefore SyncStrategy should not prevent a replica from becoming a leader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org