You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Justin Sweeney (Jira)" <ji...@apache.org> on 2023/03/07 20:01:00 UTC

[jira] [Created] (SOLR-16689) Inefficiencies in replication process

Justin Sweeney created SOLR-16689:
-------------------------------------

             Summary: Inefficiencies in replication process
                 Key: SOLR-16689
                 URL: https://issues.apache.org/jira/browse/SOLR-16689
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 9.1.1
            Reporter: Justin Sweeney


There are a couple of inefficiencies with replication that can cause increased CPU usage unnecessarily due to replicas being added:
 # The [RecoveryStrategy.replicate()|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java#L219] method makes a call to commit to on the leader. This happens whenever a replica is reloaded. For PULL replicas in particular this isn't necessary since we can just pull down whatever the latest data is and rely on other mechanisms to be consistently committing the leader. (As an aside, it seems like forcing a commit on the leader might never be necessary, but for this I've limited it to focusing on PULL replicas).
 # In a case where the leader has no data yet (index version is 0), then a non-leader replica will consistently delete and recreate its core due to this case in IndexFetcher: [https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L549.] This can cause unnecessary CPU usage until the leader has data indexed to it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org