You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/10/21 17:49:44 UTC

[GitHub] [solr] justinrsweeney opened a new pull request, #1103: SOLR-16487: Pull Replica Efficiency Improvements

justinrsweeney opened a new pull request, #1103:
URL: https://github.com/apache/solr/pull/1103

   https://issues.apache.org/jira/browse/SOLR-16487
   
   # Description
   
   This makes improvements to the following inefficiencies in how non-leader replicas are handled:
   
   1. The [RecoveryStrategy.replicate()](https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/cloud/RecoveryStrategy.java#L219) method makes a call to commit to on the leader. This happens whenever a replica is reloaded. For PULL replicas in particular this isn't necessary since we can just pull down whatever the latest data is and rely on other mechanisms to be consistently committing the leader. (As an aside, it seems like forcing a commit on the leader might never be necessary, but for this I've limited it to focusing on PULL replicas).
   2. In a case where the leader has no data yet (index version is 0), then a non-leader replica will consistently delete and recreate its core due to this case in IndexFetcher: https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/handler/IndexFetcher.java#L549. This can cause unnecessary CPU usage until the leader has data indexed to it.
   3. The polling for replication is fairly simply, but can lead to polling too often. As an example if you had the following config for commits:
   ```
   <autoCommit>
       <maxTime>15000</maxTime>
       <openSearcher>false</openSearcher>
   </autoCommit>
   
   <autoSoftCommit>
       <maxTime>60000</maxTime>
   </autoSoftCommit>
   ```
   The current logic would setup polling to be half of the autoCommit time, so poll every 7.5 seconds. However since a new searcher isn't opened, there will only be changes reflected every 60 seconds on the leader. We can make this logic a bit smarter knowing that the replication handler won't reflect changes until a new searcher is opened.
   
   # Solution
   
   This PR includes a number of small changes to fix the issues above:
   1. Adding a check if the current replica is a non-leader replica and if so skipping the commit call to the leader
   2. Modifying when the leader has a version of 0 to check if the current version is also 0 and doing nothing in that case. Previously this was checking the generation which starts at 1.
   3. Modified the code setting the polling interval for replication to only use the autoCommit time if openSearcher is true, otherwise it will use the soft commit time.
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [ ] I have created a Jira issue and added the issue ID to my pull request title.
   - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [ ] I have developed this patch against the `main` branch.
   - [ ] I have run `./gradlew check`.
   - [ ] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] justinrsweeney closed pull request #1103: SOLR-16487: Pull Replica Efficiency Improvements

Posted by "justinrsweeney (via GitHub)" <gi...@apache.org>.
justinrsweeney closed pull request #1103: SOLR-16487: Pull Replica Efficiency Improvements
URL: https://github.com/apache/solr/pull/1103


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] sonatype-lift[bot] commented on pull request #1103: SOLR-16487: Pull Replica Efficiency Improvements

Posted by GitBox <gi...@apache.org>.
sonatype-lift[bot] commented on PR #1103:
URL: https://github.com/apache/solr/pull/1103#issuecomment-1287314111

   :warning: **313 God Classes** were detected by Lift in this project. [Visit the Lift web console](https://lift.sonatype.com/results/github.com/apache/solr/01GFXTMH7FHEGWCXP6WXF2MY64?tab=technical-debt&utm_source=github.com&utm_campaign=lift-comment&utm_content=apache\%20solr) for more details.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org