You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Shawn Heisey (Jira)" <ji...@apache.org> on 2022/11/24 16:58:00 UTC

[jira] [Comment Edited] (SOLR-16561) Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher

    [ https://issues.apache.org/jira/browse/SOLR-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638364#comment-17638364 ] 

Shawn Heisey edited comment on SOLR-16561 at 11/24/22 4:57 PM:
---------------------------------------------------------------

I encouraged [~sunh11373] to open this issue on ApacheSolr slack.  I think this is how the replication interval should have been determined from the beginning.  Opening a new searcher is particularly heavy on TLOG/PULL followers, as noted elsewhere with the report that none of the existing SegmentReader instances are re-used.  The current code is particularly aggressive for most users given that the autoCommit in our example configs is set to 15 seconds.

Do any of my peers disagree with committing this change?  I think that this should be documented in the section on TLOG and PULL replicas, with some discussion about general recommendations regarding autoSoftCommit and autoCommit.  I will work on some documentation changes to pair with this.

 

 


was (Author: elyograg):
I encouraged [~sunh11373] to open this issue on ApacheSolr slack.  I think this is how the replication interval should have been determined from the beginning.  Opening a new searcher is particularly heavy on TLOG/PULL followers, as noted elsewhere with the report that all the existing SegmentReader instances are not re-used.  The current code is particularly aggressive for most users given that the autoCommit in our example configs is set to 15 seconds.

Do any of my peers disagree with committing this change?  I think that this should be documented in the section on TLOG and PULL replicas, with some discussion about general recommendations regarding autoSoftCommit and autoCommit.  I will work on some documentation changes to pair with this.

 

> Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher
> ---------------------------------------------------------------------
>
>                 Key: SOLR-16561
>                 URL: https://issues.apache.org/jira/browse/SOLR-16561
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>    Affects Versions: 8.8.2
>            Reporter: Hang Sun
>            Assignee: Shawn Heisey
>            Priority: Minor
>              Labels: replication-performance
>         Attachments: SOLR-16561.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> TLOG/PULL replicas use *IndexFetcher* to fetch segment files from leaders. Once new segment files are downloaded and merged into existing index, a new Searcher is opened so the updated data is made available to the clients.  The poll interval is determined by following code in *ReplicateFromLeader*
> {code:java}
> if (uinfo.autoCommmitMaxTime != -1) {
>    pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2);
> } else if (uinfo.autoSoftCommmitMaxTime != -1) {
>    pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2);
> }{code}
>  
> In a typical config for replication using TLOG/PULL replicas where data visibility is less important (a trade-off to avoid NRT replicas), we set a short commit time to persist changes and long soft-commit time to make changes visible.
>  
> {code:java}
> <autoCommit>
>   <maxTime>15000</maxTime>
>   <openSearcher>false</openSearcher>
> </autoCommit>
> <autoSoftCommit>
>   <maxTime>3600000</maxTime>
> </autoSoftCommit>
> {code}
>  
> With about config, the poll interval will be 15/2 = 7 sec.  This leads to frequent opening of new Searchers which causes huge impact on realtime user queries, especially if the new Searcher takes long time to warmup.  This also makes changes visible on followers ahead of leaders.   
> Because the polling of new segment files is more about visibility because TLOG replicas still get updates to tlog files via UpdateHandler (this is my understanding). It seems more appropriate to use  *autoSoftCommmitMaxTime* as the poll interval.   
> I would  proposed change below where *autoSoftCommmitMaxTime* is chosen as the preferred interval.  This will make the poll interval much longer and make the visibility order more inline with eventual consistency pattern.
>  
> {code:java}
> if (uinfo.autoSoftCommmitMaxTime != -1) {
>     pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime);
> } else if (uinfo.autoCommmitMaxTime != -1) {
>     pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime);
> }
> {code}
> The change has been tried and showed much less impact on realtime queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org