You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Justin Sweeney (Jira)" <ji...@apache.org> on 2023/01/03 17:30:00 UTC

[jira] [Commented] (SOLR-16561) Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher

    [ https://issues.apache.org/jira/browse/SOLR-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17654115#comment-17654115 ] 

Justin Sweeney commented on SOLR-16561:
---------------------------------------

Hey [~elyograg] [~sunh11373], just wanted to bump this again to take a look at the issue I linked when you get a chance since I think it could be useful to get that linked issue merged/resolved as a fix for this.

> Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher
> ---------------------------------------------------------------------
>
>                 Key: SOLR-16561
>                 URL: https://issues.apache.org/jira/browse/SOLR-16561
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (java)
>    Affects Versions: 8.8.2
>            Reporter: Hang Sun
>            Assignee: Shawn Heisey
>            Priority: Minor
>              Labels: replication-performance
>         Attachments: SOLR-16561.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> TLOG/PULL replicas use *IndexFetcher* to fetch segment files from leaders. Once new segment files are downloaded and merged into existing index, a new Searcher is opened so the updated data is made available to the clients.  The poll interval is determined by following code in *ReplicateFromLeader*
> {code:java}
> if (uinfo.autoCommmitMaxTime != -1) {
>    pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2);
> } else if (uinfo.autoSoftCommmitMaxTime != -1) {
>    pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2);
> }{code}
>  
> In a typical config for replication using TLOG/PULL replicas where data visibility is less important (a trade-off to avoid NRT replicas), we set a short commit time to persist changes and long soft-commit time to make changes visible.
>  
> {code:java}
> <autoCommit>
>   <maxTime>15000</maxTime>
>   <openSearcher>false</openSearcher>
> </autoCommit>
> <autoSoftCommit>
>   <maxTime>3600000</maxTime>
> </autoSoftCommit>
> {code}
>  
> With about config, the poll interval will be 15/2 = 7 sec.  This leads to frequent opening of new Searchers which causes huge impact on realtime user queries, especially if the new Searcher takes long time to warmup.  This also makes changes visible on followers ahead of leaders.   
> Because the polling of new segment files is more about visibility because TLOG replicas still get updates to tlog files via UpdateHandler (this is my understanding). It seems more appropriate to use  *autoSoftCommmitMaxTime* as the poll interval.   
> I would  proposed change below where *autoSoftCommmitMaxTime* is chosen as the preferred interval.  This will make the poll interval much longer and make the visibility order more inline with eventual consistency pattern.
>  
> {code:java}
> if (uinfo.autoSoftCommmitMaxTime != -1) {
>     pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime);
> } else if (uinfo.autoCommmitMaxTime != -1) {
>     pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime);
> }
> {code}
> The change has been tried and showed much less impact on realtime queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org