You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shawn Heisey (JIRA)" <ji...@apache.org> on 2018/05/21 23:11:00 UTC

[jira] [Comment Edited] (SOLR-12382) new data not seen immediately after commit() on SolrCloud collection with only TLOG and PULL replicas

    [ https://issues.apache.org/jira/browse/SOLR-12382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483184#comment-16483184 ] 

Shawn Heisey edited comment on SOLR-12382 at 5/21/18 11:10 PM:
---------------------------------------------------------------

We do have documentation that says NRT replica types are the only kind that support soft commits.

https://lucene.apache.org/solr/guide/7_3/shards-and-indexing-data-in-solrcloud.html#all-nrt-replicas

If your code is doing a soft commit, then this is what's happening:

 * Changes are indexed.
 * A soft commit is called.
 * The leader does the commit, but only into memory.  That change is NOT replicated to the other replicas, because TLOG and PULL replicas copy the on-disk index.
 * Your first query is made.  It gets load balanced by the cluster to a replica other than the leader.
 * Within 15 seconds (your autoCommit interval), a hard commit is fired, flushing all segments to disk.  At this point, the changes to the index will be on disk, so they are replicated.  When a TLOG or PULL replica has its index change, it will open a new searcher.
 * A query made after the other replicas successfully open a new searcher will see the change, no matter which replica it is sent to.

The solution to this is to use only hard commits, or stick with NRT replicas.



was (Author: elyograg):
We do have documentation that says NRT replica types are the only kind that support soft commits.

https://lucene.apache.org/solr/guide/7_3/shards-and-indexing-data-in-solrcloud.html#all-nrt-replicas

If your code is doing a soft commit, then this is what's happening:

 * Changes are indexed.
 * A soft commit is called.
 * The leader does the commit, but only into memory.  That change is NOT replicated to the other replicas.
 * Your first query is made.  It gets load balanced by the cluster to a replica other than the leader.
 * Within 15 seconds (your autoCommit interval), a hard commit is fired, flushing all segments to disk.  At this point, the changes to the index will be on disk, so they are replicated.
 * A query here will see the change, no matter which replica it is sent to.

The solution to that is to use hard commits or NRT replicas.


> new data not seen immediately after commit() on SolrCloud collection with only TLOG and PULL replicas
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-12382
>                 URL: https://issues.apache.org/jira/browse/SOLR-12382
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.3
>         Environment: SolrCloud on Amazon Linux AMI 2018.03
>  
>            Reporter: Nguyen Nguyen
>            Priority: Major
>
> On collection with TLOG/PULL replicas, queries that follow right after commit(waitSearch:true) would NOT return newly added data until several seconds later.
> Tested same scenario on another collection with only NRT replicas and found that it behaved as expected (query returned newly added data right after commit(waitSearch:true) was called.
> 7.3 SolrCloud with 3 shards, each shard has 2 TLOG replicas + 1 PULL replica
> Commit settings
> <autoCommit> 
>   <maxTime>15000</maxTime> 
>   <openSearcher>false</openSearcher> 
> </autoCommit>
> <autoSoftCommit> 
>   <maxTime>-1</maxTime> 
> </autoSoftCommit>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org