You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2013/02/19 11:17:14 UTC

[jira] [Commented] (SOLR-4260) Inconsistent numDocs between leader and replica

    [ https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581189#comment-13581189 ] 

Markus Jelsma commented on SOLR-4260:
-------------------------------------

Here's the index information for two cores of the same shard, running on different nodes.

{code}
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">1</int>
</lst>
<lst name="index">
  <int name="numDocs">117744</int>
  <int name="maxDoc">118160</int>
  <int name="deletedDocs">416</int>
  <long name="version">3802</long>
  <int name="segmentCount">15</int>
  <bool name="current">true</bool>
  <bool name="hasDeletions">true</bool>
  <str name="directory">org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_h/data/index.20130211094737738 lockFactory=org.apache.lucene.store.NativeFSLockFactory@2ca7563d; maxCacheMB=48.0 maxMergeSizeMB=4.0)</str>
  <lst name="userData">
    <str name="commitTimeMSec">1361265544970</str>
  </lst>
  <date name="lastModified">2013-02-19T09:19:04.97Z</date>
</lst>
</response>

{code}

{code}
<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">0</int>
</lst>
<lst name="index">
  <int name="numDocs">117767</int>
  <int name="maxDoc">118181</int>
  <int name="deletedDocs">414</int>
  <long name="version">3772</long>
  <int name="segmentCount">13</int>
  <bool name="current">true</bool>
  <bool name="hasDeletions">true</bool>
  <str name="directory">org.apache.lucene.store.NRTCachingDirectory:NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr/cores/shard_h/data/index.20130211105622621 lockFactory=org.apache.lucene.store.NativeFSLockFactory@684b4388; maxCacheMB=48.0 maxMergeSizeMB=4.0)</str>
  <lst name="userData">
    <str name="commitTimeMSec">1361265544937</str>
  </lst>
  <date name="lastModified">2013-02-19T09:19:04.937Z</date>
</lst>
</response>
{code}

We send updates/deletes to the cluster every 10-15 minutes. The shard will not become synchronized, unless i remove the index of one of the nodes.
                
> Inconsistent numDocs between leader and replica
> -----------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 5.0
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Priority: Critical
>             Fix For: 5.0
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using CloudSolrServer we see inconsistencies between the leader and replica for some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have a small deviation in then number of documents. The leader and slave deviate for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my attention, there were small IDF differences for exactly the same record causing a record to shift positions in the result set. During those tests no records were indexed. Consecutive catch all queries also return different number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor of two and frequently reindex using a fresh build from trunk. I've not seen this issue for quite some time until a few days ago.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org