You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christian Schramm (JIRA)" <ji...@apache.org> on 2014/12/05 11:23:12 UTC

[jira] [Commented] (SOLR-5821) Search inconsistency on SolrCloud replicas

    [ https://issues.apache.org/jira/browse/SOLR-5821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14235336#comment-14235336 ] 

Christian Schramm commented on SOLR-5821:
-----------------------------------------

I confirm that the problem still exists in 4.10.1.
I've created a cluster consisting of 8 servers, a set of 9 cores dispatchted on 4 shards, having 4 replica's.
The 8 solr servers were running tomcat7 and using a Zookeeper Cluster (3 nodes).
I was justing a loadbalancer to dispatch incoming queries across the 8 servers.
The cluster was running fine for a few hours, but when having a lot of commits, the replica's get quickly out of sync.
So searching in the same core on different hosts gave different results.
No errors however in Zookeeper or in the SOLR logs.
I guess and hope that this is not expected behaviour though....?

> Search inconsistency on SolrCloud replicas
> ------------------------------------------
>
>                 Key: SOLR-5821
>                 URL: https://issues.apache.org/jira/browse/SOLR-5821
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.6.1, 4.7.1
>         Environment: SolrCloud:
> 1 shard, 2 replicas
> Both instances/replicas have identical hardware/software:
> CPU(s): 4
> RAM: 8Gb
> HDD: 100Gb
> OS: CentOS 6.5
> ZooKeeper 3.4.5
> Tomcat 8.0.3
> Solr 4.6.1
> Servers are utilized to run Solr only.
>            Reporter: Maxim Novikov
>            Priority: Critical
>              Labels: cloud, inconsistency, replica, search
>         Attachments: Screen Shot 2014-04-05 at 2.26.26 AM.png, Screen Shot 2014-04-05 at 2.26.41 AM.png
>
>
> We use the following infrastructure:
> SolrCloud with 1 shard and 2 replicas. The index is built using DataImportHandler (importing data from the database). The number of items in the index can vary from 100 to 100,000,000.
> After indexing part of the data (not necessarily all the data, it is enough to have a small number of items in the search index), we can observe that Solr instances (replicas) return different results for the same search queries. I believe it happens because some of the results have the same scores, and Solr instances return those in a random order.
> PS This is a critical issue for us as we use a load balancer to scale Solr through replicas, and as a result of this issue, we retrieve various results for the same queries all the time. They are not necessarily completely different, but even a couple of items that differ is a deal breaker.
> The expected behaviour would be to always get identical results for the same search queries from all replicas. Otherwise, this "cloud" thing works just unreliably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org