You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Cool Techi <co...@outlook.com> on 2014/07/17 14:57:46 UTC

SolrCloud Issues

Hi,
We have moved to SolrCloud (version 4.8) and are facing several issues in this setup compared to the master/slave setup we have had for a while now,
1) We have a 2 shard set-up with one replica each, we notice that most of the time the replica's are in recovering status. What could be causing this?
2) In-consistant results count, especially when one of the nodes is recovering, I have asked another question earlier on this. To our understanding the recovering node, doesn't returns any results, what else can cause this?
3) Solr node goes down very frequently, there is no OOM or otherwise in the logs, but a node keeps going down. Also, at times we have noticed that tomcat stops responding. Since there are so many parts to solr cloud, checking what's causing the issue is difficult in a quick time, so if anyone else has faced this, it would be very helpful.

Since this is happening on our UAT environment, we need to get a fix soon. 
Regards,Ayush 		 	   		  

RE: SolrCloud Issues

Posted by Cool Techi <co...@outlook.com>.
1) The Zookeepers are on the same node as Solr, should we move them out? What would be the basic config's of machine just running Zookeepers?
2) The servers are pretty big, 
         2 X Quad Core processors 64 bit         96Gb of RAM         500GB SSD drive on which solr resides 

The index side between the two shards is 250GB,  how can we detect performance problems if any. Also, what keep the node in recovering mode for most of the time.  We index about 2K documents/ minute.
What would be the ideal configuration for such load and increasing index size.
Regards,Ayush



> From: Dan.Murphy@buy4now.com
> To: solr-user@lucene.apache.org
> Subject: RE: SolrCloud Issues
> Date: Thu, 17 Jul 2014 13:27:17 +0000
> 
> Have you deployed ZooKeeper on servers other than the Solr nodes?
> If you have them on the Solr nodes, then you may be getting elections when under pressure.
> 
> -----Original Message-----
> From: Shawn Heisey [mailto:solr@elyograg.org] 
> Sent: 17 July 2014 14:25
> To: solr-user@lucene.apache.org
> Subject: Re: SolrCloud Issues
> 
> On 7/17/2014 6:57 AM, Cool Techi wrote:
> > We have moved to SolrCloud (version 4.8) and are facing several issues 
> > in this setup compared to the master/slave setup we have had for a 
> > while now,
> > 1) We have a 2 shard set-up with one replica each, we notice that most of the time the replica's are in recovering status. What could be causing this?
> > 2) In-consistant results count, especially when one of the nodes is recovering, I have asked another question earlier on this. To our understanding the recovering node, doesn't returns any results, what else can cause this?
> > 3) Solr node goes down very frequently, there is no OOM or otherwise in the logs, but a node keeps going down. Also, at times we have noticed that tomcat stops responding. Since there are so many parts to solr cloud, checking what's causing the issue is difficult in a quick time, so if anyone else has faced this, it would be very helpful.
> 
> I don't have anything specific for you, but if you are having any kind of performance issues at all, it can lead to bizarre SolrCloud behavior.
> 
> The basic zookeeper client timeout defaults to 15 seconds.  This is a very long timeout, but if anything is happening that makes any part of SolrCloud wait longer than 15 seconds, SolrCloud will think there's a problem that needs recovery.
> 
> Here's a summary of common performance problems and some possible solutions:
> 
> http://wiki.apache.org/solr/SolrPerformanceProblems
> 
> Thanks,
> Shawn
> 
 		 	   		  

RE: SolrCloud Issues

Posted by Dan Murphy <Da...@buy4now.com>.
Have you deployed ZooKeeper on servers other than the Solr nodes?
If you have them on the Solr nodes, then you may be getting elections when under pressure.

-----Original Message-----
From: Shawn Heisey [mailto:solr@elyograg.org] 
Sent: 17 July 2014 14:25
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud Issues

On 7/17/2014 6:57 AM, Cool Techi wrote:
> We have moved to SolrCloud (version 4.8) and are facing several issues 
> in this setup compared to the master/slave setup we have had for a 
> while now,
> 1) We have a 2 shard set-up with one replica each, we notice that most of the time the replica's are in recovering status. What could be causing this?
> 2) In-consistant results count, especially when one of the nodes is recovering, I have asked another question earlier on this. To our understanding the recovering node, doesn't returns any results, what else can cause this?
> 3) Solr node goes down very frequently, there is no OOM or otherwise in the logs, but a node keeps going down. Also, at times we have noticed that tomcat stops responding. Since there are so many parts to solr cloud, checking what's causing the issue is difficult in a quick time, so if anyone else has faced this, it would be very helpful.

I don't have anything specific for you, but if you are having any kind of performance issues at all, it can lead to bizarre SolrCloud behavior.

The basic zookeeper client timeout defaults to 15 seconds.  This is a very long timeout, but if anything is happening that makes any part of SolrCloud wait longer than 15 seconds, SolrCloud will think there's a problem that needs recovery.

Here's a summary of common performance problems and some possible solutions:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn


Re: SolrCloud Issues

Posted by Shawn Heisey <so...@elyograg.org>.
On 7/17/2014 6:57 AM, Cool Techi wrote:
> We have moved to SolrCloud (version 4.8) and are facing several issues in this setup compared to the master/slave setup we have had for a while now,
> 1) We have a 2 shard set-up with one replica each, we notice that most of the time the replica's are in recovering status. What could be causing this?
> 2) In-consistant results count, especially when one of the nodes is recovering, I have asked another question earlier on this. To our understanding the recovering node, doesn't returns any results, what else can cause this?
> 3) Solr node goes down very frequently, there is no OOM or otherwise in the logs, but a node keeps going down. Also, at times we have noticed that tomcat stops responding. Since there are so many parts to solr cloud, checking what's causing the issue is difficult in a quick time, so if anyone else has faced this, it would be very helpful.

I don't have anything specific for you, but if you are having any kind
of performance issues at all, it can lead to bizarre SolrCloud behavior.

The basic zookeeper client timeout defaults to 15 seconds.  This is a
very long timeout, but if anything is happening that makes any part of
SolrCloud wait longer than 15 seconds, SolrCloud will think there's a
problem that needs recovery.

Here's a summary of common performance problems and some possible solutions:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn