You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Kelly, Frank" <fr...@here.com> on 2016/07/27 16:22:11 UTC

SolrCloud: Failure to recover on restart following OutOfMemoryError

Hi All,

 We have a SolrCloud cluster with 3 Virtual Machines, assigning 4GB to the Java Heap.
Recently we added a number of collections to the machine going from around 80 collections (each with 3 shards x 3 replicas) to 150 collections

We've hit Heap errors.
That wasn't the surprise, the surprise was that when I restarted - allowing more Xmx heap for Java (now 6GB) that Solr did not and could not recover despite having enough memory.
It was complaining about ZooKeeper status - "ZooKeeper thinks I am the leader but I am not".
I can only successfully recover by shutting off Solr, and nuking the ZooKeeper configs, recreating the configs and restarting Solr and deleting all collections.

Shouldn't Solr be able to recover on restart or does OutOfMemoryError cause some kind of Zk/Solr cluster state corruption that is unrecoverable?

-Frank
[cid:9D23A7EE-B937-4A83-9A5B-38A778230C49]
Frank Kelly
Principal Software Engineer
Predictive Analytics Team (SCBE/HAC/CDA)






HERE
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W

[cid:C257596C-835F-46C4-9C3A-B6FEA434522E]<http://360.here.com/>  [cid:B15C92A6-468E-42FB-BE2C-F3211A891159] <https://twitter.com/here>   [cid:A6EA73A4-F4B9-4CB9-ACA4-6A6CF864B1FD] <https://www.facebook.com/here>    [cid:5FE83A6F-759B-4976-8575-6AE8D5F99222] <https://linkedin.com/company/heremaps>    [cid:45DB5498-9C1F-4EB9-B1C8-94A508F13BF4] <https://www.instagram.com/here>






Re: SolrCloud: Failure to recover on restart following OutOfMemoryError

Posted by "Kelly, Frank" <fr...@here.com>.
Hi Folks,

 I didn't hear back from anyone on this and just wanted to ping again to see if anyone had any thoughts.
FWIW we are using SolrCloud (Solr 5.3.1) with a separate dedicated ZooKeeper ensemble.

-Frank






From: Frank J Kelly <fr...@here.com>>
Reply-To: "solr-user@lucene.apache.org<ma...@lucene.apache.org>" <so...@lucene.apache.org>>
Date: Wednesday, July 27, 2016 at 12:22 PM
To: "solr-user@lucene.apache.org<ma...@lucene.apache.org>" <so...@lucene.apache.org>>
Subject: SolrCloud: Failure to recover on restart following OutOfMemoryError

Hi All,

 We have a SolrCloud cluster with 3 Virtual Machines, assigning 4GB to the Java Heap.
Recently we added a number of collections to the machine going from around 80 collections (each with 3 shards x 3 replicas) to 150 collections

We've hit Heap errors.
That wasn't the surprise, the surprise was that when I restarted - allowing more Xmx heap for Java (now 6GB) that Solr did not and could not recover despite having enough memory.
It was complaining about ZooKeeper status - "ZooKeeper thinks I am the leader but I am not".
I can only successfully recover by shutting off Solr, and nuking the ZooKeeper configs, recreating the configs and restarting Solr and deleting all collections.

Shouldn't Solr be able to recover on restart or does OutOfMemoryError cause some kind of Zk/Solr cluster state corruption that is unrecoverable?

-Frank
[cid:9D23A7EE-B937-4A83-9A5B-38A778230C49]
Frank Kelly
Principal Software Engineer
Predictive Analytics Team (SCBE/HAC/CDA)






HERE
5 Wayside Rd, Burlington, MA 01803, USA
42° 29' 7" N 71° 11' 32" W

[cid:C257596C-835F-46C4-9C3A-B6FEA434522E]<http://360.here.com/>  [cid:B15C92A6-468E-42FB-BE2C-F3211A891159] <https://twitter.com/here>   [cid:A6EA73A4-F4B9-4CB9-ACA4-6A6CF864B1FD] <https://www.facebook.com/here>    [cid:5FE83A6F-759B-4976-8575-6AE8D5F99222] <https://linkedin.com/company/heremaps>    [cid:45DB5498-9C1F-4EB9-B1C8-94A508F13BF4] <https://www.instagram.com/here>