You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Cat Bieber <cb...@techtarget.com> on 2013/08/29 22:16:47 UTC

cores/shards with no leader

Hello,

We're running solr 4.2.0 and recently converted to SolrCloud. We've got 
16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each 
core. We're suddenly having trouble with very slow tomcat restarts 
(15-45 minutes) and even when we can get a few replicas up, we aren't 
seeing a leader for many of our cores. I tried issuing a reload command 
through the cores admin, but it fails because there is no leader. Is 
there any way to cause an election? Restarting tomcat on individual 
servers in the cluster doesn't seem to help. We do have some cores that 
are serving request properly and would prefer not to shut down the whole 
cluster if possible -- this is a production system.

In addition, some cores are reporting a peculiar error, stack trace 
below. The cores that report this problem seem to be completely down 
across all replicas.

ERROR org.apache.solr.servlet.SolrDispatchFilter  - null
:org.apache.solr.common.SolrException: Error opening new searcher
         at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1415)
         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1527)
         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1304)
         at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1239)
         at 
org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94)
         at 
org.apache.solr.servlet.cache.HttpCacheHeaderUtil.calcLastModified(HttpCacheHeaderUtil.java:145)
         at 
org.apache.solr.servlet.cache.HttpCacheHeaderUtil.doCacheHeaderValidation(HttpCacheHeaderUtil.java:218)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:334)
         at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
         at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
         at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
         at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
         at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
         at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
         at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
         at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:581)
         at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
         at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
         at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:879)
         at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
         at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
         at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
         at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
         at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: Already closed
         at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:237)
         at 
org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:222)
         at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:244)
         at 
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1326)

Has anyone see either of these issues before? I'm having trouble finding 
any information on either situation.

Thanks,
-Cat

Re: cores/shards with no leader

Posted by Srivatsan <ra...@gmail.com>.
I am also facing this issue . I am using solr 4.3.0.



--
View this message in context: http://lucene.472066.n3.nabble.com/cores-shards-with-no-leader-tp4087323p4087478.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: cores/shards with no leader

Posted by Erick Erickson <er...@gmail.com>.
Shawn's link gives you a fairly concise version, there's a
longer treatment (if you can stay awake) here:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

FWIW,
Erick



On Thu, Aug 29, 2013 at 6:21 PM, Cat Bieber <cb...@techtarget.com> wrote:

> Thanks Shawn. We'll give an upgrade a try and see if that helps.
> -Cat
>
>
> On 08/29/2013 04:32 PM, Shawn Heisey wrote:
>
>> On 8/29/2013 2:16 PM, Cat Bieber wrote:
>>
>>
>>> We're running solr 4.2.0 and recently converted to SolrCloud. We've got
>>> 16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each
>>> core. We're suddenly having trouble ...
>>>
>>>
>> Solr 4.2.0 had a number of bugs.  They were severe enough that a 4.2.1
>> version was quickly released afterwards.  It should be possible to
>> upgrade without changing your config.  You should probably upgrade to
>> 4.4, but that would be less straightforward.
>>
>>
>>
>

Re: cores/shards with no leader

Posted by Cat Bieber <cb...@techtarget.com>.
Thanks Shawn. We'll give an upgrade a try and see if that helps.
-Cat

On 08/29/2013 04:32 PM, Shawn Heisey wrote:
> On 8/29/2013 2:16 PM, Cat Bieber wrote:
>    
>> We're running solr 4.2.0 and recently converted to SolrCloud. We've got
>> 16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each
>> core. We're suddenly having trouble ...
>>      
> Solr 4.2.0 had a number of bugs.  They were severe enough that a 4.2.1
> version was quickly released afterwards.  It should be possible to
> upgrade without changing your config.  You should probably upgrade to
> 4.4, but that would be less straightforward.
>
>    

Re: cores/shards with no leader

Posted by Shawn Heisey <so...@elyograg.org>.
On 8/29/2013 2:16 PM, Cat Bieber wrote:
> We're running solr 4.2.0 and recently converted to SolrCloud. We've got
> 16 cores, each with 1 shard. 3 zookeeper instances, 4 replicas of each
> core. We're suddenly having trouble with very slow tomcat restarts
> (15-45 minutes) and even when we can get a few replicas up, we aren't
> seeing a leader for many of our cores. I tried issuing a reload command
> through the cores admin, but it fails because there is no leader. Is
> there any way to cause an election? Restarting tomcat on individual
> servers in the cluster doesn't seem to help. We do have some cores that
> are serving request properly and would prefer not to shut down the whole
> cluster if possible -- this is a production system.

Solr 4.2.0 had a number of bugs.  They were severe enough that a 4.2.1 
version was quickly released afterwards.  It should be possible to 
upgrade without changing your config.  You should probably upgrade to 
4.4, but that would be less straightforward.

http://lucene.apache.org/solr/4_2_1/changes/Changes.html#4.2.1.bug_fixes

The very slow restarts are probably being caused by the problem 
described at the URL below.  The solution is also described here:

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_startup

Thanks,
Shawn