You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by patrick conant <pa...@gmail.com> on 2014/01/07 17:57:54 UTC

no servers hosting shard

In our Solr instance we have two shards each running on two servers.  The
server that was the leader for one of the shards ran into a problem, and
when we restarted the service, Solar is no longer electing a leader for the
shard.

The stack traces from the logs are below, and the 'Cloud Dump' from the
Solr console is attached.  We're running Solr 4.4.0.  Any guidance on how
to recover from this?  Restarting or redeploying the service doesn't seem
to make any difference.

Thanks,
Pat.


2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore
- org.apache.solr.common.SolrException: no servers hosting shard:
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)

2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore
- org.apache.solr.common.SolrException: No registered leader was found,
collection:customerOrderSearch slice:shard1
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
at
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
at
org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)

Re: no servers hosting shard

Posted by patrick conant <pa...@gmail.com>.
We found a way to recover.  This sequence allowed everything to start up
successfully.

- Stop all Solr instances
- Stop all Zookeeper instances
- Start all Zookeeper instances
- Start Solr instances one at a time.

Restarting the first Solr instance took several minutes, but the subsequent
instances started up much more quickly.

Cheers,
Pat.





On Tue, Jan 7, 2014 at 10:20 AM, patrick conant <pa...@gmail.com>wrote:

> After a full bounce of Tomcat, I'm now getting a new exception (below).  I
> can browse the Zookeeper config in the Solr admin UI, and can confirm that
> there's a node for '/collections/customerOrderSearch/leaders/shard2', but
> no node for 'collections/customerOrderSearch/leaders/shard1'.  Still, any
> ideas or guidance on how to recover would be appreciated.  We've restarted
> all three zookeeper instances and both Solr instances, but that hasn't made
> any appreciable difference.
>
> --p.
>
>
>
>
> 2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR
> org.apache.solr.core.CoreContainer -
> null:org.apache.solr.common.cloud.ZooKeeperException:
>  at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309)
> at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556)
>  at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365)
> at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>  at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.solr.common.SolrException: Error getting leader from
> zk for shard shard1
> at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864)
>  at org.apache.solr.cloud.ZkController.register(ZkController.java:773)
> at org.apache.solr.cloud.ZkController.register(ZkController.java:723)
>  at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286)
> ... 11 more
> Caused by: org.apache.solr.common.SolrException: Could not get leader props
>  at
> org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911)
> at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875)
>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839)
> ... 14 more
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>  at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
> at
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252)
>  at
> org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249)
> at
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
>  at
> org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249)
> at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889)
>  ... 16 more
>
>
>
> On Tue, Jan 7, 2014 at 9:57 AM, patrick conant <pa...@gmail.com>wrote:
>
>> In our Solr instance we have two shards each running on two servers.  The
>> server that was the leader for one of the shards ran into a problem, and
>> when we restarted the service, Solar is no longer electing a leader for the
>> shard.
>>
>> The stack traces from the logs are below, and the 'Cloud Dump' from the
>> Solr console is attached.  We're running Solr 4.4.0.  Any guidance on how
>> to recover from this?  Restarting or redeploying the service doesn't seem
>> to make any difference.
>>
>> Thanks,
>> Pat.
>>
>>
>> 2014-01-07 00:00:10,754 [http-8080-62] ERROR
>> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: no
>> servers hosting shard:
>>  at
>> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
>> at
>> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>  at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>>  at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>> at java.lang.Thread.run(Thread.java:662)
>>
>> 2014-01-07 09:38:33,701 [http-8080-21] ERROR
>> org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: No
>> registered leader was found, collection:customerOrderSearch slice:shard1
>>  at
>> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
>> at
>> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
>>  at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
>> at
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
>>  at
>> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
>> at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
>>  at
>> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>> at
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>>  at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
>>  at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
>> at
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>>  at
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>> at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>>  at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>> at
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>>  at
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
>> at
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>>  at
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>> at
>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
>>  at
>> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
>> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
>>  at java.lang.Thread.run(Thread.java:662)
>>
>>
>>
>

Re: no servers hosting shard

Posted by patrick conant <pa...@gmail.com>.
After a full bounce of Tomcat, I'm now getting a new exception (below).  I
can browse the Zookeeper config in the Solr admin UI, and can confirm that
there's a node for '/collections/customerOrderSearch/leaders/shard2', but
no node for 'collections/customerOrderSearch/leaders/shard1'.  Still, any
ideas or guidance on how to recover would be appreciated.  We've restarted
all three zookeeper instances and both Solr instances, but that hasn't made
any appreciable difference.

--p.




2014-01-07 10:06:14,980 [coreLoadExecutor-4-thread-1] ERROR
org.apache.solr.core.CoreContainer -
null:org.apache.solr.common.cloud.ZooKeeperException:
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:309)
at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:556)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:365)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.common.SolrException: Error getting leader from
zk for shard shard1
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:864)
at org.apache.solr.cloud.ZkController.register(ZkController.java:773)
at org.apache.solr.cloud.ZkController.register(ZkController.java:723)
at org.apache.solr.core.ZkContainer.registerInZk(ZkContainer.java:286)
... 11 more
Caused by: org.apache.solr.common.SolrException: Could not get leader props
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:911)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:875)
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:839)
... 14 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /collections/customerOrderSearch/leaders/shard1
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249)
at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:889)
... 16 more



On Tue, Jan 7, 2014 at 9:57 AM, patrick conant <pa...@gmail.com>wrote:

> In our Solr instance we have two shards each running on two servers.  The
> server that was the leader for one of the shards ran into a problem, and
> when we restarted the service, Solar is no longer electing a leader for the
> shard.
>
> The stack traces from the logs are below, and the 'Cloud Dump' from the
> Solr console is attached.  We're running Solr 4.4.0.  Any guidance on how
> to recover from this?  Restarting or redeploying the service doesn't seem
> to make any difference.
>
> Thanks,
> Pat.
>
>
> 2014-01-07 00:00:10,754 [http-8080-62] ERROR org.apache.solr.core.SolrCore
> - org.apache.solr.common.SolrException: no servers hosting shard:
>  at
> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:149)
> at
> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>  at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
>
> 2014-01-07 09:38:33,701 [http-8080-21] ERROR org.apache.solr.core.SolrCore
> - org.apache.solr.common.SolrException: No registered leader was found,
> collection:customerOrderSearch slice:shard1
>  at
> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:487)
> at
> org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:470)
>  at
> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:223)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:428)
>  at
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
> at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
>  at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>  at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
>  at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
>  at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>  at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>  at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>  at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>  at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
>  at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
>  at java.lang.Thread.run(Thread.java:662)
>
>
>