You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by yriveiro <ya...@gmail.com> on 2013/10/19 01:04:45 UTC

Leader election fails in some point.

Hi,

In this screenshot I have a shard with two replicas without leader,

http://picpaste.com/qf2jdkj8.png

On machine with shard green I found this exception:

INFO  - dat5 - 2013-10-18 22:48:04.775;
org.apache.solr.handler.admin.CoreAdminHandler; Going to wait for
coreNodeName: 192.168.20.106:8983_solr_statistics-13_shard18_replica4,
state: recovering, checkLive: true, onlyIfLeader: true
ERROR - dat5 - 2013-10-18 22:48:04.775;
org.apache.solr.common.SolrException; org.apache.solr.common.SolrException:
We are not the leader
	at
org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler.java:824)
	at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:192)
	at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
	at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:655)
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:246)
	at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
	at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
	at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
	at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
--
	at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
	at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
	at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
	at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
	at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Unknown Source)

On the machine with the shard in recovery state I found this exception:

INFO  - dat6 - 2013-10-18 22:48:44.131;
org.apache.solr.cloud.ShardLeaderElectionContext; Running the leader process
for shard shard18
INFO  - dat6 - 2013-10-18 22:48:44.137;
org.apache.solr.cloud.ShardLeaderElectionContext; Checking if I should try
and be the leader.
INFO  - dat6 - 2013-10-18 22:48:44.138;
org.apache.solr.cloud.ShardLeaderElectionContext; My last published State
was recovering, I won't be the leader.
INFO  - dat6 - 2013-10-18 22:48:44.139;
org.apache.solr.cloud.ShardLeaderElectionContext; There may be a better
leader candidate than us - going back into recovery
INFO  - dat6 - 2013-10-18 22:48:44.142;
org.apache.solr.update.DefaultSolrCoreState; Running recovery - first
canceling any ongoing recovery
WARN  - dat6 - 2013-10-18 22:48:44.142;
org.apache.solr.cloud.RecoveryStrategy; Stopping recovery for
zkNodeName=192.168.20.106:8983_solr_statistics-13_shard18_replica4core=statistics-13_shard18_replica4
INFO  - dat6 - 2013-10-18 22:48:45.131;
org.apache.solr.cloud.RecoveryStrategy; Finished recovery process.
core=statistics-13_shard18_replica4
INFO  - dat6 - 2013-10-18 22:48:45.131;
org.apache.solr.cloud.RecoveryStrategy; Starting recovery process. 
core=statistics-13_shard18_replica4 recoveringAfterStartup=false
INFO  - dat6 - 2013-10-18 22:48:45.131; org.apache.solr.cloud.ZkController;
publishing core=statistics-13_shard18_replica4 state=recovering
INFO  - dat6 - 2013-10-18 22:48:45.132; org.apache.solr.cloud.ZkController;
numShards not found on descriptor - reading it from system property
INFO  - dat6 - 2013-10-18 22:48:45.141;
org.apache.solr.client.solrj.impl.HttpClientUtil; Creating new http client,
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
ERROR - dat6 - 2013-10-18 22:48:45.143;
org.apache.solr.common.SolrException; Error while trying to recover.
core=statistics-13_shard18_replica4:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
We are not the leader
	at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:424)
	at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
	at
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198)
	at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342)
	at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)

No leader means we can't index data because a 503 http status code is
returned.

Is this the normal behaviour or a bug?



-----
Best regards
--
View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-fails-in-some-point-tp4096514.html
Sent from the Solr - User mailing list archive at Nabble.com.