You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Stephan Lagraulet (JIRA)" <ji...@apache.org> on 2015/08/18 09:41:46 UTC
[jira] [Created] (SOLR-7940) [CollectionAPI] Frequent Cluster Status timeout

Stephan Lagraulet created SOLR-7940:
---------------------------------------

             Summary: [CollectionAPI] Frequent Cluster Status timeout
                 Key: SOLR-7940
                 URL: https://issues.apache.org/jira/browse/SOLR-7940
             Project: Solr
          Issue Type: Bug
          Components: SolrCloud
    Affects Versions: 4.10.2
         Environment: Ubuntu on Azure
            Reporter: Stephan Lagraulet


Very often we have a timeout when we call http://server2:8080/solr/admin/collections?action=CLUSTERSTATUS&wt=json
{code}
{"responseHeader": 
{"status": 500,
"QTime": 180100},
"error": 
{"msg": "CLUSTERSTATUS the collection time out:180s",
"trace": "org.apache.solr.common.SolrException: CLUSTERSTATUS the collection time out:180s\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:368)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:320)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleClusterStatus(CollectionsHandler.java:640)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:220)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:729)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:267)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1338)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:350)\n\tat org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)\n\tat org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:890)\n\tat org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:944)\n\tat org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:630)\n\tat org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)\n\tat org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:77)\n\tat org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:606)\n\tat org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:46)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:603)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:538)\n\tat java.lang.Thread.run(Thread.java:745)\n",
"code": 500}}
{code}

The cluster has 3 SolR nodes with 6 small collections replicated on all nodes.
We were using this api to monitor cluster state but it was failing every 10 minutes. We switched by using ZkStateReader in CloudSolrServer and it has been working for a day without problems.

Is there a kind of deadlock as this call was been made on the three nodes concurrently?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org