You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Peter Horvath (JIRA)" <ji...@apache.org> on 2016/05/18 14:21:13 UTC

[jira] [Updated] (SOLR-9129) Solr Cloud hangs when creating large number of collections and node fails to recover after restart

     [ https://issues.apache.org/jira/browse/SOLR-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Peter Horvath updated SOLR-9129:
--------------------------------
    Attachment: exception1.txt

null:org.apache.solr.common.SolrException: Could not fully create collection: FOOBAR

> Solr Cloud hangs when creating large number of collections and node fails to recover after restart
> --------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9129
>                 URL: https://issues.apache.org/jira/browse/SOLR-9129
>             Project: Solr
>          Issue Type: Bug
>          Components: Server
>    Affects Versions: 6.0
>         Environment: OS: GNU Linux, kernel 4.4.0-22 on x86_64 (Ubuntu Linux 16.04 LTS (64-bit))
> RAM: 16 GB
> CPU: Intel Core i7-4720HQ CPU @ 2.60GHz × 8
> Java version: Oracle JDK 1.8.0_92 (x64) build 1.8.0_92-b14 Java HotSpot(TM) 64-Bit Server VM (build 25.92-b14, mixed mode)
>            Reporter: Peter Horvath
>         Attachments: exception1.txt
>
>
> I attempted to benchmark SolrCloud to see how well it would work with some sample data set of ours. 
> I wanted to create about 2500 empty collections first to see how that would scale.
> Unfortunately, the test was not successful. Solr started failing after creating around 2000 collections and the cluster has failed to recover after a complete restart, which is quite concerning to me. 
> I based my environment on the cloud example (I use the same config set as the gettingstarted example collection etc); so I have the vanilla install and used the following commands to bring the nodes online:
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8983 -s
> ".../solr/6.0.0/example/cloud/node1/solr"
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7574 -s
> ".../solr/6.0.0/example/cloud/node2/solr" -z localhost:9983
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 8984 -s
> ".../solr/6.0.0/example/cloud/node3/solr" -z localhost:9983
> .../solr/6.0.0/bin/solr start -m 2g -cloud -p 7575 -s
> ".../solr/6.0.0/example/cloud/node4/solr" -z localhost:9983
> After about 2000 collections were created, SolR got hung; REST requests started failing. I found the following entry in the logs, wihch I could relate to the failed REST request. For further logs, please see the attachment of this issue. 
> null:org.apache.solr.common.SolrException: Could not fully create collection: FOOBAR
> 	at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:266)
> 	at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:197)
> 	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
> 	at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:658)
> 	at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:441)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:229)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:184)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> 	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> 	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> 	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> 	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> 	at org.eclipse.jetty.server.Server.handle(Server.java:518)
> 	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> 	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> 	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> 	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> 	at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> 	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> 	at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> 	at java.lang.Thread.run(Thread.java:745)
> For further logs, please see the attachment of this issue. 
> After the Solr instance affected has failed to recover, I decided to restart the whole cluster (using the official solr stop-start commands). Unfortunately, after this, at least one node remained spinning in ZooKeeper logic, creating more than four thousand (!!) threads.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org