You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org> on 2018/09/06 03:54:00 UTC

[jira] [Commented] (SOLR-12748) OOM on Solr cloud example startup on master

    [ https://issues.apache.org/jira/browse/SOLR-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16605216#comment-16605216 ] 

Shalin Shekhar Mangar commented on SOLR-12748:
----------------------------------------------

Attaching console logs from both nodes. The console logs also have the full thread dump.

> OOM on Solr cloud example startup on master
> -------------------------------------------
>
>                 Key: SOLR-12748
>                 URL: https://issues.apache.org/jira/browse/SOLR-12748
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: Shalin Shekhar Mangar
>            Priority: Major
>             Fix For: master (8.0), 7.5
>
>         Attachments: logs.tar.gz
>
>
> I was testing a fix for SOLR-11966 on master with SHA {{89d598e9e891e87825d45aabea45e708a51ba860}}. My local state was admittedly dirty having cycled through many full cluster restarts including running embedded ZK then pointing Solr to an external ZK and cycling back again to embedded ZK. One one such restart after pointing solr back to embedded zk from an external one, the start script hanged for a while on the soft commit step and then the Solr went out of memory.
> I ran through the following commands during these tests:
> {code}
> cd solr
> ant server
> ./bin/solr -e cloud -noprompt
> ./bin/solr stop -all
> ./bin/solr -e cloud -noprompt
> ./bin/solr stop -all
> ./bin/solr -e cloud -noprompt
> ./bin/solr stop -all
> ./bin/solr -c -p 8983 -s "example/cloud/node1/solr/" -z localhost:2181
> ./bin/solr stop -all
> ./bin/solr -c -p 8983 -s "example/cloud/node1/solr/" -z localhost:2181
> ./bin/solr -c -p 7574 -s "example/cloud/node2/solr/" -z localhost:2181
> ./bin/solr stop -all
> ./bin/solr start -e cloud -noprompt
> {code}
> {code}
> *** [WARN] *** Your open file limit is currently 32768.  
>  It should be set to 65000 to avoid operational disruption. 
>  If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
> INFO  - 2018-09-06 08:51:41.989; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop
> Welcome to the SolrCloud example!
> Starting up 2 Solr nodes for your example SolrCloud cluster.
> Solr home directory /home/shalin/work/oss/lucene-solr/solr/example/cloud/node1/solr already exists.
> /home/shalin/work/oss/lucene-solr/solr/example/cloud/node2 already exists.
> Starting up Solr on port 8983 using command:
> "bin/solr" start -cloud -p 8983 -s "example/cloud/node1/solr"
> *** [WARN] *** Your open file limit is currently 32768.  
>  It should be set to 65000 to avoid operational disruption. 
>  If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
> Waiting up to 180 seconds to see Solr running on port 8983 [|]  
> Started Solr server on port 8983 (pid=20475). Happy searching!
>     
> Starting up Solr on port 7574 using command:
> "bin/solr" start -cloud -p 7574 -s "example/cloud/node2/solr" -z localhost:9983
> *** [WARN] *** Your open file limit is currently 32768.  
>  It should be set to 65000 to avoid operational disruption. 
>  If you no longer wish to see this warning, set SOLR_ULIMIT_CHECKS to false in your profile or solr.in.sh
> Waiting up to 180 seconds to see Solr running on port 7574 [|]  
> Started Solr server on port 7574 (pid=20843). Happy searching!
> INFO  - 2018-09-06 08:51:50.156; org.apache.solr.common.cloud.ConnectionManager; zkClient has connected
> INFO  - 2018-09-06 08:51:50.178; org.apache.solr.common.cloud.ZkStateReader; Updated live nodes from ZooKeeper... (0) -> (2)
> INFO  - 2018-09-06 08:51:50.197; org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider; Cluster at localhost:9983 ready
> Collection 'gettingstarted' already exists! Skipping collection creation step.
> Enabling auto soft-commits with maxTime 3 secs using the Config API
> POSTing request to Config API: http://localhost:8983/solr/gettingstarted/config
> {"set-property":{"updateHandler.autoSoftCommit.maxTime":"3000"}}
> ^CJava HotSpot(TM) 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated
> {code}
> After I saw the OOM warning, I used {{kill -3}} to get thread dumps for both Solr processes and then had to manually kill both processes. It is also curious why the OOM killer script did not fire.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org