You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Per Steffensen (Updated) (JIRA)" <ji...@apache.org> on 2012/03/26 13:06:27 UTC

[jira] [Updated] (SOLR-3273) 404 Not Found on action=PREPRECOVERY

     [ https://issues.apache.org/jira/browse/SOLR-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Per Steffensen updated SOLR-3273:
---------------------------------

    Description: 
We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.

About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up: 
{code}
nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
{code}
The ./myapp/solr.xml looks like this on server1:
{code:xml}
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
    <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
  </cores>
</solr>
{code}
The ./myapp/solr.xml looks like this on server2:
{code:xml}
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
    <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
  </cores>
</solr>
{code}

The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:
{code}
SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found

request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
        at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
        at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
{code}

Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.

Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.

Regards, Per Steffensen

  was:
We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.

About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up: 
<pre>
nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
</pre>
The ./myapp/solr.xml looks like this on server1:
<pre>
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
    <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
  </cores>
</solr>
</pre>
The ./myapp/solr.xml looks like this on server2:
<pre>
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
  <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
    <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
  </cores>
</solr>
</pre>

The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:
<pre>
SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found

request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
        at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
        at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
        at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
        at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
        at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
</pre>

Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.

Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.

Regards, Per Steffensen

    
> 404 Not Found on action=PREPRECOVERY
> ------------------------------------
>
>                 Key: SOLR-3273
>                 URL: https://issues.apache.org/jira/browse/SOLR-3273
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.0
>         Environment: Any
>            Reporter: Per Steffensen
>
> We have an application based on a recent copy of 4.0-SNAPSHOT. We have a preformance test setup where we performance test our application (and therefore indirectly Solr(Cloud)). When we run the performance test against a setup using SolrCloud without replication, everything seems to run very nicely for days. When we add replication to the setup the same performance test shows some problems - which we will report (and maybe help fix) in distinct issues here in jira.
> About the setup - the setup is a little more complex than described below, but I believe the description will tell "enough":
> We have two solr servers which we start from <solr-install>/example using this command (ZooKeepers have been started before) - we first start solr on server1, and then starts solr on server2 after solr on server1 finished starting up: 
> {code}
> nohup java -Xmx4096m -Dcom.sun.management.jmxremote -DzkHost=server1:2181,server2:2181,server3:2181 -Dbootstrap_confdir=./myapp/conf -Dcollection.configName=myapp_conf -Dsolr.solr.home=./myapp -Djava.util.logging.config.file=logging.properties -jar start.jar >./myapp/logs/stdout.log 2>./myapp/logs/stderr.log &
> {code}
> The ./myapp/solr.xml looks like this on server1:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8" ?>
> <solr persistent="false">
>   <cores adminPath="/admin/myapp" host="server1" hostPort="8983" hostContext="solr">
>     <core name="collA_slice1_shard1" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
>   </cores>
> </solr>
> {code}
> The ./myapp/solr.xml looks like this on server2:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8" ?>
> <solr persistent="false">
>   <cores adminPath="/admin/myapp" host="server2" hostPort="8983" hostContext="solr">
>     <core name="collA_slice1_shard2" instanceDir="." dataDir="collA_slice1_data" collection="collA" shard="slice1" />
>   </cores>
> </solr>
> {code}
> The first thing we observe is that Solr server1 (running collA_slice1_shard1) seems to start up nicely, but when Solr server2 (running collA_slice1_shard2) is started up later it quickly reports the following in its solr.log an keeps doing that for a long time:
> {code}
> SEVERE: Error while trying to recover:org.apache.solr.common.SolrException: Not Found
> request: http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2
>         at org.apache.solr.common.SolrExceptionPropagationHelper.decodeFromMsg(SolrExceptionPropagationHelper.java:40)
>         at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:445)
>         at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:264)
>         at org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:188)
>         at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:285)
>         at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:206)
> {code}
> Please note that we have changed a little bit in the way errors are logged, but basically this means that Solr server2 gets an "404 Not Found" on its request "http://server1:8983/solr/admin/cores?action=PREPRECOVERY&core=collA_slice1_shard1&nodeName=server2%3A8983_solr&coreNodeName=server2%3A8983_solr_collA_slice1_shard2&state=recovering&checkLive=true&pauseFor=6000&wt=javabin&version=2" to Solr server1.
> Seems like there is not a common agreement among the Solr servers on how/where to send those requests and how/where to listen for them.
> Regards, Per Steffensen

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org