You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Pouliot, Scott" <Sc...@peoplefluent.com> on 2017/06/27 14:22:50 UTC

Master/Slave out of sync

Hey guys...

Does anyone else have a problem with the master/slave setup getting out of sync and staying that way until I either optimize the core or restart SOLR?  It seems to be happening more and more frequently these days and I'm looking for a solution here.  Running SOLR 6.2 on these instances using jetty.

I do see some log entries like the following at the moment, but it has happened WITHOUT these errors in the past as well.  This error just looks like the core is being loaded, so it can't replicate (as far as I can tell):

2017-06-23 00:44:08.624 ERROR (indexFetcher-677-thread-1) [   x:Client1] o.a.s.h.IndexFetcher Master at: http://master:8080/solr/Client1 is not available. Index fetch failed. Exception: Error from server at http://master:8080/solr/Client1: Expected mime type application/octet-stream but got text/html. <html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 503 {metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore is loading,code=503}</title>
</head>
<body><h2>HTTP ERROR 503</h2>
<p>Problem accessing /solr/Client1/replication. Reason:
<pre>    {metadata={error-class=org.apache.solr.common.SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore is loading,code=503}</pre></p>
</body>
</html>

Our setup looks something like this:

Master
                Client Core 1
                Client Core 2
                Client Core 3

Slave
                Client Core 1
                Client Core 2
                Client Core 3

Master Config
                <requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
      <!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid value for replicateAfter. -->
      <str name="replicateAfter">startup</str>
      <str name="replicateAfter">commit</str>

      <!--The default value of reservation is 10 secs.See the documentation below . Normally , you should not need to specify this -->
      <str name="commitReserveDuration">00:00:10</str>
    </lst>
    <!-- keep only 1 backup.  Using this parameter precludes using the "numberToKeep" request parameter. (Solr3.6 / Solr4.0)-->
    <!-- (For this to work in conjunction with "backupAfter" with Solr 3.6.0, see bug fix https://issues.apache.org/jira/browse/SOLR-3361 )-->
    <str name="maxNumberOfBackups">1</str>
  </requestHandler>


Slave Config
                <requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="slave">

      <!--fully qualified url to the master core. It is possible to pass on this as a request param for the fetchindex command-->
      <str name="masterUrl">http://master:8080/solr/${solr.core.name}</str>

      <!--Interval in which the slave should poll master .Format is HH:mm:ss . If this is absent slave does not poll automatically.
         But a fetchindex can be triggered from the admin or the http API -->
      <str name="pollInterval">00:00:45</str>
    </lst>
  </requestHandler>

Master screenshot
[cid:image001.png@01D2EF2F.53DF4C10]


Slave Screenshot
[cid:image002.png@01D2EF2F.53DF4C10]

RE: Master/Slave out of sync

Posted by "Pouliot, Scott" <Sc...@peoplefluent.com>.
I figured the attachments would get stripped, but it was worth a shot!  It was just a screenshot showing the version numbers off from each other.

Here are the Master/Slave commit settings:

<autoCommit> 
       <maxTime>180000</maxTime> 
       <openSearcher>false</openSearcher> 
</autoCommit>

<autoSoftCommit> 
       <maxTime>60000</maxTime> 
</autoSoftCommit>

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, June 27, 2017 11:17 AM
To: solr-user <so...@lucene.apache.org>
Subject: Re: Master/Slave out of sync

First, attachments are almost always stripped by the mail program, so we can't see anything.

Hmmm, does look odd. What happens if you issue a commit against the slave via a url? I.e.
http://server:port/solr/core/update?commit=true?

And what are the autocommit settings on the slave?

Best,
Erick

On Tue, Jun 27, 2017 at 7:22 AM, Pouliot, Scott < Scott.Pouliot@peoplefluent.com> wrote:

> Hey guys…
>
>
>
> Does anyone else have a problem with the master/slave setup getting 
> out of sync and staying that way until I either optimize the core or 
> restart SOLR?  It seems to be happening more and more frequently these 
> days and I’m looking for a solution here.  Running SOLR 6.2 on these 
> instances using jetty.
>
>
>
> I do see some log entries like the following at the moment, but it has 
> happened WITHOUT these errors in the past as well.  This error just 
> looks like the core is being loaded, so it can’t replicate (as far as I can tell):
>
>
>
> 2017-06-23 00:44:08.624 ERROR (indexFetcher-677-thread-1) [   x:Client1]
> o.a.s.h.IndexFetcher Master at: http://master:8080/solr/Client1 is not 
> available. Index fetch failed. Exception: Error from server at
> http://master:8080/solr/Client1: Expected mime type 
> application/octet-stream but got text/html. <html>
>
> <head>
>
> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
>
> <title>Error 503 {metadata={error-class=org.apache.solr.common.
> SolrException,root-error-class=org.apache.solr.common.SolrException},m
> sg=SolrCore
> is loading,code=503}</title>
>
> </head>
>
> <body><h2>HTTP ERROR 503</h2>
>
> <p>Problem accessing /solr/Client1/replication. Reason:
>
> <pre>    {metadata={error-class=org.apache.solr.common.
> SolrException,root-error-class=org.apache.solr.common.SolrException},m
> sg=SolrCore
> is loading,code=503}</pre></p>
>
> </body>
>
> </html>
>
>
>
> Our setup looks something like this:
>
>
>
> Master
>
>                 Client Core 1
>
>                 Client Core 2
>
>                 Client Core 3
>
>
>
> Slave
>
>                 Client Core 1
>
>                 Client Core 2
>
>                 Client Core 3
>
>
>
> Master Config
>
>                 <requestHandler name="/replication" class="solr.ReplicationHandler"
> >
>
>     <lst name="master">
>
>       <!--Replicate on 'startup' and 'commit'. 'optimize' is also a 
> valid value for replicateAfter. -->
>
>       <str name="replicateAfter">startup</str>
>
>       <str name="replicateAfter">commit</str>
>
>
>
>       <!--The default value of reservation is 10 secs.See the 
> documentation below . Normally , you should not need to specify this 
> -->
>
>       <str name="commitReserveDuration">00:00:10</str>
>
>     </lst>
>
>     <!-- keep only 1 backup.  Using this parameter precludes using the 
> "numberToKeep" request parameter. (Solr3.6 / Solr4.0)-->
>
>     <!-- (For this to work in conjunction with "backupAfter" with Solr 
> 3.6.0, see bug fix https://issues.apache.org/jira/browse/SOLR-3361 
> )-->
>
>     <str name="maxNumberOfBackups">1</str>
>
>   </requestHandler>
>
>
>
>
>
> Slave Config
>
>                 <requestHandler name="/replication" class="solr.ReplicationHandler"
> >
>
>     <lst name="slave">
>
>
>
>       <!--fully qualified url to the master core. It is possible to 
> pass on this as a request param for the fetchindex command-->
>
>       <str name="masterUrl">http://master:8080/solr/${solr.core.name}
> </str>
>
>
>
>       <!--Interval in which the slave should poll master .Format is 
> HH:mm:ss . If this is absent slave does not poll automatically.
>
>          But a fetchindex can be triggered from the admin or the http 
> API
> -->
>
>       <str name="pollInterval">00:00:45</str>
>
>     </lst>
>
>   </requestHandler>
>
>
>
> Master screenshot
>
>
>
>
>
> Slave Screenshot
>
>

Re: Master/Slave out of sync

Posted by Erick Erickson <er...@gmail.com>.
First, attachments are almost always stripped by the mail program, so we
can't see anything.

Hmmm, does look odd. What happens if you issue a commit against the slave
via a url? I.e.
http://server:port/solr/core/update?commit=true?

And what are the autocommit settings on the slave?

Best,
Erick

On Tue, Jun 27, 2017 at 7:22 AM, Pouliot, Scott <
Scott.Pouliot@peoplefluent.com> wrote:

> Hey guys…
>
>
>
> Does anyone else have a problem with the master/slave setup getting out of
> sync and staying that way until I either optimize the core or restart
> SOLR?  It seems to be happening more and more frequently these days and I’m
> looking for a solution here.  Running SOLR 6.2 on these instances using
> jetty.
>
>
>
> I do see some log entries like the following at the moment, but it has
> happened WITHOUT these errors in the past as well.  This error just looks
> like the core is being loaded, so it can’t replicate (as far as I can tell):
>
>
>
> 2017-06-23 00:44:08.624 ERROR (indexFetcher-677-thread-1) [   x:Client1]
> o.a.s.h.IndexFetcher Master at: http://master:8080/solr/Client1 is not
> available. Index fetch failed. Exception: Error from server at
> http://master:8080/solr/Client1: Expected mime type
> application/octet-stream but got text/html. <html>
>
> <head>
>
> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
>
> <title>Error 503 {metadata={error-class=org.apache.solr.common.
> SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore
> is loading,code=503}</title>
>
> </head>
>
> <body><h2>HTTP ERROR 503</h2>
>
> <p>Problem accessing /solr/Client1/replication. Reason:
>
> <pre>    {metadata={error-class=org.apache.solr.common.
> SolrException,root-error-class=org.apache.solr.common.SolrException},msg=SolrCore
> is loading,code=503}</pre></p>
>
> </body>
>
> </html>
>
>
>
> Our setup looks something like this:
>
>
>
> Master
>
>                 Client Core 1
>
>                 Client Core 2
>
>                 Client Core 3
>
>
>
> Slave
>
>                 Client Core 1
>
>                 Client Core 2
>
>                 Client Core 3
>
>
>
> Master Config
>
>                 <requestHandler name="/replication" class="solr.ReplicationHandler"
> >
>
>     <lst name="master">
>
>       <!--Replicate on 'startup' and 'commit'. 'optimize' is also a valid
> value for replicateAfter. -->
>
>       <str name="replicateAfter">startup</str>
>
>       <str name="replicateAfter">commit</str>
>
>
>
>       <!--The default value of reservation is 10 secs.See the
> documentation below . Normally , you should not need to specify this -->
>
>       <str name="commitReserveDuration">00:00:10</str>
>
>     </lst>
>
>     <!-- keep only 1 backup.  Using this parameter precludes using the
> "numberToKeep" request parameter. (Solr3.6 / Solr4.0)-->
>
>     <!-- (For this to work in conjunction with "backupAfter" with Solr
> 3.6.0, see bug fix https://issues.apache.org/jira/browse/SOLR-3361 )-->
>
>     <str name="maxNumberOfBackups">1</str>
>
>   </requestHandler>
>
>
>
>
>
> Slave Config
>
>                 <requestHandler name="/replication" class="solr.ReplicationHandler"
> >
>
>     <lst name="slave">
>
>
>
>       <!--fully qualified url to the master core. It is possible to pass
> on this as a request param for the fetchindex command-->
>
>       <str name="masterUrl">http://master:8080/solr/${solr.core.name}
> </str>
>
>
>
>       <!--Interval in which the slave should poll master .Format is
> HH:mm:ss . If this is absent slave does not poll automatically.
>
>          But a fetchindex can be triggered from the admin or the http API
> -->
>
>       <str name="pollInterval">00:00:45</str>
>
>     </lst>
>
>   </requestHandler>
>
>
>
> Master screenshot
>
>
>
>
>
> Slave Screenshot
>
>