You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by csscouter <ti...@verizon.net> on 2012/08/05 13:04:01 UTC

Stopping replication?

Hello, and Help! We've just moved solr (3.3.0) to a new set of servers, and
the slaves are not working. The new servers have the DNS cnames of the
previous servers, and no configuration files have changed. The master shows
index generation 1940, but slaves show generation 1. I have replicateAfter
commit and startup defined, pollInterval is 60 seconds and I've tried
"http://slave_host:port/solr/replication?command=fetchindex" and
"http://slave_host:port/solr/replication?command=enablepoll".

It is of note that the admin replication page for the master
(http://slave_host:port/solr/admin/replication) shows a title "Solr
Replication <name> Master", but that same page on each of the slave servers
shows "Solr Replication <name>". I seem to recall that the slaves USED TO
say "Solr Replication <name> Slave".

I need to resolve the problem this is causing, and the simplest solution
seems to me to just stop replication. How does one do that (or fix the
problem I'm experiencing)?

Your help is most welcomed and appreciated.

Tim H



--
View this message in context: http://lucene.472066.n3.nabble.com/Stopping-replication-tp3999272.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stopping replication?

Posted by csscouter <ti...@verizon.net>.
Erick,

Thank you for the courtesy of your reply.

I was able to figure out the problem, and for the benefit of the list, I
list the analysis. Judging by the caliber of those on this list, this is
likely too basic for the interests of most, but newbies (among whom I still
classify myself) might benefit. Here's what occurred:

Recall that the version I'm using is 3.3. I don't know if these comments can
extend to versions other than 3.3, but I suspect so.

I noted in my initial "plea": /I seem to recall that the slaves USED TO
> say "Solr Replication  Slave"./ It turns out that is indeed the case, and
> that was a clue that they weren't being recognized as slave servers. The
> file solrconfig.xml contains the configuration setup for replication,
> under the entry "<requestHandler name=&quot;/replication ...
> &lt;/requestHandler>. A slave "knows" it's a slave by the following entry:


  true
  http://://replication
  00:00:60


The key here is the line true. There is at least one "fancy" way to define
"trueness" or "falseness" - by defining the value as a parameter, and
passing the resolution to the parameter in to solr when it starts. The
reason for using this technique is to allow a single solrconfig.xml file to
be deployed to all servers running solr, and then configuring those servers
as slaves or the master at the time the servers start. (The information on
doing this is in the solr wiki documentation for Replication at
http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node,
incidentally).

In my case, I'm running solr under WebLogic 10.3.2 application server. I had
defined the line:

  true

as:

  ${org.apache.solr.handler.enable.slave:false}

in my solrconfig.xml, and had been starting the WebLogic managed servers
with the parameter "-Dorg.apache.solr.handler.enable.master=false". Note
that this parameter deals with the *master* and not the slave. This was
working in my existing environment, and despite the fact that no
"-Dorg.apache.solr.handler.enable.slave=true" parameter was being passed in
from WebLogic, the slaves were able to recognize themselves as slaves. In
the new WebLogic environment, this was no longer the case. I don't know why
at this point.

To solve the problem for the short term, I created a separate file for the
slave servers that bypassed the whole parameter-resolution mechanism by
defining that line under the slave configuration in its solrconfig.xml as:

  true

That, of course, now leaves me with 2 solrconfig.xml files - one for the
master server, and one for the slave servers. My bottom line is that at
least it's now working, people are not being impacted, and I can
troubleshoot the underlying issue at a more leisurely pace.

Hope this helps someone, somewhere. Erick, thanks for taking an interest.

Tim Hibbs



--
View this message in context: http://lucene.472066.n3.nabble.com/Stopping-replication-tp3999272p3999445.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stopping replication?

Posted by Erick Erickson <er...@gmail.com>.
Thanks for wrapping that up.....

Erick

On Mon, Aug 6, 2012 at 2:52 PM, csscouter <ti...@verizon.net> wrote:
> Erick,
>
> Thank you for the courtesy of your reply.
>
> I was able to figure out the problem, and for the benefit of the list, I
> list the analysis. Judging by the caliber of those on this list, this is
> likely too basic for the interests of most, but newbies (among whom I still
> classify myself) might benefit. Here's what occurred:
>
> Recall that the version I'm using is 3.3. I don't know if these comments can
> extend to versions other than 3.3, but I suspect so.
>
> I noted in my initial "plea": /I seem to recall that the slaves USED TO
>> say "Solr Replication <name> Slave"./ It turns out that is indeed the
>> case, and that was a clue that they weren't being recognized as slave
>> servers. The file solrconfig.xml contains the configuration setup for
>> replication, under the entry "<requestHandler name=&quot;/replication ...
>> &lt;/requestHandler>. A slave "knows" it's a slave by the following entry:
>
> <lst name="slave">
>   <str name="enable">true</str>
>   <str name="masterUrl">http://<host>:<port>/<solr home location, in my case
> 'apache-solr-3.3.0'>/replication</str>
>   <str name="pollInterval">00:00:60</str>
> </lst>
>
> The key here is the line <str name="enable">true</str>. There is at least
> one "fancy" way to define "trueness" or "falseness" - by defining the value
> as a parameter, and passing the resolution to the parameter in to solr when
> it starts. The reason for using this technique is to allow a single
> solrconfig.xml file to be deployed to all servers running solr, and then
> configuring those servers as slaves or the master at the time the servers
> start. (The information on doing this is in the solr wiki documentation for
> Replication at
> http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node,
> incidentally).
>
> In my case, I'm running solr under WebLogic 10.3.2 application server. I had
> defined the line:
>
>   <str name="enable">true</str>
>
> as:
>
>   <str name="enable">${org.apache.solr.handler.enable.slave:false}</str>
>
> in my solrconfig.xml, and had been starting the WebLogic managed servers
> with the parameter "-Dorg.apache.solr.handler.enable.master=false". Note
> that this parameter deals with the *master* and not the slave. This was
> working in my existing environment, and despite the fact that no
> "-Dorg.apache.solr.handler.enable.slave=true" parameter was being passed in
> from WebLogic, the slaves were able to recognize themselves as slaves. In
> the new WebLogic environment, this was no longer the case. I don't know why
> at this point.
>
> To solve the problem for the short term, I created a separate file for the
> slave servers that bypassed the whole parameter-resolution mechanism by
> defining that line under the slave configuration in its solrconfig.xml as:
>
>   <str name="enable">true</str>
>
> That, of course, now leaves me with 2 solrconfig.xml files - one for the
> master server, and one for the slave servers. My bottom line is that at
> least it's now working, people are not being impacted, and I can
> troubleshoot the underlying issue at a more leisurely pace.
>
> Hope this helps someone, somewhere. Erick, thanks for taking an interest.
>
> Tim Hibbs
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Stopping-replication-tp3999272p3999447.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stopping replication?

Posted by csscouter <ti...@verizon.net>.
Erick,

Thank you for the courtesy of your reply.

I was able to figure out the problem, and for the benefit of the list, I
list the analysis. Judging by the caliber of those on this list, this is
likely too basic for the interests of most, but newbies (among whom I still
classify myself) might benefit. Here's what occurred:

Recall that the version I'm using is 3.3. I don't know if these comments can
extend to versions other than 3.3, but I suspect so.

I noted in my initial "plea": /I seem to recall that the slaves USED TO
> say "Solr Replication <name> Slave"./ It turns out that is indeed the
> case, and that was a clue that they weren't being recognized as slave
> servers. The file solrconfig.xml contains the configuration setup for
> replication, under the entry "<requestHandler name=&quot;/replication ...
> &lt;/requestHandler>. A slave "knows" it's a slave by the following entry:

<lst name="slave">
  <str name="enable">true</str>
  <str name="masterUrl">http://<host>:<port>/<solr home location, in my case
'apache-solr-3.3.0'>/replication</str>
  <str name="pollInterval">00:00:60</str>
</lst>

The key here is the line <str name="enable">true</str>. There is at least
one "fancy" way to define "trueness" or "falseness" - by defining the value
as a parameter, and passing the resolution to the parameter in to solr when
it starts. The reason for using this technique is to allow a single
solrconfig.xml file to be deployed to all servers running solr, and then
configuring those servers as slaves or the master at the time the servers
start. (The information on doing this is in the solr wiki documentation for
Replication at
http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node,
incidentally).

In my case, I'm running solr under WebLogic 10.3.2 application server. I had
defined the line:

  <str name="enable">true</str>

as:

  <str name="enable">${org.apache.solr.handler.enable.slave:false}</str>

in my solrconfig.xml, and had been starting the WebLogic managed servers
with the parameter "-Dorg.apache.solr.handler.enable.master=false". Note
that this parameter deals with the *master* and not the slave. This was
working in my existing environment, and despite the fact that no
"-Dorg.apache.solr.handler.enable.slave=true" parameter was being passed in
from WebLogic, the slaves were able to recognize themselves as slaves. In
the new WebLogic environment, this was no longer the case. I don't know why
at this point.

To solve the problem for the short term, I created a separate file for the
slave servers that bypassed the whole parameter-resolution mechanism by
defining that line under the slave configuration in its solrconfig.xml as:

  <str name="enable">true</str>

That, of course, now leaves me with 2 solrconfig.xml files - one for the
master server, and one for the slave servers. My bottom line is that at
least it's now working, people are not being impacted, and I can
troubleshoot the underlying issue at a more leisurely pace.

Hope this helps someone, somewhere. Erick, thanks for taking an interest.

Tim Hibbs



--
View this message in context: http://lucene.472066.n3.nabble.com/Stopping-replication-tp3999272p3999447.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Stopping replication?

Posted by Erick Erickson <er...@gmail.com>.
First thing: Are you absolutely certain that the slaves are pointing at the
master? My first guess would be that somehow your slaves aren't talking
to the master. Take a look at the log files on one of the slaves, you should
see some information about the replication attempt, that would be good
info to share.

Second thing I'd try: take a test machine and point it at your master. First
blow away the entire index (rm -rf <solr home>/data/index) Make sure you
get the index directory, not just the contents. See if that works.

Third thing. It's vaguely possible that replication on the master is disabled.
The same screen that had the HTTP commands on it also has some
commands to insure that the master is willing to do replications. The logs
on the master might help here too.

If none of that makes any difference, could you post the relevant portions
of the logs?

Best
Erick

On Sun, Aug 5, 2012 at 7:04 AM, csscouter <ti...@verizon.net> wrote:
> Hello, and Help! We've just moved solr (3.3.0) to a new set of servers, and
> the slaves are not working. The new servers have the DNS cnames of the
> previous servers, and no configuration files have changed. The master shows
> index generation 1940, but slaves show generation 1. I have replicateAfter
> commit and startup defined, pollInterval is 60 seconds and I've tried
> "http://slave_host:port/solr/replication?command=fetchindex" and
> "http://slave_host:port/solr/replication?command=enablepoll".
>
> It is of note that the admin replication page for the master
> (http://slave_host:port/solr/admin/replication) shows a title "Solr
> Replication <name> Master", but that same page on each of the slave servers
> shows "Solr Replication <name>". I seem to recall that the slaves USED TO
> say "Solr Replication <name> Slave".
>
> I need to resolve the problem this is causing, and the simplest solution
> seems to me to just stop replication. How does one do that (or fix the
> problem I'm experiencing)?
>
> Your help is most welcomed and appreciated.
>
> Tim H
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Stopping-replication-tp3999272.html
> Sent from the Solr - User mailing list archive at Nabble.com.